Re: [DISCUSS] GitBox

2018-05-16 Thread Chesnay Schepler

I couldn't find any such setting in one of my repos :(

On 16.05.2018 21:03, Kenneth Knowles wrote:

When I open a pull request to Beam, it is on by default. I have just run an
experiment to see if it is remembering the last option I checked and it is
not. Even after I disable it for one pull request, the next one has it
checked again. So it may be a repository-level setting that you can set up.

Kenn

On Wed, May 16, 2018 at 11:19 AM Chesnay Schepler 
wrote:


This however has to be enabled by the contributor, separately for each PR.
We'll see how often we get the opportunity to use it.

On 16.05.2018 17:43, Kenneth Knowles wrote:

Actually, GitHub has a feature so you do not require picture-perfect
commits:


https://help.github.com/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork/

If the owner of the PR checks the box, it will give committers write

access

to their branch (on their fork). A nice bonus is you can make the changes
and then continue the review, too.

Kenn

On Wed, May 16, 2018 at 8:31 AM Stefan Richter <

s.rich...@data-artisans.com>

wrote:


+1


Am 16.05.2018 um 12:40 schrieb Chesnay Schepler :

Hello,

during the discussion about how to better manage pull requests [1] the

topic of GitBox integration came up again.

This seems like a good opportunity to restart this discussion that we

had about a year ago [2].

* What is GitBox

Essentially, GitBox allow us to use GitHub features.
We can decide for ourselves which features we want enabled.

We could merge PRs directly on GitHub at the button of a click.
That said the merge functionality is fairly limited and would
require picture-perfect commits in the pull requests.
Commits can be squashed, but you cannot amend commits in any way, be
it fixing typos or changing the commit message. Realistically this
limits how much we can use this feature, and it may lead to a
decline in the quality of commit messages.

Labels can be useful for the management of PRs as (ready for review,
delayed for next release, waiting for changes). This is really what
I'm personally most interested in.

We've been using GitBox for flink-shaded for a while now and i
didn't run into any issue. AFAIK GitBox is also the default for new
projects.

* What this means for committers:

The apache git remote URL will change, which will require all
committers to update their git setup.
This also implies that we may have to update the website build

scripts.

The new URL would (probably) be
/https://gitbox.apache.org/repos/asf/flink.git/.

To make use of GitHub features you have to link your GitHub and
Apache accounts. [3]
This also requires setting up two-factor authentication on GitHub.

Update the scm entry in the parent pom.xml.

* What this means for contributors:

Nothing should change for contributors. Small changes (like typos)
may be merged more quickly, if the commit message is appropriate, as
it could be done directly through GitHub.

[1]

http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Closing-automatically-inactive-pull-requests-tt22248.html

[2]

http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-GitBox-td18027.html

[3] https://gitbox.apache.org/setup/






Re: CloudWatch Metrics Reporter

2018-05-16 Thread Bowen Li
To @Chesnay 's question on AWS library: yes, it would require using
aws-java-sdk-core
and
aws-java-sdk-cloudwatch
.
Both
are of Apache License 2.0. This is different than
flink-connector-kinesis's Kinesis
Producer/Consumer Libraries which use Amazon Software License. Thus Flink
can build a cloudwatch reporter module and publish to mvn.

However, I'm neutral to developing a cloudwatch reporter module, the thing
itself. According to my own experience of using cloudwatch, it's ok for
metrics of other AWS services, but it's neither handy nor friendly for
metrics of external services, because of its design limitations. I didn't
use cloudwatch even when I ran Flink on AWS EMR. Not sure how popular it is
in Flink community. Besides, it takes future commitment of maintaining it.
Thus, we can give it more thoughts if we decide to build a cloudwatch
reporter module.

Thanks, Bowen



On Wed, May 16, 2018 at 1:41 AM, Dyana Rose 
wrote:

> I've written a cloud watch reporter for our own use. It's not pretty to
> crack out the metrics correctly for cloudwatch as the current metrics don't
> all set the metric names in a good hierarchy and then they aren't all added
> to the metric variables either.
>
> If someone opens the Jira I can see about getting our code up as an example
> branch of what I had to do. Unless I missed something, I think the current
> metrics need a bit of a brush up.
>
> Dyana
>
> On 16 May 2018 at 09:23, Chesnay Schepler  wrote:
>
> > Hello,
> >
> > there was no demand for a CloudWatch reporter so far.
> >
> > I only quickly skimmed the API docs, but it appears that the data is
> > inserted via REST.
> > Would the reporter require the usage of any aws library, or could be use
> > an arbitrary http client?
> > If it is the latter there shouldn't be a licensing issue as i understand
> > it.
> >
> > Please open a JIRA, let's move the discussion there.
> >
> >
> > On 16.05.2018 10:12, Rafi Aroch wrote:
> >
> >> Hi,
> >>
> >> In my team we use CloudWatch as our monitoring & alerting system.
> >> I noticed that CloudWatch does not appear in the list of supported
> >> Reporters.
> >> I was wondering why is that. Was there no demand from the community? Is
> it
> >> related to licensing issue with AWS? Was it a technical concern?
> >>
> >> Would you accept this contribution into Flink?
> >>
> >> Thanks,
> >> Rafi
> >>
> >>
> >
>
>
> --
>
> Dyana Rose
> Software Engineer
>
>
> W: www.salecycle.com 
> [image: Marketing Permissions Service]
> 
>


Re: [DISCUSS] GitBox

2018-05-16 Thread Kenneth Knowles
Rong Rong, see my reply. It can be enabled by default. I think it may be
already.

Kenn

On Wed, May 16, 2018 at 4:24 PM Rong Rong  wrote:

> +1
>
> One question regarding "This however has to be enabled by the contributor,
> separately for each PR."
> can it be by default enable when creating PR?
>
> On Wed, May 16, 2018 at 2:08 PM, Ted Yu  wrote:
>
> > +1
> >  Original message From: Shuyi Chen 
> > Date: 5/16/18  1:12 PM  (GMT-08:00) To: dev@flink.apache.org Subject:
> Re:
> > [DISCUSS] GitBox
> > +1 :) A lot of projects  are
> already
> > using it.
> >
> > On Wed, May 16, 2018 at 3:40 AM, Chesnay Schepler 
> > wrote:
> >
> > > Hello,
> > >
> > > during the discussion about how to better manage pull requests [1] the
> > > topic of GitBox integration came up again.
> > >
> > > This seems like a good opportunity to restart this discussion that we
> had
> > > about a year ago [2].
> > >
> > > * What is GitBox
> > >
> > >Essentially, GitBox allow us to use GitHub features.
> > >We can decide for ourselves which features we want enabled.
> > >
> > >We could merge PRs directly on GitHub at the button of a click.
> > >That said the merge functionality is fairly limited and would
> > >require picture-perfect commits in the pull requests.
> > >Commits can be squashed, but you cannot amend commits in any way, be
> > >it fixing typos or changing the commit message. Realistically this
> > >limits how much we can use this feature, and it may lead to a
> > >decline in the quality of commit messages.
> > >
> > >Labels can be useful for the management of PRs as (ready for review,
> > >delayed for next release, waiting for changes). This is really what
> > >I'm personally most interested in.
> > >
> > >We've been using GitBox for flink-shaded for a while now and i
> > >didn't run into any issue. AFAIK GitBox is also the default for new
> > >projects.
> > >
> > > * What this means for committers:
> > >
> > >The apache git remote URL will change, which will require all
> > >committers to update their git setup.
> > >This also implies that we may have to update the website build
> > scripts.
> > >The new URL would (probably) be
> > >/https://gitbox.apache.org/repos/asf/flink.git/.
> > >
> > >To make use of GitHub features you have to link your GitHub and
> > >Apache accounts. [3]
> > >This also requires setting up two-factor authentication on GitHub.
> > >
> > >Update the scm entry in the parent pom.xml.
> > >
> > > * What this means for contributors:
> > >
> > >Nothing should change for contributors. Small changes (like typos)
> > >may be merged more quickly, if the commit message is appropriate, as
> > >it could be done directly through GitHub.
> > >
> > > [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.
> > > com/Closing-automatically-inactive-pull-requests-tt22248.html
> > > [2] http://apache-flink-mailing-list-archive.1008284.n3.nabble.
> > > com/DISCUSS-GitBox-td18027.html
> > > [3] https://gitbox.apache.org/setup/
> > >
> >
> >
> >
> > --
> > "So you have to trust that the dots will somehow connect in your future."
> >
>


Re: [DISCUSS] GitBox

2018-05-16 Thread Rong Rong
+1

One question regarding "This however has to be enabled by the contributor,
separately for each PR."
can it be by default enable when creating PR?

On Wed, May 16, 2018 at 2:08 PM, Ted Yu  wrote:

> +1
>  Original message From: Shuyi Chen 
> Date: 5/16/18  1:12 PM  (GMT-08:00) To: dev@flink.apache.org Subject: Re:
> [DISCUSS] GitBox
> +1 :) A lot of projects  are already
> using it.
>
> On Wed, May 16, 2018 at 3:40 AM, Chesnay Schepler 
> wrote:
>
> > Hello,
> >
> > during the discussion about how to better manage pull requests [1] the
> > topic of GitBox integration came up again.
> >
> > This seems like a good opportunity to restart this discussion that we had
> > about a year ago [2].
> >
> > * What is GitBox
> >
> >Essentially, GitBox allow us to use GitHub features.
> >We can decide for ourselves which features we want enabled.
> >
> >We could merge PRs directly on GitHub at the button of a click.
> >That said the merge functionality is fairly limited and would
> >require picture-perfect commits in the pull requests.
> >Commits can be squashed, but you cannot amend commits in any way, be
> >it fixing typos or changing the commit message. Realistically this
> >limits how much we can use this feature, and it may lead to a
> >decline in the quality of commit messages.
> >
> >Labels can be useful for the management of PRs as (ready for review,
> >delayed for next release, waiting for changes). This is really what
> >I'm personally most interested in.
> >
> >We've been using GitBox for flink-shaded for a while now and i
> >didn't run into any issue. AFAIK GitBox is also the default for new
> >projects.
> >
> > * What this means for committers:
> >
> >The apache git remote URL will change, which will require all
> >committers to update their git setup.
> >This also implies that we may have to update the website build
> scripts.
> >The new URL would (probably) be
> >/https://gitbox.apache.org/repos/asf/flink.git/.
> >
> >To make use of GitHub features you have to link your GitHub and
> >Apache accounts. [3]
> >This also requires setting up two-factor authentication on GitHub.
> >
> >Update the scm entry in the parent pom.xml.
> >
> > * What this means for contributors:
> >
> >Nothing should change for contributors. Small changes (like typos)
> >may be merged more quickly, if the commit message is appropriate, as
> >it could be done directly through GitHub.
> >
> > [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.
> > com/Closing-automatically-inactive-pull-requests-tt22248.html
> > [2] http://apache-flink-mailing-list-archive.1008284.n3.nabble.
> > com/DISCUSS-GitBox-td18027.html
> > [3] https://gitbox.apache.org/setup/
> >
>
>
>
> --
> "So you have to trust that the dots will somehow connect in your future."
>


Re: [DISCUSS] GitBox

2018-05-16 Thread Ted Yu
+1
 Original message From: Shuyi Chen  Date: 
5/16/18  1:12 PM  (GMT-08:00) To: dev@flink.apache.org Subject: Re: [DISCUSS] 
GitBox 
+1 :) A lot of projects  are already
using it.

On Wed, May 16, 2018 at 3:40 AM, Chesnay Schepler 
wrote:

> Hello,
>
> during the discussion about how to better manage pull requests [1] the
> topic of GitBox integration came up again.
>
> This seems like a good opportunity to restart this discussion that we had
> about a year ago [2].
>
> * What is GitBox
>
>    Essentially, GitBox allow us to use GitHub features.
>    We can decide for ourselves which features we want enabled.
>
>    We could merge PRs directly on GitHub at the button of a click.
>    That said the merge functionality is fairly limited and would
>    require picture-perfect commits in the pull requests.
>    Commits can be squashed, but you cannot amend commits in any way, be
>    it fixing typos or changing the commit message. Realistically this
>    limits how much we can use this feature, and it may lead to a
>    decline in the quality of commit messages.
>
>    Labels can be useful for the management of PRs as (ready for review,
>    delayed for next release, waiting for changes). This is really what
>    I'm personally most interested in.
>
>    We've been using GitBox for flink-shaded for a while now and i
>    didn't run into any issue. AFAIK GitBox is also the default for new
>    projects.
>
> * What this means for committers:
>
>    The apache git remote URL will change, which will require all
>    committers to update their git setup.
>    This also implies that we may have to update the website build scripts.
>    The new URL would (probably) be
>    /https://gitbox.apache.org/repos/asf/flink.git/.
>
>    To make use of GitHub features you have to link your GitHub and
>    Apache accounts. [3]
>    This also requires setting up two-factor authentication on GitHub.
>
>    Update the scm entry in the parent pom.xml.
>
> * What this means for contributors:
>
>    Nothing should change for contributors. Small changes (like typos)
>    may be merged more quickly, if the commit message is appropriate, as
>    it could be done directly through GitHub.
>
> [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.
> com/Closing-automatically-inactive-pull-requests-tt22248.html
> [2] http://apache-flink-mailing-list-archive.1008284.n3.nabble.
> com/DISCUSS-GitBox-td18027.html
> [3] https://gitbox.apache.org/setup/
>



-- 
"So you have to trust that the dots will somehow connect in your future."


Re: [DISCUSS] GitBox

2018-05-16 Thread Shuyi Chen
+1 :) A lot of projects  are already
using it.

On Wed, May 16, 2018 at 3:40 AM, Chesnay Schepler 
wrote:

> Hello,
>
> during the discussion about how to better manage pull requests [1] the
> topic of GitBox integration came up again.
>
> This seems like a good opportunity to restart this discussion that we had
> about a year ago [2].
>
> * What is GitBox
>
>Essentially, GitBox allow us to use GitHub features.
>We can decide for ourselves which features we want enabled.
>
>We could merge PRs directly on GitHub at the button of a click.
>That said the merge functionality is fairly limited and would
>require picture-perfect commits in the pull requests.
>Commits can be squashed, but you cannot amend commits in any way, be
>it fixing typos or changing the commit message. Realistically this
>limits how much we can use this feature, and it may lead to a
>decline in the quality of commit messages.
>
>Labels can be useful for the management of PRs as (ready for review,
>delayed for next release, waiting for changes). This is really what
>I'm personally most interested in.
>
>We've been using GitBox for flink-shaded for a while now and i
>didn't run into any issue. AFAIK GitBox is also the default for new
>projects.
>
> * What this means for committers:
>
>The apache git remote URL will change, which will require all
>committers to update their git setup.
>This also implies that we may have to update the website build scripts.
>The new URL would (probably) be
>/https://gitbox.apache.org/repos/asf/flink.git/.
>
>To make use of GitHub features you have to link your GitHub and
>Apache accounts. [3]
>This also requires setting up two-factor authentication on GitHub.
>
>Update the scm entry in the parent pom.xml.
>
> * What this means for contributors:
>
>Nothing should change for contributors. Small changes (like typos)
>may be merged more quickly, if the commit message is appropriate, as
>it could be done directly through GitHub.
>
> [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.
> com/Closing-automatically-inactive-pull-requests-tt22248.html
> [2] http://apache-flink-mailing-list-archive.1008284.n3.nabble.
> com/DISCUSS-GitBox-td18027.html
> [3] https://gitbox.apache.org/setup/
>



-- 
"So you have to trust that the dots will somehow connect in your future."


Re: [DISCUSS] GitBox

2018-05-16 Thread Fabian Hueske
+1

Kenneth Knowles  schrieb am Mi., 16. Mai 2018,
21:04:

> When I open a pull request to Beam, it is on by default. I have just run an
> experiment to see if it is remembering the last option I checked and it is
> not. Even after I disable it for one pull request, the next one has it
> checked again. So it may be a repository-level setting that you can set up.
>
> Kenn
>
> On Wed, May 16, 2018 at 11:19 AM Chesnay Schepler 
> wrote:
>
> > This however has to be enabled by the contributor, separately for each
> PR.
> > We'll see how often we get the opportunity to use it.
> >
> > On 16.05.2018 17:43, Kenneth Knowles wrote:
> > > Actually, GitHub has a feature so you do not require picture-perfect
> > > commits:
> > >
> >
> https://help.github.com/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork/
> > >
> > > If the owner of the PR checks the box, it will give committers write
> > access
> > > to their branch (on their fork). A nice bonus is you can make the
> changes
> > > and then continue the review, too.
> > >
> > > Kenn
> > >
> > > On Wed, May 16, 2018 at 8:31 AM Stefan Richter <
> > s.rich...@data-artisans.com>
> > > wrote:
> > >
> > >> +1
> > >>
> > >>> Am 16.05.2018 um 12:40 schrieb Chesnay Schepler  >:
> > >>>
> > >>> Hello,
> > >>>
> > >>> during the discussion about how to better manage pull requests [1]
> the
> > >> topic of GitBox integration came up again.
> > >>> This seems like a good opportunity to restart this discussion that we
> > >> had about a year ago [2].
> > >>> * What is GitBox
> > >>>
> > >>>Essentially, GitBox allow us to use GitHub features.
> > >>>We can decide for ourselves which features we want enabled.
> > >>>
> > >>>We could merge PRs directly on GitHub at the button of a click.
> > >>>That said the merge functionality is fairly limited and would
> > >>>require picture-perfect commits in the pull requests.
> > >>>Commits can be squashed, but you cannot amend commits in any way,
> be
> > >>>it fixing typos or changing the commit message. Realistically this
> > >>>limits how much we can use this feature, and it may lead to a
> > >>>decline in the quality of commit messages.
> > >>>
> > >>>Labels can be useful for the management of PRs as (ready for
> review,
> > >>>delayed for next release, waiting for changes). This is really
> what
> > >>>I'm personally most interested in.
> > >>>
> > >>>We've been using GitBox for flink-shaded for a while now and i
> > >>>didn't run into any issue. AFAIK GitBox is also the default for
> new
> > >>>projects.
> > >>>
> > >>> * What this means for committers:
> > >>>
> > >>>The apache git remote URL will change, which will require all
> > >>>committers to update their git setup.
> > >>>This also implies that we may have to update the website build
> > scripts.
> > >>>The new URL would (probably) be
> > >>>/https://gitbox.apache.org/repos/asf/flink.git/.
> > >>>
> > >>>To make use of GitHub features you have to link your GitHub and
> > >>>Apache accounts. [3]
> > >>>This also requires setting up two-factor authentication on GitHub.
> > >>>
> > >>>Update the scm entry in the parent pom.xml.
> > >>>
> > >>> * What this means for contributors:
> > >>>
> > >>>Nothing should change for contributors. Small changes (like typos)
> > >>>may be merged more quickly, if the commit message is appropriate,
> as
> > >>>it could be done directly through GitHub.
> > >>>
> > >>> [1]
> > >>
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Closing-automatically-inactive-pull-requests-tt22248.html
> > >>> [2]
> > >>
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-GitBox-td18027.html
> > >>> [3] https://gitbox.apache.org/setup/
> > >>
> >
> >
>


Re: [DISCUSS] Adding new interfaces in [Stream]ExecutionEnvironment

2018-05-16 Thread Shuyi Chen
Hi Aljoscha, Fabian, Rong, Ted and Timo,

Thanks a lot for the feedback. Let me clarify the usage scenario in a bit
more detail. The context is that we want to add support for SQL DDL to load
UDF from external JARs located either in local filesystem or HDFS or a HTTP
endpoint in Flink SQL. The local FS option is more for debugging purpose
for user to submit the job jar locally, and the later 2 are for production
uses. Below is an example User application with the *CREATE FUNCTION* DDL
(Note: grammar and interface not finalized yet).


-




*val env = StreamExecutionEnvironment.getExecutionEnvironmentval tEnv =
TableEnvironment.getTableEnvironment(env)// setup the DataStream//..*











*// register the DataStream under the name
"OrderAtEnv.registerDataStream("OrderA", orderA, 'user, 'product,
'amount)tEnv.sqlDDL(  "create function helloFunc as
'com.example.udf.HelloWorld' using jars
('hdfs:///users/david/libraries/my-udf-1.0.1-SNAPSHOT.jar')")val result =
tEnv.sqlQuery(  "SELECT user, helloFunc(product), amount FROM OrderA WHERE
amount > 2")result.toAppendStream[Order].print()env.execute()*

-

The example application above does the following:
1) it registers a DataStream as a Calcite table(
*org.apache.calcite.schema.Table*) under name "OrderA", so SQL can
reference the DataStream as table "OrderA".
2) it uses the SQL *CREATE FUNCTION* DDL (grammar and interface not
finalized yet) to create a SQL UDF called *helloFunc* from a JAR located in
a remote HDFS path.
3) it issues a sql query that uses the *helloFunc* UDF defined above and
generate a Flink table (*org.apache.flink.table.api.Table*)
4) it convert the Flink table back to a DataStream and print it.

Step 1), 3), and 4) are already implemented. To implement 2), we need to do
the following to implement the *tEnv.sqlDDL()* function.

a) parse the DDL into a SqlNode to extract the UDF *udfClasspath*, UDF
remote path *udfUrls[]* and UDF SQL name *udfName*.
b) use the URLClassLoader to load the JARs specified in *udfUrls[]*, and
register the SQL UDF using the {Batch/Stream/}TableEnvironment
registerFunction methods using*  udfClasspath* under name *udfName.*
c) register the JARs *udfUrls[]* through the {Stream}ExecutionEnvironment,
so that the JARs can be distributed to all the TaskManagers during runtime.


Since the CREATE FUNCTION DDL is executed within the user application, I
dont think we have access to the ClusterClient at the point when
*tEnv.sqlDDL()* is executed. Also the JARs can be in a remote filesystem
(which is the main usage scenarios), so the user can't really prepare the
jar somehow in advance statically.

For normal user application, I think {Stream}ExecutionEnvironment is the
right place for the functionality, since it provides methods to control the
job execution and to interact with the outside world, and also, it actually
already does similar things provided through the *registerCachedFile*
interface.

However, in such case, SQL FUNCTION DDL and SQL client will use 2 different
routes to register UDF jars, one through *JobGraph.jobConfiguration* and
the other through *JobGraph.userJars*. So *maybe we can, as Fabian
suggests, add **registerUserJarFile()/getUserJarFiles() interfaces
in {Stream}ExecutionEnvironment, which stores the jars internally in a
List, and when generating JobGraph, copy the jars to the JobGraph using
the  {Stream}ExecutionEnvironment.getUserJarFiles() and
JobGraph.addJar()* (Note,
streaming and batch implementations might vary). In such case, both SQL
FUNCTION DDL and SQL client will use *JobGraph.userJars* to ship the UDF
jars.

Hope that clarifies better. What do you guys think? Thanks a lot.

Cheers!
Shuyi

On Wed, May 16, 2018 at 9:45 AM, Rong Rong  wrote:

> I think the question here is whether registering Jar files (or other
> executable files) during job submission is sufficient for @shuyi's use
> case.
>
> If I understand correctly regarding the part of dynamic distribution of the
> external libraries in runtime. This is used to deal with DDL/DSL such as:
> CREATE FUNCTION my_fun FROM url://
> during execution. Correct me if I am wrong @shuyi, The basic assumption
> that "we can locate and ship all executable JARs during job submission" no
> longer holds for your use case right?
>
> I guess we are missing details here regarding the "distribution of external
> libraries in runtime" part. Maybe you can share more example of this use
> case. Would this be included in the design doc @Timo?
>
> --
> Rong
>
> On Wed, May 16, 2018 at 5:41 AM, Timo Walther  wrote:
>
> > Yes, we are using the addJar functionionality of the JobGraph as well for
> > the SQL Client.
> >
> > I think the execution environment is not the right place to specify jars.
> > The location of the jars 

Re: [VOTE] Release 1.5.0, release candidate #3

2018-05-16 Thread Till Rohrmann
Testing the RC has surfaced a problem with the release of blobs of finished
jobs [1]. This is a release blocker and, thus, I have to cancel the RC 3.
I'll prepare a new RC once the problem has been fixed.

Thanks to all for testing the release candidate!

[1] https://issues.apache.org/jira/browse/FLINK-9381?filter=12327438

Cheers,
Till

On Tue, May 15, 2018 at 5:32 PM, Till Rohrmann  wrote:

> Hi everyone,
>
> Please review and vote on the release candidate #3 for the version 1.5.0,
> as follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>
>
> The complete staging area is available for your review, which includes:
> * JIRA release notes [1],
> * the official Apache source release and binary convenience releases to be
> deployed to dist.apache.org [2], which are signed with the key with
> fingerprint 1F302569A96CFFD5 [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code tag "release-1.5.0-rc3" [5],
>
> Please use this document for coordinating testing efforts: [6]
>
> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
>
> Thanks,
> Your friendly Release Manager
>
> [1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> projectId=12315522=12341764
> [2] http://people.apache.org/~trohrmann/flink-1.5.0-rc3/
> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> [4] https://repository.apache.org/content/repositories/orgapacheflink-1156
> [5] https://git-wip-us.apache.org/repos/asf?p=flink.git;a=commit;h=
> e7252690af7c4c1167dac8f2dd08ff33d6d829b5
> [6] https://docs.google.com/document/d/11DmsJDe-
> 4ljHgXByHqFQsDRH_gjuKGQHpDZ9N-Frtyg/edit?usp=sharing
>
> Pro-tip: you can create a settings.xml file with these contents:
>
> 
> 
>   flink-1.5.0
> 
> 
>   
> flink-1.5.0
> 
>   
> flink-1.5.0
> 
> https://repository.apache.org/content/repositories/
> orgapacheflink-1156/
> 
>   
>   
> archetype
> 
> https://repository.apache.org/content/repositories/
> orgapacheflink-1156/
> 
>   
> 
>   
> 
> 
>
> And reference that in you maven commands via --settings
> path/to/settings.xml. This is useful for creating a quickstart based on the
> staged release and for building against the staged jars.
>
>


Re: [DISCUSS] GitBox

2018-05-16 Thread Kenneth Knowles
When I open a pull request to Beam, it is on by default. I have just run an
experiment to see if it is remembering the last option I checked and it is
not. Even after I disable it for one pull request, the next one has it
checked again. So it may be a repository-level setting that you can set up.

Kenn

On Wed, May 16, 2018 at 11:19 AM Chesnay Schepler 
wrote:

> This however has to be enabled by the contributor, separately for each PR.
> We'll see how often we get the opportunity to use it.
>
> On 16.05.2018 17:43, Kenneth Knowles wrote:
> > Actually, GitHub has a feature so you do not require picture-perfect
> > commits:
> >
> https://help.github.com/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork/
> >
> > If the owner of the PR checks the box, it will give committers write
> access
> > to their branch (on their fork). A nice bonus is you can make the changes
> > and then continue the review, too.
> >
> > Kenn
> >
> > On Wed, May 16, 2018 at 8:31 AM Stefan Richter <
> s.rich...@data-artisans.com>
> > wrote:
> >
> >> +1
> >>
> >>> Am 16.05.2018 um 12:40 schrieb Chesnay Schepler :
> >>>
> >>> Hello,
> >>>
> >>> during the discussion about how to better manage pull requests [1] the
> >> topic of GitBox integration came up again.
> >>> This seems like a good opportunity to restart this discussion that we
> >> had about a year ago [2].
> >>> * What is GitBox
> >>>
> >>>Essentially, GitBox allow us to use GitHub features.
> >>>We can decide for ourselves which features we want enabled.
> >>>
> >>>We could merge PRs directly on GitHub at the button of a click.
> >>>That said the merge functionality is fairly limited and would
> >>>require picture-perfect commits in the pull requests.
> >>>Commits can be squashed, but you cannot amend commits in any way, be
> >>>it fixing typos or changing the commit message. Realistically this
> >>>limits how much we can use this feature, and it may lead to a
> >>>decline in the quality of commit messages.
> >>>
> >>>Labels can be useful for the management of PRs as (ready for review,
> >>>delayed for next release, waiting for changes). This is really what
> >>>I'm personally most interested in.
> >>>
> >>>We've been using GitBox for flink-shaded for a while now and i
> >>>didn't run into any issue. AFAIK GitBox is also the default for new
> >>>projects.
> >>>
> >>> * What this means for committers:
> >>>
> >>>The apache git remote URL will change, which will require all
> >>>committers to update their git setup.
> >>>This also implies that we may have to update the website build
> scripts.
> >>>The new URL would (probably) be
> >>>/https://gitbox.apache.org/repos/asf/flink.git/.
> >>>
> >>>To make use of GitHub features you have to link your GitHub and
> >>>Apache accounts. [3]
> >>>This also requires setting up two-factor authentication on GitHub.
> >>>
> >>>Update the scm entry in the parent pom.xml.
> >>>
> >>> * What this means for contributors:
> >>>
> >>>Nothing should change for contributors. Small changes (like typos)
> >>>may be merged more quickly, if the commit message is appropriate, as
> >>>it could be done directly through GitHub.
> >>>
> >>> [1]
> >>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Closing-automatically-inactive-pull-requests-tt22248.html
> >>> [2]
> >>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-GitBox-td18027.html
> >>> [3] https://gitbox.apache.org/setup/
> >>
>
>


Re: [DISCUSS] GitBox

2018-05-16 Thread Chesnay Schepler

This however has to be enabled by the contributor, separately for each PR.
We'll see how often we get the opportunity to use it.

On 16.05.2018 17:43, Kenneth Knowles wrote:

Actually, GitHub has a feature so you do not require picture-perfect
commits:
https://help.github.com/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork/

If the owner of the PR checks the box, it will give committers write access
to their branch (on their fork). A nice bonus is you can make the changes
and then continue the review, too.

Kenn

On Wed, May 16, 2018 at 8:31 AM Stefan Richter 
wrote:


+1


Am 16.05.2018 um 12:40 schrieb Chesnay Schepler :

Hello,

during the discussion about how to better manage pull requests [1] the

topic of GitBox integration came up again.

This seems like a good opportunity to restart this discussion that we

had about a year ago [2].

* What is GitBox

   Essentially, GitBox allow us to use GitHub features.
   We can decide for ourselves which features we want enabled.

   We could merge PRs directly on GitHub at the button of a click.
   That said the merge functionality is fairly limited and would
   require picture-perfect commits in the pull requests.
   Commits can be squashed, but you cannot amend commits in any way, be
   it fixing typos or changing the commit message. Realistically this
   limits how much we can use this feature, and it may lead to a
   decline in the quality of commit messages.

   Labels can be useful for the management of PRs as (ready for review,
   delayed for next release, waiting for changes). This is really what
   I'm personally most interested in.

   We've been using GitBox for flink-shaded for a while now and i
   didn't run into any issue. AFAIK GitBox is also the default for new
   projects.

* What this means for committers:

   The apache git remote URL will change, which will require all
   committers to update their git setup.
   This also implies that we may have to update the website build scripts.
   The new URL would (probably) be
   /https://gitbox.apache.org/repos/asf/flink.git/.

   To make use of GitHub features you have to link your GitHub and
   Apache accounts. [3]
   This also requires setting up two-factor authentication on GitHub.

   Update the scm entry in the parent pom.xml.

* What this means for contributors:

   Nothing should change for contributors. Small changes (like typos)
   may be merged more quickly, if the commit message is appropriate, as
   it could be done directly through GitHub.

[1]

http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Closing-automatically-inactive-pull-requests-tt22248.html

[2]

http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-GitBox-td18027.html

[3] https://gitbox.apache.org/setup/






Re: [DISCUSS] GitBox

2018-05-16 Thread Suneel Marthi
+1

On Wed, May 16, 2018 at 2:09 PM, Thomas Weise  wrote:

> +1
>
>
> On Wed, May 16, 2018 at 8:31 AM, Stefan Richter <
> s.rich...@data-artisans.com
> > wrote:
>
> > +1
> >
> > > Am 16.05.2018 um 12:40 schrieb Chesnay Schepler :
> > >
> > > Hello,
> > >
> > > during the discussion about how to better manage pull requests [1] the
> > topic of GitBox integration came up again.
> > >
> > > This seems like a good opportunity to restart this discussion that we
> > had about a year ago [2].
> > >
> > > * What is GitBox
> > >
> > >   Essentially, GitBox allow us to use GitHub features.
> > >   We can decide for ourselves which features we want enabled.
> > >
> > >   We could merge PRs directly on GitHub at the button of a click.
> > >   That said the merge functionality is fairly limited and would
> > >   require picture-perfect commits in the pull requests.
> > >   Commits can be squashed, but you cannot amend commits in any way, be
> > >   it fixing typos or changing the commit message. Realistically this
> > >   limits how much we can use this feature, and it may lead to a
> > >   decline in the quality of commit messages.
> > >
> > >   Labels can be useful for the management of PRs as (ready for review,
> > >   delayed for next release, waiting for changes). This is really what
> > >   I'm personally most interested in.
> > >
> > >   We've been using GitBox for flink-shaded for a while now and i
> > >   didn't run into any issue. AFAIK GitBox is also the default for new
> > >   projects.
> > >
> > > * What this means for committers:
> > >
> > >   The apache git remote URL will change, which will require all
> > >   committers to update their git setup.
> > >   This also implies that we may have to update the website build
> scripts.
> > >   The new URL would (probably) be
> > >   /https://gitbox.apache.org/repos/asf/flink.git/.
> > >
> > >   To make use of GitHub features you have to link your GitHub and
> > >   Apache accounts. [3]
> > >   This also requires setting up two-factor authentication on GitHub.
> > >
> > >   Update the scm entry in the parent pom.xml.
> > >
> > > * What this means for contributors:
> > >
> > >   Nothing should change for contributors. Small changes (like typos)
> > >   may be merged more quickly, if the commit message is appropriate, as
> > >   it could be done directly through GitHub.
> > >
> > > [1] http://apache-flink-mailing-list-archive.1008284.n3.
> > nabble.com/Closing-automatically-inactive-pull-requests-tt22248.html
> > > [2] http://apache-flink-mailing-list-archive.1008284.n3.
> > nabble.com/DISCUSS-GitBox-td18027.html
> > > [3] https://gitbox.apache.org/setup/
> >
> >
>


Re: [DISCUSS] GitBox

2018-05-16 Thread Jean-Baptiste Onofré
+1

Regards
JB

Le 16 mai 2018 à 20:09, à 20:09, Thomas Weise  a écrit:
>+1
>
>
>On Wed, May 16, 2018 at 8:31 AM, Stefan Richter
>> wrote:
>
>> +1
>>
>> > Am 16.05.2018 um 12:40 schrieb Chesnay Schepler
>:
>> >
>> > Hello,
>> >
>> > during the discussion about how to better manage pull requests [1]
>the
>> topic of GitBox integration came up again.
>> >
>> > This seems like a good opportunity to restart this discussion that
>we
>> had about a year ago [2].
>> >
>> > * What is GitBox
>> >
>> >   Essentially, GitBox allow us to use GitHub features.
>> >   We can decide for ourselves which features we want enabled.
>> >
>> >   We could merge PRs directly on GitHub at the button of a click.
>> >   That said the merge functionality is fairly limited and would
>> >   require picture-perfect commits in the pull requests.
>> >   Commits can be squashed, but you cannot amend commits in any way,
>be
>> >   it fixing typos or changing the commit message. Realistically
>this
>> >   limits how much we can use this feature, and it may lead to a
>> >   decline in the quality of commit messages.
>> >
>> >   Labels can be useful for the management of PRs as (ready for
>review,
>> >   delayed for next release, waiting for changes). This is really
>what
>> >   I'm personally most interested in.
>> >
>> >   We've been using GitBox for flink-shaded for a while now and i
>> >   didn't run into any issue. AFAIK GitBox is also the default for
>new
>> >   projects.
>> >
>> > * What this means for committers:
>> >
>> >   The apache git remote URL will change, which will require all
>> >   committers to update their git setup.
>> >   This also implies that we may have to update the website build
>scripts.
>> >   The new URL would (probably) be
>> >   /https://gitbox.apache.org/repos/asf/flink.git/.
>> >
>> >   To make use of GitHub features you have to link your GitHub and
>> >   Apache accounts. [3]
>> >   This also requires setting up two-factor authentication on
>GitHub.
>> >
>> >   Update the scm entry in the parent pom.xml.
>> >
>> > * What this means for contributors:
>> >
>> >   Nothing should change for contributors. Small changes (like
>typos)
>> >   may be merged more quickly, if the commit message is appropriate,
>as
>> >   it could be done directly through GitHub.
>> >
>> > [1] http://apache-flink-mailing-list-archive.1008284.n3.
>> nabble.com/Closing-automatically-inactive-pull-requests-tt22248.html
>> > [2] http://apache-flink-mailing-list-archive.1008284.n3.
>> nabble.com/DISCUSS-GitBox-td18027.html
>> > [3] https://gitbox.apache.org/setup/
>>
>>


Re: [DISCUSS] GitBox

2018-05-16 Thread Thomas Weise
+1


On Wed, May 16, 2018 at 8:31 AM, Stefan Richter  wrote:

> +1
>
> > Am 16.05.2018 um 12:40 schrieb Chesnay Schepler :
> >
> > Hello,
> >
> > during the discussion about how to better manage pull requests [1] the
> topic of GitBox integration came up again.
> >
> > This seems like a good opportunity to restart this discussion that we
> had about a year ago [2].
> >
> > * What is GitBox
> >
> >   Essentially, GitBox allow us to use GitHub features.
> >   We can decide for ourselves which features we want enabled.
> >
> >   We could merge PRs directly on GitHub at the button of a click.
> >   That said the merge functionality is fairly limited and would
> >   require picture-perfect commits in the pull requests.
> >   Commits can be squashed, but you cannot amend commits in any way, be
> >   it fixing typos or changing the commit message. Realistically this
> >   limits how much we can use this feature, and it may lead to a
> >   decline in the quality of commit messages.
> >
> >   Labels can be useful for the management of PRs as (ready for review,
> >   delayed for next release, waiting for changes). This is really what
> >   I'm personally most interested in.
> >
> >   We've been using GitBox for flink-shaded for a while now and i
> >   didn't run into any issue. AFAIK GitBox is also the default for new
> >   projects.
> >
> > * What this means for committers:
> >
> >   The apache git remote URL will change, which will require all
> >   committers to update their git setup.
> >   This also implies that we may have to update the website build scripts.
> >   The new URL would (probably) be
> >   /https://gitbox.apache.org/repos/asf/flink.git/.
> >
> >   To make use of GitHub features you have to link your GitHub and
> >   Apache accounts. [3]
> >   This also requires setting up two-factor authentication on GitHub.
> >
> >   Update the scm entry in the parent pom.xml.
> >
> > * What this means for contributors:
> >
> >   Nothing should change for contributors. Small changes (like typos)
> >   may be merged more quickly, if the commit message is appropriate, as
> >   it could be done directly through GitHub.
> >
> > [1] http://apache-flink-mailing-list-archive.1008284.n3.
> nabble.com/Closing-automatically-inactive-pull-requests-tt22248.html
> > [2] http://apache-flink-mailing-list-archive.1008284.n3.
> nabble.com/DISCUSS-GitBox-td18027.html
> > [3] https://gitbox.apache.org/setup/
>
>


Re: [DISCUSS] Adding new interfaces in [Stream]ExecutionEnvironment

2018-05-16 Thread Rong Rong
I think the question here is whether registering Jar files (or other
executable files) during job submission is sufficient for @shuyi's use
case.

If I understand correctly regarding the part of dynamic distribution of the
external libraries in runtime. This is used to deal with DDL/DSL such as:
CREATE FUNCTION my_fun FROM url://
during execution. Correct me if I am wrong @shuyi, The basic assumption
that "we can locate and ship all executable JARs during job submission" no
longer holds for your use case right?

I guess we are missing details here regarding the "distribution of external
libraries in runtime" part. Maybe you can share more example of this use
case. Would this be included in the design doc @Timo?

--
Rong

On Wed, May 16, 2018 at 5:41 AM, Timo Walther  wrote:

> Yes, we are using the addJar functionionality of the JobGraph as well for
> the SQL Client.
>
> I think the execution environment is not the right place to specify jars.
> The location of the jars depends on the submission method. If a local path
> is specified in the main() method of a packaged Flink jar, it would not
> work when such a program is submitted through the REST API.
>
> Regards,
> Timo
>
> Am 16.05.18 um 14:32 schrieb Aljoscha Krettek:
>
> I think this functionality is already there, we just have to expose it in
>> the right places: ClusterClient.submitJob() takes a JobGraph, JobGraph has
>> method addJar() for adding jars that need to be in the classloader for
>> executing a user program.
>>
>> On 16. May 2018, at 12:34, Fabian Hueske  wrote:
>>>
>>> Hi Ted,
>>>
>>> The design doc is in late draft status and proposes support for SQL DDL
>>> statements (CREATE TABLE, CREATE  FUNCTION, etc.).
>>> The question about registering JARs came up because we need a way to
>>> distribute JAR files that contain the code of user-defined functions.
>>>
>>> The design doc will soon be shared on the dev mailing list to gather
>>> feedback from the community.
>>>
>>> Best, Fabian
>>>
>>> 2018-05-16 10:45 GMT+02:00 Ted Yu :
>>>
>>> bq. In a design document, Timo mentioned that we can ship multiple JAR
 files

 Mind telling us where the design doc can be retrieved ?

 Thanks

 On Wed, May 16, 2018 at 1:29 AM, Fabian Hueske 
 wrote:

 Hi,
>
> I'm not sure if we need to modify the existing method.
> What we need is a bit different from what registerCachedFile()
> provides.
> The method ensures that a file is copied to each TaskManager and can be
> locally accessed from a function's RuntimeContext.
> In our case, we don't need to access the file but would like to make
> sure
> that it is loaded into the class loader.
> So, we could also just add a method like registerUserJarFile().
>
> In a design document, Timo mentioned that we can ship multiple JAR
> files
> with a job.
> So, we could also implement the UDF shipping logic by loading the Jar
> file(s) to the client and distribute them from there.
> In that case, we would not need to add new method to the execution
> environment.
>
> Best,
> Fabian
>
> 2018-05-15 3:50 GMT+02:00 Rong Rong :
>
> +1. This could be very useful for "dynamic" UDF.
>>
>> Just to clarify, if I understand correctly, we are tying to use an
>> ENUM
>> indicator to
>> (1) Replace the current Boolean isExecutable flag.
>> (2) Provide additional information used by ExecutionEnvironment to
>>
> decide

> when/where to use the DistributedCached file.
>>
>> In this case, DistributedCache.CacheType or DistributedCache.FileType
>> sounds more intuitive, what do you think?
>>
>> Also, I was wondering is there any other useful information for the
>>
> cached
>
>> file to be passed to runtime.
>> If we are just talking about including the library to the classloader,
>>
> can
>
>> we directly extend the interface with
>>
>> public void registerCachedFile(
>> String filePath,
>> String name,
>> boolean executable,
>> boolean includeInClassLoader)
>>
>>
>> Thanks,
>> Rong
>>
>> On Sun, May 13, 2018 at 11:14 PM, Shuyi Chen 
>>
> wrote:

> Hi Flink devs,
>>>
>>> In an effort to support loading external libraries and creating UDFs
>>>
>> from
>
>> external libraries using DDL in Flink SQL, we want to use Flink’s
>>>
>> Blob

> Server to distribute the external libraries in runtime and load those
>>> libraries into the user code classloader automatically.
>>>
>>> However, the current [Stream]ExecutionEnvironment.registerCachedFile
>>> interface limits only to registering executable or non-executable
>>>
>> blobs.
>
>> It’s not 

[jira] [Created] (FLINK-9386) Remove netty-router dependency

2018-05-16 Thread Piotr Nowojski (JIRA)
Piotr Nowojski created FLINK-9386:
-

 Summary: Remove netty-router dependency
 Key: FLINK-9386
 URL: https://issues.apache.org/jira/browse/FLINK-9386
 Project: Flink
  Issue Type: Sub-task
Reporter: Piotr Nowojski
Assignee: Piotr Nowojski
 Fix For: 1.6.0


netty-router 1.10 blocks upgrade to 4.1, while netty-router 2.2.0 has broken 
compatibility in a way that it's unusable by us (it doesn't allow to sort 
router paths as in https://issues.apache.org/jira/browse/FLINK-8000 ). I 
propose to copy & simplify & modify netty-router code to suite our needs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-9385) Operators with two inputs should show "Records Received" in Web UI separately, rather than added together

2018-05-16 Thread Josh Lemer (JIRA)
Josh Lemer created FLINK-9385:
-

 Summary: Operators with two inputs should show "Records Received" 
in Web UI separately, rather than added together
 Key: FLINK-9385
 URL: https://issues.apache.org/jira/browse/FLINK-9385
 Project: Flink
  Issue Type: Task
Reporter: Josh Lemer


In the Flink Web UI, there is a column in the Subtasks information view which 
shows how many records each subtask has received. However, for subtasks such as 
CoProcess operators which take two inputs, the number shown for "Records 
Received" is the sum of the records received from the two upstream datastreams. 
It would be much more helpful if it displayed the number of records received 
from each of the two separately.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] GitBox

2018-05-16 Thread Kenneth Knowles
Actually, GitHub has a feature so you do not require picture-perfect
commits:
https://help.github.com/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork/

If the owner of the PR checks the box, it will give committers write access
to their branch (on their fork). A nice bonus is you can make the changes
and then continue the review, too.

Kenn

On Wed, May 16, 2018 at 8:31 AM Stefan Richter 
wrote:

> +1
>
> > Am 16.05.2018 um 12:40 schrieb Chesnay Schepler :
> >
> > Hello,
> >
> > during the discussion about how to better manage pull requests [1] the
> topic of GitBox integration came up again.
> >
> > This seems like a good opportunity to restart this discussion that we
> had about a year ago [2].
> >
> > * What is GitBox
> >
> >   Essentially, GitBox allow us to use GitHub features.
> >   We can decide for ourselves which features we want enabled.
> >
> >   We could merge PRs directly on GitHub at the button of a click.
> >   That said the merge functionality is fairly limited and would
> >   require picture-perfect commits in the pull requests.
> >   Commits can be squashed, but you cannot amend commits in any way, be
> >   it fixing typos or changing the commit message. Realistically this
> >   limits how much we can use this feature, and it may lead to a
> >   decline in the quality of commit messages.
> >
> >   Labels can be useful for the management of PRs as (ready for review,
> >   delayed for next release, waiting for changes). This is really what
> >   I'm personally most interested in.
> >
> >   We've been using GitBox for flink-shaded for a while now and i
> >   didn't run into any issue. AFAIK GitBox is also the default for new
> >   projects.
> >
> > * What this means for committers:
> >
> >   The apache git remote URL will change, which will require all
> >   committers to update their git setup.
> >   This also implies that we may have to update the website build scripts.
> >   The new URL would (probably) be
> >   /https://gitbox.apache.org/repos/asf/flink.git/.
> >
> >   To make use of GitHub features you have to link your GitHub and
> >   Apache accounts. [3]
> >   This also requires setting up two-factor authentication on GitHub.
> >
> >   Update the scm entry in the parent pom.xml.
> >
> > * What this means for contributors:
> >
> >   Nothing should change for contributors. Small changes (like typos)
> >   may be merged more quickly, if the commit message is appropriate, as
> >   it could be done directly through GitHub.
> >
> > [1]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Closing-automatically-inactive-pull-requests-tt22248.html
> > [2]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-GitBox-td18027.html
> > [3] https://gitbox.apache.org/setup/
>
>


Re: [DISCUSS] GitBox

2018-05-16 Thread Stefan Richter
+1

> Am 16.05.2018 um 12:40 schrieb Chesnay Schepler :
> 
> Hello,
> 
> during the discussion about how to better manage pull requests [1] the topic 
> of GitBox integration came up again.
> 
> This seems like a good opportunity to restart this discussion that we had 
> about a year ago [2].
> 
> * What is GitBox
> 
>   Essentially, GitBox allow us to use GitHub features.
>   We can decide for ourselves which features we want enabled.
> 
>   We could merge PRs directly on GitHub at the button of a click.
>   That said the merge functionality is fairly limited and would
>   require picture-perfect commits in the pull requests.
>   Commits can be squashed, but you cannot amend commits in any way, be
>   it fixing typos or changing the commit message. Realistically this
>   limits how much we can use this feature, and it may lead to a
>   decline in the quality of commit messages.
> 
>   Labels can be useful for the management of PRs as (ready for review,
>   delayed for next release, waiting for changes). This is really what
>   I'm personally most interested in.
> 
>   We've been using GitBox for flink-shaded for a while now and i
>   didn't run into any issue. AFAIK GitBox is also the default for new
>   projects.
> 
> * What this means for committers:
> 
>   The apache git remote URL will change, which will require all
>   committers to update their git setup.
>   This also implies that we may have to update the website build scripts.
>   The new URL would (probably) be
>   /https://gitbox.apache.org/repos/asf/flink.git/.
> 
>   To make use of GitHub features you have to link your GitHub and
>   Apache accounts. [3]
>   This also requires setting up two-factor authentication on GitHub.
> 
>   Update the scm entry in the parent pom.xml.
> 
> * What this means for contributors:
> 
>   Nothing should change for contributors. Small changes (like typos)
>   may be merged more quickly, if the commit message is appropriate, as
>   it could be done directly through GitHub.
> 
> [1] 
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Closing-automatically-inactive-pull-requests-tt22248.html
> [2] 
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-GitBox-td18027.html
> [3] https://gitbox.apache.org/setup/



[jira] [Created] (FLINK-9384) KafkaAvroTableSource failed to work due to type mismatch

2018-05-16 Thread Jun Zhang (JIRA)
Jun Zhang created FLINK-9384:


 Summary: KafkaAvroTableSource failed to work due to type mismatch
 Key: FLINK-9384
 URL: https://issues.apache.org/jira/browse/FLINK-9384
 Project: Flink
  Issue Type: Bug
  Components: Kafka Connector
Affects Versions: 1.6.0
Reporter: Jun Zhang
 Fix For: 1.6.0


An exception was thrown when using KafkaAvroTableSource as follows:

Exception in thread "main" org.apache.flink.table.api.TableException: 
TableSource of type 
org.apache.flink.streaming.connectors.kafka.Kafka011AvroTableSource returned a 
DataStream of type GenericType that does not match 
with the type Row(id: Integer, name: String, age: Integer, event: 
GenericType) declared by the TableSource.getReturnType() method. 
Please validate the implementation of the TableSource.
 at 
org.apache.flink.table.plan.nodes.datastream.StreamTableSourceScan.translateToPlan(StreamTableSourceScan.scala:100)
 at 
org.apache.flink.table.api.StreamTableEnvironment.translateToCRow(StreamTableEnvironment.scala:885)
 at 
org.apache.flink.table.api.StreamTableEnvironment.translate(StreamTableEnvironment.scala:812)
 at 
org.apache.flink.table.api.StreamTableEnvironment.writeToSink(StreamTableEnvironment.scala:279)
 at org.apache.flink.table.api.Table.writeToSink(table.scala:862)
 at org.apache.flink.table.api.Table.writeToSink(table.scala:830)
 at org.apache.flink.quickstart.StreamingJobAvro.main(StreamingJobAvro.java:85)

 

It is caused by a discrepancy between the type returned by the TableSource and 
the type returned by the DataStream. I've already fixed it, would someone 
please review the patch and see if it could be merged.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] GitBox

2018-05-16 Thread Piotr Nowojski
+1

> On 16 May 2018, at 14:24, Aljoscha Krettek  wrote:
> 
> +1
> 
> On Beam, we gradually enabled this, first for the website repo and then for 
> the main repo and we didn't run into problems.
> 
>> On 16. May 2018, at 12:45, Chesnay Schepler  wrote:
>> 
>> Forget an important feature: It would allow committers to close pull 
>> requests.
>> 
>> On 16.05.2018 12:40, Chesnay Schepler wrote:
>>> Hello,
>>> 
>>> during the discussion about how to better manage pull requests [1] the 
>>> topic of GitBox integration came up again.
>>> 
>>> This seems like a good opportunity to restart this discussion that we had 
>>> about a year ago [2].
>>> 
>>> * What is GitBox
>>> 
>>>  Essentially, GitBox allow us to use GitHub features.
>>>  We can decide for ourselves which features we want enabled.
>>> 
>>>  We could merge PRs directly on GitHub at the button of a click.
>>>  That said the merge functionality is fairly limited and would
>>>  require picture-perfect commits in the pull requests.
>>>  Commits can be squashed, but you cannot amend commits in any way, be
>>>  it fixing typos or changing the commit message. Realistically this
>>>  limits how much we can use this feature, and it may lead to a
>>>  decline in the quality of commit messages.
>>> 
>>>  Labels can be useful for the management of PRs as (ready for review,
>>>  delayed for next release, waiting for changes). This is really what
>>>  I'm personally most interested in.
>>> 
>>>  We've been using GitBox for flink-shaded for a while now and i
>>>  didn't run into any issue. AFAIK GitBox is also the default for new
>>>  projects.
>>> 
>>> * What this means for committers:
>>> 
>>>  The apache git remote URL will change, which will require all
>>>  committers to update their git setup.
>>>  This also implies that we may have to update the website build scripts.
>>>  The new URL would (probably) be
>>>  /https://gitbox.apache.org/repos/asf/flink.git/.
>>> 
>>>  To make use of GitHub features you have to link your GitHub and
>>>  Apache accounts. [3]
>>>  This also requires setting up two-factor authentication on GitHub.
>>> 
>>>  Update the scm entry in the parent pom.xml.
>>> 
>>> * What this means for contributors:
>>> 
>>>  Nothing should change for contributors. Small changes (like typos)
>>>  may be merged more quickly, if the commit message is appropriate, as
>>>  it could be done directly through GitHub.
>>> 
>>> [1] 
>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Closing-automatically-inactive-pull-requests-tt22248.html
>>> [2] 
>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-GitBox-td18027.html
>>> [3] https://gitbox.apache.org/setup/
>>> 
>> 
> 



Re: Closing (automatically?) inactive pull requests

2018-05-16 Thread Piotr Nowojski
The question is what would such tool offer on top of over a Github’s view of PR 
sorted by “least recently updated”:

https://github.com/apache/flink/pulls?q=is%3Apr+is%3Aopen+sort%3Aupdated-asc 


? Maybe it would be good enough to have a policy about waiting x months/weeks 
for a contributor to respond. If he doesn’t, we either take over PR we or close 
it. Having “clean” view of oldest PRs would be beneficial for reviewing PRs in 
~FIFO order as part of community work.

Having

> On 16 May 2018, at 10:57, Fabian Hueske  wrote:
> 
> Hi,
> 
> I'm not objecting closing stale PRs.
> We have quite a few PRs with very little chance of being merged and I would
> certainly appreciate cleaning up those.
> However, I think we should not automate closing PRs for the reasons I gave
> before.
> 
> A tool that reminds us of state PRs as proposed by Till seems like a good
> idea and might actually help to re-adjust priorities from time to time.
> 
> @Yazdan: Right now there is no assignment happening. Committers decide
> which PRs they want to review and merge.
> 
> Best, Fabian
> 
> 2018-05-16 4:26 GMT+02:00 Yaz Sh :
> 
>> I have questions in this regard (you guys might have addressed it in this
>> email chain):
>> how PRs get assigned to a reviewer apart of a contributor tag someone?
>> what if PR never gets a reviewer attention and it became in-active due to
>> long review respond? should Bot assign a reviewer to a PR based on
>> reviewers interest (i.e., defined via tags) and notify he/she if PR is
>> waiting for review?
>> 
>> 
>> Cheers,
>> /Yazdan
>> 
>> 
>>> On May 15, 2018, at 12:06 PM, Thomas Weise  wrote:
>>> 
>>> I like Till's proposal to notify the participants on the PR to PTAL. But
>> I
>>> would also suggest to auto-close when no action is taken, with a friendly
>>> reminder that PRs can be reopened anytime.
>>> 
>>> The current situation with 350 open PRs may send a signal to contributors
>>> that it may actually be too much hassle to get a change committed in
>> Flink.
>>> Since the count keeps on rising, this is also not a past problem. Pruning
>>> inactive PRs may help, that will also give a more accurate picture of the
>>> lack of review bandwidth, if that is the root cause.
>>> 
>>> Thanks,
>>> Thomas
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Tue, May 15, 2018 at 8:24 AM, Ted Yu  wrote:
>>> 
 How does the bot decide whether the PR is waiting for reviews or is
>> being
 abandoned by contributor ?
 
 How about letting the bot count the number of times contributor pings
 committer(s) for review ?
 When unanswered ping count crosses some threshold, say 3, the bot
>> publishes
 the JIRA and PR somewhere.
 
 Cheers
 
 On Tue, May 15, 2018 at 8:19 AM, Till Rohrmann 
 wrote:
 
> I'm a bit torn here because I can see the pros and cons for both sides.
> 
> Maybe a compromise could be to not have a closing but a monitoring bot
> which notifies us about inactive PRs. This could then trigger an
> investigation of the underlying problem and ultimately lead to a
 conscious
> decision to close or keep the PR open. As such it is not strictly
 necessary
> to have such a bot but it would at least remind us to make a decision
 about
> older PRs with no activity.
> 
> Cheers,
> Till
> 
> On Tue, May 15, 2018 at 3:26 PM, Chesnay Schepler 
> wrote:
> 
>> /So far I did it twice for older PRs. In both cases I didn’t get any
>> response and I even forgot in which PRs I had asked this question, so
> now I
>> can not even close them :S/
>> 
>> To be honest this sounds more like an issue with how your organize
>> your
>> work. No amount of closing PRs can fix that.
>> With GitBox we can assign reviewers to a PR, but I'm not sure whether
 it
>> only allows committers to assign people.
>> Bookmarks or text files might help as well./
>> /
>> 
>> /Regarding only 30% blocked on contributor. I wonder what would be
>> this
>> number if we tried to ask in the rest of old PRs “Hey, are you still
>> interested in reviewing/merging this?”.  If old PR is waiting for a
>> reviewer for X months, it doesn’t mean that’s not abandoned. Even if
>> it
>> doesn’t, reducing our overhead by those 30% of the PRs is something./
>> 
>> No doubt the number would be higher if we were to go back, but as i
>> explained earlier that is not a reason to close it. If a PR is
 abandoned
>> because we messed up we should still /try /to get it in.
>> 
>> /This is kind of whole point of what I was proposing. If the PR author
 is
>> still there, and can respond/bump/interrupt the closing timeout,
>> that’s
>> great. Gives us even more sense of urgency 

[jira] [Created] (FLINK-9383) Extend DistributedCache E2E test to cover directories

2018-05-16 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-9383:
---

 Summary: Extend DistributedCache E2E test to cover directories
 Key: FLINK-9383
 URL: https://issues.apache.org/jira/browse/FLINK-9383
 Project: Flink
  Issue Type: Bug
  Components: Local Runtime, Tests
Affects Versions: 1.6.0
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.6.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-9382) Inconsistent file storage behavior in FileCache

2018-05-16 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-9382:
---

 Summary: Inconsistent file storage behavior in FileCache
 Key: FLINK-9382
 URL: https://issues.apache.org/jira/browse/FLINK-9382
 Project: Flink
  Issue Type: Bug
  Components: Local Runtime
Affects Versions: 1.6.0
Reporter: Chesnay Schepler


The file storage behavior in the {{FileCache}} is inconsistent and depends 
whether we're accessing a file or zipped folder.

In case of a file we return the file returned from the {{PermanentBlobService}} 
as is, in this case the blobserver takes care of deleting the file _at some 
point_.
In case of a directory (that is implicitly zipped!) we extract the zip into a 
separate storage directory and delete the zip afterwards (the deletion in 
particular is questionable as it interferes with the blob server).

The underlying issue to me is that the responsibilities between the 
{{PermanentBlobService}} and {{FileCache}} aren't clearly laid out, and they 
actually seem a bit redundant.
We may want to change the FIleCache to be a thin wrapper around the 
{{PermanentBlobService}} that only takes care of unzipping directories and 
setting executable flags.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] Adding new interfaces in [Stream]ExecutionEnvironment

2018-05-16 Thread Aljoscha Krettek
I think this functionality is already there, we just have to expose it in the 
right places: ClusterClient.submitJob() takes a JobGraph, JobGraph has method 
addJar() for adding jars that need to be in the classloader for executing a 
user program.

> On 16. May 2018, at 12:34, Fabian Hueske  wrote:
> 
> Hi Ted,
> 
> The design doc is in late draft status and proposes support for SQL DDL
> statements (CREATE TABLE, CREATE  FUNCTION, etc.).
> The question about registering JARs came up because we need a way to
> distribute JAR files that contain the code of user-defined functions.
> 
> The design doc will soon be shared on the dev mailing list to gather
> feedback from the community.
> 
> Best, Fabian
> 
> 2018-05-16 10:45 GMT+02:00 Ted Yu :
> 
>> bq. In a design document, Timo mentioned that we can ship multiple JAR
>> files
>> 
>> Mind telling us where the design doc can be retrieved ?
>> 
>> Thanks
>> 
>> On Wed, May 16, 2018 at 1:29 AM, Fabian Hueske  wrote:
>> 
>>> Hi,
>>> 
>>> I'm not sure if we need to modify the existing method.
>>> What we need is a bit different from what registerCachedFile() provides.
>>> The method ensures that a file is copied to each TaskManager and can be
>>> locally accessed from a function's RuntimeContext.
>>> In our case, we don't need to access the file but would like to make sure
>>> that it is loaded into the class loader.
>>> So, we could also just add a method like registerUserJarFile().
>>> 
>>> In a design document, Timo mentioned that we can ship multiple JAR files
>>> with a job.
>>> So, we could also implement the UDF shipping logic by loading the Jar
>>> file(s) to the client and distribute them from there.
>>> In that case, we would not need to add new method to the execution
>>> environment.
>>> 
>>> Best,
>>> Fabian
>>> 
>>> 2018-05-15 3:50 GMT+02:00 Rong Rong :
>>> 
 +1. This could be very useful for "dynamic" UDF.
 
 Just to clarify, if I understand correctly, we are tying to use an ENUM
 indicator to
 (1) Replace the current Boolean isExecutable flag.
 (2) Provide additional information used by ExecutionEnvironment to
>> decide
 when/where to use the DistributedCached file.
 
 In this case, DistributedCache.CacheType or DistributedCache.FileType
 sounds more intuitive, what do you think?
 
 Also, I was wondering is there any other useful information for the
>>> cached
 file to be passed to runtime.
 If we are just talking about including the library to the classloader,
>>> can
 we directly extend the interface with
 
 public void registerCachedFile(
String filePath,
String name,
boolean executable,
boolean includeInClassLoader)
 
 
 Thanks,
 Rong
 
 On Sun, May 13, 2018 at 11:14 PM, Shuyi Chen 
>> wrote:
 
> Hi Flink devs,
> 
> In an effort to support loading external libraries and creating UDFs
>>> from
> external libraries using DDL in Flink SQL, we want to use Flink’s
>> Blob
> Server to distribute the external libraries in runtime and load those
> libraries into the user code classloader automatically.
> 
> However, the current [Stream]ExecutionEnvironment.registerCachedFile
> interface limits only to registering executable or non-executable
>>> blobs.
> It’s not possible to tell in runtime if the blob files are libraries
>>> and
> should be loaded into the user code classloader in RuntimeContext.
> Therefore, I want to propose to add an enum called *BlobType*
>>> explicitly
 to
> indicate the type of the Blob file being distributed, and the
>> following
> interface in [Stream]ExecutionEnvironment to support it. In general,
>> I
> think the new BlobType information can be used by Flink runtime to
> preprocess the Blob files if needed.
> 
> */***
> ** Registers a file at the distributed cache under the given name.
>> The
 file
> will be accessible*
> ** from any user-defined function in the (distributed) runtime under
>> a
> local path. Files*
> ** may be local files (as long as all relevant workers have access to
 it),
> or files in a distributed file system.*
> ** The runtime will copy the files temporarily to a local cache, if
> needed.*
> ***
> ** The {@link org.apache.flink.api.common.
>> functions.RuntimeContext}
 can
> be obtained inside UDFs via*
> ** {@link
> org.apache.flink.api.common.functions.RichFunction#
>>> getRuntimeContext()}
> and
> provides access*
> ** {@link org.apache.flink.api.common.ca
> che.DistributedCache} via*
> ** {@link
> org.apache.flink.api.common.functions.RuntimeContext#
> getDistributedCache()}.*
> ***
> ** @param filePath The path of the file, as a URI (e.g.
 

Re: Errors checkpointing to S3 for high-scale jobs

2018-05-16 Thread Stephan Ewen
For posterity: Here is the Jira Issue that tracks this:
https://issues.apache.org/jira/browse/FLINK-9061

On Thu, Mar 22, 2018 at 11:46 PM, Jamie Grier  wrote:

> I think we need to modify the way we write checkpoints to S3 for high-scale
> jobs (those with many total tasks).  The issue is that we are writing all
> the checkpoint data under a common key prefix.  This is the worst case
> scenario for S3 performance since the key is used as a partition key.
>
> In the worst case checkpoints fail with a 500 status code coming back from
> S3 and an internal error type of TooBusyException.
>
> One possible solution would be to add a hook in the Flink filesystem code
> that allows me to "rewrite" paths.  For example say I have the checkpoint
> directory set to:
>
> s3://bucket/flink/checkpoints
>
> I would hook that and rewrite that path to:
>
> s3://bucket/[HASH]/flink/checkpoints, where HASH is the hash of the
> original path
>
> This would distribute the checkpoint write load around the S3 cluster
> evenly.
>
> For reference:
> https://aws.amazon.com/premiumsupport/knowledge-
> center/s3-bucket-performance-improve/
>
> Any other people hit this issue?  Any other ideas for solutions?  This is a
> pretty serious problem for people trying to checkpoint to S3.
>
> -Jamie
>


Re: [DISCUSS] GitBox

2018-05-16 Thread Aljoscha Krettek
+1

On Beam, we gradually enabled this, first for the website repo and then for the 
main repo and we didn't run into problems.

> On 16. May 2018, at 12:45, Chesnay Schepler  wrote:
> 
> Forget an important feature: It would allow committers to close pull requests.
> 
> On 16.05.2018 12:40, Chesnay Schepler wrote:
>> Hello,
>> 
>> during the discussion about how to better manage pull requests [1] the topic 
>> of GitBox integration came up again.
>> 
>> This seems like a good opportunity to restart this discussion that we had 
>> about a year ago [2].
>> 
>> * What is GitBox
>> 
>>   Essentially, GitBox allow us to use GitHub features.
>>   We can decide for ourselves which features we want enabled.
>> 
>>   We could merge PRs directly on GitHub at the button of a click.
>>   That said the merge functionality is fairly limited and would
>>   require picture-perfect commits in the pull requests.
>>   Commits can be squashed, but you cannot amend commits in any way, be
>>   it fixing typos or changing the commit message. Realistically this
>>   limits how much we can use this feature, and it may lead to a
>>   decline in the quality of commit messages.
>> 
>>   Labels can be useful for the management of PRs as (ready for review,
>>   delayed for next release, waiting for changes). This is really what
>>   I'm personally most interested in.
>> 
>>   We've been using GitBox for flink-shaded for a while now and i
>>   didn't run into any issue. AFAIK GitBox is also the default for new
>>   projects.
>> 
>> * What this means for committers:
>> 
>>   The apache git remote URL will change, which will require all
>>   committers to update their git setup.
>>   This also implies that we may have to update the website build scripts.
>>   The new URL would (probably) be
>>   /https://gitbox.apache.org/repos/asf/flink.git/.
>> 
>>   To make use of GitHub features you have to link your GitHub and
>>   Apache accounts. [3]
>>   This also requires setting up two-factor authentication on GitHub.
>> 
>>   Update the scm entry in the parent pom.xml.
>> 
>> * What this means for contributors:
>> 
>>   Nothing should change for contributors. Small changes (like typos)
>>   may be merged more quickly, if the commit message is appropriate, as
>>   it could be done directly through GitHub.
>> 
>> [1] 
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Closing-automatically-inactive-pull-requests-tt22248.html
>> [2] 
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-GitBox-td18027.html
>> [3] https://gitbox.apache.org/setup/
>> 
> 



[jira] [Created] (FLINK-9381) BlobServer data for a job is not getting cleaned up at JM

2018-05-16 Thread Amit Jain (JIRA)
Amit Jain created FLINK-9381:


 Summary: BlobServer data for a job is not getting cleaned up at JM
 Key: FLINK-9381
 URL: https://issues.apache.org/jira/browse/FLINK-9381
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.5.0
 Environment: Flink 1.5.0 RC3 Commit e725269
Reporter: Amit Jain


We are running Flink 1.5.0 rc3 with YARN as cluster manager and found
 Job Manager is getting killed due to out of disk error.
 
 Upon further analysis, we found blob server data for a job is not
 getting cleaned up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-9380) Failing end-to-end tests should not clean up logs

2018-05-16 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-9380:


 Summary: Failing end-to-end tests should not clean up logs
 Key: FLINK-9380
 URL: https://issues.apache.org/jira/browse/FLINK-9380
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.5.0
Reporter: Till Rohrmann


Some of the end-to-end tests clean up their logs also in the failure case. This 
makes debugging and understanding the problem extremely difficult. Ideally, the 
scripts says where it stored the respective logs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-9379) HA end-to-end test failing locally

2018-05-16 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-9379:


 Summary: HA end-to-end test failing locally
 Key: FLINK-9379
 URL: https://issues.apache.org/jira/browse/FLINK-9379
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.5.0
Reporter: Till Rohrmann


The HA end-to-end test fails sometimes with

{code}
The program finished with the following exception:

org.apache.flink.client.program.ProgramInvocationException: Could not submit 
job 797547d5fd619ea240d4c6690adc9101.
at 
org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:247)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:464)
at 
org.apache.flink.client.program.DetachedEnvironment.finalizeExecute(DetachedEnvironment.java:77)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:410)
at 
org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:781)
at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:275)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:210)
at 
org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1020)
at 
org.apache.flink.client.cli.CliFrontend.lambda$main$9(CliFrontend.java:1096)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
at 
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1096)
Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to 
submit JobGraph.
at 
org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$5(RestClusterClient.java:357)
at 
java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
at 
java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852)
at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at 
java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:561)
at 
java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:929)
at 
java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.CompletionException: 
org.apache.flink.runtime.rest.util.RestClientException: [Service temporarily 
unavailable due to an ongoing leader election. Please refresh.]
at 
java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326)
at 
java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338)
at 
java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:911)
at 
java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:953)
at 
java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926)
... 4 more
Caused by: org.apache.flink.runtime.rest.util.RestClientException: [Service 
temporarily unavailable due to an ongoing leader election. Please refresh.]
at 
org.apache.flink.runtime.rest.RestClient.parseResponse(RestClient.java:225)
at 
org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$3(RestClient.java:209)
at 
java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:952)
... 5 more
{code}

when executing it locally. 

I assume that the test does not properly wait until the cluster is ready for a 
job submission.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] GitBox

2018-05-16 Thread Chesnay Schepler
Forget an important feature: It would allow committers to close pull 
requests.


On 16.05.2018 12:40, Chesnay Schepler wrote:

Hello,

during the discussion about how to better manage pull requests [1] the 
topic of GitBox integration came up again.


This seems like a good opportunity to restart this discussion that we 
had about a year ago [2].


* What is GitBox

   Essentially, GitBox allow us to use GitHub features.
   We can decide for ourselves which features we want enabled.

   We could merge PRs directly on GitHub at the button of a click.
   That said the merge functionality is fairly limited and would
   require picture-perfect commits in the pull requests.
   Commits can be squashed, but you cannot amend commits in any way, be
   it fixing typos or changing the commit message. Realistically this
   limits how much we can use this feature, and it may lead to a
   decline in the quality of commit messages.

   Labels can be useful for the management of PRs as (ready for review,
   delayed for next release, waiting for changes). This is really what
   I'm personally most interested in.

   We've been using GitBox for flink-shaded for a while now and i
   didn't run into any issue. AFAIK GitBox is also the default for new
   projects.

* What this means for committers:

   The apache git remote URL will change, which will require all
   committers to update their git setup.
   This also implies that we may have to update the website build 
scripts.

   The new URL would (probably) be
   /https://gitbox.apache.org/repos/asf/flink.git/.

   To make use of GitHub features you have to link your GitHub and
   Apache accounts. [3]
   This also requires setting up two-factor authentication on GitHub.

   Update the scm entry in the parent pom.xml.

* What this means for contributors:

   Nothing should change for contributors. Small changes (like typos)
   may be merged more quickly, if the commit message is appropriate, as
   it could be done directly through GitHub.

[1] 
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Closing-automatically-inactive-pull-requests-tt22248.html
[2] 
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-GitBox-td18027.html

[3] https://gitbox.apache.org/setup/





[DISCUSS] GitBox

2018-05-16 Thread Chesnay Schepler

Hello,

during the discussion about how to better manage pull requests [1] the 
topic of GitBox integration came up again.


This seems like a good opportunity to restart this discussion that we 
had about a year ago [2].


* What is GitBox

   Essentially, GitBox allow us to use GitHub features.
   We can decide for ourselves which features we want enabled.

   We could merge PRs directly on GitHub at the button of a click.
   That said the merge functionality is fairly limited and would
   require picture-perfect commits in the pull requests.
   Commits can be squashed, but you cannot amend commits in any way, be
   it fixing typos or changing the commit message. Realistically this
   limits how much we can use this feature, and it may lead to a
   decline in the quality of commit messages.

   Labels can be useful for the management of PRs as (ready for review,
   delayed for next release, waiting for changes). This is really what
   I'm personally most interested in.

   We've been using GitBox for flink-shaded for a while now and i
   didn't run into any issue. AFAIK GitBox is also the default for new
   projects.

* What this means for committers:

   The apache git remote URL will change, which will require all
   committers to update their git setup.
   This also implies that we may have to update the website build scripts.
   The new URL would (probably) be
   /https://gitbox.apache.org/repos/asf/flink.git/.

   To make use of GitHub features you have to link your GitHub and
   Apache accounts. [3]
   This also requires setting up two-factor authentication on GitHub.

   Update the scm entry in the parent pom.xml.

* What this means for contributors:

   Nothing should change for contributors. Small changes (like typos)
   may be merged more quickly, if the commit message is appropriate, as
   it could be done directly through GitHub.

[1] 
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Closing-automatically-inactive-pull-requests-tt22248.html
[2] 
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-GitBox-td18027.html

[3] https://gitbox.apache.org/setup/


Re: [DISCUSS] Adding new interfaces in [Stream]ExecutionEnvironment

2018-05-16 Thread Fabian Hueske
Hi Ted,

The design doc is in late draft status and proposes support for SQL DDL
statements (CREATE TABLE, CREATE  FUNCTION, etc.).
The question about registering JARs came up because we need a way to
distribute JAR files that contain the code of user-defined functions.

The design doc will soon be shared on the dev mailing list to gather
feedback from the community.

Best, Fabian

2018-05-16 10:45 GMT+02:00 Ted Yu :

> bq. In a design document, Timo mentioned that we can ship multiple JAR
> files
>
> Mind telling us where the design doc can be retrieved ?
>
> Thanks
>
> On Wed, May 16, 2018 at 1:29 AM, Fabian Hueske  wrote:
>
> > Hi,
> >
> > I'm not sure if we need to modify the existing method.
> > What we need is a bit different from what registerCachedFile() provides.
> > The method ensures that a file is copied to each TaskManager and can be
> > locally accessed from a function's RuntimeContext.
> > In our case, we don't need to access the file but would like to make sure
> > that it is loaded into the class loader.
> > So, we could also just add a method like registerUserJarFile().
> >
> > In a design document, Timo mentioned that we can ship multiple JAR files
> > with a job.
> > So, we could also implement the UDF shipping logic by loading the Jar
> > file(s) to the client and distribute them from there.
> > In that case, we would not need to add new method to the execution
> > environment.
> >
> > Best,
> > Fabian
> >
> > 2018-05-15 3:50 GMT+02:00 Rong Rong :
> >
> > > +1. This could be very useful for "dynamic" UDF.
> > >
> > > Just to clarify, if I understand correctly, we are tying to use an ENUM
> > > indicator to
> > > (1) Replace the current Boolean isExecutable flag.
> > > (2) Provide additional information used by ExecutionEnvironment to
> decide
> > > when/where to use the DistributedCached file.
> > >
> > > In this case, DistributedCache.CacheType or DistributedCache.FileType
> > > sounds more intuitive, what do you think?
> > >
> > > Also, I was wondering is there any other useful information for the
> > cached
> > > file to be passed to runtime.
> > > If we are just talking about including the library to the classloader,
> > can
> > > we directly extend the interface with
> > >
> > > public void registerCachedFile(
> > > String filePath,
> > > String name,
> > > boolean executable,
> > > boolean includeInClassLoader)
> > >
> > >
> > > Thanks,
> > > Rong
> > >
> > > On Sun, May 13, 2018 at 11:14 PM, Shuyi Chen 
> wrote:
> > >
> > > > Hi Flink devs,
> > > >
> > > > In an effort to support loading external libraries and creating UDFs
> > from
> > > > external libraries using DDL in Flink SQL, we want to use Flink’s
> Blob
> > > > Server to distribute the external libraries in runtime and load those
> > > > libraries into the user code classloader automatically.
> > > >
> > > > However, the current [Stream]ExecutionEnvironment.registerCachedFile
> > > > interface limits only to registering executable or non-executable
> > blobs.
> > > > It’s not possible to tell in runtime if the blob files are libraries
> > and
> > > > should be loaded into the user code classloader in RuntimeContext.
> > > > Therefore, I want to propose to add an enum called *BlobType*
> > explicitly
> > > to
> > > > indicate the type of the Blob file being distributed, and the
> following
> > > > interface in [Stream]ExecutionEnvironment to support it. In general,
> I
> > > > think the new BlobType information can be used by Flink runtime to
> > > > preprocess the Blob files if needed.
> > > >
> > > > */***
> > > > ** Registers a file at the distributed cache under the given name.
> The
> > > file
> > > > will be accessible*
> > > > ** from any user-defined function in the (distributed) runtime under
> a
> > > > local path. Files*
> > > > ** may be local files (as long as all relevant workers have access to
> > > it),
> > > > or files in a distributed file system.*
> > > > ** The runtime will copy the files temporarily to a local cache, if
> > > > needed.*
> > > > ***
> > > > ** The {@link org.apache.flink.api.common.
> functions.RuntimeContext}
> > > can
> > > > be obtained inside UDFs via*
> > > > ** {@link
> > > > org.apache.flink.api.common.functions.RichFunction#
> > getRuntimeContext()}
> > > > and
> > > > provides access*
> > > > ** {@link org.apache.flink.api.common.ca
> > > > che.DistributedCache} via*
> > > > ** {@link
> > > > org.apache.flink.api.common.functions.RuntimeContext#
> > > > getDistributedCache()}.*
> > > > ***
> > > > ** @param filePath The path of the file, as a URI (e.g.
> > > "file:///some/path"
> > > > or "hdfs://host:port/and/path")*
> > > > ** @param name The name under which the file is registered.*
> > > > ** @param blobType indicating the type of the Blob file*
> > > > **/*
> > > >
> > > > *public void registerCachedFile(String filePath, String name,
> 

Re: Elasticsearch Sink

2018-05-16 Thread Tzu-Li (Gordon) Tai
Good to know! Thanks a lot for pushing this Christophe.

Please ping me when the new PR is opened, and we can continue the discussion 
there.
Ideally we have this in early in the 1.6 release cycle, so that the 
Elasticsearch e2e tests (will be merging a PR for that soon) can catching 
anything unexpected.

Cheers,
Gordon

On 16 May 2018 at 4:26:36 PM, Christophe Jolif (cjo...@gmail.com) wrote:

Ok thanks for the feedback. 

> I agree. IIRC, the ES PRs that were opened also did this by changing the 
>return type from Client to AutoClosable, as well as letting the call bridge 
>also handle creation of BulkProcessors, correct?

Correct.

> Instead, we maintain our own request class to abstract those classes away, 
>and only create the actual Elasticsearch requests internally.

I'm personally unsure I would like to have to use Flink specific request 
classes instead of Elastic ones but apart from that I think we are pretty much 
inline. 

I'm not exactly sure when I'll get the cycles but I will try to issue yet 
another PR along those lines so we can continue the discussion from there and 
hopefully we would get something in time for 1.6!

Thanks again,
--
Christophe

On Wed, May 16, 2018 at 7:19 AM, Tzu-Li (Gordon) Tai  
wrote:
Hi,

What if the user in a ES5.3+ case is calling the deprecated method? You 
agree it will fail? I'm not necessarily against that. I just want to make 
it clear that we don't have a perfect solution here either?

I think what we could do here is at least in the deprecated 
`add(ActionRequest)` method, try casting to either `IndexRequest` or 
`DeleteRequest` and forward calls to the new methods.
As a matter of fact, this is exactly what the Elasticsearch BulkProcessor API 
is doing internally [1] [2] [3], so we should be safe here.
Looking at the code in [1] [2] [3], we should actually also consider it being a 
`UpdateRequest` (just to correct my previous statement that it can only be 
either index or delete).
But yes, I think there wouldn’t be a perfect solution here; something has to be 
broken / deprecated in order for our implementation to be more future proof.

indeed we always considered that among 
ActionRequest only delete and index were actually supported (which makes 
sense to me). 

Looking at the code in [1] [2] [3], we should actually also consider it being a 
`UpdateRequest` (just to correct my previous statement that it can only be 
either index or delete).
And yes, the Elasticsearch Javadocs of the BulkProcessor also clearly state 
this.

For the REST part, there is another incompatibility at the "Client" API 
level. Indeed the RestClient is not implementing the "Client" interface 
that is exposed in the bridge. So "just" doing the change above would still 
not allow to provide a REST based implementation? 

I agree. IIRC, the ES PRs that were opened also did this by changing the return 
type from Client to AutoClosable, as well as letting the call bridge also 
handle creation of BulkProcessors, correct?
I think this is definitely the way to go for us to be more future proof.
The general rule of thumb is that, for all APIs (either APIs of the base module 
that the ES connectors internally use, or user-facing APIs of the connector), 
we should move towards not leaking Elasticsearch classes as the API.
This goes for removing Client as the return type in the call bridge interface. 
When re-working the RequestIndexer interface, we can even consider not directly 
exposing the `IndexRequest`, `DeleteRequest`, `UpdateRequest` classes as the 
API.
Instead, we maintain our own request class to abstract those classes away, and 
only create the actual Elasticsearch requests internally.

Cheers,
Gordon

[1] 
https://github.com/elastic/elasticsearch/blob/v5.2.2/core/src/main/java/org/elasticsearch/action/bulk/BulkRequest.java#L110
[2] 
https://github.com/elastic/elasticsearch/blob/v6.2.4/server/src/main/java/org/elasticsearch/action/bulk/BulkRequest.java#L128
On 16 May 2018 at 4:12:15 AM, Christophe Jolif (cjo...@gmail.com) wrote:

Hi Gordon,

On Tue, May 15, 2018 at 6:16 AM, Tzu-Li (Gordon) Tai 
wrote:

> Hi,
>
> Let me first clarify a few things so that we are on the same page here:
>
> 1. The reason that we are looking into having a new base-module for
> versions 5.3+ is that
> new Elasticsearch BulkProcessor APIs are breaking some of our original
> base API assumptions.
> This is noticeable from the intended cast here [1].
>
> 2. Given that we are looking into a new base module, Christophe proposed
> that we focus on designing
> the new base module around Elasticsearch’s new REST API, so that we are
> more-future proof.
>
> 3. I proposed to remove / deprecate support for earlier versions, because
> once we introduce the new
> base module, we would be maintaining a lot of Elasticsearch connector
> modules (2 base modules, 4+ version specific).
> Of course, whether or not this really is a problem depends on how much
> capacity the 

[jira] [Created] (FLINK-9378) Improve TableException message with TypeName usage

2018-05-16 Thread Sergey Nuyanzin (JIRA)
Sergey Nuyanzin created FLINK-9378:
--

 Summary: Improve TableException message with TypeName usage
 Key: FLINK-9378
 URL: https://issues.apache.org/jira/browse/FLINK-9378
 Project: Flink
  Issue Type: Improvement
  Components: Table API  SQL
Reporter: Sergey Nuyanzin
Assignee: Sergey Nuyanzin


Currently in TableException simple name is in use. It is not clear what is the 
issue while having error message like {noformat}
Exception in thread "main" org.apache.flink.table.api.TableException: Result 
field does not match requested type. Requested: Date; Actual: Date
at 
org.apache.flink.table.api.TableEnvironment$$anonfun$generateRowConverterFunction$1.apply(TableEnvironment.scala:953)
{noformat}
or
{noformat}Caused by: org.apache.flink.table.api.TableException: Type is not 
supported: Date
at 
org.apache.flink.table.api.TableException$.apply(exceptions.scala:53){noformat}
also for more detailed have a look at FLINK-9341



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-9377) Remove writing serializers as part of the checkpoint meta information

2018-05-16 Thread Tzu-Li (Gordon) Tai (JIRA)
Tzu-Li (Gordon) Tai created FLINK-9377:
--

 Summary: Remove writing serializers as part of the checkpoint meta 
information
 Key: FLINK-9377
 URL: https://issues.apache.org/jira/browse/FLINK-9377
 Project: Flink
  Issue Type: Sub-task
  Components: State Backends, Checkpointing
Reporter: Tzu-Li (Gordon) Tai
Assignee: Tzu-Li (Gordon) Tai
 Fix For: 1.6.0


When writing meta information of a state in savepoints, we currently write both 
the state serializer as well as the state serializer's configuration snapshot.

Writing both is actually redundant, as most of the time they have identical 
information.
Moreover, the fact that we use Java serialization to write the serializer and 
rely on it to be re-readable on the restore run, already poses problems for 
serializers such as the {{AvroSerializer}} (see discussion in FLINK-9202).

The proposal here is to leave only the config snapshot as meta information, and 
use that as the single source of truth of information about the schema of 
serialized state.
The config snapshot should be treated as a factory (or provided to a factory) 
to re-create serializers capable of reading old, serialized state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-9376) Allow upgrading to incompatible state serializers (state schema evolution)

2018-05-16 Thread Tzu-Li (Gordon) Tai (JIRA)
Tzu-Li (Gordon) Tai created FLINK-9376:
--

 Summary: Allow upgrading to incompatible state serializers (state 
schema evolution)
 Key: FLINK-9376
 URL: https://issues.apache.org/jira/browse/FLINK-9376
 Project: Flink
  Issue Type: New Feature
  Components: State Backends, Checkpointing, Type Serialization System
Reporter: Tzu-Li (Gordon) Tai
 Fix For: 1.6.0


Currently, users have access to upgrade state serializers on the restore run of 
a stateful job, as long as the upgraded new serializer remains backwards 
compatible with all previous written data in the savepoint (i.e. it can read 
all previous and current schema of serialized state objects).

What is still lacking is the ability to upgrade to incompatible serializers. 
Upon being registered an incompatible serializer for existing restored state, 
that state needs to go through the process of -
1. read serialized state with the previous serializer
2. passing each deserialized state object through a “migration map function”, 
and
3. writing back the state with the new serializer

This should be strictly limited to state registrations that occur before the 
actual processing begins (e.g. in the `open` or `initializeState` methods), so 
that we avoid performing these operations during processing.

Procedure 2. will allow even state type migrations, but that is out-of-scope of 
this JIRA.
This ticket focuses only on procedures 1. and 3., where we try to enable schema 
evolution without state type changes.

This is an umbrella JIRA ticket that overlooks this feature, including a few 
preliminary tasks that work towards enabling it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] Drop "canEqual" from TypeInformation, TypeSerializer, etc.

2018-05-16 Thread Timo Walther

+1

TypeInformation has too many methods that need to be implemented but 
provide little benefit for Flink.


Am 16.05.18 um 10:55 schrieb Ted Yu:

+1 from me as well.

I checked a few serializer classes. The `equals` method on serializers
contains the logic of `canEqual` method whose existence seems redundant.

On Wed, May 16, 2018 at 1:49 AM, Tzu-Li (Gordon) Tai 
wrote:


+1.

Looking at the implementations of the `canEqual` method in several
serializers, it seems like all that is done is a check whether the object
is of the same serializer class.
We’ll have to be careful and double check all `equals` method on
serializers that may have relied on the `canEqual` method to perform the
preliminary type check.
Otherwise, this sounds good.

On 16 May 2018 at 4:35:47 PM, Stephan Ewen (se...@apache.org) wrote:

Hi all!

As part of an attempt to simplify some code in the TypeInfo and
TypeSerializer area, I would like to drop the "canEqual" methods for the
following reason:

"canEqual()" is necessary to make proper equality checks across
hierarchies
of types. This is for example useful in a collection API, stating for
example whether a List can be equal to a Collection if they have the same
contents. We don't have that here.

A certain type information (and serializer) is equal to another one if
they
describe the same type, strictly. There is no necessity for cross
hierarchy
checks.

This has also let to the situation that most type infos and serializers
implement just a dummy/default version of "canEqual". Many "equals()"
methods do not even call the other object's "canEqual", etc.

As a first step, we could simply deprecate the method and implement an
empty default, and remove all calls to that method.

Best,
Stephan





Re: Rewriting a new file instead of writing a ".valid-length" file inBucketSink when restoring

2018-05-16 Thread Xinyu Zhang
Hi Till


Thanks for your suggestion. A small tool can work lightly and asynchronously. 
However, I don't know when others will use the data, so I should use the tool 
to check and truncate the finished file once a valid-length file is found. I 
think it's hard to maintain it and it shouldn't be maintained by users (just 
like the current implementation of BucketingSink with truncate function). 


Regards,
Zhang Xinyu


--  --
??: "Till Rohrmann";
: 2018??5??15??(??) 11:27
??: "dev";
: "kkloudas"; 
: Re: ?? Rewriting a new file instead of writing a ".valid-length" file 
inBucketSink when restoring



Hi Xinyu,

would it help to have a small tool which can truncate the finished files
which have a valid-length file associated? That way, one could use this
tool before others are using the data farther down stream.

Cheers,
Till

On Tue, May 15, 2018 at 3:05 PM, Xinyu Zhang <342689...@qq.com> wrote:

> Yes, I'm glad to do it. but I'm not sure writing a new file is a good
> solution. So I want to discuss it here.
> Do you have any ideas? @Kostas
>
>
>
>
> --  --
> ??: "twalthr";
> : 2018??5??15??(??) 8:21
> ??: "Xinyu Zhang"<342689...@qq.com>;
> : "dev"; "kkloudas";
> : Re: ?? Rewriting a new file instead of writing a ".valid-length" 
> file
> inBucketSink when restoring
>
>
>
> As far as I know, the bucketing sink is currenlty also limited by
> relying on Hadoops file system abstraction. It is planned to switch to
> Flink's file system abstraction which might also improve this situation.
> Kostas (in CC) might know more about it.
>
> But I think we can discuss if an other behavior should be configurable
> as well. Would you be willing to contribute?
>
> Regards,
> Timo
>
>
> Am 15.05.18 um 14:01 schrieb Xinyu Zhang:
> > Thanks for your reply.
> > Indeed, if a file is very large, it will take a long time. However,
> > the the ??.valid-length?? file is not convenient for others who use the
> > data in HDFS.
> > Maybe we should provide a configuration for users to choose which
> > strategy they prefer.
> > Do you have any ideas?
> >
> >
> > --  --
> > *??:* "Timo Walther";
> > *:* 2018??5??15??(??) 7:30
> > *??:* "dev";
> > *:* Re: Rewriting a new file instead of writing a ".valid-length"
> > file inBucketSink when restoring
> >
> > I guess writing a new file would take much longer than just using the
> > .valid-length file, especially if the files are very large. The
> > restoring time should be as minimal as possible to ensure little
> > downtime on restarts.
> >
> > Regards,
> > Timo
> >
> >
> > Am 15.05.18 um 09:31 schrieb Gary Yao:
> > > Hi,
> > >
> > > The BucketingSink truncates the file if the Hadoop FileSystem
> > supports this
> > > operation (Hadoop 2.7 and above) [1]. What version of Hadoop are you
> > using?
> > >
> > > Best,
> > > Gary
> > >
> > > [1]
> > >
> > https://github.com/apache/flink/blob/bcd028d75b0e5c5c691e24640a2196
> b2fdaf85e0/flink-connectors/flink-connector-filesystem/
> src/main/java/org/apache/flink/streaming/connectors/fs/
> bucketing/BucketingSink.java#L301
> > >
> > > On Mon, May 14, 2018 at 1:37 PM, ?? <342689...@qq.com> wrote:
> > >
> > >> Hi
> > >>
> > >>
> > >> I'm trying to copy data from kafka to HDFS . The data in HDFS is
> > used to
> > >> do other computations by others in map/reduce.
> > >> If some tasks failed, the ".valid-length" file is created for the low
> > >> version hadoop. The problem is other people must know how to deal
> > with the
> > >> ".valid-length" file, otherwise, the data may be not exactly-once.
> > >> Hence, why not rewrite a new file when restoring instead of writing a
> > >> ".valid-length" file. In this way, others who use the data in HDFS
> > don't
> > >> need to know how to deal with the ".valid-length" file.
> > >>
> > >>
> > >> Thanks!
> > >>
> > >>
> > >> Zhang Xinyu
> >
>

Re: Closing (automatically?) inactive pull requests

2018-05-16 Thread Fabian Hueske
Hi,

I'm not objecting closing stale PRs.
We have quite a few PRs with very little chance of being merged and I would
certainly appreciate cleaning up those.
However, I think we should not automate closing PRs for the reasons I gave
before.

A tool that reminds us of state PRs as proposed by Till seems like a good
idea and might actually help to re-adjust priorities from time to time.

@Yazdan: Right now there is no assignment happening. Committers decide
which PRs they want to review and merge.

Best, Fabian

2018-05-16 4:26 GMT+02:00 Yaz Sh :

> I have questions in this regard (you guys might have addressed it in this
> email chain):
>  how PRs get assigned to a reviewer apart of a contributor tag someone?
> what if PR never gets a reviewer attention and it became in-active due to
> long review respond? should Bot assign a reviewer to a PR based on
> reviewers interest (i.e., defined via tags) and notify he/she if PR is
> waiting for review?
>
>
> Cheers,
> /Yazdan
>
>
> > On May 15, 2018, at 12:06 PM, Thomas Weise  wrote:
> >
> > I like Till's proposal to notify the participants on the PR to PTAL. But
> I
> > would also suggest to auto-close when no action is taken, with a friendly
> > reminder that PRs can be reopened anytime.
> >
> > The current situation with 350 open PRs may send a signal to contributors
> > that it may actually be too much hassle to get a change committed in
> Flink.
> > Since the count keeps on rising, this is also not a past problem. Pruning
> > inactive PRs may help, that will also give a more accurate picture of the
> > lack of review bandwidth, if that is the root cause.
> >
> > Thanks,
> > Thomas
> >
> >
> >
> >
> >
> > On Tue, May 15, 2018 at 8:24 AM, Ted Yu  wrote:
> >
> >> How does the bot decide whether the PR is waiting for reviews or is
> being
> >> abandoned by contributor ?
> >>
> >> How about letting the bot count the number of times contributor pings
> >> committer(s) for review ?
> >> When unanswered ping count crosses some threshold, say 3, the bot
> publishes
> >> the JIRA and PR somewhere.
> >>
> >> Cheers
> >>
> >> On Tue, May 15, 2018 at 8:19 AM, Till Rohrmann 
> >> wrote:
> >>
> >>> I'm a bit torn here because I can see the pros and cons for both sides.
> >>>
> >>> Maybe a compromise could be to not have a closing but a monitoring bot
> >>> which notifies us about inactive PRs. This could then trigger an
> >>> investigation of the underlying problem and ultimately lead to a
> >> conscious
> >>> decision to close or keep the PR open. As such it is not strictly
> >> necessary
> >>> to have such a bot but it would at least remind us to make a decision
> >> about
> >>> older PRs with no activity.
> >>>
> >>> Cheers,
> >>> Till
> >>>
> >>> On Tue, May 15, 2018 at 3:26 PM, Chesnay Schepler 
> >>> wrote:
> >>>
>  /So far I did it twice for older PRs. In both cases I didn’t get any
>  response and I even forgot in which PRs I had asked this question, so
> >>> now I
>  can not even close them :S/
> 
>  To be honest this sounds more like an issue with how your organize
> your
>  work. No amount of closing PRs can fix that.
>  With GitBox we can assign reviewers to a PR, but I'm not sure whether
> >> it
>  only allows committers to assign people.
>  Bookmarks or text files might help as well./
>  /
> 
>  /Regarding only 30% blocked on contributor. I wonder what would be
> this
>  number if we tried to ask in the rest of old PRs “Hey, are you still
>  interested in reviewing/merging this?”.  If old PR is waiting for a
>  reviewer for X months, it doesn’t mean that’s not abandoned. Even if
> it
>  doesn’t, reducing our overhead by those 30% of the PRs is something./
> 
>  No doubt the number would be higher if we were to go back, but as i
>  explained earlier that is not a reason to close it. If a PR is
> >> abandoned
>  because we messed up we should still /try /to get it in.
> 
>  /This is kind of whole point of what I was proposing. If the PR author
> >> is
>  still there, and can respond/bump/interrupt the closing timeout,
> that’s
>  great. Gives us even more sense of urgency to review it./
> 
>  Unfortunately knowing that it is more urgent is irrelevant, as we
>  currently don't have the manpower to review them. Reviving them now
> >> would
>  serve no purpose. The alternative is to wait until more people become
>  active reviewers.
> 
>  /As a last resort, closing PR after timeout is not the end of the
> >> world.
>  It always can be reopened./
> 
>  Let's be realistic here, it will not be reopened.
> 
> 
>  On 15.05.2018 14:21, Piotr Nowojski wrote:
> 
> > I agree that we have other, even more important, problems with
> >> reviewing
> > PR and community, but that shouldn’t block us from trying to clean up
> 

Re: [DISCUSS] Drop "canEqual" from TypeInformation, TypeSerializer, etc.

2018-05-16 Thread Ted Yu
+1 from me as well.

I checked a few serializer classes. The `equals` method on serializers
contains the logic of `canEqual` method whose existence seems redundant.

On Wed, May 16, 2018 at 1:49 AM, Tzu-Li (Gordon) Tai 
wrote:

> +1.
>
> Looking at the implementations of the `canEqual` method in several
> serializers, it seems like all that is done is a check whether the object
> is of the same serializer class.
> We’ll have to be careful and double check all `equals` method on
> serializers that may have relied on the `canEqual` method to perform the
> preliminary type check.
> Otherwise, this sounds good.
>
> On 16 May 2018 at 4:35:47 PM, Stephan Ewen (se...@apache.org) wrote:
>
> Hi all!
>
> As part of an attempt to simplify some code in the TypeInfo and
> TypeSerializer area, I would like to drop the "canEqual" methods for the
> following reason:
>
> "canEqual()" is necessary to make proper equality checks across
> hierarchies
> of types. This is for example useful in a collection API, stating for
> example whether a List can be equal to a Collection if they have the same
> contents. We don't have that here.
>
> A certain type information (and serializer) is equal to another one if
> they
> describe the same type, strictly. There is no necessity for cross
> hierarchy
> checks.
>
> This has also let to the situation that most type infos and serializers
> implement just a dummy/default version of "canEqual". Many "equals()"
> methods do not even call the other object's "canEqual", etc.
>
> As a first step, we could simply deprecate the method and implement an
> empty default, and remove all calls to that method.
>
> Best,
> Stephan
>


Re: [DISCUSS] Drop "canEqual" from TypeInformation, TypeSerializer, etc.

2018-05-16 Thread Tzu-Li (Gordon) Tai
+1.

Looking at the implementations of the `canEqual` method in several serializers, 
it seems like all that is done is a check whether the object is of the same 
serializer class.
We’ll have to be careful and double check all `equals` method on serializers 
that may have relied on the `canEqual` method to perform the preliminary type 
check.
Otherwise, this sounds good.

On 16 May 2018 at 4:35:47 PM, Stephan Ewen (se...@apache.org) wrote:

Hi all!  

As part of an attempt to simplify some code in the TypeInfo and  
TypeSerializer area, I would like to drop the "canEqual" methods for the  
following reason:  

"canEqual()" is necessary to make proper equality checks across hierarchies  
of types. This is for example useful in a collection API, stating for  
example whether a List can be equal to a Collection if they have the same  
contents. We don't have that here.  

A certain type information (and serializer) is equal to another one if they  
describe the same type, strictly. There is no necessity for cross hierarchy  
checks.  

This has also let to the situation that most type infos and serializers  
implement just a dummy/default version of "canEqual". Many "equals()"  
methods do not even call the other object's "canEqual", etc.  

As a first step, we could simply deprecate the method and implement an  
empty default, and remove all calls to that method.  

Best,  
Stephan  


Re: [DISCUSS] Adding new interfaces in [Stream]ExecutionEnvironment

2018-05-16 Thread Ted Yu
bq. In a design document, Timo mentioned that we can ship multiple JAR files

Mind telling us where the design doc can be retrieved ?

Thanks

On Wed, May 16, 2018 at 1:29 AM, Fabian Hueske  wrote:

> Hi,
>
> I'm not sure if we need to modify the existing method.
> What we need is a bit different from what registerCachedFile() provides.
> The method ensures that a file is copied to each TaskManager and can be
> locally accessed from a function's RuntimeContext.
> In our case, we don't need to access the file but would like to make sure
> that it is loaded into the class loader.
> So, we could also just add a method like registerUserJarFile().
>
> In a design document, Timo mentioned that we can ship multiple JAR files
> with a job.
> So, we could also implement the UDF shipping logic by loading the Jar
> file(s) to the client and distribute them from there.
> In that case, we would not need to add new method to the execution
> environment.
>
> Best,
> Fabian
>
> 2018-05-15 3:50 GMT+02:00 Rong Rong :
>
> > +1. This could be very useful for "dynamic" UDF.
> >
> > Just to clarify, if I understand correctly, we are tying to use an ENUM
> > indicator to
> > (1) Replace the current Boolean isExecutable flag.
> > (2) Provide additional information used by ExecutionEnvironment to decide
> > when/where to use the DistributedCached file.
> >
> > In this case, DistributedCache.CacheType or DistributedCache.FileType
> > sounds more intuitive, what do you think?
> >
> > Also, I was wondering is there any other useful information for the
> cached
> > file to be passed to runtime.
> > If we are just talking about including the library to the classloader,
> can
> > we directly extend the interface with
> >
> > public void registerCachedFile(
> > String filePath,
> > String name,
> > boolean executable,
> > boolean includeInClassLoader)
> >
> >
> > Thanks,
> > Rong
> >
> > On Sun, May 13, 2018 at 11:14 PM, Shuyi Chen  wrote:
> >
> > > Hi Flink devs,
> > >
> > > In an effort to support loading external libraries and creating UDFs
> from
> > > external libraries using DDL in Flink SQL, we want to use Flink’s Blob
> > > Server to distribute the external libraries in runtime and load those
> > > libraries into the user code classloader automatically.
> > >
> > > However, the current [Stream]ExecutionEnvironment.registerCachedFile
> > > interface limits only to registering executable or non-executable
> blobs.
> > > It’s not possible to tell in runtime if the blob files are libraries
> and
> > > should be loaded into the user code classloader in RuntimeContext.
> > > Therefore, I want to propose to add an enum called *BlobType*
> explicitly
> > to
> > > indicate the type of the Blob file being distributed, and the following
> > > interface in [Stream]ExecutionEnvironment to support it. In general, I
> > > think the new BlobType information can be used by Flink runtime to
> > > preprocess the Blob files if needed.
> > >
> > > */***
> > > ** Registers a file at the distributed cache under the given name. The
> > file
> > > will be accessible*
> > > ** from any user-defined function in the (distributed) runtime under a
> > > local path. Files*
> > > ** may be local files (as long as all relevant workers have access to
> > it),
> > > or files in a distributed file system.*
> > > ** The runtime will copy the files temporarily to a local cache, if
> > > needed.*
> > > ***
> > > ** The {@link org.apache.flink.api.common.functions.RuntimeContext}
> > can
> > > be obtained inside UDFs via*
> > > ** {@link
> > > org.apache.flink.api.common.functions.RichFunction#
> getRuntimeContext()}
> > > and
> > > provides access*
> > > ** {@link org.apache.flink.api.common.ca
> > > che.DistributedCache} via*
> > > ** {@link
> > > org.apache.flink.api.common.functions.RuntimeContext#
> > > getDistributedCache()}.*
> > > ***
> > > ** @param filePath The path of the file, as a URI (e.g.
> > "file:///some/path"
> > > or "hdfs://host:port/and/path")*
> > > ** @param name The name under which the file is registered.*
> > > ** @param blobType indicating the type of the Blob file*
> > > **/*
> > >
> > > *public void registerCachedFile(String filePath, String name,
> > > DistributedCache.BlobType blobType) {...}*
> > >
> > > Optionally, we can add another interface to register UDF Jars which
> will
> > > use the interface above to implement.
> > >
> > > *public void registerJarFile(String filePath, String name) {...}*
> > >
> > > The existing interface in the following will be marked deprecated:
> > >
> > > *public void registerCachedFile(String filePath, String name, boolean
> > > executable) {...}*
> > >
> > > And the following interface will be implemented using the new interface
> > > proposed above with a EXECUTABLE BlobType:
> > >
> > > *public void registerCachedFile(String filePath, String name) { ... }*
> > >
> > > Thanks a lot.
> > > 

Re: CloudWatch Metrics Reporter

2018-05-16 Thread Dyana Rose
I've written a cloud watch reporter for our own use. It's not pretty to
crack out the metrics correctly for cloudwatch as the current metrics don't
all set the metric names in a good hierarchy and then they aren't all added
to the metric variables either.

If someone opens the Jira I can see about getting our code up as an example
branch of what I had to do. Unless I missed something, I think the current
metrics need a bit of a brush up.

Dyana

On 16 May 2018 at 09:23, Chesnay Schepler  wrote:

> Hello,
>
> there was no demand for a CloudWatch reporter so far.
>
> I only quickly skimmed the API docs, but it appears that the data is
> inserted via REST.
> Would the reporter require the usage of any aws library, or could be use
> an arbitrary http client?
> If it is the latter there shouldn't be a licensing issue as i understand
> it.
>
> Please open a JIRA, let's move the discussion there.
>
>
> On 16.05.2018 10:12, Rafi Aroch wrote:
>
>> Hi,
>>
>> In my team we use CloudWatch as our monitoring & alerting system.
>> I noticed that CloudWatch does not appear in the list of supported
>> Reporters.
>> I was wondering why is that. Was there no demand from the community? Is it
>> related to licensing issue with AWS? Was it a technical concern?
>>
>> Would you accept this contribution into Flink?
>>
>> Thanks,
>> Rafi
>>
>>
>


-- 

Dyana Rose
Software Engineer


W: www.salecycle.com 
[image: Marketing Permissions Service]



Re: [Discuss] Proposing FLIP-25 - Support User State TTL Natively in Flink

2018-05-16 Thread Fabian Hueske
Hi,

Yes. IMO it makes sense to put the logic into the abstract base classes to
share the implementation across different state backends and state
primitives.
The overhead of storing the key twice is a valid concern, but I'm not sure
about the approach to add a timestamp to each value.
How would we discard state then? Iterating always over all (or a range of)
keys to check if their state should be expired?
That would only work efficiently if we relax the clean-up logic which could
be a valid design decision.

Best, Fabian

2018-05-14 9:33 GMT+02:00 sihua zhou :

> Hi Fabian,
> thanks you very much for the reply, just a alternative. Can we implement
> the TTL logical in `AbstractStateBackend` and `AbstractState`? A simplest
> way is to append the `ts` to the state's value? and we use the backend's
> `current time`(its also can be event time and process time) to judge
> whether the data is outdated? The pros is that:
> - state is puly backed by state backend.
> - for each key-value, we only need to store the one copy of key? (if we
> implement TTL base on timer, we need to store two copys of key, one for the
> timer and one for the keyed state)
>
> What do you think?
>
> Best,
> Sihua
>
> On 05/14/2018 15:20,Fabian Hueske 
> wrote:
>
> Hi Sihua,
>
> I think it makes sense to couple state TTL to the timer service. We'll
> need some kind of timers to expire state, so I think we should reuse
> components that we have instead of implementing another timer service.
> Moreover, using the same timer service and using the public state APIs
> helps to have a consistent TTL behavior across different state backend.
>
> Best, Fabian
>
> 2018-05-14 8:51 GMT+02:00 sihua zhou :
>
>> Hi Bowen,
>> thanks for your doc! I left some comments on the doc, the main concerning
>> is that it makes me feel like a coupling that the TTL need to depend on
>> `timer`. Because I think the TTL is a property of the state, so it should
>> be backed by the state backend. If we implement the TTL base on the timer,
>> than it looks like a coupling... it makes me feel that the backend for
>> state becomes `state backend` + `timer`. And in fact, IMO, even the `timer`
>> should depend on `state backend` in theroy, it's a type of HeapState that
>> scoped to the `key group`(not scoped to per key like the current keyed
>> state).
>>
>> And I found the doc is for exact TTL, I wonder if we can support a relax
>> TTL that could provides a better performance. Because to me, the reason
>> that I need TTL is just to prevent the state size growing infinitly(I
>> believe I'm not the only one like this), so a relax version is enough, if
>> there is a relax TTL which have a better performance, I would prefer that.
>>
>> What do you think?
>>
>> Best,
>> Sihua
>>
>>
>>
>> On 05/14/2018 14:31,Bowen Li 
>> wrote:
>>
>> Thank you, Fabian! I've created the FLIP-25 page
>> > 3A+Support+User+State+TTL+Natively>
>> .
>>
>> To continue our discussion points:
>> 1. I see what you mean now. I totally agree. Since we don't completely
>> know
>> it now, shall we experiment or prototype a little bit before deciding
>> this?
>> 2. -
>> 3. Adding tags to timers is an option.
>>
>> Another option I came up with recently, is like this: let
>> *InternalTimerService
>> *maintains user timers and TTL timers separately. Implementation classes
>> of
>> InternalTimerService should add two new collections of timers,  e.g.
>> *Ttl*ProcessingTimeTimersQueue
>> and *Ttl*EventTimeTimersQueue for HeapInternalTimerService. Upon
>> InternalTimerService#onProcessingTime() and advanceWatermark(), they will
>> first iterate through ProcessingTimeTimers and EventTimeTimers (user
>> timers) and then through *Ttl*ProcessingTimeTimers and
>> *Ttl*EventTimeTimers
>>
>> (Ttl timers).
>>
>> We'll also add the following new internal APIs to register Ttl timers:
>>
>> ```
>> @Internal
>> public void registerTtlProcessingTimeTimer(N namespace, long time);
>>
>> @Internal
>> public void registerTtlEventTimeTimer(N namespace, long time);
>> ```
>>
>> The biggest advantage, compared to option 1, is that it doesn't impact
>> existing timer-related checkpoint/savepoint, restore and migrations.
>>
>> What do you think?  And, any other Flink committers want to chime in for
>> ideas? I've also documented the above two discussion points to the FLIP
>> page.
>>
>> Thanks,
>> Bowen
>>
>>
>> On Wed, May 2, 2018 at 5:36 AM, Fabian Hueske  wrote:
>>
>> Hi Bowen,
>>
>> 1. The motivation to keep the TTL logic outside of the state backend was
>> mainly to avoid state backend custom implementations. If we have a generic
>> approach that would work for all state backends, we could try to put the
>> logic into a base class like AbstractStateBackend. After all, state
>> cleanup
>> is tightly related to the responsibilities of 

[DISCUSS] Drop "canEqual" from TypeInformation, TypeSerializer, etc.

2018-05-16 Thread Stephan Ewen
Hi all!

As part of an attempt to simplify some code in the TypeInfo and
TypeSerializer area, I would like to drop the "canEqual" methods for the
following reason:

"canEqual()" is necessary to make proper equality checks across hierarchies
of types. This is for example useful in a collection API, stating for
example whether a List can be equal to a Collection if they have the same
contents. We don't have that here.

A certain type information (and serializer) is equal to another one if they
describe the same type, strictly. There is no necessity for cross hierarchy
checks.

This has also let to the situation that most type infos and serializers
implement just a dummy/default version of "canEqual". Many "equals()"
methods do not even call the other object's "canEqual", etc.

As a first step, we could simply deprecate the method and implement an
empty default, and remove all calls to that method.

Best,
Stephan


Re: [DISCUSS] Adding new interfaces in [Stream]ExecutionEnvironment

2018-05-16 Thread Fabian Hueske
Hi,

I'm not sure if we need to modify the existing method.
What we need is a bit different from what registerCachedFile() provides.
The method ensures that a file is copied to each TaskManager and can be
locally accessed from a function's RuntimeContext.
In our case, we don't need to access the file but would like to make sure
that it is loaded into the class loader.
So, we could also just add a method like registerUserJarFile().

In a design document, Timo mentioned that we can ship multiple JAR files
with a job.
So, we could also implement the UDF shipping logic by loading the Jar
file(s) to the client and distribute them from there.
In that case, we would not need to add new method to the execution
environment.

Best,
Fabian

2018-05-15 3:50 GMT+02:00 Rong Rong :

> +1. This could be very useful for "dynamic" UDF.
>
> Just to clarify, if I understand correctly, we are tying to use an ENUM
> indicator to
> (1) Replace the current Boolean isExecutable flag.
> (2) Provide additional information used by ExecutionEnvironment to decide
> when/where to use the DistributedCached file.
>
> In this case, DistributedCache.CacheType or DistributedCache.FileType
> sounds more intuitive, what do you think?
>
> Also, I was wondering is there any other useful information for the cached
> file to be passed to runtime.
> If we are just talking about including the library to the classloader, can
> we directly extend the interface with
>
> public void registerCachedFile(
> String filePath,
> String name,
> boolean executable,
> boolean includeInClassLoader)
>
>
> Thanks,
> Rong
>
> On Sun, May 13, 2018 at 11:14 PM, Shuyi Chen  wrote:
>
> > Hi Flink devs,
> >
> > In an effort to support loading external libraries and creating UDFs from
> > external libraries using DDL in Flink SQL, we want to use Flink’s Blob
> > Server to distribute the external libraries in runtime and load those
> > libraries into the user code classloader automatically.
> >
> > However, the current [Stream]ExecutionEnvironment.registerCachedFile
> > interface limits only to registering executable or non-executable blobs.
> > It’s not possible to tell in runtime if the blob files are libraries and
> > should be loaded into the user code classloader in RuntimeContext.
> > Therefore, I want to propose to add an enum called *BlobType* explicitly
> to
> > indicate the type of the Blob file being distributed, and the following
> > interface in [Stream]ExecutionEnvironment to support it. In general, I
> > think the new BlobType information can be used by Flink runtime to
> > preprocess the Blob files if needed.
> >
> > */***
> > ** Registers a file at the distributed cache under the given name. The
> file
> > will be accessible*
> > ** from any user-defined function in the (distributed) runtime under a
> > local path. Files*
> > ** may be local files (as long as all relevant workers have access to
> it),
> > or files in a distributed file system.*
> > ** The runtime will copy the files temporarily to a local cache, if
> > needed.*
> > ***
> > ** The {@link org.apache.flink.api.common.functions.RuntimeContext}
> can
> > be obtained inside UDFs via*
> > ** {@link
> > org.apache.flink.api.common.functions.RichFunction#getRuntimeContext()}
> > and
> > provides access*
> > ** {@link org.apache.flink.api.common.ca
> > che.DistributedCache} via*
> > ** {@link
> > org.apache.flink.api.common.functions.RuntimeContext#
> > getDistributedCache()}.*
> > ***
> > ** @param filePath The path of the file, as a URI (e.g.
> "file:///some/path"
> > or "hdfs://host:port/and/path")*
> > ** @param name The name under which the file is registered.*
> > ** @param blobType indicating the type of the Blob file*
> > **/*
> >
> > *public void registerCachedFile(String filePath, String name,
> > DistributedCache.BlobType blobType) {...}*
> >
> > Optionally, we can add another interface to register UDF Jars which will
> > use the interface above to implement.
> >
> > *public void registerJarFile(String filePath, String name) {...}*
> >
> > The existing interface in the following will be marked deprecated:
> >
> > *public void registerCachedFile(String filePath, String name, boolean
> > executable) {...}*
> >
> > And the following interface will be implemented using the new interface
> > proposed above with a EXECUTABLE BlobType:
> >
> > *public void registerCachedFile(String filePath, String name) { ... }*
> >
> > Thanks a lot.
> > Shuyi
> >
> > "So you have to trust that the dots will somehow connect in your future."
> >
>


[jira] [Created] (FLINK-9375) Introduce AbortCheckpoint message from JM to TMs

2018-05-16 Thread Stefan Richter (JIRA)
Stefan Richter created FLINK-9375:
-

 Summary: Introduce AbortCheckpoint message from JM to TMs
 Key: FLINK-9375
 URL: https://issues.apache.org/jira/browse/FLINK-9375
 Project: Flink
  Issue Type: Improvement
  Components: State Backends, Checkpointing
Reporter: Stefan Richter


We should introduce an {{AbortCheckpoint}} message that a jobmanager can send 
to taskmanagers if a checkpoint is canceled so that the operators can eagerly 
stop their alignment phase and continue to normal processing. This can reduce 
some backpressure issues in the context of canceled and restarted checkpoints.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Elasticsearch Sink

2018-05-16 Thread Christophe Jolif
Ok thanks for the feedback.

> I agree. IIRC, the ES PRs that were opened also did this by changing the
return type from Client to AutoClosable, as well as letting the call bridge
also handle creation of BulkProcessors, correct?

Correct.

> Instead, we maintain our own request class to abstract those classes
away, and only create the actual Elasticsearch requests internally.

I'm personally unsure I would like to have to use Flink specific request
classes instead of Elastic ones but apart from that I think we are pretty
much inline.

I'm not exactly sure when I'll get the cycles but I will try to issue yet
another PR along those lines so we can continue the discussion from there
and hopefully we would get something in time for 1.6!

Thanks again,
--
Christophe

On Wed, May 16, 2018 at 7:19 AM, Tzu-Li (Gordon) Tai 
wrote:

> Hi,
>
> What if the user in a ES5.3+ case is calling the deprecated method? You
> agree it will fail? I'm not necessarily against that. I just want to make
> it clear that we don't have a perfect solution here either?
>
>
> I think what we could do here is at least in the deprecated
> `add(ActionRequest)` method, try casting to either `IndexRequest` or
> `DeleteRequest` and forward calls to the new methods.
> As a matter of fact, this is exactly what the Elasticsearch BulkProcessor
> API is doing internally [1] [2] [3], so we should be safe here.
> Looking at the code in [1] [2] [3], we should actually also consider it
> being a `UpdateRequest` (just to correct my previous statement that it can
> only be either index or delete).
> But yes, I think there wouldn’t be a perfect solution here; something has
> to be broken / deprecated in order for our implementation to be more future
> proof.
>
> indeed we always considered that among
> ActionRequest only delete and index were actually supported (which makes
> sense to me).
>
>
> Looking at the code in [1] [2] [3], we should actually also consider it
> being a `UpdateRequest` (just to correct my previous statement that it can
> only be either index or delete).
> And yes, the Elasticsearch Javadocs of the BulkProcessor also clearly
> state this.
>
> For the REST part, there is another incompatibility at the "Client" API
> level. Indeed the RestClient is not implementing the "Client" interface
> that is exposed in the bridge. So "just" doing the change above would
> still
> not allow to provide a REST based implementation?
>
>
> I agree. IIRC, the ES PRs that were opened also did this by changing the
> return type from Client to AutoClosable, as well as letting the call bridge
> also handle creation of BulkProcessors, correct?
> I think this is definitely the way to go for us to be more future proof.
> The general rule of thumb is that, for all APIs (either APIs of the base
> module that the ES connectors internally use, or user-facing APIs of the
> connector), we should move towards not leaking Elasticsearch classes as the
> API.
> This goes for removing Client as the return type in the call bridge
> interface. When re-working the RequestIndexer interface, we can even
> consider not directly exposing the `IndexRequest`, `DeleteRequest`,
> `UpdateRequest` classes as the API.
> Instead, we maintain our own request class to abstract those classes away,
> and only create the actual Elasticsearch requests internally.
>
> Cheers,
> Gordon
>
> [1] https://github.com/elastic/elasticsearch/blob/v5.
> 2.2/core/src/main/java/org/elasticsearch/action/bulk/BulkRequest.java#L110
> [2] https://github.com/elastic/elasticsearch/blob/v6.
> 2.4/server/src/main/java/org/elasticsearch/action/bulk/
> BulkRequest.java#L128
>
> On 16 May 2018 at 4:12:15 AM, Christophe Jolif (cjo...@gmail.com) wrote:
>
> Hi Gordon,
>
> On Tue, May 15, 2018 at 6:16 AM, Tzu-Li (Gordon) Tai 
>
> wrote:
>
> > Hi,
> >
> > Let me first clarify a few things so that we are on the same page here:
> >
> > 1. The reason that we are looking into having a new base-module for
> > versions 5.3+ is that
> > new Elasticsearch BulkProcessor APIs are breaking some of our original
> > base API assumptions.
> > This is noticeable from the intended cast here [1].
> >
> > 2. Given that we are looking into a new base module, Christophe proposed
> > that we focus on designing
> > the new base module around Elasticsearch’s new REST API, so that we are
> > more-future proof.
> >
> > 3. I proposed to remove / deprecate support for earlier versions,
> because
> > once we introduce the new
> > base module, we would be maintaining a lot of Elasticsearch connector
> > modules (2 base modules, 4+ version specific).
> > Of course, whether or not this really is a problem depends on how much
> > capacity the community has to maintain them, as Steve mentioned.
> >
> >
> > Now, moving on to another proposed solution that should work (at least
> for
> > now, depends on how Elasticsearch changes their API in the future):
> > The main problem we’ve bumped into is that 

Re: CloudWatch Metrics Reporter

2018-05-16 Thread Chesnay Schepler

Hello,

there was no demand for a CloudWatch reporter so far.

I only quickly skimmed the API docs, but it appears that the data is 
inserted via REST.
Would the reporter require the usage of any aws library, or could be use 
an arbitrary http client?

If it is the latter there shouldn't be a licensing issue as i understand it.

Please open a JIRA, let's move the discussion there.

On 16.05.2018 10:12, Rafi Aroch wrote:

Hi,

In my team we use CloudWatch as our monitoring & alerting system.
I noticed that CloudWatch does not appear in the list of supported
Reporters.
I was wondering why is that. Was there no demand from the community? Is it
related to licensing issue with AWS? Was it a technical concern?

Would you accept this contribution into Flink?

Thanks,
Rafi





CloudWatch Metrics Reporter

2018-05-16 Thread Rafi Aroch
Hi,

In my team we use CloudWatch as our monitoring & alerting system.
I noticed that CloudWatch does not appear in the list of supported
Reporters.
I was wondering why is that. Was there no demand from the community? Is it
related to licensing issue with AWS? Was it a technical concern?

Would you accept this contribution into Flink?

Thanks,
Rafi


[jira] [Created] (FLINK-9374) Flink Kinesis Producer does not backpressure

2018-05-16 Thread Franz Thoma (JIRA)
Franz Thoma created FLINK-9374:
--

 Summary: Flink Kinesis Producer does not backpressure
 Key: FLINK-9374
 URL: https://issues.apache.org/jira/browse/FLINK-9374
 Project: Flink
  Issue Type: Bug
Reporter: Franz Thoma
 Attachments: after.png, before.png

The {{FlinkKinesisProducer}} just accepts records and forwards it to a 
{{KinesisProducer}} from the Amazon Kinesis Producer Library (KPL). The KPL 
internally holds an unbounded queue of records that have not yet been sent.

Since Kinesis is rate-limited to 1MB per second per shard, this queue may grow 
indefinitely if Flink sends records faster than the KPL can forward them to 
Kinesis.

One way to circumvent this problem is to set a record TTL, so that queued 
records are dropped after a certain amount of time, but this will lead to data 
loss under high loads.

Currently the only time the queue is flushed is during checkpointing: 
{{FlinkKinesisProducer}} consumes records at arbitrary rate, either until a 
checkpoint is reached (and will wait until the queue is flushed), or until 
out-of-memory, whichever is reached first. (This gets worse due to the fact 
that the Java KPL is only a thin wrapper around a C++ process, so it is not 
even the Java process that runs out of memory, but the C++ process.) The 
implicit rate-limit due to checkpointing leads to a ragged throughput graph 
like this (the periods with zero throughput are the wait times before a 
checkpoint):

!file:///home/fthoma/projects/flink/before.png!!before.png! Throughput limited 
by checkpointing only

My proposed solution is to add a config option {{queueLimit}} to set a maximum 
number of records that may be waiting in the KPL queue. If this limit is 
reached, the {{FlinkKinesisProducer}} should trigger a {{flush()}} and wait 
(blocking) until the queue length is below the limit again. This automatically 
leads to backpressuring, since the {{FlinkKinesisProducer}} cannot accept 
records while waiting. For compatibility, {{queueLimit}} is set to 
{{Integer.MAX_VALUE}} by default, so the behavior is unchanged unless a client 
explicitly sets the value. Setting a »sane« default value is not possible 
unfortunately, since sensible values for the limit depend on the record size 
(the limit should be chosen so that about 10–100MB of records per shard are 
accumulated before flushing, otherwise the maximum Kinesis throughput may not 
be reached).

!after.png! Throughput with a queue limit of 10 records (the spikes are 
checkpoints, where the queue is still flushed completely)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] Release 1.5.0, release candidate #2

2018-05-16 Thread shashank734
Same error on 3.5.2 ... Let me check rc3 also. 



--
Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/


[jira] [Created] (FLINK-9373) Always call RocksIterator.status() to check the internal error of RocksDB

2018-05-16 Thread Sihua Zhou (JIRA)
Sihua Zhou created FLINK-9373:
-

 Summary: Always call RocksIterator.status() to check the internal 
error of RocksDB
 Key: FLINK-9373
 URL: https://issues.apache.org/jira/browse/FLINK-9373
 Project: Flink
  Issue Type: Bug
  Components: State Backends, Checkpointing
Affects Versions: 1.5.0
Reporter: Sihua Zhou
Assignee: Sihua Zhou


Currently, when using RocksIterator we only use the _iterator.isValid()_ to 
check whether we have reached the end of the iterator. But that is not enough, 
if we refer to RocksDB's wiki 
https://github.com/facebook/rocksdb/wiki/Iterator#error-handling we should find 
that _iterator.isValid()_ may also cause by a internal error. A safer way to 
use the _RocksIterator_ is to always call the _iterator.status()_ to check the 
internal error of _RocksDB_. There is a case from user email seems to lost data 
because of this 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Missing-MapState-when-Timer-fires-after-restored-state-td20134.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)