Re: TaskManager job lifecycle hooks

2017-12-07 Thread Eron Wright
Could you speak to whether the lifecycle provided by RichFunction
(open/close) would fit the requirement?

https://ci.apache.org/projects/flink/flink-docs-release-1.3/api/java/org/apache/flink/api/common/functions/RichFunction.html#open-org.apache.flink.configuration.Configuration-

On Thu, Dec 7, 2017 at 1:57 PM, Ben Sidhom 
wrote:

> Hey,
>
> I'm working on the Apache Beam  portability
> story
> and trying to figure out how we can get the Flink runner to support
> the new portability
> API .
>
> In order to get the runner to work with portable SDKs, we need to be able
> to spin up and manage user containers from the TaskManagers themselves. All
> communication with user code (effectively user-defined functions) happens
> over RPC endpoints between the container and the Flink worker threads.
> Unfortunately, we cannot assume that the container images themselves are
> small or that they are cheap to start up. For this reason, we cannot
> reasonably start and stop these external services once per task (e.g., by
> wrapping service lifetimes within mapPartions). In order to support
> multiple jobs per JVM (either due to multiple task slots per manager or
> multiple jobs submitted to a cluster serially) , we need to know when to
> clean up resources associated with a particular job.
>
> Is there a way to do this in user code? Ideally, this would be something
> like a set of per-job startup and shutdown hooks that execute on each
> TaskManager that a particular job runs on. If this does not currently
> exist, how reasonable would it be to introduce client-facing APIs that
> would allow it? Is there a better approach for this lifecycle management
> that better fits into the Flink execution model?
>
> Thanks
> --
> -Ben
>


Dataworks Summit EU call for abstracts

2017-12-07 Thread Alan Gates
Dataworks Summit EU 2018 is in Berlin, April 16-19.   The call for
abstracts is open through December 15th.  One of the tracks is IoT and
Streaming, which is a great fit for Flink  talks.
https://dataworkssummit.com/abstracts/

Alan.


TaskManager job lifecycle hooks

2017-12-07 Thread Ben Sidhom
Hey,

I'm working on the Apache Beam  portability story
and trying to figure out how we can get the Flink runner to support
the new portability
API .

In order to get the runner to work with portable SDKs, we need to be able
to spin up and manage user containers from the TaskManagers themselves. All
communication with user code (effectively user-defined functions) happens
over RPC endpoints between the container and the Flink worker threads.
Unfortunately, we cannot assume that the container images themselves are
small or that they are cheap to start up. For this reason, we cannot
reasonably start and stop these external services once per task (e.g., by
wrapping service lifetimes within mapPartions). In order to support
multiple jobs per JVM (either due to multiple task slots per manager or
multiple jobs submitted to a cluster serially) , we need to know when to
clean up resources associated with a particular job.

Is there a way to do this in user code? Ideally, this would be something
like a set of per-job startup and shutdown hooks that execute on each
TaskManager that a particular job runs on. If this does not currently
exist, how reasonable would it be to introduce client-facing APIs that
would allow it? Is there a better approach for this lifecycle management
that better fits into the Flink execution model?

Thanks
-- 
-Ben


Re: [VOTE] Release 1.4.0, release candidate #3

2017-12-07 Thread Eron Wright
Just discovered:  the removal of Flink's Future (FLINK-7252) causes a
breaking change in connectors that use
`org.apache.flink.runtime.checkpoint.MasterTriggerRestoreHook`, because
`Future` is a type on one of the methods.

To my knowledge, this affects only the Pravega connector.  Curious to know
whether any other connectors are affected.  I don't think we (Dell EMC)
consider it a blocker but it will mean that the connector is Flink 1.4+.

Eron


On Thu, Dec 7, 2017 at 12:25 PM, Aljoscha Krettek 
wrote:

> I just noticed that I did a copy-and-paste error and the last paragraph
> about voting period should be this:
>
> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
>
> Best,
> Aljoscha
>
> > On 7. Dec 2017, at 19:24, Bowen Li  wrote:
> >
> > I agree that it shouldn't block the release. The doc website part is even
> > better!
> >
> > On Thu, Dec 7, 2017 at 1:09 AM, Aljoscha Krettek 
> > wrote:
> >
> >> Good catch, yes. This shouldn't block the release, though, since the doc
> >> is always built form the latest state of a release branch, i.e. the 1.4
> doc
> >> on the website will update as soon as the doc on the release-1.4 branch
> is
> >> updated.
> >>
> >>> On 6. Dec 2017, at 20:47, Bowen Li  wrote:
> >>>
> >>> Hi Aljoscha,
> >>>
> >>> I found Flink's State doc and javaDoc are very ambiguous on what the
> >>> replacement of FoldingState is, which will confuse a lot of users. We
> >> need
> >>> to fix it in 1.4 release.
> >>>
> >>> I have submitted a PR at https://github.com/apache/flink/pull/5129
> >>>
> >>> Thanks,
> >>> Bowen
> >>>
> >>>
> >>> On Wed, Dec 6, 2017 at 5:56 AM, Aljoscha Krettek 
> >>> wrote:
> >>>
>  Hi everyone,
> 
>  Please review and vote on release candidate #3 for the version 1.4.0,
> as
>  follows:
>  [ ] +1, Approve the release
>  [ ] -1, Do not approve the release (please provide specific comments)
> 
> 
>  The complete staging area is available for your review, which
> includes:
>  * JIRA release notes [1],
>  * the official Apache source release and binary convenience releases
> to
> >> be
>  deployed to dist.apache.org[2], which are signed with the key with
>  fingerprint F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
>  * all artifacts to be deployed to the Maven Central Repository [4],
>  * source code tag "release-1.4.0-rc1" [5],
>  * website pull request listing the new release [6].
> 
>  Please have a careful look at the website PR because I changed some
>  wording and we're now also releasing a binary without Hadoop
> >> dependencies.
> 
>  Please use this document for coordinating testing efforts: [7]
> 
>  The only change between RC1 and this RC2 is that the source release
>  package does not include the erroneously included binary Ruby
> >> dependencies
>  of the documentation anymore. Because of this I would like to propose
> a
>  shorter voting time and close the vote around the time that RC1 would
> >> have
>  closed. This would mean closing by end of Wednesday. Please let me
> know
> >> if
>  you disagree with this. The vote is adopted by majority approval, with
> >> at
>  least 3 PMC affirmative votes.
> 
>  Thanks,
>  Your friendly Release Manager
> 
>  [1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?
>  projectId=12315522&version=12340533
>  [2] http://people.apache.org/~aljoscha/flink-1.4.0-rc3/
>  [3] https://dist.apache.org/repos/dist/release/flink/KEYS
>  [4] https://repository.apache.org/content/repositories/
> >> orgapacheflink-1141
>  [5] https://git-wip-us.apache.org/repos/asf?p=flink.git;a=tag;h=
>  8fb9635dd2e64dbb20887c84f646f02034b57cb1
>  [6] https://github.com/apache/flink-web/pull/95
>  [7] https://docs.google.com/document/d/1cOkycJwEKVjG_
>  onnpl3bQNTq7uebh48zDtIJxceyU2E/edit?usp=sharing
> 
>  Pro-tip: you can create a settings.xml file with these contents:
> 
>  
>  
>  flink-1.4.0
>  
>  
>  
>  flink-1.4.0
>  
>    
>  flink-1.4.0
>  
>  https://repository.apache.org/content/repositories/
>  orgapacheflink-1141/
>  
>    
>    
>  archetype
>  
>  https://repository.apache.org/content/repositories/
>  orgapacheflink-1141/
>  
>    
>  
>  
>  
>  
> 
>  And reference that in you maven commands via --settings
>  path/to/settings.xml. This is useful for creating a quickstart based
> on
> >> the
>  staged release and for building against the staged jars.
> >>
> >>
>
>


Re: [VOTE] Release 1.4.0, release candidate #3

2017-12-07 Thread Aljoscha Krettek
I just noticed that I did a copy-and-paste error and the last paragraph about 
voting period should be this:

The vote will be open for at least 72 hours. It is adopted by majority 
approval, with at least 3 PMC affirmative votes.

Best,
Aljoscha

> On 7. Dec 2017, at 19:24, Bowen Li  wrote:
> 
> I agree that it shouldn't block the release. The doc website part is even
> better!
> 
> On Thu, Dec 7, 2017 at 1:09 AM, Aljoscha Krettek 
> wrote:
> 
>> Good catch, yes. This shouldn't block the release, though, since the doc
>> is always built form the latest state of a release branch, i.e. the 1.4 doc
>> on the website will update as soon as the doc on the release-1.4 branch is
>> updated.
>> 
>>> On 6. Dec 2017, at 20:47, Bowen Li  wrote:
>>> 
>>> Hi Aljoscha,
>>> 
>>> I found Flink's State doc and javaDoc are very ambiguous on what the
>>> replacement of FoldingState is, which will confuse a lot of users. We
>> need
>>> to fix it in 1.4 release.
>>> 
>>> I have submitted a PR at https://github.com/apache/flink/pull/5129
>>> 
>>> Thanks,
>>> Bowen
>>> 
>>> 
>>> On Wed, Dec 6, 2017 at 5:56 AM, Aljoscha Krettek 
>>> wrote:
>>> 
 Hi everyone,
 
 Please review and vote on release candidate #3 for the version 1.4.0, as
 follows:
 [ ] +1, Approve the release
 [ ] -1, Do not approve the release (please provide specific comments)
 
 
 The complete staging area is available for your review, which includes:
 * JIRA release notes [1],
 * the official Apache source release and binary convenience releases to
>> be
 deployed to dist.apache.org[2], which are signed with the key with
 fingerprint F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
 * all artifacts to be deployed to the Maven Central Repository [4],
 * source code tag "release-1.4.0-rc1" [5],
 * website pull request listing the new release [6].
 
 Please have a careful look at the website PR because I changed some
 wording and we're now also releasing a binary without Hadoop
>> dependencies.
 
 Please use this document for coordinating testing efforts: [7]
 
 The only change between RC1 and this RC2 is that the source release
 package does not include the erroneously included binary Ruby
>> dependencies
 of the documentation anymore. Because of this I would like to propose a
 shorter voting time and close the vote around the time that RC1 would
>> have
 closed. This would mean closing by end of Wednesday. Please let me know
>> if
 you disagree with this. The vote is adopted by majority approval, with
>> at
 least 3 PMC affirmative votes.
 
 Thanks,
 Your friendly Release Manager
 
 [1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?
 projectId=12315522&version=12340533
 [2] http://people.apache.org/~aljoscha/flink-1.4.0-rc3/
 [3] https://dist.apache.org/repos/dist/release/flink/KEYS
 [4] https://repository.apache.org/content/repositories/
>> orgapacheflink-1141
 [5] https://git-wip-us.apache.org/repos/asf?p=flink.git;a=tag;h=
 8fb9635dd2e64dbb20887c84f646f02034b57cb1
 [6] https://github.com/apache/flink-web/pull/95
 [7] https://docs.google.com/document/d/1cOkycJwEKVjG_
 onnpl3bQNTq7uebh48zDtIJxceyU2E/edit?usp=sharing
 
 Pro-tip: you can create a settings.xml file with these contents:
 
 
 
 flink-1.4.0
 
 
 
 flink-1.4.0
 
   
 flink-1.4.0
 
 https://repository.apache.org/content/repositories/
 orgapacheflink-1141/
 
   
   
 archetype
 
 https://repository.apache.org/content/repositories/
 orgapacheflink-1141/
 
   
 
 
 
 
 
 And reference that in you maven commands via --settings
 path/to/settings.xml. This is useful for creating a quickstart based on
>> the
 staged release and for building against the staged jars.
>> 
>> 



[jira] [Created] (FLINK-8222) Update Scala version

2017-12-07 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-8222:
-

 Summary: Update Scala version
 Key: FLINK-8222
 URL: https://issues.apache.org/jira/browse/FLINK-8222
 Project: Flink
  Issue Type: Improvement
  Components: Build System
Affects Versions: 1.4.0
Reporter: Greg Hogan
Assignee: Greg Hogan


Update Scala to version {{2.11.12}}. I don't believe this affects the Flink 
distribution but rather anyone who is compiling Flink or a 
Flink-quickstart-derived program on a shared system.

"A privilege escalation vulnerability (CVE-2017-15288) has been identified in 
the Scala compilation daemon."

https://www.scala-lang.org/news/security-update-nov17.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [VOTE] Release 1.4.0, release candidate #3

2017-12-07 Thread Bowen Li
I agree that it shouldn't block the release. The doc website part is even
better!

On Thu, Dec 7, 2017 at 1:09 AM, Aljoscha Krettek 
wrote:

> Good catch, yes. This shouldn't block the release, though, since the doc
> is always built form the latest state of a release branch, i.e. the 1.4 doc
> on the website will update as soon as the doc on the release-1.4 branch is
> updated.
>
> > On 6. Dec 2017, at 20:47, Bowen Li  wrote:
> >
> > Hi Aljoscha,
> >
> > I found Flink's State doc and javaDoc are very ambiguous on what the
> > replacement of FoldingState is, which will confuse a lot of users. We
> need
> > to fix it in 1.4 release.
> >
> > I have submitted a PR at https://github.com/apache/flink/pull/5129
> >
> > Thanks,
> > Bowen
> >
> >
> > On Wed, Dec 6, 2017 at 5:56 AM, Aljoscha Krettek 
> > wrote:
> >
> >> Hi everyone,
> >>
> >> Please review and vote on release candidate #3 for the version 1.4.0, as
> >> follows:
> >> [ ] +1, Approve the release
> >> [ ] -1, Do not approve the release (please provide specific comments)
> >>
> >>
> >> The complete staging area is available for your review, which includes:
> >> * JIRA release notes [1],
> >> * the official Apache source release and binary convenience releases to
> be
> >> deployed to dist.apache.org[2], which are signed with the key with
> >> fingerprint F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
> >> * all artifacts to be deployed to the Maven Central Repository [4],
> >> * source code tag "release-1.4.0-rc1" [5],
> >> * website pull request listing the new release [6].
> >>
> >> Please have a careful look at the website PR because I changed some
> >> wording and we're now also releasing a binary without Hadoop
> dependencies.
> >>
> >> Please use this document for coordinating testing efforts: [7]
> >>
> >> The only change between RC1 and this RC2 is that the source release
> >> package does not include the erroneously included binary Ruby
> dependencies
> >> of the documentation anymore. Because of this I would like to propose a
> >> shorter voting time and close the vote around the time that RC1 would
> have
> >> closed. This would mean closing by end of Wednesday. Please let me know
> if
> >> you disagree with this. The vote is adopted by majority approval, with
> at
> >> least 3 PMC affirmative votes.
> >>
> >> Thanks,
> >> Your friendly Release Manager
> >>
> >> [1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> >> projectId=12315522&version=12340533
> >> [2] http://people.apache.org/~aljoscha/flink-1.4.0-rc3/
> >> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> >> [4] https://repository.apache.org/content/repositories/
> orgapacheflink-1141
> >> [5] https://git-wip-us.apache.org/repos/asf?p=flink.git;a=tag;h=
> >> 8fb9635dd2e64dbb20887c84f646f02034b57cb1
> >> [6] https://github.com/apache/flink-web/pull/95
> >> [7] https://docs.google.com/document/d/1cOkycJwEKVjG_
> >> onnpl3bQNTq7uebh48zDtIJxceyU2E/edit?usp=sharing
> >>
> >> Pro-tip: you can create a settings.xml file with these contents:
> >>
> >> 
> >> 
> >> flink-1.4.0
> >> 
> >> 
> >> 
> >>  flink-1.4.0
> >>  
> >>
> >>  flink-1.4.0
> >>  
> >>  https://repository.apache.org/content/repositories/
> >> orgapacheflink-1141/
> >>  
> >>
> >>
> >>  archetype
> >>  
> >>  https://repository.apache.org/content/repositories/
> >> orgapacheflink-1141/
> >>  
> >>
> >>  
> >> 
> >> 
> >> 
> >>
> >> And reference that in you maven commands via --settings
> >> path/to/settings.xml. This is useful for creating a quickstart based on
> the
> >> staged release and for building against the staged jars.
>
>


[jira] [Created] (FLINK-8223) Update Hadoop versions

2017-12-07 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-8223:
-

 Summary: Update Hadoop versions
 Key: FLINK-8223
 URL: https://issues.apache.org/jira/browse/FLINK-8223
 Project: Flink
  Issue Type: Improvement
  Components: Build System
Affects Versions: 1.5.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Trivial


Update 2.7.3 to 2.7.4 and 2.8.0 to 2.8.2. See 
http://hadoop.apache.org/releases.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-8221) Implement set of network latency benchmarks in Flink

2017-12-07 Thread Piotr Nowojski (JIRA)
Piotr Nowojski created FLINK-8221:
-

 Summary: Implement set of network latency benchmarks in Flink
 Key: FLINK-8221
 URL: https://issues.apache.org/jira/browse/FLINK-8221
 Project: Flink
  Issue Type: New Feature
  Components: Network
Reporter: Piotr Nowojski
Assignee: Nico Kruber






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-8220) Define set of network throughput benchmarks in Flink

2017-12-07 Thread Piotr Nowojski (JIRA)
Piotr Nowojski created FLINK-8220:
-

 Summary: Define set of network throughput benchmarks in Flink
 Key: FLINK-8220
 URL: https://issues.apache.org/jira/browse/FLINK-8220
 Project: Flink
  Issue Type: New Feature
  Components: Network
Reporter: Piotr Nowojski
Assignee: Piotr Nowojski






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-8219) Kinesis Connector metrics

2017-12-07 Thread Gary Oslon (JIRA)
Gary Oslon created FLINK-8219:
-

 Summary: Kinesis Connector metrics
 Key: FLINK-8219
 URL: https://issues.apache.org/jira/browse/FLINK-8219
 Project: Flink
  Issue Type: Bug
  Components: Kinesis Connector
Reporter: Gary Oslon


I don't see in the documentation which metrics are emitted by Kinesis 
Connector. When working with Amazon's Kinesis Client library, it is common to 
get metrics via {{MillisBehindLatest}} which tells you whether your processor 
is delayed.

Where are those metrics being published?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [VOTE] Release 1.4.0, release candidate #3

2017-12-07 Thread Aljoscha Krettek
Good catch, yes. This shouldn't block the release, though, since the doc is 
always built form the latest state of a release branch, i.e. the 1.4 doc on the 
website will update as soon as the doc on the release-1.4 branch is updated.

> On 6. Dec 2017, at 20:47, Bowen Li  wrote:
> 
> Hi Aljoscha,
> 
> I found Flink's State doc and javaDoc are very ambiguous on what the
> replacement of FoldingState is, which will confuse a lot of users. We need
> to fix it in 1.4 release.
> 
> I have submitted a PR at https://github.com/apache/flink/pull/5129
> 
> Thanks,
> Bowen
> 
> 
> On Wed, Dec 6, 2017 at 5:56 AM, Aljoscha Krettek 
> wrote:
> 
>> Hi everyone,
>> 
>> Please review and vote on release candidate #3 for the version 1.4.0, as
>> follows:
>> [ ] +1, Approve the release
>> [ ] -1, Do not approve the release (please provide specific comments)
>> 
>> 
>> The complete staging area is available for your review, which includes:
>> * JIRA release notes [1],
>> * the official Apache source release and binary convenience releases to be
>> deployed to dist.apache.org[2], which are signed with the key with
>> fingerprint F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
>> * all artifacts to be deployed to the Maven Central Repository [4],
>> * source code tag "release-1.4.0-rc1" [5],
>> * website pull request listing the new release [6].
>> 
>> Please have a careful look at the website PR because I changed some
>> wording and we're now also releasing a binary without Hadoop dependencies.
>> 
>> Please use this document for coordinating testing efforts: [7]
>> 
>> The only change between RC1 and this RC2 is that the source release
>> package does not include the erroneously included binary Ruby dependencies
>> of the documentation anymore. Because of this I would like to propose a
>> shorter voting time and close the vote around the time that RC1 would have
>> closed. This would mean closing by end of Wednesday. Please let me know if
>> you disagree with this. The vote is adopted by majority approval, with at
>> least 3 PMC affirmative votes.
>> 
>> Thanks,
>> Your friendly Release Manager
>> 
>> [1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?
>> projectId=12315522&version=12340533
>> [2] http://people.apache.org/~aljoscha/flink-1.4.0-rc3/
>> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
>> [4] https://repository.apache.org/content/repositories/orgapacheflink-1141
>> [5] https://git-wip-us.apache.org/repos/asf?p=flink.git;a=tag;h=
>> 8fb9635dd2e64dbb20887c84f646f02034b57cb1
>> [6] https://github.com/apache/flink-web/pull/95
>> [7] https://docs.google.com/document/d/1cOkycJwEKVjG_
>> onnpl3bQNTq7uebh48zDtIJxceyU2E/edit?usp=sharing
>> 
>> Pro-tip: you can create a settings.xml file with these contents:
>> 
>> 
>> 
>> flink-1.4.0
>> 
>> 
>> 
>>  flink-1.4.0
>>  
>>
>>  flink-1.4.0
>>  
>>  https://repository.apache.org/content/repositories/
>> orgapacheflink-1141/
>>  
>>
>>
>>  archetype
>>  
>>  https://repository.apache.org/content/repositories/
>> orgapacheflink-1141/
>>  
>>
>>  
>> 
>> 
>> 
>> 
>> And reference that in you maven commands via --settings
>> path/to/settings.xml. This is useful for creating a quickstart based on the
>> staged release and for building against the staged jars.