Re: Updated checkstyle version

2017-11-27 Thread Aljoscha Krettek
Thanks for the update!

> On 27. Nov 2017, at 22:14, Greg Hogan  wrote:
> 
> Hi devs,
> 
> Recent commits to the master and release-1.4 branches updated the checkstyle 
> version from 6.19 to 8.4 and if using the checkstyle plugin for IntelliJ you 
> will need to manually update this version within the preferences dialog. The 
> old version was not fully enforcing the rule set.
> 
> Greg Hogan



[jira] [Created] (FLINK-8162) Kinesis Connector to report millisBehindLatest metric

2017-11-27 Thread Cristian (JIRA)
Cristian created FLINK-8162:
---

 Summary: Kinesis Connector to report millisBehindLatest metric
 Key: FLINK-8162
 URL: https://issues.apache.org/jira/browse/FLINK-8162
 Project: Flink
  Issue Type: Improvement
  Components: Kinesis Connector
Reporter: Cristian
Priority: Minor


When reading from Kinesis streams, one of the most valuable metrics is 
"MillisBehindLatest" 
([see](https://github.com/aws/aws-sdk-java/blob/25f0821f69bf94ec456f602f2b83ea2b0ca15643/aws-java-sdk-kinesis/src/main/java/com/amazonaws/services/kinesis/model/GetRecordsResult.java#L187-L201)).

Flink should use its metrics mechanism to report this value as a gauge, tagging 
it with the shard id.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Updated checkstyle version

2017-11-27 Thread Greg Hogan
Hi devs,

Recent commits to the master and release-1.4 branches updated the checkstyle 
version from 6.19 to 8.4 and if using the checkstyle plugin for IntelliJ you 
will need to manually update this version within the preferences dialog. The 
old version was not fully enforcing the rule set.

Greg Hogan

[DISCUSS] Service Authorization (SSL client authentication)

2017-11-27 Thread Eron Wright
I'd like to make some progress on hardening Flink using SSL client
authentication.   Here's the FLIP proposal:
https://docs.google.com/document/d/13IRPb2GdL842rIzMgEn0ibOQHNku6W8aMf1p7gCPJjg/edit?usp=sharing

1. What is the next step to have this FLIP be accepted?
2. Does anyone have any objections to the technical plan?

Thanks!
Eron Wright
Dell EMC


[VOTE] Release 1.4.0, release candidate #2

2017-11-27 Thread Aljoscha Krettek
Hi everyone,

Please review and vote on release candidate #2 for the version 1.4.0, as 
follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)


The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release and binary convenience releases to be 
deployed to dist.apache.org[2], which are signed with the key with fingerprint 
F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag "release-1.4.0-rc1" [5],
* website pull request listing the new release [6].

Please have a careful look at the website PR because I changed some wording and 
we're now also releasing a binary without Hadoop dependencies.

Please use this document for coordinating testing efforts: [7]

The only change between RC1 and this RC2 is that the source release package 
does not include the erroneously included binary Ruby dependencies of the 
documentation anymore. Because of this I would like to propose a shorter voting 
time and close the vote around the time that RC1 would have closed. This would 
mean closing by end of Wednesday. Please let me know if you disagree with this. 
The vote is adopted by majority approval, with at least 3 PMC affirmative votes.

Thanks,
Your friendly Release Manager

[1] 
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12340533
[2] http://people.apache.org/~aljoscha/flink-1.4.0-rc2/
[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4] https://repository.apache.org/content/repositories/orgapacheflink-1140
[5] 
https://git-wip-us.apache.org/repos/asf?p=flink.git;a=tag;h=ea751b7b23b23446ed3fcdeed564bbe8bf4adf9c
[6] https://github.com/apache/flink-web/pull/95
[7] 
https://docs.google.com/document/d/1HqYyrNoMSXwo8zBpZj7s39UzUdlFcFO8TRpHNZ_cl44/edit?usp=sharing

Pro-tip: you can create a settings.xml file with these contents:



 flink-1.4.0


 
   flink-1.4.0
   
 
   flink-1.4.0
   
   
https://repository.apache.org/content/repositories/orgapacheflink-1140/
   
 
 
   archetype
   
   
https://repository.apache.org/content/repositories/orgapacheflink-1140/
   
 
   
 



And reference that in you maven commands via --settings path/to/settings.xml. 
This is useful for creating a quickstart based on the staged release and for 
building against the staged jars.

[jira] [Created] (FLINK-8161) Flakey YARNSessionCapacitySchedulerITCase on Travis

2017-11-27 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-8161:


 Summary: Flakey YARNSessionCapacitySchedulerITCase on Travis
 Key: FLINK-8161
 URL: https://issues.apache.org/jira/browse/FLINK-8161
 Project: Flink
  Issue Type: Bug
  Components: Tests, YARN
Affects Versions: 1.5.0
Reporter: Till Rohrmann
Assignee: Till Rohrmann
Priority: Critical
 Fix For: 1.5.0


The {{YARNSessionCapacitySchedulerITCase}} spuriously fails on Travis because 
it now contains {{2017-11-25 22:49:49,204 WARN  
akka.remote.transport.netty.NettyTransport- Remote 
connection to [null] failed with java.nio.channels.NotYetConnectedException}} 
from time to time in the logs. I suspect that this is due to switching from 
Flakka to Akka 2.4.0. In order to solve this problem I propose to add this log 
statement to the whitelisted log statements.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[CANCEL] [VOTE] Release 1.4.0, release candidate #1

2017-11-27 Thread Aljoscha Krettek
I just realised that the source release is ridiculously huge (it includes the 
binary ruby dependencies of the documentation, by accident). I'll create a new 
RC.

> On 27. Nov 2017, at 14:59, Stefan Richter  wrote:
> 
> Hi,
> 
> thanks for creating this PR Aljoscha! I tested Flink in a cluster setup on 
> Google Cloud, YARN-per-job, checked that for all backends that HA, recovery, 
> at-least-once, end-to-end exactly once (with Kafka11 Producer), savepoints, 
> externalized checkpoints, and rescaling work correctly. No problems found in 
> RC1.
> 
> +1 (non-binding) from me.
> 
> Best,
> Stefan 
> 
>> Am 24.11.2017 um 14:53 schrieb Ted Yu :
>> 
>> Long weekend should end this Sunday. 
>> Closing vote Wednesday would be great. 
>> Thanks
>>  Original message From: Aljoscha Krettek 
>>  Date: 11/24/17  5:34 AM  (GMT-08:00) To: 
>> dev@flink.apache.org Subject: Re: [VOTE] Release 1.4.0, release candidate #1 
>> How long will the long weekend be? I thought about closing the vote on 
>> Wednesday, i.e. not count the weekend. Would that work?
>> 
>> Best,
>> Aljoscha
>> 
>>> On 24. Nov 2017, at 12:18, Ted Yu  wrote:
>>> 
>>> Aljoscha:
>>> Thanks for spinning RC.
>>> 
>>> bq. The vote will be open for at least 72 hours
>>> 
>>> As you are aware, it is long weekend in US.
>>> 
>>> Is it possible to extend by 24 hours so that developers in US can
>>> participate in validation ?
>>> 
>>> Cheers
>>> 
>>> On Fri, Nov 24, 2017 at 2:57 AM, Aljoscha Krettek 
>>> wrote:
>>> 
 Hi everyone,
 
 Please review and vote on release candidate #1 for the version 1.4.0, as
 follows:
 [ ] +1, Approve the release
 [ ] -1, Do not approve the release (please provide specific comments)
 
 
 The complete staging area is available for your review, which includes:
 * JIRA release notes [1],
 * the official Apache source release and binary convenience releases to be
 deployed to dist.apache.org [2], which are signed with the key with
 fingerprint F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
 * all artifacts to be deployed to the Maven Central Repository [4],
 * source code tag "release-1.4.0-rc1" [5],
 * website pull request listing the new release [6].
 
 Please have a careful look at the website PR because I changed some
 wording and we're now also releasing a binary without Hadoop dependencies.
 
 Please use this document for coordinating testing efforts: [7]
 
 The vote will be open for at least 72 hours. It is adopted by majority
 approval, with at least 3 PMC affirmative votes.
 
 Thanks,
 Your friendly Release Manager
 
 [1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?
 projectId=12315522&version=12340533
 [2] http://people.apache.org/~aljoscha/flink-1.4.0-rc1/
 [3] https://dist.apache.org/repos/dist/release/flink/KEYS
 [4] https://repository.apache.org/content/repositories/orgapacheflink-1139
 [5] https://git-wip-us.apache.org/repos/asf?p=flink.git;a=tag;h=
 a0b322cf77851d3b8589812a0c8e443e9e320e67
 [6] https://github.com/apache/flink-web/pull/95
 [7] https://docs.google.com/document/d/16fU1cpxoYf3o9cCDyakj7ZDnUoJTj
 4_CEmMTpCkY81s/edit?usp=sharing
 
 Pro-tip: you can create a settings.xml file with these contents:
 
 
 
  flink-1.4.0
 
 
  
flink-1.4.0

  
flink-1.4.0

https://repository.apache.org/content/repositories/
 orgapacheflink-1139/

  
  
archetype

https://repository.apache.org/content/repositories/
 orgapacheflink-1139/

  

  
 
 
 
 And reference that in you maven commands via --settings
 path/to/settings.xml. This is useful for creating a quickstart based on the
 staged release and for building against the staged jars.
>> 
> 



Re: [DISCUSS] FLIP-23 Model Serving

2017-11-27 Thread Fabian Hueske
Hi Stavros,

thanks for the detailed FLIP!
Model serving is an important use case and it's great to see efforts to add
a library for this to Flink!

I've read the FLIP and would like to ask a few questions and make some
suggestions.

1) Is it a strict requirement that a ML pipeline must be able to handle
different input types?
I understand that it makes sense to have different models for different
instances of the same type, i.e., same data type but different keys. Hence,
the key-based joins make sense to me. However, couldn't completely
different types be handled by different ML pipelines or would there be
major drawbacks?

2) I think from an API point of view it would be better to not require
input records to be encoded as ProtoBuf messages. Instead, the model server
could accept strongly-typed objects (Java/Scala) and (if necessary) convert
them to ProtoBuf messages internally. In case we need to support different
types of records (see my first point), we can introduce a Union type (i.e.,
an n-ary Either type). I see that we need some kind of binary encoding
format for the models but maybe also this can be designed to be pluggable
such that later other encodings can be added.

3) I think the DataStream Java API should be supported as a first class
citizens for this library.

4) For the integration with the DataStream API, we could provide an API
that receives (typed) DataStream objects, internally constructs the
DataStream operators, and returns one (or more) result DataStreams. The
benefit is that we don't need to change the DataStream API directly, but
put a library on top. The other libraries (CEP, Table, Gelly) follow this
approach.

5) I'm skeptical about using queryable state to expose metrics. Did you
consider using Flink's metrics system [1]? It is easily configurable and we
provided several reporters that export the metrics.

What do you think?
Best, Fabian

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/metrics.html

2017-11-23 12:32 GMT+01:00 Stavros Kontopoulos :

> Hi guys,
>
> Let's discuss the new FLIP proposal for model serving over Flink. The idea
> is to combine previous efforts there and provide a library on top of Flink
> for serving models.
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-23+-+Model+Serving
>
> Code from previous efforts can be found here: https://github.com/FlinkML
>
> Best,
> Stavros
>


Re: [VOTE] Release 1.4.0, release candidate #1

2017-11-27 Thread Stefan Richter
Hi,

thanks for creating this PR Aljoscha! I tested Flink in a cluster setup on 
Google Cloud, YARN-per-job, checked that for all backends that HA, recovery, 
at-least-once, end-to-end exactly once (with Kafka11 Producer), savepoints, 
externalized checkpoints, and rescaling work correctly. No problems found in 
RC1.

+1 (non-binding) from me.

Best,
Stefan 

> Am 24.11.2017 um 14:53 schrieb Ted Yu :
> 
> Long weekend should end this Sunday. 
> Closing vote Wednesday would be great. 
> Thanks
>  Original message From: Aljoscha Krettek 
>  Date: 11/24/17  5:34 AM  (GMT-08:00) To: 
> dev@flink.apache.org Subject: Re: [VOTE] Release 1.4.0, release candidate #1 
> How long will the long weekend be? I thought about closing the vote on 
> Wednesday, i.e. not count the weekend. Would that work?
> 
> Best,
> Aljoscha
> 
>> On 24. Nov 2017, at 12:18, Ted Yu  wrote:
>> 
>> Aljoscha:
>> Thanks for spinning RC.
>> 
>> bq. The vote will be open for at least 72 hours
>> 
>> As you are aware, it is long weekend in US.
>> 
>> Is it possible to extend by 24 hours so that developers in US can
>> participate in validation ?
>> 
>> Cheers
>> 
>> On Fri, Nov 24, 2017 at 2:57 AM, Aljoscha Krettek 
>> wrote:
>> 
>>> Hi everyone,
>>> 
>>> Please review and vote on release candidate #1 for the version 1.4.0, as
>>> follows:
>>> [ ] +1, Approve the release
>>> [ ] -1, Do not approve the release (please provide specific comments)
>>> 
>>> 
>>> The complete staging area is available for your review, which includes:
>>> * JIRA release notes [1],
>>> * the official Apache source release and binary convenience releases to be
>>> deployed to dist.apache.org [2], which are signed with the key with
>>> fingerprint F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
>>> * all artifacts to be deployed to the Maven Central Repository [4],
>>> * source code tag "release-1.4.0-rc1" [5],
>>> * website pull request listing the new release [6].
>>> 
>>> Please have a careful look at the website PR because I changed some
>>> wording and we're now also releasing a binary without Hadoop dependencies.
>>> 
>>> Please use this document for coordinating testing efforts: [7]
>>> 
>>> The vote will be open for at least 72 hours. It is adopted by majority
>>> approval, with at least 3 PMC affirmative votes.
>>> 
>>> Thanks,
>>> Your friendly Release Manager
>>> 
>>> [1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?
>>> projectId=12315522&version=12340533
>>> [2] http://people.apache.org/~aljoscha/flink-1.4.0-rc1/
>>> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
>>> [4] https://repository.apache.org/content/repositories/orgapacheflink-1139
>>> [5] https://git-wip-us.apache.org/repos/asf?p=flink.git;a=tag;h=
>>> a0b322cf77851d3b8589812a0c8e443e9e320e67
>>> [6] https://github.com/apache/flink-web/pull/95
>>> [7] https://docs.google.com/document/d/16fU1cpxoYf3o9cCDyakj7ZDnUoJTj
>>> 4_CEmMTpCkY81s/edit?usp=sharing
>>> 
>>> Pro-tip: you can create a settings.xml file with these contents:
>>> 
>>> 
>>> 
>>>   flink-1.4.0
>>> 
>>> 
>>>   
>>> flink-1.4.0
>>> 
>>>   
>>> flink-1.4.0
>>> 
>>> https://repository.apache.org/content/repositories/
>>> orgapacheflink-1139/
>>> 
>>>   
>>>   
>>> archetype
>>> 
>>> https://repository.apache.org/content/repositories/
>>> orgapacheflink-1139/
>>> 
>>>   
>>> 
>>>   
>>> 
>>> 
>>> 
>>> And reference that in you maven commands via --settings
>>> path/to/settings.xml. This is useful for creating a quickstart based on the
>>> staged release and for building against the staged jars.
> 



Re: Call for responses: Apache Flink user survey 2017

2017-11-27 Thread Till Rohrmann
Hi everyone,

As a reminder, the 2017 Apache Flink User Survey will be open for responses
through the end of the day today, Monday November 27. Thank you to everyone
who's participated already.

Cheers,
Till

On Tue, Nov 7, 2017 at 4:04 PM, Till Rohrmann  wrote:

> Hi everyone,
>
> data Artisans is running a second annual Apache Flink user survey [1] in
> order to understand Flink usage and the needs of the community. This survey
> will help to shape the Flink roadmap and make Flink the best that it can be
> for users.
>
> We'll publish a report with a summary of findings at the conclusion of the
> survey. All of your responses will remain confidential, and only aggregate
> statistics will be shared.
>
> We expect the survey to take 5-10 minutes, and all questions are
> optional--we appreciate any feedback that you're willing to provide. The
> survey will be open for responses until Monday, November 27.
>
> As a thank you, respondents will be entered in a drawing to win one of 10
> tickets to Flink Forward 2018 (your choice of the second-annual San
> Francisco event on April 9-10 or the Berlin event on September 3-5).
>
> We look forward to hearing from you.
>
> Cheers,
> Till
>
> [1] http://www.surveygizmo.com/s3/3166399/Apache-Flink-User-
> Survey-2ecff2d56551
>


[jira] [Created] (FLINK-8160) Extend OperatorHarness to expose metrics

2017-11-27 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-8160:
---

 Summary: Extend OperatorHarness to expose metrics
 Key: FLINK-8160
 URL: https://issues.apache.org/jira/browse/FLINK-8160
 Project: Flink
  Issue Type: Improvement
  Components: Metrics, Streaming
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.5.0


To better test interactions between operators and metrics the harness should 
expose the metrics registered by the operator.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-8159) Add rich support for SelectWrapper and FlatSelectWrapper

2017-11-27 Thread Dian Fu (JIRA)
Dian Fu created FLINK-8159:
--

 Summary: Add rich support for SelectWrapper and FlatSelectWrapper
 Key: FLINK-8159
 URL: https://issues.apache.org/jira/browse/FLINK-8159
 Project: Flink
  Issue Type: Sub-task
  Components: CEP
Reporter: Dian Fu
Assignee: Dian Fu


{{SelectWrapper}} and {{FlatSelectWrapper}} should extends 
{{AbstractRichFucntion}} and process properly if the underlying functions 
extend RichFunction.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-8158) Rowtime window inner join emits late data

2017-11-27 Thread Hequn Cheng (JIRA)
Hequn Cheng created FLINK-8158:
--

 Summary: Rowtime window inner join emits late data
 Key: FLINK-8158
 URL: https://issues.apache.org/jira/browse/FLINK-8158
 Project: Flink
  Issue Type: Bug
  Components: Table API & SQL
Reporter: Hequn Cheng
Assignee: Hequn Cheng


When executing the join, the join operator needs to make sure that no late data 
is emitted. Currently, this achieved by holding back watermarks. However, the 
window border is not handled correctly. For the sql bellow: 
{quote}
val sqlQuery =
  """
|SELECT t2.key, t2.id, t1.id
|FROM T1 as t1 join T2 as t2 ON
|  t1.key = t2.key AND
|  t1.rt BETWEEN t2.rt - INTERVAL '5' SECOND AND
|t2.rt + INTERVAL '1' SECOND
|""".stripMargin

val data1 = new mutable.MutableList[(String, String, Long)]
// for boundary test
data1.+=(("A", "LEFT1", 6000L))

val data2 = new mutable.MutableList[(String, String, Long)]
data2.+=(("A", "RIGHT1", 6000L))
{quote}

Join will output a watermark with timestamp 1000, but if left comes with 
another data ("A", "LEFT1", 1000L), join will output a record with timestamp 
1000 which equals previous watermark.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-8157) Create a "Flink Resources" page on the Apache Flink project site

2017-11-27 Thread Mike Winters (JIRA)
Mike Winters created FLINK-8157:
---

 Summary: Create a "Flink Resources" page on the Apache Flink 
project site
 Key: FLINK-8157
 URL: https://issues.apache.org/jira/browse/FLINK-8157
 Project: Flink
  Issue Type: Improvement
  Components: Project Website
Reporter: Mike Winters
Assignee: Mike Winters
Priority: Minor


The Apache Flink project website does not currently provide a single, 
well-organized list of Flink resources for users who are new to the framework 
or who are looking for support on their projects. 

In an effort to make these resources easier to find, we can create a 
"Resources" page that will be linked to from the main navigation and will 
provide information about: 

• Mailing lists
• Stack Overflow
• Publicly-available training

...and more. 

In some cases, resources listed on this page will overlap with resources listed 
on other pages For instance, mailing list info would also appear on "How to 
Contribute".

In other cases, we can move information from existing pages to this new page. 
For instance, the "Slides" section at the bottom of the "Community & Project 
Info" page could be more relevant on a "Resources" page than where it's 
currently located. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)