[jira] [Commented] (FLINK-7827) creating and starting a NetCat in test cases programmatically

2017-10-14 Thread bluejoe (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204991#comment-16204991
 ] 

bluejoe commented on FLINK-7827:


yes,right

> creating and starting a NetCat in test cases programmatically
> -
>
> Key: FLINK-7827
> URL: https://issues.apache.org/jira/browse/FLINK-7827
> Project: Flink
>  Issue Type: Improvement
>  Components: DataStream API
>Reporter: bluejoe
>  Labels: httpstreaming, netcat, streaming
>
> hi, all
> I have written a MockNetCat class, which help developers start a netcat 
> programmatically, instead of manually launching
> {{nc -lk }}
> command for test
> developers can start a mock NetCat server in test cases and send data to the 
> server (simulating user input text in the nc shell)
> is this feature ok to create a PR to flink?
> use of MockNetCat is very simple, like:
> {{var nc: MockNetCat = MockNetCat.start();}}
> this starts a NetCat server, and data can be generated using following code:
> {{nc.writeData("hello\r\nworld\r\nbye\r\nworld\r\n");}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (FLINK-5690) protobuf is not shaded properly

2017-10-14 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-5690.
---

> protobuf is not shaded properly
> ---
>
> Key: FLINK-5690
> URL: https://issues.apache.org/jira/browse/FLINK-5690
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.1.4, 1.3.0
>Reporter: Andrey
>Assignee: Aljoscha Krettek
> Fix For: 1.4.0
>
>
> Currently distributive contains com/google/protobuf package. Without proper 
> shading client code could fail with:
> {code}
> Caused by: java.lang.IllegalAccessError: tried to access method 
> com.google.protobuf.
> {code}
> Steps to reproduce:
> * create job class "com.google.protobuf.TestClass"
> * call com.google.protobuf.TextFormat.escapeText(String) method from this 
> class
> * deploy job to flink cluster (usign web console for example)
> * run job. In logs IllegalAccessError.
> Issue in package protected method and different classloaders. TestClass 
> loaded by FlinkUserCodeClassLoader, but TextFormat class loaded by 
> sun.misc.Launcher$AppClassLoader



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (FLINK-5690) protobuf is not shaded properly

2017-10-14 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-5690.
-
   Resolution: Resolved
 Assignee: Aljoscha Krettek  (was: Robert Metzger)
Fix Version/s: 1.4.0
 Release Note: Fixed via FLINK-7810

> protobuf is not shaded properly
> ---
>
> Key: FLINK-5690
> URL: https://issues.apache.org/jira/browse/FLINK-5690
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.1.4, 1.3.0
>Reporter: Andrey
>Assignee: Aljoscha Krettek
> Fix For: 1.4.0
>
>
> Currently distributive contains com/google/protobuf package. Without proper 
> shading client code could fail with:
> {code}
> Caused by: java.lang.IllegalAccessError: tried to access method 
> com.google.protobuf.
> {code}
> Steps to reproduce:
> * create job class "com.google.protobuf.TestClass"
> * call com.google.protobuf.TextFormat.escapeText(String) method from this 
> class
> * deploy job to flink cluster (usign web console for example)
> * run job. In logs IllegalAccessError.
> Issue in package protected method and different classloaders. TestClass 
> loaded by FlinkUserCodeClassLoader, but TextFormat class loaded by 
> sun.misc.Launcher$AppClassLoader



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-5690) protobuf is not shaded properly

2017-10-14 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204878#comment-16204878
 ] 

Stephan Ewen commented on FLINK-5690:
-

Fixed by moving to a newer version of Akka which hides Protobuf in FLINK-7810

> protobuf is not shaded properly
> ---
>
> Key: FLINK-5690
> URL: https://issues.apache.org/jira/browse/FLINK-5690
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.1.4, 1.3.0
>Reporter: Andrey
>Assignee: Robert Metzger
>
> Currently distributive contains com/google/protobuf package. Without proper 
> shading client code could fail with:
> {code}
> Caused by: java.lang.IllegalAccessError: tried to access method 
> com.google.protobuf.
> {code}
> Steps to reproduce:
> * create job class "com.google.protobuf.TestClass"
> * call com.google.protobuf.TextFormat.escapeText(String) method from this 
> class
> * deploy job to flink cluster (usign web console for example)
> * run job. In logs IllegalAccessError.
> Issue in package protected method and different classloaders. TestClass 
> loaded by FlinkUserCodeClassLoader, but TextFormat class loaded by 
> sun.misc.Launcher$AppClassLoader



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (FLINK-7809) Drop support for Scala 2.10

2017-10-14 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-7809.
-
  Resolution: Done
Release Note: Done in a6039cab509b619efe0739d5f7d5e6099f954cb8

> Drop support for Scala 2.10
> ---
>
> Key: FLINK-7809
> URL: https://issues.apache.org/jira/browse/FLINK-7809
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
> Fix For: 1.4.0
>
>
> Dropping support for Scala 2.10 allows us to get rid of our custom Flakka 
> 2.3.x (with 2.4 back ports) dependency and instead use a newer Akka.
> This is also a prerequisite for adding Scala 2.12 support because supporting 
> three Scala versions would require too much fiddling.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (FLINK-7809) Drop support for Scala 2.10

2017-10-14 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-7809.
---

> Drop support for Scala 2.10
> ---
>
> Key: FLINK-7809
> URL: https://issues.apache.org/jira/browse/FLINK-7809
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
> Fix For: 1.4.0
>
>
> Dropping support for Scala 2.10 allows us to get rid of our custom Flakka 
> 2.3.x (with 2.4 back ports) dependency and instead use a newer Akka.
> This is also a prerequisite for adding Scala 2.12 support because supporting 
> three Scala versions would require too much fiddling.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (FLINK-7810) Switch from custom Flakka to Akka 2.4.x

2017-10-14 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-7810.
-
  Resolution: Fixed
Release Note: 
Fixed via 5a5006ceb8d19bc0f3cc490451a18b8fc21197cb

Addendum in 1e03c6c46ef5706c05183435a3d3de616cc063ee

> Switch from custom Flakka to Akka 2.4.x
> ---
>
> Key: FLINK-7810
> URL: https://issues.apache.org/jira/browse/FLINK-7810
> Project: Flink
>  Issue Type: Improvement
>  Components: Distributed Coordination
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
> Fix For: 1.4.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (FLINK-7810) Switch from custom Flakka to Akka 2.4.x

2017-10-14 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-7810.
---

> Switch from custom Flakka to Akka 2.4.x
> ---
>
> Key: FLINK-7810
> URL: https://issues.apache.org/jira/browse/FLINK-7810
> Project: Flink
>  Issue Type: Improvement
>  Components: Distributed Coordination
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
> Fix For: 1.4.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (FLINK-7842) Shade jackson (org.codehouse.jackson) in flink-shaded-hadoop2

2017-10-14 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-7842.
---

> Shade jackson (org.codehouse.jackson) in flink-shaded-hadoop2
> -
>
> Key: FLINK-7842
> URL: https://issues.apache.org/jira/browse/FLINK-7842
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
> Fix For: 1.4.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (FLINK-7842) Shade jackson (org.codehouse.jackson) in flink-shaded-hadoop2

2017-10-14 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-7842.
-
  Resolution: Fixed
Release Note: Fixed via 438ee260dea61d28968dd6ea5f02142ea2f7c533

> Shade jackson (org.codehouse.jackson) in flink-shaded-hadoop2
> -
>
> Key: FLINK-7842
> URL: https://issues.apache.org/jira/browse/FLINK-7842
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
> Fix For: 1.4.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (FLINK-7419) Shade jackson dependency in flink-avro

2017-10-14 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-7419.
-
  Resolution: Fixed
Release Note: Fixed via dbaf262c6e2f8a5db4a40d4ecce42782dc7e2d2e

> Shade jackson dependency in flink-avro
> --
>
> Key: FLINK-7419
> URL: https://issues.apache.org/jira/browse/FLINK-7419
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System
>Reporter: Stephan Ewen
>Assignee: Fang Yong
>Priority: Blocker
> Fix For: 1.4.0
>
>
> Avro uses {{org.codehouse.jackson}} which also exists in multiple 
> incompatible versions. We should shade it to 
> {{org.apache.flink.shaded.avro.org.codehouse.jackson}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (FLINK-7419) Shade jackson dependency in flink-avro

2017-10-14 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-7419.
---

> Shade jackson dependency in flink-avro
> --
>
> Key: FLINK-7419
> URL: https://issues.apache.org/jira/browse/FLINK-7419
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System
>Reporter: Stephan Ewen
>Assignee: Fang Yong
>Priority: Blocker
> Fix For: 1.4.0
>
>
> Avro uses {{org.codehouse.jackson}} which also exists in multiple 
> incompatible versions. We should shade it to 
> {{org.apache.flink.shaded.avro.org.codehouse.jackson}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7810) Switch from custom Flakka to Akka 2.4.x

2017-10-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204853#comment-16204853
 ] 

ASF GitHub Bot commented on FLINK-7810:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4807


> Switch from custom Flakka to Akka 2.4.x
> ---
>
> Key: FLINK-7810
> URL: https://issues.apache.org/jira/browse/FLINK-7810
> Project: Flink
>  Issue Type: Improvement
>  Components: Distributed Coordination
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
> Fix For: 1.4.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #4807: [FLINK-7810] Switch from custom Flakka to Akka 2.4...

2017-10-14 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4807


---


[jira] [Commented] (FLINK-7419) Shade jackson dependency in flink-avro

2017-10-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204827#comment-16204827
 ] 

ASF GitHub Bot commented on FLINK-7419:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4524
  
Merging this...


> Shade jackson dependency in flink-avro
> --
>
> Key: FLINK-7419
> URL: https://issues.apache.org/jira/browse/FLINK-7419
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System
>Reporter: Stephan Ewen
>Assignee: Fang Yong
>Priority: Blocker
> Fix For: 1.4.0
>
>
> Avro uses {{org.codehouse.jackson}} which also exists in multiple 
> incompatible versions. We should shade it to 
> {{org.apache.flink.shaded.avro.org.codehouse.jackson}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4524: [FLINK-7419] Shade jackson dependency in flink-avro

2017-10-14 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4524
  
Merging this...


---


[jira] [Assigned] (FLINK-7286) Flink Dashboard fails to display bytes/records received by sources

2017-10-14 Thread Hai Zhou UTC+8 (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hai Zhou UTC+8 reassigned FLINK-7286:
-

Assignee: Hai Zhou UTC+8

> Flink Dashboard fails to display bytes/records received by sources
> --
>
> Key: FLINK-7286
> URL: https://issues.apache.org/jira/browse/FLINK-7286
> Project: Flink
>  Issue Type: Bug
>  Components: Webfrontend
>Affects Versions: 1.3.1
>Reporter: Elias Levy
>Assignee: Hai Zhou UTC+8
> Fix For: 1.4.0
>
>
> It appears Flink can't measure the number of bytes read or records produced 
> by a source (e.g. Kafka source). This is particularly problematic for simple 
> jobs where the job pipeline is chained, and in which there are no 
> measurements between operators. Thus, in the UI it appears that the job is 
> not consuming any data.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7842) Shade jackson (org.codehouse.jackson) in flink-shaded-hadoop2

2017-10-14 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-7842:
---

 Summary: Shade jackson (org.codehouse.jackson) in 
flink-shaded-hadoop2
 Key: FLINK-7842
 URL: https://issues.apache.org/jira/browse/FLINK-7842
 Project: Flink
  Issue Type: Sub-task
  Components: Build System
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 1.4.0






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6019) Some log4j messages do not have a loglevel field set, so they can't be suppressed

2017-10-14 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204738#comment-16204738
 ] 

Stephan Ewen commented on FLINK-6019:
-

These are meant as progress reports for the client.

You should be able to deactivate this by 
{{env.getConfig().disableSysoutLogging()}}.

Does that solve the issue?

> Some log4j messages do not have a loglevel field set, so they can't be 
> suppressed
> -
>
> Key: FLINK-6019
> URL: https://issues.apache.org/jira/browse/FLINK-6019
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.2.0
> Environment: Linux
>Reporter: Luke Hutchison
>
> Some of the log messages do not appear to have a loglevel value set, so they 
> can't be suppressed by setting the log4j level to WARN. There's this line at 
> the beginning which doesn't even have a timestamp:
> {noformat}
> Connected to JobManager at Actor[akka://flink/user/jobmanager_1#1844933939]
> {noformat}
> And then there are numerous lines like this, missing an "INFO" field:
> {noformat}
> 03/10/2017 00:01:14   Job execution switched to status RUNNING.
> 03/10/2017 00:01:14   DataSource (at readTable(DBTableReader.java:165) 
> (org.apache.flink.api.java.io.PojoCsvInputFormat))(1/8) switched to SCHEDULED 
> 03/10/2017 00:01:14   DataSink (count())(1/8) switched to SCHEDULED 
> 03/10/2017 00:01:14   DataSink (count())(3/8) switched to DEPLOYING 
> 03/10/2017 00:01:15   DataSink (count())(3/8) switched to RUNNING 
> 03/10/2017 00:01:17   DataSink (count())(6/8) switched to FINISHED 
> 03/10/2017 00:01:17   DataSource (at readTable(DBTableReader.java:165) 
> (org.apache.flink.api.java.io.PojoCsvInputFormat))(6/8) switched to FINISHED 
> 03/10/2017 00:01:17   Job execution switched to status FINISHED.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (FLINK-5706) Implement Flink's own S3 filesystem

2017-10-14 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-5706.
---

> Implement Flink's own S3 filesystem
> ---
>
> Key: FLINK-5706
> URL: https://issues.apache.org/jira/browse/FLINK-5706
> Project: Flink
>  Issue Type: New Feature
>  Components: filesystem-connector
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
> Fix For: 1.4.0
>
>
> As part of the effort to make Flink completely independent from Hadoop, Flink 
> needs its own S3 filesystem implementation. Currently Flink relies on 
> Hadoop's S3a and S3n file systems.
> An own S3 file system can be implemented using the AWS SDK. As the basis of 
> the implementation, the Hadoop File System can be used (Apache Licensed, 
> should be okay to reuse some code as long as we do a proper attribution).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-7841) Add docs for Flink's S3 support

2017-10-14 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-7841:
---

 Summary: Add docs for Flink's S3 support
 Key: FLINK-7841
 URL: https://issues.apache.org/jira/browse/FLINK-7841
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 1.4.0






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (FLINK-5706) Implement Flink's own S3 filesystem

2017-10-14 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-5706.
-
   Resolution: Fixed
Fix Version/s: 1.4.0
 Release Note: Implemented in 991af3652479f85f732cbbade46bed7df1c5d819

> Implement Flink's own S3 filesystem
> ---
>
> Key: FLINK-5706
> URL: https://issues.apache.org/jira/browse/FLINK-5706
> Project: Flink
>  Issue Type: New Feature
>  Components: filesystem-connector
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
> Fix For: 1.4.0
>
>
> As part of the effort to make Flink completely independent from Hadoop, Flink 
> needs its own S3 filesystem implementation. Currently Flink relies on 
> Hadoop's S3a and S3n file systems.
> An own S3 file system can be implemented using the AWS SDK. As the basis of 
> the implementation, the Hadoop File System can be used (Apache Licensed, 
> should be okay to reuse some code as long as we do a proper attribution).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-5706) Implement Flink's own S3 filesystem

2017-10-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204737#comment-16204737
 ] 

ASF GitHub Bot commented on FLINK-5706:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4818


> Implement Flink's own S3 filesystem
> ---
>
> Key: FLINK-5706
> URL: https://issues.apache.org/jira/browse/FLINK-5706
> Project: Flink
>  Issue Type: New Feature
>  Components: filesystem-connector
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
>
> As part of the effort to make Flink completely independent from Hadoop, Flink 
> needs its own S3 filesystem implementation. Currently Flink relies on 
> Hadoop's S3a and S3n file systems.
> An own S3 file system can be implemented using the AWS SDK. As the basis of 
> the implementation, the Hadoop File System can be used (Apache Licensed, 
> should be okay to reuse some code as long as we do a proper attribution).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #4818: [FLINK-5706] [file systems] Add S3 file systems wi...

2017-10-14 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4818


---


[jira] [Created] (FLINK-7840) Shade Akka's Netty Dependency

2017-10-14 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-7840:
---

 Summary: Shade Akka's Netty Dependency
 Key: FLINK-7840
 URL: https://issues.apache.org/jira/browse/FLINK-7840
 Project: Flink
  Issue Type: Improvement
  Components: Build System
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 1.4.0


In order to avoid clashes between different Netty versions we should shade 
Akka's Netty away.

These dependency version clashed manifest themselves in very subtle ways, like 
occasional deadlocks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (FLINK-7691) Remove ClassTag in Scala DataSet API

2017-10-14 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen updated FLINK-7691:

Issue Type: Sub-task  (was: Improvement)
Parent: FLINK-3957

> Remove ClassTag in Scala DataSet API
> 
>
> Key: FLINK-7691
> URL: https://issues.apache.org/jira/browse/FLINK-7691
> Project: Flink
>  Issue Type: Sub-task
>  Components: DataSet API
>Reporter: Timo Walther
> Fix For: 2.0.0
>
>
> In the DataStream API a {{ClassTag}} is not required, which allows to pass 
> {{TypeInformation}} manually if required. In the DataSet API most methods 
> look like:
> {code}
> // DataSet API
> def fromElements[T: ClassTag : TypeInformation](data: T*): DataSet[T]
> // DataStream API
> def fromElements[T: TypeInformation](data: T*): DataStream[T]
> {code}
> I would propose to remove the ClassTag, if possible. This would make it 
> easier e.g. to supply TypeInformation for the {{Row}} type. Or is there an 
> easier way in Scala that I don't know?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (FLINK-7691) Remove ClassTag in Scala DataSet API

2017-10-14 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen updated FLINK-7691:

Fix Version/s: 2.0.0

> Remove ClassTag in Scala DataSet API
> 
>
> Key: FLINK-7691
> URL: https://issues.apache.org/jira/browse/FLINK-7691
> Project: Flink
>  Issue Type: Improvement
>  Components: DataSet API
>Reporter: Timo Walther
> Fix For: 2.0.0
>
>
> In the DataStream API a {{ClassTag}} is not required, which allows to pass 
> {{TypeInformation}} manually if required. In the DataSet API most methods 
> look like:
> {code}
> // DataSet API
> def fromElements[T: ClassTag : TypeInformation](data: T*): DataSet[T]
> // DataStream API
> def fromElements[T: TypeInformation](data: T*): DataStream[T]
> {code}
> I would propose to remove the ClassTag, if possible. This would make it 
> easier e.g. to supply TypeInformation for the {{Row}} type. Or is there an 
> easier way in Scala that I don't know?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7794) Link Broken in -- https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/connectors/index.html

2017-10-14 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204734#comment-16204734
 ] 

Stephan Ewen commented on FLINK-7794:
-

Thanks for reporting this. Do you want to open a pull request to fix this issue?

> Link Broken in -- 
> https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/connectors/index.html
> ---
>
> Key: FLINK-7794
> URL: https://issues.apache.org/jira/browse/FLINK-7794
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.3.0
>Reporter: Paul Wu
>Priority: Minor
>
> Broken url link  "predefined data sources"  in page 
> https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/connectors/index.html.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7827) creating and starting a NetCat in test cases programmatically

2017-10-14 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204732#comment-16204732
 ] 

Stephan Ewen commented on FLINK-7827:
-

Is this intended for example programs and test?

> creating and starting a NetCat in test cases programmatically
> -
>
> Key: FLINK-7827
> URL: https://issues.apache.org/jira/browse/FLINK-7827
> Project: Flink
>  Issue Type: Improvement
>  Components: DataStream API
>Reporter: bluejoe
>  Labels: httpstreaming, netcat, streaming
>
> hi, all
> I have written a MockNetCat class, which help developers start a netcat 
> programmatically, instead of manually launching
> {{nc -lk }}
> command for test
> developers can start a mock NetCat server in test cases and send data to the 
> server (simulating user input text in the nc shell)
> is this feature ok to create a PR to flink?
> use of MockNetCat is very simple, like:
> {{var nc: MockNetCat = MockNetCat.start();}}
> this starts a NetCat server, and data can be generated using following code:
> {{nc.writeData("hello\r\nworld\r\nbye\r\nworld\r\n");}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7829) Remove (or at least deprecate) DataStream.writeToFile/Csv

2017-10-14 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204731#comment-16204731
 ] 

Stephan Ewen commented on FLINK-7829:
-

I think these methods should exist on something like a "bounded stream" that is 
executed as one batch.

> Remove (or at least deprecate) DataStream.writeToFile/Csv
> -
>
> Key: FLINK-7829
> URL: https://issues.apache.org/jira/browse/FLINK-7829
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API
>Reporter: Aljoscha Krettek
>Priority: Blocker
> Fix For: 1.4.0
>
>
> These methods are seductive for users but they should never actually use them 
> in a production streaming job. For those cases the {{BucketingSink}} should 
> be used.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7837) AggregatingFunction does not work with immutable types

2017-10-14 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204730#comment-16204730
 ] 

Stephan Ewen commented on FLINK-7837:
-

It should have been like that from the start, oversight on my side...

> AggregatingFunction does not work with immutable types
> --
>
> Key: FLINK-7837
> URL: https://issues.apache.org/jira/browse/FLINK-7837
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>Priority: Blocker
> Fix For: 1.4.0
>
>
> Currently {{add()}} has this signature:
> {code}
> void add(IN value, ACC accumulator);
> {code}
> meaning that a value can only be added to an accumulator by modifying the 
> accumulator. This should be extended to:
> {code}
> ACC add(IN value, ACC accumulator);
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7837) AggregatingFunction does not work with immutable types

2017-10-14 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204729#comment-16204729
 ] 

Stephan Ewen commented on FLINK-7837:
-

+1 for this change

> AggregatingFunction does not work with immutable types
> --
>
> Key: FLINK-7837
> URL: https://issues.apache.org/jira/browse/FLINK-7837
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>Priority: Blocker
> Fix For: 1.4.0
>
>
> Currently {{add()}} has this signature:
> {code}
> void add(IN value, ACC accumulator);
> {code}
> meaning that a value can only be added to an accumulator by modifying the 
> accumulator. This should be extended to:
> {code}
> ACC add(IN value, ACC accumulator);
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (FLINK-7340) Taskmanager hung after temporary DNS outage

2017-10-14 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204715#comment-16204715
 ] 

Stephan Ewen edited comment on FLINK-7340 at 10/14/17 4:27 PM:
---

Thanks, that is a good pointer!

[~till.rohrmann] Can we take this into account in the Akka / HA address 
management?


was (Author: stephanewen):
Thanks, that is a good pointer!

[~till.rohrmann] Can we take this into account in the Akka / HA address 
management

> Taskmanager hung after temporary DNS outage
> ---
>
> Key: FLINK-7340
> URL: https://issues.apache.org/jira/browse/FLINK-7340
> Project: Flink
>  Issue Type: Bug
>  Components: Core, Distributed Coordination
>Affects Versions: 1.3.1
> Environment: Non-HA Flink running in Kubernetes.
>Reporter: Joshua Griffith
>
> After a Kubernetes node failure, several TaskManagers and the DNS system were 
> automatically restarted. One TaskManager was unable to connect to the 
> JobManager and continually logged the following errors:
> {quote}
> 2017-08-01 18:58:06.707 [flink-akka.actor.default-dispatcher-823] INFO  
> org.apache.flink.runtime.taskmanager.TaskManager  - Trying to register at 
> JobManager akka.tcp://flink@jobmanager:6123/user/jobmanager (attempt 595, 
> timeout: 3 milliseconds)
> 2017-08-01 18:58:06.713 [flink-akka.actor.default-dispatcher-834] INFO  
> Remoting flink-akka.remote.default-remote-dispatcher-240 - Quarantined 
> address [akka.tcp://flink@jobmanager:6123] is still unreachable or has not 
> been restarted. Keeping it quarantined.
> {quote}
> After exec'ing into the container, I was able to {{telnet jobmanager 6123}} 
> successfully and {{dig jobmanager}} showed the correct IP in DNS. I suspect 
> that the TaskManager cached a bad IP address for the JobManager when the DNS 
> system was restarting and it used that cached address rather than respecting 
> the 30s TTL and getting a new one for the next request. It may be a good idea 
> for the TaskManager to explicitly perform a DNS lookup after JobManager 
> connection failures.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7340) Taskmanager hung after temporary DNS outage

2017-10-14 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204715#comment-16204715
 ] 

Stephan Ewen commented on FLINK-7340:
-

Thanks, that is a good pointer!

[~till.rohrmann] Can we take this into account in the Akka / HA address 
management

> Taskmanager hung after temporary DNS outage
> ---
>
> Key: FLINK-7340
> URL: https://issues.apache.org/jira/browse/FLINK-7340
> Project: Flink
>  Issue Type: Bug
>  Components: Core, Distributed Coordination
>Affects Versions: 1.3.1
> Environment: Non-HA Flink running in Kubernetes.
>Reporter: Joshua Griffith
>
> After a Kubernetes node failure, several TaskManagers and the DNS system were 
> automatically restarted. One TaskManager was unable to connect to the 
> JobManager and continually logged the following errors:
> {quote}
> 2017-08-01 18:58:06.707 [flink-akka.actor.default-dispatcher-823] INFO  
> org.apache.flink.runtime.taskmanager.TaskManager  - Trying to register at 
> JobManager akka.tcp://flink@jobmanager:6123/user/jobmanager (attempt 595, 
> timeout: 3 milliseconds)
> 2017-08-01 18:58:06.713 [flink-akka.actor.default-dispatcher-834] INFO  
> Remoting flink-akka.remote.default-remote-dispatcher-240 - Quarantined 
> address [akka.tcp://flink@jobmanager:6123] is still unreachable or has not 
> been restarted. Keeping it quarantined.
> {quote}
> After exec'ing into the container, I was able to {{telnet jobmanager 6123}} 
> successfully and {{dig jobmanager}} showed the correct IP in DNS. I suspect 
> that the TaskManager cached a bad IP address for the JobManager when the DNS 
> system was restarting and it used that cached address rather than respecting 
> the 30s TTL and getting a new one for the next request. It may be a good idea 
> for the TaskManager to explicitly perform a DNS lookup after JobManager 
> connection failures.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (FLINK-4377) akka.remote.OversizedPayloadException: Discarding oversized payload

2017-10-14 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-4377.
---

> akka.remote.OversizedPayloadException: Discarding oversized payload
> ---
>
> Key: FLINK-4377
> URL: https://issues.apache.org/jira/browse/FLINK-4377
> Project: Flink
>  Issue Type: Bug
>  Components: DataSet API
>Affects Versions: 1.1.0
> Environment: Linux
>Reporter: Sajeev Ramakrishnan
>
> Dear Team,
>   I was trying to create a hash map with a size around 1 million. Then I 
> encountered the below issue. For smaller maps, I am not seeing any issues.
> akka.remote.OversizedPayloadException: Discarding oversized payload sent to 
> Actor[akka.tcp://flink@localhost:58247/user/$d#458673459]: max allowed size 
> 10485760 bytes, actual size of encoded class 
> org.apache.flink.runtime.messages.JobManagerMessages$JobResultSuccess was 
> 175254213 bytes.
> Regards,
> Sajeev



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (FLINK-4377) akka.remote.OversizedPayloadException: Discarding oversized payload

2017-10-14 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-4377.
-
  Resolution: Duplicate
Release Note: Duplicate of FLINK-4399

> akka.remote.OversizedPayloadException: Discarding oversized payload
> ---
>
> Key: FLINK-4377
> URL: https://issues.apache.org/jira/browse/FLINK-4377
> Project: Flink
>  Issue Type: Bug
>  Components: DataSet API
>Affects Versions: 1.1.0
> Environment: Linux
>Reporter: Sajeev Ramakrishnan
>
> Dear Team,
>   I was trying to create a hash map with a size around 1 million. Then I 
> encountered the below issue. For smaller maps, I am not seeing any issues.
> akka.remote.OversizedPayloadException: Discarding oversized payload sent to 
> Actor[akka.tcp://flink@localhost:58247/user/$d#458673459]: max allowed size 
> 10485760 bytes, actual size of encoded class 
> org.apache.flink.runtime.messages.JobManagerMessages$JobResultSuccess was 
> 175254213 bytes.
> Regards,
> Sajeev



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6683) building with Scala 2.11 no longer uses change-scala-version.sh

2017-10-14 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204713#comment-16204713
 ] 

Stephan Ewen commented on FLINK-6683:
-

I think this issue is not valid for 1.3 - as Greg mentioned, 1.3 actually uses 
that script still to change Scala versions.

The 1.4 docs are correct.

> building with Scala 2.11 no longer uses change-scala-version.sh
> ---
>
> Key: FLINK-6683
> URL: https://issues.apache.org/jira/browse/FLINK-6683
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System, Documentation
>Affects Versions: 1.3.0
>Reporter: David Anderson
> Fix For: 1.3.0
>
>
> FLINK-6414 eliminated change-scala-version.sh. The documentation 
> (setup/building.html) needs to be updated to match.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (FLINK-6683) building with Scala 2.11 no longer uses change-scala-version.sh

2017-10-14 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-6683.
-
Resolution: Invalid

> building with Scala 2.11 no longer uses change-scala-version.sh
> ---
>
> Key: FLINK-6683
> URL: https://issues.apache.org/jira/browse/FLINK-6683
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System, Documentation
>Affects Versions: 1.3.0
>Reporter: David Anderson
> Fix For: 1.3.0
>
>
> FLINK-6414 eliminated change-scala-version.sh. The documentation 
> (setup/building.html) needs to be updated to match.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (FLINK-6683) building with Scala 2.11 no longer uses change-scala-version.sh

2017-10-14 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-6683.
---

> building with Scala 2.11 no longer uses change-scala-version.sh
> ---
>
> Key: FLINK-6683
> URL: https://issues.apache.org/jira/browse/FLINK-6683
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System, Documentation
>Affects Versions: 1.3.0
>Reporter: David Anderson
> Fix For: 1.3.0
>
>
> FLINK-6414 eliminated change-scala-version.sh. The documentation 
> (setup/building.html) needs to be updated to match.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (FLINK-2075) Shade akka and protobuf dependencies away

2017-10-14 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-2075.
---

> Shade akka and protobuf dependencies away
> -
>
> Key: FLINK-2075
> URL: https://issues.apache.org/jira/browse/FLINK-2075
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Till Rohrmann
> Fix For: 1.0.0, 1.4.0
>
>
> Lately, the Zeppelin project encountered the following problem: It includes 
> flink-runtime which depends on akka_remote:2.3.7 which again depends on 
> protobuf-java:2.5.0. However, Zeppelin set the protobuf-java version to 2.4.1 
> to make it build with YARN 2.2. Due to this, akka_remote finds a wrong 
> protobuf-java version and fails because of an incompatible change between 
> these versions.
> I propose to shade Flink's akka dependency and protobuf dependency away, so 
> that user projects depending on Flink are not forced to use a special 
> akka/protobuf version.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (FLINK-2075) Shade akka and protobuf dependencies away

2017-10-14 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-2075.
-
   Resolution: Done
Fix Version/s: 1.4.0
 Release Note: Solved by upgrading to a newer Akka version in FLINK-7810

> Shade akka and protobuf dependencies away
> -
>
> Key: FLINK-2075
> URL: https://issues.apache.org/jira/browse/FLINK-2075
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Till Rohrmann
> Fix For: 1.4.0, 1.0.0
>
>
> Lately, the Zeppelin project encountered the following problem: It includes 
> flink-runtime which depends on akka_remote:2.3.7 which again depends on 
> protobuf-java:2.5.0. However, Zeppelin set the protobuf-java version to 2.4.1 
> to make it build with YARN 2.2. Due to this, akka_remote finds a wrong 
> protobuf-java version and fails because of an incompatible change between 
> these versions.
> I propose to shade Flink's akka dependency and protobuf dependency away, so 
> that user projects depending on Flink are not forced to use a special 
> akka/protobuf version.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (FLINK-5989) Protobuf in akka needs to be shaded

2017-10-14 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-5989.
---

> Protobuf in akka needs to be shaded
> ---
>
> Key: FLINK-5989
> URL: https://issues.apache.org/jira/browse/FLINK-5989
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Wenlong Lyu
> Fix For: 1.4.0
>
>
> Currently akka introduces dependency on protobuf, which is a common jar used 
> in many systems, I think we need to use a shaded akka like what we do in 
> dependency on hadoop to avoid version conflicts with user code.
> {code}
> [INFO] +- com.data-artisans:flakka-actor_2.10:jar:2.3-custom:compile
> [INFO] |  \- com.typesafe:config:jar:1.2.1:compile
> [INFO] +- com.data-artisans:flakka-remote_2.10:jar:2.3-custom:compile
> [INFO] |  +- io.netty:netty:jar:3.8.0.Final:compile
> [INFO] |  +- com.google.protobuf:protobuf-java:jar:2.5.0:compile
> [INFO] |  \- org.uncommons.maths:uncommons-maths:jar:1.2.2a:compile
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (FLINK-5989) Protobuf in akka needs to be shaded

2017-10-14 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-5989.
-
   Resolution: Done
Fix Version/s: 1.4.0
 Release Note: Subsumed by issue FLINK-7810

> Protobuf in akka needs to be shaded
> ---
>
> Key: FLINK-5989
> URL: https://issues.apache.org/jira/browse/FLINK-5989
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Wenlong Lyu
> Fix For: 1.4.0
>
>
> Currently akka introduces dependency on protobuf, which is a common jar used 
> in many systems, I think we need to use a shaded akka like what we do in 
> dependency on hadoop to avoid version conflicts with user code.
> {code}
> [INFO] +- com.data-artisans:flakka-actor_2.10:jar:2.3-custom:compile
> [INFO] |  \- com.typesafe:config:jar:1.2.1:compile
> [INFO] +- com.data-artisans:flakka-remote_2.10:jar:2.3-custom:compile
> [INFO] |  +- io.netty:netty:jar:3.8.0.Final:compile
> [INFO] |  +- com.google.protobuf:protobuf-java:jar:2.5.0:compile
> [INFO] |  \- org.uncommons.maths:uncommons-maths:jar:1.2.2a:compile
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-5989) Protobuf in akka needs to be shaded

2017-10-14 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204703#comment-16204703
 ] 

Stephan Ewen commented on FLINK-5989:
-

This issue should be solved through the the Akka upgrade in FLINK-7810

> Protobuf in akka needs to be shaded
> ---
>
> Key: FLINK-5989
> URL: https://issues.apache.org/jira/browse/FLINK-5989
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Wenlong Lyu
> Fix For: 1.4.0
>
>
> Currently akka introduces dependency on protobuf, which is a common jar used 
> in many systems, I think we need to use a shaded akka like what we do in 
> dependency on hadoop to avoid version conflicts with user code.
> {code}
> [INFO] +- com.data-artisans:flakka-actor_2.10:jar:2.3-custom:compile
> [INFO] |  \- com.typesafe:config:jar:1.2.1:compile
> [INFO] +- com.data-artisans:flakka-remote_2.10:jar:2.3-custom:compile
> [INFO] |  +- io.netty:netty:jar:3.8.0.Final:compile
> [INFO] |  +- com.google.protobuf:protobuf-java:jar:2.5.0:compile
> [INFO] |  \- org.uncommons.maths:uncommons-maths:jar:1.2.2a:compile
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7021) Flink Task Manager hangs on startup if one Zookeeper node is unresolvable

2017-10-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204700#comment-16204700
 ] 

ASF GitHub Bot commented on FLINK-7021:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4214
  
@skidder Do you want to follow up on this one? Otherwise, another 
contributor might take this over.
Betting this fix into 1.4 would be great...


> Flink Task Manager hangs on startup if one Zookeeper node is unresolvable
> -
>
> Key: FLINK-7021
> URL: https://issues.apache.org/jira/browse/FLINK-7021
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.2.0, 1.3.0, 1.2.1, 1.3.1
> Environment: Kubernetes cluster running:
> * Flink 1.3.0 Job Manager & Task Manager on Java 8u131
> * Zookeeper 3.4.10 cluster with 3 nodes
>Reporter: Scott Kidder
>Assignee: Scott Kidder
>
> h2. Problem
> Flink Task Manager will hang during startup if one of the Zookeeper nodes in 
> the Zookeeper connection string is unresolvable.
> h2. Expected Behavior
> Flink should retry name resolution & connection to Zookeeper nodes with 
> exponential back-off.
> h2. Environment Details
> We're running Flink and Zookeeper in Kubernetes on CoreOS. CoreOS can run in 
> a configuration that automatically detects and applies operating system 
> updates. We have a Zookeeper node running on the same CoreOS instance as 
> Flink. It's possible that the Zookeeper node will not yet be started when the 
> Flink components are started. This could cause hostname resolution of the 
> Zookeeper nodes to fail.
> h3. Flink Task Manager Logs
> {noformat}
> 2017-06-27 15:38:51,713 INFO  
> org.apache.flink.runtime.taskmanager.TaskManager  - Using 
> configured hostname/address for TaskManager: 10.2.45.11
> 2017-06-27 15:38:51,714 INFO  
> org.apache.flink.runtime.taskmanager.TaskManager  - Starting 
> TaskManager
> 2017-06-27 15:38:51,714 INFO  
> org.apache.flink.runtime.taskmanager.TaskManager  - Starting 
> TaskManager actor system at 10.2.45.11:6122.
> 2017-06-27 15:38:52,950 INFO  akka.event.slf4j.Slf4jLogger
>   - Slf4jLogger started
> 2017-06-27 15:38:53,079 INFO  Remoting
>   - Starting remoting
> 2017-06-27 15:38:53,573 INFO  Remoting
>   - Remoting started; listening on addresses 
> :[akka.tcp://flink@10.2.45.11:6122]
> 2017-06-27 15:38:53,576 INFO  
> org.apache.flink.runtime.taskmanager.TaskManager  - Starting 
> TaskManager actor
> 2017-06-27 15:38:53,660 INFO  
> org.apache.flink.runtime.io.network.netty.NettyConfig - NettyConfig 
> [server address: /10.2.45.11, server port: 6121, ssl enabled: false, memory 
> segment size (bytes): 32768, transport type: NIO, number of server threads: 2 
> (manual), number of client threads: 2 (manual), server connect backlog: 0 
> (use Netty's default), client connect timeout (sec): 120, send/receive buffer 
> size (bytes): 0 (use Netty's default)]
> 2017-06-27 15:38:53,682 INFO  
> org.apache.flink.runtime.taskexecutor.TaskManagerConfiguration  - Messages 
> have a max timeout of 1 ms
> 2017-06-27 15:38:53,688 INFO  
> org.apache.flink.runtime.taskexecutor.TaskManagerServices - Temporary 
> file directory '/tmp': total 49 GB, usable 42 GB (85.71% usable)
> 2017-06-27 15:38:54,071 INFO  
> org.apache.flink.runtime.io.network.buffer.NetworkBufferPool  - Allocated 96 
> MB for network buffer pool (number of memory segments: 3095, bytes per 
> segment: 32768).
> 2017-06-27 15:38:54,564 INFO  
> org.apache.flink.runtime.io.network.NetworkEnvironment- Starting the 
> network environment and its components.
> 2017-06-27 15:38:54,576 INFO  
> org.apache.flink.runtime.io.network.netty.NettyClient - Successful 
> initialization (took 4 ms).
> 2017-06-27 15:38:54,677 INFO  
> org.apache.flink.runtime.io.network.netty.NettyServer - Successful 
> initialization (took 101 ms). Listening on SocketAddress /10.2.45.11:6121.
> 2017-06-27 15:38:54,981 INFO  
> org.apache.flink.runtime.taskexecutor.TaskManagerServices - Limiting 
> managed memory to 0.7 of the currently free heap space (612 MB), memory will 
> be allocated lazily.
> 2017-06-27 15:38:55,050 INFO  
> org.apache.flink.runtime.io.disk.iomanager.IOManager  - I/O manager 
> uses directory /tmp/flink-io-ca01554d-f25e-4c17-a828-96d82b43d4a7 for spill 
> files.
> 2017-06-27 15:38:55,061 INFO  org.apache.flink.runtime.metrics.MetricRegistry 
>   - Configuring StatsDReporter with {interval=10 SECONDS, 
> port=8125, host=localhost, 
> class=org.apache.flink.metrics.statsd.StatsDReporter}.
> 2017-06-27 

[GitHub] flink issue #4214: [FLINK-7021] Flink Task Manager hangs on startup if one Z...

2017-10-14 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4214
  
@skidder Do you want to follow up on this one? Otherwise, another 
contributor might take this over.
Betting this fix into 1.4 would be great...


---


[jira] [Commented] (FLINK-5685) Connection leak in Taskmanager

2017-10-14 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204695#comment-16204695
 ] 

Stephan Ewen commented on FLINK-5685:
-

Is there any update on this issue, or did increasing the number of file handles 
solve the issue?

We are also updating to a newer akka version for the next release, which may 
have an impact.

> Connection leak in Taskmanager
> --
>
> Key: FLINK-5685
> URL: https://issues.apache.org/jira/browse/FLINK-5685
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Reporter: Andrey
>Priority: Critical
>
> Steps to reproduce:
> * setup cluster with the following configuration: 1 job manager, 2 task 
> managers
> * job manager starts rejecting connection attempts from task manager.
> {code}
> 2017-01-30 03:24:42,908 INFO  
> org.apache.flink.runtime.taskmanager.TaskManager  - Trying to 
> register at JobManager akka.tcp://flink@ip:6123/user/jobmanager (attempt 
> 4326, timeout: 30 seconds)
> 2017-01-30 03:24:42,913 WARN  Remoting
>   - Tried to associate with unreachable remote address 
> [akka.tcp://flink@ip:6123]. Address is now gated for 5000 ms, all messages to 
> this
>  address will be delivered to dead letters. Reason: The remote system has 
> quarantined this system. No further associations to the remote system are 
> possible until this system is restarted.
> {code}
> * task manager tries multiple times. (looks like it doens't close connection 
> after failure)
> * job manager unable to process any messages. In logs:
> {code}
> 2017-01-30 03:25:12,932 WARN  
> org.jboss.netty.channel.socket.nio.AbstractNioSelector- Failed to 
> accept a connection.
> java.io.IOException: Too many open files
> at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
> at 
> sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
> at 
> sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
> at 
> org.jboss.netty.channel.socket.nio.NioServerBoss.process(NioServerBoss.java:100)
> at 
> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
> at 
> org.jboss.netty.channel.socket.nio.NioServerBoss.run(NioServerBoss.java:42)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7608) LatencyGauge change to histogram metric

2017-10-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204672#comment-16204672
 ] 

ASF GitHub Bot commented on FLINK-7608:
---

Github user yew1eb commented on the issue:

https://github.com/apache/flink/pull/4826
  

[LatencyStatsJob.java](https://gist.github.com/yew1eb/3329239f866b691364f4d11a7f0a846a)
"Source1" and "Source2"  Source send a LatencyMarker per 200ms.
"Map-1-A" and "Map-2-B" Map Opreater random sleep a few milliseconds.

The following figure shows the latency statistics of **Map-2-B**:
https://user-images.githubusercontent.com/4133864/31576890-1c7d672c-b0ca-11e7-8abd-f761e072fccd.png;>


The following figure shows the latency statistics of **Sink: Print**:
https://user-images.githubusercontent.com/4133864/31576900-67086e72-b0ca-11e7-8247-1c8310e58cf0.png;>

R: @zentol  @rmetzger 



> LatencyGauge change to  histogram metric
> 
>
> Key: FLINK-7608
> URL: https://issues.apache.org/jira/browse/FLINK-7608
> Project: Flink
>  Issue Type: Bug
>  Components: Metrics
>Reporter: Hai Zhou UTC+8
>Assignee: Hai Zhou UTC+8
>Priority: Blocker
> Fix For: 1.4.0, 1.3.3
>
>
> I used slf4jReporter[https://issues.apache.org/jira/browse/FLINK-4831]  to 
> export metrics the log file.
> I found:
> {noformat}
> -- Gauges 
> -
> ..
> zhouhai-mbp.taskmanager.f3fd3a269c8c3da4e8319c8f6a201a57.Flink Streaming 
> Job.Map.0.latency:
>  value={LatencySourceDescriptor{vertexID=1, subtaskIndex=-1}={p99=116.0, 
> p50=59.5, min=11.0, max=116.0, p95=116.0, mean=61.836}}
> zhouhai-mbp.taskmanager.f3fd3a269c8c3da4e8319c8f6a201a57.Flink Streaming 
> Job.Sink- Unnamed.0.latency: 
> value={LatencySourceDescriptor{vertexID=1, subtaskIndex=0}={p99=195.0, 
> p50=163.5, min=115.0, max=195.0, p95=195.0, mean=161.0}}
> ..
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4826: [FLINK-7608][metric] Refactoring latency statistics metri...

2017-10-14 Thread yew1eb
Github user yew1eb commented on the issue:

https://github.com/apache/flink/pull/4826
  

[LatencyStatsJob.java](https://gist.github.com/yew1eb/3329239f866b691364f4d11a7f0a846a)
"Source1" and "Source2"  Source send a LatencyMarker per 200ms.
"Map-1-A" and "Map-2-B" Map Opreater random sleep a few milliseconds.

The following figure shows the latency statistics of **Map-2-B**:
https://user-images.githubusercontent.com/4133864/31576890-1c7d672c-b0ca-11e7-8abd-f761e072fccd.png;>


The following figure shows the latency statistics of **Sink: Print**:
https://user-images.githubusercontent.com/4133864/31576900-67086e72-b0ca-11e7-8247-1c8310e58cf0.png;>

R: @zentol  @rmetzger 



---


[jira] [Commented] (FLINK-7484) CaseClassSerializer.duplicate() does not perform proper deep copy

2017-10-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204640#comment-16204640
 ] 

ASF GitHub Bot commented on FLINK-7484:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4822
  
Seems this is merged into master, please also merge this for 1.3.x


> CaseClassSerializer.duplicate() does not perform proper deep copy
> -
>
> Key: FLINK-7484
> URL: https://issues.apache.org/jira/browse/FLINK-7484
> Project: Flink
>  Issue Type: Sub-task
>  Components: CEP, DataStream API, Scala API
>Affects Versions: 1.3.2
> Environment: Flink 1.3.2 , Yarn Cluster, FsStateBackend
>Reporter: Shashank Agarwal
>Assignee: Aljoscha Krettek
>Priority: Blocker
> Fix For: 1.4.0
>
>
> I am using many CEP's and Global Window. I am getting following error 
> sometimes and application  crashes. I have checked logically there's no flow 
> in the program. Here ItemPojo is a Pojo class and we are using 
> java.utill.List[ItemPojo]. We are using Scala DataStream API please find 
> attached logs.
> {code}
> 2017-08-17 10:04:12,814 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - TriggerWindow(GlobalWindows(), 
> ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@6d36aa3c},
>  co.thirdwatch.trigger.TransactionTrigger@5707c1cb, 
> WindowedStream.apply(WindowedStream.scala:582)) -> Flat Map -> Map -> Sink: 
> Saving CSV Features Sink (1/2) (06c0d4d231bc620ba9e7924b9b0da8d1) switched 
> from RUNNING to FAILED.
> com.esotericsoftware.kryo.KryoException: java.lang.IndexOutOfBoundsException: 
> Index: 7, Size: 5
> Serialization trace:
> category (co.thirdwatch.pojo.ItemPojo)
> underlying (scala.collection.convert.Wrappers$SeqWrapper)
>   at 
> com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:528)
>   at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:761)
>   at com.twitter.chill.TraversableSerializer.read(Traversable.scala:43)
>   at com.twitter.chill.TraversableSerializer.read(Traversable.scala:21)
>   at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:679)
>   at 
> com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
>   at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:528)
>   at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:657)
>   at 
> org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer.copy(KryoSerializer.java:190)
>   at 
> org.apache.flink.api.scala.typeutils.CaseClassSerializer.copy(CaseClassSerializer.scala:101)
>   at 
> org.apache.flink.api.scala.typeutils.CaseClassSerializer.copy(CaseClassSerializer.scala:32)
>   at 
> org.apache.flink.runtime.state.ArrayListSerializer.copy(ArrayListSerializer.java:74)
>   at 
> org.apache.flink.runtime.state.ArrayListSerializer.copy(ArrayListSerializer.java:34)
>   at 
> org.apache.flink.runtime.state.heap.CopyOnWriteStateTable.get(CopyOnWriteStateTable.java:279)
>   at 
> org.apache.flink.runtime.state.heap.CopyOnWriteStateTable.get(CopyOnWriteStateTable.java:296)
>   at 
> org.apache.flink.runtime.state.heap.HeapListState.add(HeapListState.java:77)
>   at 
> org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.processElement(WindowOperator.java:442)
>   at 
> org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:206)
>   at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:69)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:263)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IndexOutOfBoundsException: Index: 7, Size: 5
>   at java.util.ArrayList.rangeCheck(ArrayList.java:653)
>   at java.util.ArrayList.get(ArrayList.java:429)
>   at 
> com.esotericsoftware.kryo.util.MapReferenceResolver.getReadObject(MapReferenceResolver.java:42)
>   at com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:805)
>   at com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:728)
>   at 
> com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:113)
>   ... 22 more
> 2017-08-17 10:04:12,816 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Freeing task resources for TriggerWindow(GlobalWindows(), 
> ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@6d36aa3c},
>  

[GitHub] flink issue #4822: [FLINK-7484] Perform proper deep copy in CaseClassSeriali...

2017-10-14 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/4822
  
Seems this is merged into master, please also merge this for 1.3.x


---


[jira] [Closed] (FLINK-7510) Move some connectors to Apache Bahir

2017-10-14 Thread Hai Zhou UTC+8 (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hai Zhou UTC+8 closed FLINK-7510.
-
Resolution: Won't Fix

> Move some connectors to Apache Bahir
> 
>
> Key: FLINK-7510
> URL: https://issues.apache.org/jira/browse/FLINK-7510
> Project: Flink
>  Issue Type: Wish
>  Components: Streaming Connectors
>Reporter: Hai Zhou UTC+8
>Assignee: Hai Zhou UTC+8
>
> Hi Flink community:
> Flink is really a great stream processing framework, provides a number of 
> connectors that support multiple data sources and sinks.
> but,
> I suggested that moving the unpopular connector to Bahir (and popular 
> connector to keep in the main Flink codebase).
> E.g
> flink-connectors/flink-connector-twitter
> flink-contrib/flink-connector-wikiedits
> 
> I am willing to do the work, if you think it's acceptable, I will create new 
> sub-task issue and  submit a PR soon.
> Regards,
> Hai Zhou
> the corresponding bahir issue: 
> [https://issues.apache.org/jira/browse/BAHIR-131]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (FLINK-7758) Fix bug Kafka09Fetcher add offset metrics

2017-10-14 Thread Hai Zhou UTC+8 (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hai Zhou UTC+8 updated FLINK-7758:
--
Component/s: Metrics

> Fix bug  Kafka09Fetcher add offset metrics 
> ---
>
> Key: FLINK-7758
> URL: https://issues.apache.org/jira/browse/FLINK-7758
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector, Metrics
>Affects Versions: 1.3.2
>Reporter: Hai Zhou UTC+8
>Assignee: Hai Zhou UTC+8
> Fix For: 1.4.0
>
>
> in Kafka09Fetcher, add _KafkaConsumer_ kafkaMetricGroup. 
> No judgment that the useMetrics variable is true.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7608) LatencyGauge change to histogram metric

2017-10-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204575#comment-16204575
 ] 

ASF GitHub Bot commented on FLINK-7608:
---

GitHub user yew1eb opened a pull request:

https://github.com/apache/flink/pull/4826

[FLINK-7608][metric] Refactoring latency statistics metric

A detailed description of this PR, see  [#issue FLINK-7608: LatencyGauge 
change to histogram metric](https://issues.apache.org/jira/browse/FLINK-7608)

## Verifying this change

This change is already covered by existing tests.


## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (no)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)

## Documentation

  - Does this pull request introduce a new feature? (no)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yew1eb/flink FLINK-7608

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4826.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4826


commit 890616b828d979eed336d546300f442ac7f75b09
Author: yew1eb 
Date:   2017-10-14T09:19:49Z

refactoring latency statistics metric




> LatencyGauge change to  histogram metric
> 
>
> Key: FLINK-7608
> URL: https://issues.apache.org/jira/browse/FLINK-7608
> Project: Flink
>  Issue Type: Bug
>  Components: Metrics
>Reporter: Hai Zhou UTC+8
>Assignee: Hai Zhou UTC+8
>Priority: Blocker
> Fix For: 1.4.0, 1.3.3
>
>
> I used slf4jReporter[https://issues.apache.org/jira/browse/FLINK-4831]  to 
> export metrics the log file.
> I found:
> {noformat}
> -- Gauges 
> -
> ..
> zhouhai-mbp.taskmanager.f3fd3a269c8c3da4e8319c8f6a201a57.Flink Streaming 
> Job.Map.0.latency:
>  value={LatencySourceDescriptor{vertexID=1, subtaskIndex=-1}={p99=116.0, 
> p50=59.5, min=11.0, max=116.0, p95=116.0, mean=61.836}}
> zhouhai-mbp.taskmanager.f3fd3a269c8c3da4e8319c8f6a201a57.Flink Streaming 
> Job.Sink- Unnamed.0.latency: 
> value={LatencySourceDescriptor{vertexID=1, subtaskIndex=0}={p99=195.0, 
> p50=163.5, min=115.0, max=195.0, p95=195.0, mean=161.0}}
> ..
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #4826: [FLINK-7608][metric] Refactoring latency statistic...

2017-10-14 Thread yew1eb
GitHub user yew1eb opened a pull request:

https://github.com/apache/flink/pull/4826

[FLINK-7608][metric] Refactoring latency statistics metric

A detailed description of this PR, see  [#issue FLINK-7608: LatencyGauge 
change to histogram metric](https://issues.apache.org/jira/browse/FLINK-7608)

## Verifying this change

This change is already covered by existing tests.


## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (no)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)

## Documentation

  - Does this pull request introduce a new feature? (no)



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yew1eb/flink FLINK-7608

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4826.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4826


commit 890616b828d979eed336d546300f442ac7f75b09
Author: yew1eb 
Date:   2017-10-14T09:19:49Z

refactoring latency statistics metric




---


[jira] [Commented] (FLINK-7799) Improve performance of windowed joins

2017-10-14 Thread Xingcan Cui (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204554#comment-16204554
 ] 

Xingcan Cui commented on FLINK-7799:


Hi [~fhueske], I've reconsidered this problem and found a drawback of the time 
block solution. 

Specifically, if we merge all the rows belonging to the same time block in an 
entry (the value of which is either a {{Map}} or a {{List}}) of the 
{{MapState}}, the minimum operating unit of the state becomes a collection. 
That means everytime we store/remove a single row, all the data in the same 
block must also be rewritten, which will definitely bring a lot of extra cost.

If that drawback cannot be eliminated, I wonder if we could improve the join 
performance from another point of view. Since the rocksdb backend should be 
widely used in real applications and the {{MapState}} entries are ordered in 
it, can we provide something like a  hint mechanism in the state API, so that 
the join function can be aware of the ordering?

> Improve performance of windowed joins
> -
>
> Key: FLINK-7799
> URL: https://issues.apache.org/jira/browse/FLINK-7799
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Affects Versions: 1.4.0
>Reporter: Fabian Hueske
>Assignee: Xingcan Cui
>Priority: Critical
>
> The performance of windowed joins can be improved by changing the state 
> access patterns.
> Right now, rows are inserted into a MapState with their timestamp as key. 
> Since we use a time resolution of 1ms, this means that the full key space of 
> the state must be iterated and many map entries must be accessed when joining 
> or evicting rows. 
> A better strategy would be to block the time into larger intervals and 
> register the rows in their respective interval. Another benefit would be that 
> we can directly access the state entries because we know exactly which 
> timestamps to look up. Hence, we can limit the state access to the relevant 
> section during joining and state eviction. 
> The good size for intervals needs to be identified and might depend on the 
> size of the window.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (FLINK-7839) Creating Quickstart project for SNAPSHOT version fails

2017-10-14 Thread Michael Fong (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204394#comment-16204394
 ] 

Michael Fong edited comment on FLINK-7839 at 10/14/17 7:56 AM:
---

>From above comment, I propose we could put a note in the document addressing 
>the following:
1. for users with maven 3.0+
1.1 Get rid of '-DarchetypeCatalog' 
1.2 Optionally define snapshot repository in settings.xml if truly need the 
latest snapshot (i.e.  perhaps for dev) ; otherwise, maven should locate the 
latest release repository by default .
2. for users with older maven
1.1 Optionally provide '-DarchetypeCatalog' as maven would attempt to retrieve 
meta from snapshot repo over release repo.

[~rmetzger] What do you think?


was (Author: mcfongtw):
>From above comment, I propose we could put a note in the document addressing 
>the following:
1. for users with maven 3.0+
1.1 Get rid of '-DarchetypeCatalog' 
1.2 Optionally define snapshot repository in settings.xml (only for dev, i 
assume) ; otherwise, maven should locate the latest release repository by 
default .
2. for users with older maven
1.1 Optionally provide '-DarchetypeCatalog' as maven would attempt to retrieve 
meta from snapshot repo over release repo.

> Creating Quickstart project for SNAPSHOT version fails
> --
>
> Key: FLINK-7839
> URL: https://issues.apache.org/jira/browse/FLINK-7839
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.4.0
>Reporter: Gary Yao
>Assignee: Michael Fong
>Priority: Blocker
>  Labels: documentation
>
> The documentation on creating quickstart projects is broken for SNAPSHOT 
> releases. For example, the documentation suggests to use the following 
> command to generate a Flink 1.4-SNAPSHOT project using maven archetypes:
> {code}
> mvn archetype:generate \
>   -DarchetypeGroupId=org.apache.flink  \
>   -DarchetypeArtifactId=flink-quickstart-java  \
>   
> -DarchetypeCatalog=https://repository.apache.org/content/repositories/snapshots/
>  \
>   -DarchetypeVersion=1.4-SNAPSHOT
> {code}
> The command fails with the error:
> {code}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-archetype-plugin:3.0.1:generate (default-cli) 
> on project flink-training-exercises: archetypeCatalog 
> 'https://repository.apache.org/content/repositories/snapshots/' is not 
> supported anymore. Please read the plugin documentation for details. -> [Help 
> 1]
> {code}
> This also affects the quickstart script.
> Since version 3.0.0, the archetype plugin does not allow to specify 
> repositories as command line arguments. See 
> http://maven.apache.org/archetype/maven-archetype-plugin/archetype-repository.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7798) Add support for windowed joins to Table API

2017-10-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204526#comment-16204526
 ] 

ASF GitHub Bot commented on FLINK-7798:
---

Github user xccui commented on a diff in the pull request:

https://github.com/apache/flink/pull/4825#discussion_r144688874
  
--- Diff: docs/dev/table/tableApi.md ---
@@ -498,6 +499,23 @@ Table left = tableEnv.fromDataSet(ds1, "a, b, c");
 Table right = tableEnv.fromDataSet(ds2, "d, e, f");
 Table result = left.join(right).where("a = d").select("a, b, e");
 {% endhighlight %}
+Note: Currently, only time-windowed inner joins can be 
processed in a streaming fashion.
+
+A time-windowed join requires a special join condition that 
bounds the time on both sides. This can be done by two appropriate range 
predicates ( , =, =, ) that compares the time attributes of both input tables. 
The following rules apply for time predicates:
+  
+Time predicates must compare time attributes of both input 
tables.
+Time predicates must compare only time attributes of the 
same type, i.e., processing time with processing time or event time with event 
time.
+Only range predicates are valid time predicates.
+Non-time predicates must not access a time attribute.
--- End diff --

@fhueske, I think the last rule about time attribute access could be 
removed now, right?


> Add support for windowed joins to Table API
> ---
>
> Key: FLINK-7798
> URL: https://issues.apache.org/jira/browse/FLINK-7798
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Affects Versions: 1.4.0
>Reporter: Fabian Hueske
>Assignee: Xingcan Cui
>Priority: Blocker
> Fix For: 1.4.0
>
>
> Currently, windowed joins on streaming tables are only supported through SQL.
> The Table API should support these joins as well. For that, we have to adjust 
> the Table API validation and translate the API into the respective logical 
> plan. Since most of the code should already be there for the batch Table API 
> joins, this should be fairly straightforward.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #4825: [FLINK-7798] [table] Add support for stream window...

2017-10-14 Thread xccui
Github user xccui commented on a diff in the pull request:

https://github.com/apache/flink/pull/4825#discussion_r144688874
  
--- Diff: docs/dev/table/tableApi.md ---
@@ -498,6 +499,23 @@ Table left = tableEnv.fromDataSet(ds1, "a, b, c");
 Table right = tableEnv.fromDataSet(ds2, "d, e, f");
 Table result = left.join(right).where("a = d").select("a, b, e");
 {% endhighlight %}
+Note: Currently, only time-windowed inner joins can be 
processed in a streaming fashion.
+
+A time-windowed join requires a special join condition that 
bounds the time on both sides. This can be done by two appropriate range 
predicates ( , =, =, ) that compares the time attributes of both input tables. 
The following rules apply for time predicates:
+  
+Time predicates must compare time attributes of both input 
tables.
+Time predicates must compare only time attributes of the 
same type, i.e., processing time with processing time or event time with event 
time.
+Only range predicates are valid time predicates.
+Non-time predicates must not access a time attribute.
--- End diff --

@fhueske, I think the last rule about time attribute access could be 
removed now, right?


---


[GitHub] flink pull request #4825: [FLINK-7798] [table] Add support for stream window...

2017-10-14 Thread xccui
Github user xccui commented on a diff in the pull request:

https://github.com/apache/flink/pull/4825#discussion_r144688279
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/stream/table/JoinITCase.scala
 ---
@@ -0,0 +1,114 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.runtime.stream.table
+
+import org.apache.flink.api.scala._
+import org.apache.flink.streaming.api.TimeCharacteristic
+import org.apache.flink.table.api.scala._
+import org.apache.flink.streaming.api.scala.{DataStream, 
StreamExecutionEnvironment}
+import org.apache.flink.streaming.util.StreamingMultipleProgramsTestBase
+import org.apache.flink.table.api.{TableEnvironment, TableException}
+import org.apache.flink.table.runtime.utils.StreamITCase
+import org.apache.flink.types.Row
+import org.junit.Test
+
+/**
+  * Test if the table joins can be correctly translated.
+  */
+class JoinITCase extends StreamingMultipleProgramsTestBase {
--- End diff --

Thanks for pointing this out, @fhueske. I'll rework the test.


---


[jira] [Closed] (FLINK-7686) Add Flink Forward Berlin 2017 conference slides to the flink website

2017-10-14 Thread Hai Zhou UTC+8 (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hai Zhou UTC+8 closed FLINK-7686.
-
Resolution: Not A Problem

> Add Flink Forward Berlin 2017 conference slides to the flink website
> 
>
> Key: FLINK-7686
> URL: https://issues.apache.org/jira/browse/FLINK-7686
> Project: Flink
>  Issue Type: Wish
>  Components: Project Website
>Reporter: Hai Zhou UTC+8
>Priority: Trivial
>
> I recently watched [Flink Forward Berlin 
> 2017|https://berlin.flink-forward.org/sessions/] conference slides, the 
> content is very good.  
> I think we should add them to the [flink 
> website|http://flink.apache.org/community.html] for more people to know.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7798) Add support for windowed joins to Table API

2017-10-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204515#comment-16204515
 ] 

ASF GitHub Bot commented on FLINK-7798:
---

Github user xccui commented on a diff in the pull request:

https://github.com/apache/flink/pull/4825#discussion_r144688279
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/stream/table/JoinITCase.scala
 ---
@@ -0,0 +1,114 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.runtime.stream.table
+
+import org.apache.flink.api.scala._
+import org.apache.flink.streaming.api.TimeCharacteristic
+import org.apache.flink.table.api.scala._
+import org.apache.flink.streaming.api.scala.{DataStream, 
StreamExecutionEnvironment}
+import org.apache.flink.streaming.util.StreamingMultipleProgramsTestBase
+import org.apache.flink.table.api.{TableEnvironment, TableException}
+import org.apache.flink.table.runtime.utils.StreamITCase
+import org.apache.flink.types.Row
+import org.junit.Test
+
+/**
+  * Test if the table joins can be correctly translated.
+  */
+class JoinITCase extends StreamingMultipleProgramsTestBase {
--- End diff --

Thanks for pointing this out, @fhueske. I'll rework the test.


> Add support for windowed joins to Table API
> ---
>
> Key: FLINK-7798
> URL: https://issues.apache.org/jira/browse/FLINK-7798
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Affects Versions: 1.4.0
>Reporter: Fabian Hueske
>Assignee: Xingcan Cui
>Priority: Blocker
> Fix For: 1.4.0
>
>
> Currently, windowed joins on streaming tables are only supported through SQL.
> The Table API should support these joins as well. For that, we have to adjust 
> the Table API validation and translate the API into the respective logical 
> plan. Since most of the code should already be there for the batch Table API 
> joins, this should be fairly straightforward.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-7798) Add support for windowed joins to Table API

2017-10-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204511#comment-16204511
 ] 

ASF GitHub Bot commented on FLINK-7798:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/4825#discussion_r144688111
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/stream/table/JoinITCase.scala
 ---
@@ -0,0 +1,114 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.runtime.stream.table
+
+import org.apache.flink.api.scala._
+import org.apache.flink.streaming.api.TimeCharacteristic
+import org.apache.flink.table.api.scala._
+import org.apache.flink.streaming.api.scala.{DataStream, 
StreamExecutionEnvironment}
+import org.apache.flink.streaming.util.StreamingMultipleProgramsTestBase
+import org.apache.flink.table.api.{TableEnvironment, TableException}
+import org.apache.flink.table.runtime.utils.StreamITCase
+import org.apache.flink.types.Row
+import org.junit.Test
+
+/**
+  * Test if the table joins can be correctly translated.
+  */
+class JoinITCase extends StreamingMultipleProgramsTestBase {
--- End diff --

ITCases are quite expensive to run. That's why we try to avoid them if 
possible and if we add them, they should test as much as possible (i.e., result 
correctness) to justify the additional build time.

However, I don't think we need ITCases for this feature. The correct 
execution is already validated by harness tests and SQL ITCases. It is 
sufficient to add plan validation tests, i.e., tests that check the correct 
translation of Table API queries to execution plans (without executing them). 
Have a look at `org.apache.flink.table.api.stream.table.AggregateTest` as an 
example. Since SQL and Table API have a common execution layer, validating the 
plans is sufficient.


> Add support for windowed joins to Table API
> ---
>
> Key: FLINK-7798
> URL: https://issues.apache.org/jira/browse/FLINK-7798
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Affects Versions: 1.4.0
>Reporter: Fabian Hueske
>Assignee: Xingcan Cui
>Priority: Blocker
> Fix For: 1.4.0
>
>
> Currently, windowed joins on streaming tables are only supported through SQL.
> The Table API should support these joins as well. For that, we have to adjust 
> the Table API validation and translate the API into the respective logical 
> plan. Since most of the code should already be there for the batch Table API 
> joins, this should be fairly straightforward.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #4825: [FLINK-7798] [table] Add support for stream window...

2017-10-14 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/4825#discussion_r144688111
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/stream/table/JoinITCase.scala
 ---
@@ -0,0 +1,114 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.runtime.stream.table
+
+import org.apache.flink.api.scala._
+import org.apache.flink.streaming.api.TimeCharacteristic
+import org.apache.flink.table.api.scala._
+import org.apache.flink.streaming.api.scala.{DataStream, 
StreamExecutionEnvironment}
+import org.apache.flink.streaming.util.StreamingMultipleProgramsTestBase
+import org.apache.flink.table.api.{TableEnvironment, TableException}
+import org.apache.flink.table.runtime.utils.StreamITCase
+import org.apache.flink.types.Row
+import org.junit.Test
+
+/**
+  * Test if the table joins can be correctly translated.
+  */
+class JoinITCase extends StreamingMultipleProgramsTestBase {
--- End diff --

ITCases are quite expensive to run. That's why we try to avoid them if 
possible and if we add them, they should test as much as possible (i.e., result 
correctness) to justify the additional build time.

However, I don't think we need ITCases for this feature. The correct 
execution is already validated by harness tests and SQL ITCases. It is 
sufficient to add plan validation tests, i.e., tests that check the correct 
translation of Table API queries to execution plans (without executing them). 
Have a look at `org.apache.flink.table.api.stream.table.AggregateTest` as an 
example. Since SQL and Table API have a common execution layer, validating the 
plans is sufficient.


---