[GitHub] flink issue #3001: [FLINK-4821] [kinesis] Implement rescalable non-partition...

2016-12-14 Thread tony810430
Github user tony810430 commented on the issue:

https://github.com/apache/flink/pull/3001
  
@tzulitai as you wish =)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4821) Implement rescalable non-partitioned state for Kinesis Connector

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15750667#comment-15750667
 ] 

ASF GitHub Bot commented on FLINK-4821:
---

Github user tony810430 commented on the issue:

https://github.com/apache/flink/pull/3001
  
@tzulitai as you wish =)


> Implement rescalable non-partitioned state for Kinesis Connector
> 
>
> Key: FLINK-4821
> URL: https://issues.apache.org/jira/browse/FLINK-4821
> Project: Flink
>  Issue Type: New Feature
>  Components: Kinesis Connector
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Wei-Che Wei
> Fix For: 1.2.0
>
>
> FLINK-4379 added the rescalable non-partitioned state feature, along with the 
> implementation for the Kafka connector.
> The AWS Kinesis connector will benefit from the feature and should implement 
> it too. This ticket tracks progress for this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4523) Allow Kinesis Consumer to start from specific timestamp / Date

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15750659#comment-15750659
 ] 

ASF GitHub Bot commented on FLINK-4523:
---

Github user tzulitai commented on the issue:

https://github.com/apache/flink/pull/2916
  
Thanks for addressing the final comments. I'll add the docs and merge this 
by the end of the day :)


> Allow Kinesis Consumer to start from specific timestamp / Date
> --
>
> Key: FLINK-4523
> URL: https://issues.apache.org/jira/browse/FLINK-4523
> Project: Flink
>  Issue Type: New Feature
>  Components: Kinesis Connector
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Wei-Che Wei
> Fix For: 1.2.0
>
>
> We had a Kinesis user requesting this feature on an offline chat.
> To be specific, we let all initial Kinesis shards be iterated starting from 
> records at the given timestamp.
> The AWS Java SDK we're using already provides API for this, so we can add 
> this functionality with fairly low overhead: 
> http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/kinesis/model/GetShardIteratorRequest.html#setTimestamp-java.util.Date-



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2916: [FLINK-4523] [kinesis] Allow Kinesis Consumer to start fr...

2016-12-14 Thread tzulitai
Github user tzulitai commented on the issue:

https://github.com/apache/flink/pull/2916
  
Thanks for addressing the final comments. I'll add the docs and merge this 
by the end of the day :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2985: [FLINK-5104] Bipartite graph validation

2016-12-14 Thread mushketyk
Github user mushketyk commented on the issue:

https://github.com/apache/flink/pull/2985
  
@greghogan Thank you for the review.
Can you give me an example of how you document joins?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5104) Implement BipartiteGraph validator

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15750656#comment-15750656
 ] 

ASF GitHub Bot commented on FLINK-5104:
---

Github user mushketyk commented on the issue:

https://github.com/apache/flink/pull/2985
  
@greghogan Thank you for the review.
Can you give me an example of how you document joins?


> Implement BipartiteGraph validator
> --
>
> Key: FLINK-5104
> URL: https://issues.apache.org/jira/browse/FLINK-5104
> Project: Flink
>  Issue Type: Sub-task
>  Components: Gelly
>Reporter: Ivan Mushketyk
>Assignee: Ivan Mushketyk
>
> BipartiteGraph should have a validator similar to GraphValidator for Graph 
> class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4647) Implement BipartiteGraph reader

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15750651#comment-15750651
 ] 

ASF GitHub Bot commented on FLINK-4647:
---

Github user mushketyk commented on the issue:

https://github.com/apache/flink/pull/2987
  
Hi @greghogan ,

I've rebased this PR on top of master.



> Implement BipartiteGraph reader
> ---
>
> Key: FLINK-4647
> URL: https://issues.apache.org/jira/browse/FLINK-4647
> Project: Flink
>  Issue Type: Sub-task
>  Components: Gelly
>Reporter: Ivan Mushketyk
>Assignee: Ivan Mushketyk
>
> Implement reading bipartite graph from a CSV. Should be similar to how 
> regular graph is read from a file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2987: [FLINK-4647] Read bipartite graph

2016-12-14 Thread mushketyk
Github user mushketyk commented on the issue:

https://github.com/apache/flink/pull/2987
  
Hi @greghogan ,

I've rebased this PR on top of master.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4821) Implement rescalable non-partitioned state for Kinesis Connector

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15750647#comment-15750647
 ] 

ASF GitHub Bot commented on FLINK-4821:
---

Github user tzulitai commented on a diff in the pull request:

https://github.com/apache/flink/pull/3001#discussion_r92554905
  
--- Diff: 
flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java
 ---
@@ -294,11 +296,18 @@ public void close() throws Exception {
lastStateSnapshot.toString(), checkpointId, 
checkpointTimestamp);
}
 
-   return lastStateSnapshot;
+   List> listState = 
new ArrayList<>(lastStateSnapshot.size());
+   for (Map.Entry entry: 
lastStateSnapshot.entrySet()) {
+   listState.add(Tuple2.of(entry.getKey(), 
entry.getValue()));
+   }
+   return listState;
}
 
@Override
-   public void restoreState(HashMap 
restoredState) throws Exception {
-   sequenceNumsToRestore = restoredState;
+   public void restoreState(List> state) throws Exception {
+   sequenceNumsToRestore = new HashMap<>();
+   for (Tuple2 subState: 
state) {
--- End diff --

formatting nit: need empty space before colon ":"


> Implement rescalable non-partitioned state for Kinesis Connector
> 
>
> Key: FLINK-4821
> URL: https://issues.apache.org/jira/browse/FLINK-4821
> Project: Flink
>  Issue Type: New Feature
>  Components: Kinesis Connector
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Wei-Che Wei
> Fix For: 1.2.0
>
>
> FLINK-4379 added the rescalable non-partitioned state feature, along with the 
> implementation for the Kafka connector.
> The AWS Kinesis connector will benefit from the feature and should implement 
> it too. This ticket tracks progress for this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4821) Implement rescalable non-partitioned state for Kinesis Connector

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15750648#comment-15750648
 ] 

ASF GitHub Bot commented on FLINK-4821:
---

Github user tzulitai commented on a diff in the pull request:

https://github.com/apache/flink/pull/3001#discussion_r92555340
  
--- Diff: 
flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java
 ---
@@ -294,11 +296,18 @@ public void close() throws Exception {
lastStateSnapshot.toString(), checkpointId, 
checkpointTimestamp);
}
 
-   return lastStateSnapshot;
+   List> listState = 
new ArrayList<>(lastStateSnapshot.size());
+   for (Map.Entry entry: 
lastStateSnapshot.entrySet()) {
+   listState.add(Tuple2.of(entry.getKey(), 
entry.getValue()));
+   }
+   return listState;
}
 
@Override
-   public void restoreState(HashMap 
restoredState) throws Exception {
-   sequenceNumsToRestore = restoredState;
+   public void restoreState(List> state) throws Exception {
+   sequenceNumsToRestore = new HashMap<>();
+   for (Tuple2 subState: 
state) {
--- End diff --

We should probably do a null check here for `state`.
From the looks of #3005, I don't think restored state will ever be null 
(will be empty list), but it'd be good to make the code here independent of 
that.


> Implement rescalable non-partitioned state for Kinesis Connector
> 
>
> Key: FLINK-4821
> URL: https://issues.apache.org/jira/browse/FLINK-4821
> Project: Flink
>  Issue Type: New Feature
>  Components: Kinesis Connector
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Wei-Che Wei
> Fix For: 1.2.0
>
>
> FLINK-4379 added the rescalable non-partitioned state feature, along with the 
> implementation for the Kafka connector.
> The AWS Kinesis connector will benefit from the feature and should implement 
> it too. This ticket tracks progress for this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4821) Implement rescalable non-partitioned state for Kinesis Connector

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15750646#comment-15750646
 ] 

ASF GitHub Bot commented on FLINK-4821:
---

Github user tzulitai commented on a diff in the pull request:

https://github.com/apache/flink/pull/3001#discussion_r92554828
  
--- Diff: 
flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java
 ---
@@ -294,11 +296,18 @@ public void close() throws Exception {
lastStateSnapshot.toString(), checkpointId, 
checkpointTimestamp);
}
 
-   return lastStateSnapshot;
+   List> listState = 
new ArrayList<>(lastStateSnapshot.size());
+   for (Map.Entry entry: 
lastStateSnapshot.entrySet()) {
--- End diff --

formatting nit: need empty space before colon `:`


> Implement rescalable non-partitioned state for Kinesis Connector
> 
>
> Key: FLINK-4821
> URL: https://issues.apache.org/jira/browse/FLINK-4821
> Project: Flink
>  Issue Type: New Feature
>  Components: Kinesis Connector
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Wei-Che Wei
> Fix For: 1.2.0
>
>
> FLINK-4379 added the rescalable non-partitioned state feature, along with the 
> implementation for the Kafka connector.
> The AWS Kinesis connector will benefit from the feature and should implement 
> it too. This ticket tracks progress for this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #3001: [FLINK-4821] [kinesis] Implement rescalable non-pa...

2016-12-14 Thread tzulitai
Github user tzulitai commented on a diff in the pull request:

https://github.com/apache/flink/pull/3001#discussion_r92554905
  
--- Diff: 
flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java
 ---
@@ -294,11 +296,18 @@ public void close() throws Exception {
lastStateSnapshot.toString(), checkpointId, 
checkpointTimestamp);
}
 
-   return lastStateSnapshot;
+   List> listState = 
new ArrayList<>(lastStateSnapshot.size());
+   for (Map.Entry entry: 
lastStateSnapshot.entrySet()) {
+   listState.add(Tuple2.of(entry.getKey(), 
entry.getValue()));
+   }
+   return listState;
}
 
@Override
-   public void restoreState(HashMap 
restoredState) throws Exception {
-   sequenceNumsToRestore = restoredState;
+   public void restoreState(List> state) throws Exception {
+   sequenceNumsToRestore = new HashMap<>();
+   for (Tuple2 subState: 
state) {
--- End diff --

formatting nit: need empty space before colon ":"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #3001: [FLINK-4821] [kinesis] Implement rescalable non-pa...

2016-12-14 Thread tzulitai
Github user tzulitai commented on a diff in the pull request:

https://github.com/apache/flink/pull/3001#discussion_r92554828
  
--- Diff: 
flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java
 ---
@@ -294,11 +296,18 @@ public void close() throws Exception {
lastStateSnapshot.toString(), checkpointId, 
checkpointTimestamp);
}
 
-   return lastStateSnapshot;
+   List> listState = 
new ArrayList<>(lastStateSnapshot.size());
+   for (Map.Entry entry: 
lastStateSnapshot.entrySet()) {
--- End diff --

formatting nit: need empty space before colon `:`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #3001: [FLINK-4821] [kinesis] Implement rescalable non-pa...

2016-12-14 Thread tzulitai
Github user tzulitai commented on a diff in the pull request:

https://github.com/apache/flink/pull/3001#discussion_r92555340
  
--- Diff: 
flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java
 ---
@@ -294,11 +296,18 @@ public void close() throws Exception {
lastStateSnapshot.toString(), checkpointId, 
checkpointTimestamp);
}
 
-   return lastStateSnapshot;
+   List> listState = 
new ArrayList<>(lastStateSnapshot.size());
+   for (Map.Entry entry: 
lastStateSnapshot.entrySet()) {
+   listState.add(Tuple2.of(entry.getKey(), 
entry.getValue()));
+   }
+   return listState;
}
 
@Override
-   public void restoreState(HashMap 
restoredState) throws Exception {
-   sequenceNumsToRestore = restoredState;
+   public void restoreState(List> state) throws Exception {
+   sequenceNumsToRestore = new HashMap<>();
+   for (Tuple2 subState: 
state) {
--- End diff --

We should probably do a null check here for `state`.
From the looks of #3005, I don't think restored state will ever be null 
(will be empty list), but it'd be good to make the code here independent of 
that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5324) JVM Opitons will be work both for YarnApplicationMasterRunner and YarnTaskManager with yarn mode

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15750634#comment-15750634
 ] 

ASF GitHub Bot commented on FLINK-5324:
---

Github user hzyuemeng1 commented on the issue:

https://github.com/apache/flink/pull/2994
  
@wuchong ,can u check this issue?


> JVM Opitons will be work both for YarnApplicationMasterRunner and 
> YarnTaskManager with yarn mode
> 
>
> Key: FLINK-5324
> URL: https://issues.apache.org/jira/browse/FLINK-5324
> Project: Flink
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 1.1.3
>Reporter: yuemeng
>Priority: Critical
> Attachments: 
> 0001-FLINK-5324-yarn-JVM-Opitons-will-work-for-both-YarnA.patch
>
>
> YarnApplicationMasterRunner and YarnTaskManager both use follow code to get 
> jvm options
> {code}
> final String javaOpts = 
> flinkConfiguration.getString(ConfigConstants.FLINK_JVM_OPTIONS, "")
> {code}
> so when we add some jvm options for one of them ,it will be both worked



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5324) JVM Opitons will be work both for YarnApplicationMasterRunner and YarnTaskManager with yarn mode

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15750635#comment-15750635
 ] 

ASF GitHub Bot commented on FLINK-5324:
---

Github user hzyuemeng1 closed the pull request at:

https://github.com/apache/flink/pull/2994


> JVM Opitons will be work both for YarnApplicationMasterRunner and 
> YarnTaskManager with yarn mode
> 
>
> Key: FLINK-5324
> URL: https://issues.apache.org/jira/browse/FLINK-5324
> Project: Flink
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 1.1.3
>Reporter: yuemeng
>Priority: Critical
> Attachments: 
> 0001-FLINK-5324-yarn-JVM-Opitons-will-work-for-both-YarnA.patch
>
>
> YarnApplicationMasterRunner and YarnTaskManager both use follow code to get 
> jvm options
> {code}
> final String javaOpts = 
> flinkConfiguration.getString(ConfigConstants.FLINK_JVM_OPTIONS, "")
> {code}
> so when we add some jvm options for one of them ,it will be both worked



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2994: [FLINK-5324] [yarn] JVM Opitons will work for both...

2016-12-14 Thread hzyuemeng1
Github user hzyuemeng1 closed the pull request at:

https://github.com/apache/flink/pull/2994


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2994: [FLINK-5324] [yarn] JVM Opitons will work for both YarnAp...

2016-12-14 Thread hzyuemeng1
Github user hzyuemeng1 commented on the issue:

https://github.com/apache/flink/pull/2994
  
@wuchong ,can u check this issue?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-5342) Setting the parallelism automatically for operators base on cost model

2016-12-14 Thread godfrey he (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

godfrey he updated FLINK-5342:
--
Description: On Flink table API, a query will be translated to operators 
without parallelism. And user do not know even do not care the target operators 
translated from query. So it's better to set the parallelism automatically for 
each operator base on cost model.  (was: On Flink table API, a query will be 
translated to operators without parallelism. And user do not know even do not 
care the target operators translated from query, and can not set parallelism. 
So it's better to set the parallelism automatically for each operator base on 
cost model.)

> Setting the parallelism automatically for operators base on cost model
> --
>
> Key: FLINK-5342
> URL: https://issues.apache.org/jira/browse/FLINK-5342
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: godfrey he
>
> On Flink table API, a query will be translated to operators without 
> parallelism. And user do not know even do not care the target operators 
> translated from query. So it's better to set the parallelism automatically 
> for each operator base on cost model.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-5342) Setting the parallelism automatically for operators base on cost model

2016-12-14 Thread godfrey he (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

godfrey he updated FLINK-5342:
--
Description: On Flink table API, a query will be translated to operators 
without parallelism. And user do not know even do not care the target operators 
translated from query, and can not set parallelism. So it's better to set the 
parallelism automatically for each operator base on cost model.  (was: On Flink 
table API, a query will be translated to operators without parallelism. And 
user do not know the target operators translated from query, and can not set 
parallelism. So it's better to set the parallelism automatically for each 
operator base on cost model.)

> Setting the parallelism automatically for operators base on cost model
> --
>
> Key: FLINK-5342
> URL: https://issues.apache.org/jira/browse/FLINK-5342
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: godfrey he
>
> On Flink table API, a query will be translated to operators without 
> parallelism. And user do not know even do not care the target operators 
> translated from query, and can not set parallelism. So it's better to set the 
> parallelism automatically for each operator base on cost model.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-5342) Setting the parallelism automatically for Operators base on cost model

2016-12-14 Thread godfrey he (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

godfrey he updated FLINK-5342:
--
Summary: Setting the parallelism automatically for Operators base on cost 
model  (was: Setting the parallelism automatically for DataSet Operators base 
on cost model)

> Setting the parallelism automatically for Operators base on cost model
> --
>
> Key: FLINK-5342
> URL: https://issues.apache.org/jira/browse/FLINK-5342
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: godfrey he
>
> On Flink table API, a query will be translated to operators without 
> parallelism. And user do not know the target operators translated from query, 
> and can not set parallelism. So it's better to set the parallelism 
> automatically for each operator base on cost model.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-5342) Setting the parallelism automatically for operators base on cost model

2016-12-14 Thread godfrey he (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

godfrey he updated FLINK-5342:
--
Summary: Setting the parallelism automatically for operators base on cost 
model  (was: Setting the parallelism automatically for Operators base on cost 
model)

> Setting the parallelism automatically for operators base on cost model
> --
>
> Key: FLINK-5342
> URL: https://issues.apache.org/jira/browse/FLINK-5342
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: godfrey he
>
> On Flink table API, a query will be translated to operators without 
> parallelism. And user do not know the target operators translated from query, 
> and can not set parallelism. So it's better to set the parallelism 
> automatically for each operator base on cost model.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5342) Setting the parallelism automatically for DataSet Operators base on cost model

2016-12-14 Thread godfrey he (JIRA)
godfrey he created FLINK-5342:
-

 Summary: Setting the parallelism automatically for DataSet 
Operators base on cost model
 Key: FLINK-5342
 URL: https://issues.apache.org/jira/browse/FLINK-5342
 Project: Flink
  Issue Type: Improvement
  Components: Table API & SQL
Reporter: godfrey he


On Flink table API, a query will be translate to operators without parallelism. 
And user do not know the target operators translated from query, and can not 
set parallelism. So it's better to set the parallelism automatically for each 
operator base on cost model.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-5342) Setting the parallelism automatically for DataSet Operators base on cost model

2016-12-14 Thread godfrey he (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

godfrey he updated FLINK-5342:
--
Description: On Flink table API, a query will be translated to operators 
without parallelism. And user do not know the target operators translated from 
query, and can not set parallelism. So it's better to set the parallelism 
automatically for each operator base on cost model.  (was: On Flink table API, 
a query will be translate to operators without parallelism. And user do not 
know the target operators translated from query, and can not set parallelism. 
So it's better to set the parallelism automatically for each operator base on 
cost model.)

> Setting the parallelism automatically for DataSet Operators base on cost model
> --
>
> Key: FLINK-5342
> URL: https://issues.apache.org/jira/browse/FLINK-5342
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: godfrey he
>
> On Flink table API, a query will be translated to operators without 
> parallelism. And user do not know the target operators translated from query, 
> and can not set parallelism. So it's better to set the parallelism 
> automatically for each operator base on cost model.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5280) Extend TableSource to support nested data

2016-12-14 Thread Jark Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15750273#comment-15750273
 ] 

Jark Wu commented on FLINK-5280:


Hi [~ivan.mushketyk], thanks for your detailed and clear proposal. 

Regarding to the new argument {{fieldMappings}} in {{FlinkTable}}, I think it 
is playing the same role  of {{fieldIndexes}}. Actually, {{fieldIndexes}} is 
the {{inputPojoFieldMapping}} in {{CodeGenerator}} when converting. In case of 
POJO, {{fieldIndexes}} is a fieldMapping. In other cases, it is an array of 
{{0~n}}.

Regarding to the {{getNumberOfFields}} in {{TableSource}}, yes, it is used 
rarely used and can be  replaced by {{getFieldsNames.length}} if 
{{getFieldsNames}} still display the first level attributes.

Hi [~fhueske], I agree with the {{RowTypeInfo}} approach which is similar to 
Calcite's way I think. But we should support custom names in {{RowTypeInfo}} 
first. 

> Extend TableSource to support nested data
> -
>
> Key: FLINK-5280
> URL: https://issues.apache.org/jira/browse/FLINK-5280
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Affects Versions: 1.2.0
>Reporter: Fabian Hueske
>Assignee: Ivan Mushketyk
>
> The {{TableSource}} interface does currently only support the definition of 
> flat rows. 
> However, there are several storage formats for nested data that should be 
> supported such as Avro, Json, Parquet, and Orc. The Table API and SQL can 
> also natively handle nested rows.
> The {{TableSource}} interface and the code to register table sources in 
> Calcite's schema need to be extended to support nested data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4391) Provide support for asynchronous operations over streams

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15750207#comment-15750207
 ] 

ASF GitHub Bot commented on FLINK-4391:
---

Github user bjlovegithub commented on a diff in the pull request:

https://github.com/apache/flink/pull/2629#discussion_r92536062
  
--- Diff: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/async/RichAsyncFunction.java
 ---
@@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.api.functions.async;
+
+import org.apache.flink.annotation.PublicEvolving;
+import org.apache.flink.api.common.functions.AbstractRichFunction;
+import org.apache.flink.api.common.functions.IterationRuntimeContext;
+import org.apache.flink.api.common.functions.RichFunction;
+import org.apache.flink.api.common.functions.RuntimeContext;
+import 
org.apache.flink.streaming.api.functions.async.collector.AsyncCollector;
+
+/**
+ * Rich variant of the {@link AsyncFunction}. As a {@link RichFunction}, 
it gives access to the
+ * {@link org.apache.flink.api.common.functions.RuntimeContext} and 
provides setup and teardown methods:
+ * {@link RichFunction#open(org.apache.flink.configuration.Configuration)} 
and
+ * {@link RichFunction#close()}.
+ *
+ * 
+ * {@link RichAsyncFunction#getRuntimeContext()} and {@link 
RichAsyncFunction#getRuntimeContext()} are
+ * not supported because the key may get changed while accessing states in 
the working thread.
+ *
+ * @param  The type of the input elements.
+ * @param  The type of the returned elements.
+ */
+
+@PublicEvolving
+public abstract class RichAsyncFunction extends 
AbstractRichFunction
+   implements AsyncFunction {
+
+   @Override
+   public abstract void asyncInvoke(IN input, AsyncCollector 
collector) throws Exception;
+
+   @Override
+   public RuntimeContext getRuntimeContext() {
--- End diff --

I agree. It will not miss anything except state access :-)


> Provide support for asynchronous operations over streams
> 
>
> Key: FLINK-4391
> URL: https://issues.apache.org/jira/browse/FLINK-4391
> Project: Flink
>  Issue Type: New Feature
>  Components: DataStream API
>Reporter: Jamie Grier
>Assignee: david.wang
>
> Many Flink users need to do asynchronous processing driven by data from a 
> DataStream.  The classic example would be joining against an external 
> database in order to enrich a stream with extra information.
> It would be nice to add general support for this type of operation in the 
> Flink API.  Ideally this could simply take the form of a new operator that 
> manages async operations, keeps so many of them in flight, and then emits 
> results to downstream operators as the async operations complete.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2629: [FLINK-4391] Provide support for asynchronous oper...

2016-12-14 Thread bjlovegithub
Github user bjlovegithub commented on a diff in the pull request:

https://github.com/apache/flink/pull/2629#discussion_r92536062
  
--- Diff: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/async/RichAsyncFunction.java
 ---
@@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.api.functions.async;
+
+import org.apache.flink.annotation.PublicEvolving;
+import org.apache.flink.api.common.functions.AbstractRichFunction;
+import org.apache.flink.api.common.functions.IterationRuntimeContext;
+import org.apache.flink.api.common.functions.RichFunction;
+import org.apache.flink.api.common.functions.RuntimeContext;
+import 
org.apache.flink.streaming.api.functions.async.collector.AsyncCollector;
+
+/**
+ * Rich variant of the {@link AsyncFunction}. As a {@link RichFunction}, 
it gives access to the
+ * {@link org.apache.flink.api.common.functions.RuntimeContext} and 
provides setup and teardown methods:
+ * {@link RichFunction#open(org.apache.flink.configuration.Configuration)} 
and
+ * {@link RichFunction#close()}.
+ *
+ * 
+ * {@link RichAsyncFunction#getRuntimeContext()} and {@link 
RichAsyncFunction#getRuntimeContext()} are
+ * not supported because the key may get changed while accessing states in 
the working thread.
+ *
+ * @param  The type of the input elements.
+ * @param  The type of the returned elements.
+ */
+
+@PublicEvolving
+public abstract class RichAsyncFunction extends 
AbstractRichFunction
+   implements AsyncFunction {
+
+   @Override
+   public abstract void asyncInvoke(IN input, AsyncCollector 
collector) throws Exception;
+
+   @Override
+   public RuntimeContext getRuntimeContext() {
--- End diff --

I agree. It will not miss anything except state access :-)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5144) Error while applying rule AggregateJoinTransposeRule

2016-12-14 Thread Kurt Young (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15750108#comment-15750108
 ] 

Kurt Young commented on FLINK-5144:
---

I'm interested to solve this problem :)

> Error while applying rule AggregateJoinTransposeRule
> 
>
> Key: FLINK-5144
> URL: https://issues.apache.org/jira/browse/FLINK-5144
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Kurt Young
>
> AggregateJoinTransposeRule seems to cause errors. We have to investigate if 
> this is a Flink or Calcite error. Here a simplified example:
> {code}
> select
>   sum(l_extendedprice)
> from
>   lineitem,
>   part
> where
>   p_partkey = l_partkey
>   and l_quantity < (
> select
>   avg(l_quantity)
> from
>   lineitem
> where
>   l_partkey = p_partkey
>   )
> {code}
> Exception:
> {code}
> Exception in thread "main" java.lang.AssertionError: Internal error: Error 
> occurred while applying rule AggregateJoinTransposeRule
>   at org.apache.calcite.util.Util.newInternal(Util.java:792)
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(VolcanoRuleCall.java:148)
>   at 
> org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:225)
>   at 
> org.apache.calcite.rel.rules.AggregateJoinTransposeRule.onMatch(AggregateJoinTransposeRule.java:342)
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:213)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:819)
>   at 
> org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:334)
>   at 
> org.apache.flink.api.table.BatchTableEnvironment.optimize(BatchTableEnvironment.scala:251)
>   at 
> org.apache.flink.api.table.BatchTableEnvironment.translate(BatchTableEnvironment.scala:286)
>   at 
> org.apache.flink.api.scala.table.BatchTableEnvironment.toDataSet(BatchTableEnvironment.scala:139)
>   at 
> org.apache.flink.api.scala.table.package$.table2RowDataSet(package.scala:77)
>   at 
> org.apache.flink.api.scala.sql.tpch.TPCHQueries$.runQ17(TPCHQueries.scala:826)
>   at 
> org.apache.flink.api.scala.sql.tpch.TPCHQueries$.main(TPCHQueries.scala:57)
>   at 
> org.apache.flink.api.scala.sql.tpch.TPCHQueries.main(TPCHQueries.scala)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)
> Caused by: java.lang.AssertionError: Type mismatch:
> rowtype of new rel:
> RecordType(BIGINT l_partkey, BIGINT p_partkey) NOT NULL
> rowtype of set:
> RecordType(BIGINT p_partkey) NOT NULL
>   at org.apache.calcite.util.Litmus$1.fail(Litmus.java:31)
>   at org.apache.calcite.plan.RelOptUtil.equal(RelOptUtil.java:1838)
>   at org.apache.calcite.plan.volcano.RelSubset.add(RelSubset.java:273)
>   at org.apache.calcite.plan.volcano.RelSet.add(RelSet.java:148)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.addRelToSet(VolcanoPlanner.java:1820)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1766)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:1032)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:1052)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:1942)
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(VolcanoRuleCall.java:136)
>   ... 17 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-5144) Error while applying rule AggregateJoinTransposeRule

2016-12-14 Thread Kurt Young (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kurt Young reassigned FLINK-5144:
-

Assignee: Kurt Young

> Error while applying rule AggregateJoinTransposeRule
> 
>
> Key: FLINK-5144
> URL: https://issues.apache.org/jira/browse/FLINK-5144
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Kurt Young
>
> AggregateJoinTransposeRule seems to cause errors. We have to investigate if 
> this is a Flink or Calcite error. Here a simplified example:
> {code}
> select
>   sum(l_extendedprice)
> from
>   lineitem,
>   part
> where
>   p_partkey = l_partkey
>   and l_quantity < (
> select
>   avg(l_quantity)
> from
>   lineitem
> where
>   l_partkey = p_partkey
>   )
> {code}
> Exception:
> {code}
> Exception in thread "main" java.lang.AssertionError: Internal error: Error 
> occurred while applying rule AggregateJoinTransposeRule
>   at org.apache.calcite.util.Util.newInternal(Util.java:792)
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(VolcanoRuleCall.java:148)
>   at 
> org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:225)
>   at 
> org.apache.calcite.rel.rules.AggregateJoinTransposeRule.onMatch(AggregateJoinTransposeRule.java:342)
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:213)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:819)
>   at 
> org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:334)
>   at 
> org.apache.flink.api.table.BatchTableEnvironment.optimize(BatchTableEnvironment.scala:251)
>   at 
> org.apache.flink.api.table.BatchTableEnvironment.translate(BatchTableEnvironment.scala:286)
>   at 
> org.apache.flink.api.scala.table.BatchTableEnvironment.toDataSet(BatchTableEnvironment.scala:139)
>   at 
> org.apache.flink.api.scala.table.package$.table2RowDataSet(package.scala:77)
>   at 
> org.apache.flink.api.scala.sql.tpch.TPCHQueries$.runQ17(TPCHQueries.scala:826)
>   at 
> org.apache.flink.api.scala.sql.tpch.TPCHQueries$.main(TPCHQueries.scala:57)
>   at 
> org.apache.flink.api.scala.sql.tpch.TPCHQueries.main(TPCHQueries.scala)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)
> Caused by: java.lang.AssertionError: Type mismatch:
> rowtype of new rel:
> RecordType(BIGINT l_partkey, BIGINT p_partkey) NOT NULL
> rowtype of set:
> RecordType(BIGINT p_partkey) NOT NULL
>   at org.apache.calcite.util.Litmus$1.fail(Litmus.java:31)
>   at org.apache.calcite.plan.RelOptUtil.equal(RelOptUtil.java:1838)
>   at org.apache.calcite.plan.volcano.RelSubset.add(RelSubset.java:273)
>   at org.apache.calcite.plan.volcano.RelSet.add(RelSet.java:148)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.addRelToSet(VolcanoPlanner.java:1820)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1766)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:1032)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:1052)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:1942)
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(VolcanoRuleCall.java:136)
>   ... 17 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4883) Prevent UDFs implementations through Scala singleton objects

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15750067#comment-15750067
 ] 

ASF GitHub Bot commented on FLINK-4883:
---

Github user Renkai commented on the issue:

https://github.com/apache/flink/pull/2729
  
Hi, @aljoscha I'm wondering about how could the warning be "big", should it 
have a different color in the console or pop out an alert window when people 
submit apps from web?


> Prevent UDFs implementations through Scala singleton objects
> 
>
> Key: FLINK-4883
> URL: https://issues.apache.org/jira/browse/FLINK-4883
> Project: Flink
>  Issue Type: Bug
>Reporter: Stefan Richter
>Assignee: Renkai Ge
>
> Currently, user can create and use UDFs in Scala like this:
> {code}
> object FlatMapper extends RichCoFlatMapFunction[Long, String, (Long, String)] 
> {
> ...
> }
> {code}
> However, this leads to problems as the UDF is now a singleton that Flink 
> could use across several operator instances, which leads to job failures. We 
> should detect and prevent the usage of singleton UDFs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2729: [FLINK-4883]Prevent UDFs implementations through Scala si...

2016-12-14 Thread Renkai
Github user Renkai commented on the issue:

https://github.com/apache/flink/pull/2729
  
Hi, @aljoscha I'm wondering about how could the warning be "big", should it 
have a different color in the console or pop out an alert window when people 
submit apps from web?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2962: [FLINK-5051] Backwards compatibility for serialize...

2016-12-14 Thread StefanRRichter
Github user StefanRRichter closed the pull request at:

https://github.com/apache/flink/pull/2962


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5051) Backwards compatibility for serializers in backend state

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15749873#comment-15749873
 ] 

ASF GitHub Bot commented on FLINK-5051:
---

Github user StefanRRichter closed the pull request at:

https://github.com/apache/flink/pull/2962


> Backwards compatibility for serializers in backend state
> 
>
> Key: FLINK-5051
> URL: https://issues.apache.org/jira/browse/FLINK-5051
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Reporter: Stefan Richter
>Assignee: Stefan Richter
>
> When a new state is register, e.g. in a keyed backend via 
> `getPartitionedState`, the caller has to provide all type serializers 
> required for the persistence of state components. Explicitly passing the 
> serializers on state creation already allows for potentiall version upgrades 
> of serializers.
> However, those serializers are currently not part of any snapshot and are 
> only provided at runtime, when the state is registered newly or restored. For 
> backwards compatibility, this has strong implications: checkpoints are not 
> self contained in that state is currently a blackbox without knowledge about 
> it's corresponding serializers. Most cases where we would need to restructure 
> the state are basically lost. We could only convert them lazily at runtime 
> and only once the user is registering the concrete state, which might happen 
> at unpredictable points.
> I suggest to adapt our solution as follows:
> - As now, all states are registered with their set of serializers.
> - Unlike now, all serializers are written to the snapshot. This makes 
> savepoints self-contained and also allows to create inspection tools for 
> savepoints at some point in the future.
> - Introduce an interface {{Versioned}} with {{long getVersion()}} and 
> {{boolean isCompatible(Versioned v)}} which is then implemented by 
> serializers. Compatible serializers must ensure that they can deserialize 
> older versions, and can then serialize them in their new format. This is how 
> we upgrade.
> We need to find the right tradeoff in how many places we need to store the 
> serializers. I suggest to write them once per parallel operator instance for 
> each state, i.e. we have a map with state_name -> tuple3 serializer, serializer>. This could go before all 
> key-groups are written, right at the head of the file. Then, for each file we 
> see on restore, we can first read the serializer map from the head of the 
> stream, then go through the key groups by offset.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5341) Add a metric exposing how many times a job has been restarted

2016-12-14 Thread Dan Bress (JIRA)
Dan Bress created FLINK-5341:


 Summary: Add a metric exposing how many times a job has been 
restarted
 Key: FLINK-5341
 URL: https://issues.apache.org/jira/browse/FLINK-5341
 Project: Flink
  Issue Type: New Feature
  Components: Core
Reporter: Dan Bress
Priority: Minor


I would like the job manager to expose a metric how many times each job has 
been restarted.  This way I can grab this number and measure whether or not my 
job is healthy



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-5340) Add a metric exposing jobs uptimes

2016-12-14 Thread Dan Bress (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dan Bress updated FLINK-5340:
-
Summary: Add a metric exposing jobs uptimes  (was: I would like a metric 
exposing job uptime)

> Add a metric exposing jobs uptimes
> --
>
> Key: FLINK-5340
> URL: https://issues.apache.org/jira/browse/FLINK-5340
> Project: Flink
>  Issue Type: New Feature
>  Components: Core
>Reporter: Dan Bress
>Priority: Minor
>
> I would like the job manager to expose a metric indicating how long each job 
> has been up.  This way I can grab this number and measure the health of my 
> job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5340) I would like a metric exposing job uptime

2016-12-14 Thread Dan Bress (JIRA)
Dan Bress created FLINK-5340:


 Summary: I would like a metric exposing job uptime
 Key: FLINK-5340
 URL: https://issues.apache.org/jira/browse/FLINK-5340
 Project: Flink
  Issue Type: New Feature
  Components: Core
Reporter: Dan Bress


I would like the job manager to expose a metric indicating how long each job 
has been up.  This way I can grab this number and measure the health of my job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-5340) I would like a metric exposing job uptime

2016-12-14 Thread Dan Bress (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dan Bress updated FLINK-5340:
-
Priority: Minor  (was: Major)

> I would like a metric exposing job uptime
> -
>
> Key: FLINK-5340
> URL: https://issues.apache.org/jira/browse/FLINK-5340
> Project: Flink
>  Issue Type: New Feature
>  Components: Core
>Reporter: Dan Bress
>Priority: Minor
>
> I would like the job manager to expose a metric indicating how long each job 
> has been up.  This way I can grab this number and measure the health of my 
> job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5320) Fix result TypeInformation in WindowedStream.fold(ACC, FoldFunction, WindowFunction)

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15749732#comment-15749732
 ] 

ASF GitHub Bot commented on FLINK-5320:
---

Github user ymarzougui commented on the issue:

https://github.com/apache/flink/pull/2995
  
@StephanEwen @aljoscha, I added a test for `WindowedStream.fold()` similar 
to the one in 
[EventTimeAllWindowCheckpointingITCase](https://github.com/apache/flink/blob/master/flink-tests/src/test/java/org/apache/flink/test/checkpointing/EventTimeAllWindowCheckpointingITCase.java#L285).


> Fix result TypeInformation in WindowedStream.fold(ACC, FoldFunction, 
> WindowFunction)
> 
>
> Key: FLINK-5320
> URL: https://issues.apache.org/jira/browse/FLINK-5320
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 1.2.0
>Reporter: Yassine Marzougui
>Assignee: Yassine Marzougui
>Priority: Blocker
>
> The WindowedStream.fold(ACC, FoldFunction, WindowFunction) does not correctly 
> infer the resultType.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2995: [FLINK-5320] Fix result TypeInformation in WindowedStream...

2016-12-14 Thread ymarzougui
Github user ymarzougui commented on the issue:

https://github.com/apache/flink/pull/2995
  
@StephanEwen @aljoscha, I added a test for `WindowedStream.fold()` similar 
to the one in 
[EventTimeAllWindowCheckpointingITCase](https://github.com/apache/flink/blob/master/flink-tests/src/test/java/org/apache/flink/test/checkpointing/EventTimeAllWindowCheckpointingITCase.java#L285).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2985: [FLINK-5104] Bipartite graph validation

2016-12-14 Thread mushketyk
Github user mushketyk commented on the issue:

https://github.com/apache/flink/pull/2985
  
Sorry for the failing build. I've removed the unused imports so it should 
be fine now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5104) Implement BipartiteGraph validator

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15749656#comment-15749656
 ] 

ASF GitHub Bot commented on FLINK-5104:
---

Github user mushketyk commented on the issue:

https://github.com/apache/flink/pull/2985
  
Sorry for the failing build. I've removed the unused imports so it should 
be fine now.


> Implement BipartiteGraph validator
> --
>
> Key: FLINK-5104
> URL: https://issues.apache.org/jira/browse/FLINK-5104
> Project: Flink
>  Issue Type: Sub-task
>  Components: Gelly
>Reporter: Ivan Mushketyk
>Assignee: Ivan Mushketyk
>
> BipartiteGraph should have a validator similar to GraphValidator for Graph 
> class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2987: [FLINK-4647] Read bipartite graph

2016-12-14 Thread greghogan
Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/2987
  
#2832 has been committed to master. Please rebase.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4647) Implement BipartiteGraph reader

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15749561#comment-15749561
 ] 

ASF GitHub Bot commented on FLINK-4647:
---

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/2987
  
#2832 has been committed to master. Please rebase.


> Implement BipartiteGraph reader
> ---
>
> Key: FLINK-4647
> URL: https://issues.apache.org/jira/browse/FLINK-4647
> Project: Flink
>  Issue Type: Sub-task
>  Components: Gelly
>Reporter: Ivan Mushketyk
>Assignee: Ivan Mushketyk
>
> Implement reading bipartite graph from a CSV. Should be similar to how 
> regular graph is read from a file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4936) Operator names for Gelly inputs

2016-12-14 Thread Greg Hogan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Hogan updated FLINK-4936:
--
Fix Version/s: 1.2.0

> Operator names for Gelly inputs
> ---
>
> Key: FLINK-4936
> URL: https://issues.apache.org/jira/browse/FLINK-4936
> Project: Flink
>  Issue Type: Improvement
>  Components: Gelly
>Affects Versions: 1.2.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>Priority: Minor
> Fix For: 1.2.0
>
>
> Provider descriptive operator names for Gelly's {{Graph}} and 
> {{GraphCsvReader}}. Also, condense multiple type conversion maps into a 
> single mapper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-4936) Operator names for Gelly inputs

2016-12-14 Thread Greg Hogan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Hogan closed FLINK-4936.
-
Resolution: Implemented

Implemented in 09e0817306e8c077aeea96252a7c98fbd0f9747b

> Operator names for Gelly inputs
> ---
>
> Key: FLINK-4936
> URL: https://issues.apache.org/jira/browse/FLINK-4936
> Project: Flink
>  Issue Type: Improvement
>  Components: Gelly
>Affects Versions: 1.2.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>Priority: Minor
> Fix For: 1.2.0
>
>
> Provider descriptive operator names for Gelly's {{Graph}} and 
> {{GraphCsvReader}}. Also, condense multiple type conversion maps into a 
> single mapper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4936) Operator names for Gelly inputs

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15749496#comment-15749496
 ] 

ASF GitHub Bot commented on FLINK-4936:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2832


> Operator names for Gelly inputs
> ---
>
> Key: FLINK-4936
> URL: https://issues.apache.org/jira/browse/FLINK-4936
> Project: Flink
>  Issue Type: Improvement
>  Components: Gelly
>Affects Versions: 1.2.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>Priority: Minor
>
> Provider descriptive operator names for Gelly's {{Graph}} and 
> {{GraphCsvReader}}. Also, condense multiple type conversion maps into a 
> single mapper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2832: [FLINK-4936] [gelly] Operator names for Gelly inpu...

2016-12-14 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2832


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5104) Implement BipartiteGraph validator

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15749452#comment-15749452
 ] 

ASF GitHub Bot commented on FLINK-5104:
---

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/2985
  
@mushketyk the build is failing due to unused imports (IntelliJ can be 
configured to automatically remove these).

There are so many ways to implement the validator. Projections would reduce 
the data to be transmitted and sorted. I like to document when we are 
performing a different kind of join (here, an anti-join) for the day when these 
are available in Flink. I don't know how counting in the `FlatJoinFunction` 
followed by two `DiscardingOutputFormat` compares with the current 
implementation of a union and count (which is a dummy `OutputFormat`).

We'll need to rebase this PR once FLINK-5311 has been committed.


> Implement BipartiteGraph validator
> --
>
> Key: FLINK-5104
> URL: https://issues.apache.org/jira/browse/FLINK-5104
> Project: Flink
>  Issue Type: Sub-task
>  Components: Gelly
>Reporter: Ivan Mushketyk
>Assignee: Ivan Mushketyk
>
> BipartiteGraph should have a validator similar to GraphValidator for Graph 
> class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2985: [FLINK-5104] Bipartite graph validation

2016-12-14 Thread greghogan
Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/2985
  
@mushketyk the build is failing due to unused imports (IntelliJ can be 
configured to automatically remove these).

There are so many ways to implement the validator. Projections would reduce 
the data to be transmitted and sorted. I like to document when we are 
performing a different kind of join (here, an anti-join) for the day when these 
are available in Flink. I don't know how counting in the `FlatJoinFunction` 
followed by two `DiscardingOutputFormat` compares with the current 
implementation of a union and count (which is a dummy `OutputFormat`).

We'll need to rebase this PR once FLINK-5311 has been committed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4936) Operator names for Gelly inputs

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15749308#comment-15749308
 ] 

ASF GitHub Bot commented on FLINK-4936:
---

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/2832
  
That's odd, your last comment was not propagated to JIRA @vasia. Might have 
been an outage at that time. Will merge this ...


> Operator names for Gelly inputs
> ---
>
> Key: FLINK-4936
> URL: https://issues.apache.org/jira/browse/FLINK-4936
> Project: Flink
>  Issue Type: Improvement
>  Components: Gelly
>Affects Versions: 1.2.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>Priority: Minor
>
> Provider descriptive operator names for Gelly's {{Graph}} and 
> {{GraphCsvReader}}. Also, condense multiple type conversion maps into a 
> single mapper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2832: [FLINK-4936] [gelly] Operator names for Gelly inputs

2016-12-14 Thread greghogan
Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/2832
  
That's odd, your last comment was not propagated to JIRA @vasia. Might have 
been an outage at that time. Will merge this ...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4861) Package optional project artifacts

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15749204#comment-15749204
 ] 

ASF GitHub Bot commented on FLINK-4861:
---

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3000
  
`flink-avro` is included in the distribution.

Will adding a scala suffix to`flink-hcatalog` break API compatibility?

I see now that we can use `combine.self="override"` to allow configuration 
defined in the parent pom to be overwritten in child poms.

I am thinking that we need to take a more nuanced approach to copying or 
creating jars for `opt/`. What if we scale this initial PR to only include 
metrics, cep, ml, and gelly?


> Package optional project artifacts
> --
>
> Key: FLINK-4861
> URL: https://issues.apache.org/jira/browse/FLINK-4861
> Project: Flink
>  Issue Type: New Feature
>  Components: Build System
>Affects Versions: 1.2.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
> Fix For: 1.2.0
>
>
> Per the mailing list 
> [discussion|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Additional-project-downloads-td13223.html],
>  package the Flink libraries and connectors into subdirectories of a new 
> {{opt}} directory in the release/snapshot tarballs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #3000: [FLINK-4861] [build] Package optional project artifacts

2016-12-14 Thread greghogan
Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3000
  
`flink-avro` is included in the distribution.

Will adding a scala suffix to`flink-hcatalog` break API compatibility?

I see now that we can use `combine.self="override"` to allow configuration 
defined in the parent pom to be overwritten in child poms.

I am thinking that we need to take a more nuanced approach to copying or 
creating jars for `opt/`. What if we scale this initial PR to only include 
metrics, cep, ml, and gelly?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4782) log4j & slf4j-log4j are present in jar-with-dependencies artifacts

2016-12-14 Thread Greg Hogan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15749108#comment-15749108
 ] 

Greg Hogan commented on FLINK-4782:
---

flink-metrics-dropwizard-1.2-SNAPSHOT-jar-with-dependencies.jar
flink-metrics-ganglia-1.2-SNAPSHOT-jar-with-dependencies.jar
flink-metrics-graphite-1.2-SNAPSHOT-jar-with-dependencies.jar

Shouldn't log4j and slf4j-log4j12 be marked as scope provided in the parent pom?

> log4j & slf4j-log4j are present in jar-with-dependencies artifacts
> --
>
> Key: FLINK-4782
> URL: https://issues.apache.org/jira/browse/FLINK-4782
> Project: Flink
>  Issue Type: Bug
>  Components: Build System, Metrics
>Affects Versions: 1.1.2
>Reporter: Maciej Prochniak
>Priority: Minor
>
> This makes it more difficult to use dropwizard metrics with e.g. logback
> It's because they are declared in parent pom as compile scope dependencies. 
> I don't clearly understand why is it this way - I'd rather have them provided 
> (as they are still specified in dist) but maybe there are some good reasons 
> for that?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2962: [FLINK-5051] Backwards compatibility for serializers in b...

2016-12-14 Thread aljoscha
Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/2962
  
I merged this.

Thanks a lot for your work. 👍 Could you please close this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5051) Backwards compatibility for serializers in backend state

2016-12-14 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748860#comment-15748860
 ] 

Aljoscha Krettek commented on FLINK-5051:
-

I think this is only partially solved by your recent PR, right [~srichter]. 
Should we leave this open? Or create a new issue for follow-up work?

> Backwards compatibility for serializers in backend state
> 
>
> Key: FLINK-5051
> URL: https://issues.apache.org/jira/browse/FLINK-5051
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Reporter: Stefan Richter
>Assignee: Stefan Richter
>
> When a new state is register, e.g. in a keyed backend via 
> `getPartitionedState`, the caller has to provide all type serializers 
> required for the persistence of state components. Explicitly passing the 
> serializers on state creation already allows for potentiall version upgrades 
> of serializers.
> However, those serializers are currently not part of any snapshot and are 
> only provided at runtime, when the state is registered newly or restored. For 
> backwards compatibility, this has strong implications: checkpoints are not 
> self contained in that state is currently a blackbox without knowledge about 
> it's corresponding serializers. Most cases where we would need to restructure 
> the state are basically lost. We could only convert them lazily at runtime 
> and only once the user is registering the concrete state, which might happen 
> at unpredictable points.
> I suggest to adapt our solution as follows:
> - As now, all states are registered with their set of serializers.
> - Unlike now, all serializers are written to the snapshot. This makes 
> savepoints self-contained and also allows to create inspection tools for 
> savepoints at some point in the future.
> - Introduce an interface {{Versioned}} with {{long getVersion()}} and 
> {{boolean isCompatible(Versioned v)}} which is then implemented by 
> serializers. Compatible serializers must ensure that they can deserialize 
> older versions, and can then serialize them in their new format. This is how 
> we upgrade.
> We need to find the right tradeoff in how many places we need to store the 
> serializers. I suggest to write them once per parallel operator instance for 
> each state, i.e. we have a map with state_name -> tuple3 serializer, serializer>. This could go before all 
> key-groups are written, right at the head of the file. Then, for each file we 
> see on restore, we can first read the serializer map from the head of the 
> stream, then go through the key groups by offset.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-5034) Don't Write StateDescriptor to RocksDB Snapshot

2016-12-14 Thread Aljoscha Krettek (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek closed FLINK-5034.
---
Resolution: Fixed

Solved by 21d1d8b49337e734ce3defb5f1b9344f748cb49e which was addressing 
FLINK-5051.

> Don't Write StateDescriptor to RocksDB Snapshot
> ---
>
> Key: FLINK-5034
> URL: https://issues.apache.org/jira/browse/FLINK-5034
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Reporter: Aljoscha Krettek
>Priority: Blocker
> Fix For: 1.2.0
>
>
> The {{StateDescriptor}} contains user code, which means that we possibly have 
> problems with updates to user-code. Also the {{StateDescriptor}} is not 
> really necessary when restoring.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5051) Backwards compatibility for serializers in backend state

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748852#comment-15748852
 ] 

ASF GitHub Bot commented on FLINK-5051:
---

Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/2962
  
I merged this.

Thanks a lot for your work.  Could you please close this PR.


> Backwards compatibility for serializers in backend state
> 
>
> Key: FLINK-5051
> URL: https://issues.apache.org/jira/browse/FLINK-5051
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Reporter: Stefan Richter
>Assignee: Stefan Richter
>
> When a new state is register, e.g. in a keyed backend via 
> `getPartitionedState`, the caller has to provide all type serializers 
> required for the persistence of state components. Explicitly passing the 
> serializers on state creation already allows for potentiall version upgrades 
> of serializers.
> However, those serializers are currently not part of any snapshot and are 
> only provided at runtime, when the state is registered newly or restored. For 
> backwards compatibility, this has strong implications: checkpoints are not 
> self contained in that state is currently a blackbox without knowledge about 
> it's corresponding serializers. Most cases where we would need to restructure 
> the state are basically lost. We could only convert them lazily at runtime 
> and only once the user is registering the concrete state, which might happen 
> at unpredictable points.
> I suggest to adapt our solution as follows:
> - As now, all states are registered with their set of serializers.
> - Unlike now, all serializers are written to the snapshot. This makes 
> savepoints self-contained and also allows to create inspection tools for 
> savepoints at some point in the future.
> - Introduce an interface {{Versioned}} with {{long getVersion()}} and 
> {{boolean isCompatible(Versioned v)}} which is then implemented by 
> serializers. Compatible serializers must ensure that they can deserialize 
> older versions, and can then serialize them in their new format. This is how 
> we upgrade.
> We need to find the right tradeoff in how many places we need to store the 
> serializers. I suggest to write them once per parallel operator instance for 
> each state, i.e. we have a map with state_name -> tuple3 serializer, serializer>. This could go before all 
> key-groups are written, right at the head of the file. Then, for each file we 
> see on restore, we can first read the serializer map from the head of the 
> stream, then go through the key groups by offset.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4391) Provide support for asynchronous operations over streams

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748829#comment-15748829
 ] 

ASF GitHub Bot commented on FLINK-4391:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/2629#discussion_r92436971
  
--- Diff: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/async/RichAsyncFunction.java
 ---
@@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.api.functions.async;
+
+import org.apache.flink.annotation.PublicEvolving;
+import org.apache.flink.api.common.functions.AbstractRichFunction;
+import org.apache.flink.api.common.functions.IterationRuntimeContext;
+import org.apache.flink.api.common.functions.RichFunction;
+import org.apache.flink.api.common.functions.RuntimeContext;
+import 
org.apache.flink.streaming.api.functions.async.collector.AsyncCollector;
+
+/**
+ * Rich variant of the {@link AsyncFunction}. As a {@link RichFunction}, 
it gives access to the
+ * {@link org.apache.flink.api.common.functions.RuntimeContext} and 
provides setup and teardown methods:
+ * {@link RichFunction#open(org.apache.flink.configuration.Configuration)} 
and
+ * {@link RichFunction#close()}.
+ *
+ * 
+ * {@link RichAsyncFunction#getRuntimeContext()} and {@link 
RichAsyncFunction#getRuntimeContext()} are
+ * not supported because the key may get changed while accessing states in 
the working thread.
+ *
+ * @param  The type of the input elements.
+ * @param  The type of the returned elements.
+ */
+
+@PublicEvolving
+public abstract class RichAsyncFunction extends 
AbstractRichFunction
+   implements AsyncFunction {
+
+   @Override
+   public abstract void asyncInvoke(IN input, AsyncCollector 
collector) throws Exception;
+
+   @Override
+   public RuntimeContext getRuntimeContext() {
--- End diff --

Maybe we could wrap the `RuntimeContext` which will throw an 
`UnsupportedOperationException` if we try to access state. Everything else will 
be forwarded to the underlying `RuntimeContext`?


> Provide support for asynchronous operations over streams
> 
>
> Key: FLINK-4391
> URL: https://issues.apache.org/jira/browse/FLINK-4391
> Project: Flink
>  Issue Type: New Feature
>  Components: DataStream API
>Reporter: Jamie Grier
>Assignee: david.wang
>
> Many Flink users need to do asynchronous processing driven by data from a 
> DataStream.  The classic example would be joining against an external 
> database in order to enrich a stream with extra information.
> It would be nice to add general support for this type of operation in the 
> Flink API.  Ideally this could simply take the form of a new operator that 
> manages async operations, keeps so many of them in flight, and then emits 
> results to downstream operators as the async operations complete.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2629: [FLINK-4391] Provide support for asynchronous oper...

2016-12-14 Thread tillrohrmann
Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/2629#discussion_r92436971
  
--- Diff: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/async/RichAsyncFunction.java
 ---
@@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.api.functions.async;
+
+import org.apache.flink.annotation.PublicEvolving;
+import org.apache.flink.api.common.functions.AbstractRichFunction;
+import org.apache.flink.api.common.functions.IterationRuntimeContext;
+import org.apache.flink.api.common.functions.RichFunction;
+import org.apache.flink.api.common.functions.RuntimeContext;
+import 
org.apache.flink.streaming.api.functions.async.collector.AsyncCollector;
+
+/**
+ * Rich variant of the {@link AsyncFunction}. As a {@link RichFunction}, 
it gives access to the
+ * {@link org.apache.flink.api.common.functions.RuntimeContext} and 
provides setup and teardown methods:
+ * {@link RichFunction#open(org.apache.flink.configuration.Configuration)} 
and
+ * {@link RichFunction#close()}.
+ *
+ * 
+ * {@link RichAsyncFunction#getRuntimeContext()} and {@link 
RichAsyncFunction#getRuntimeContext()} are
+ * not supported because the key may get changed while accessing states in 
the working thread.
+ *
+ * @param  The type of the input elements.
+ * @param  The type of the returned elements.
+ */
+
+@PublicEvolving
+public abstract class RichAsyncFunction extends 
AbstractRichFunction
+   implements AsyncFunction {
+
+   @Override
+   public abstract void asyncInvoke(IN input, AsyncCollector 
collector) throws Exception;
+
+   @Override
+   public RuntimeContext getRuntimeContext() {
--- End diff --

Maybe we could wrap the `RuntimeContext` which will throw an 
`UnsupportedOperationException` if we try to access state. Everything else will 
be forwarded to the underlying `RuntimeContext`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5323) CheckpointNotifier should be removed from docs

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748737#comment-15748737
 ] 

ASF GitHub Bot commented on FLINK-5323:
---

Github user abhishsi commented on the issue:

https://github.com/apache/flink/pull/3006
  
@rmetzger - not sure why CI failed. How do I rerun it?


> CheckpointNotifier should be removed from docs
> --
>
> Key: FLINK-5323
> URL: https://issues.apache.org/jira/browse/FLINK-5323
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.1.3
>Reporter: Abhishek Singh
>Priority: Trivial
>
> I was following the official documentation: 
> https://ci.apache.org/projects/flink/flink-docs-release-1.1/apis/streaming/state.html
> Looks like this is the right one to be using: import 
> org.apache.flink.runtime.state.CheckpointListener;
> -Abhishek-
> On Dec 9, 2016, at 4:30 PM, Abhishek R. Singh 
>  wrote:
> I can’t seem to find CheckpointNotifier. Appreciate help !
> CheckpointNotifier is not a member of package 
> org.apache.flink.streaming.api.checkpoint
> From my pom.xml:
> 
> org.apache.flink
> flink-scala_2.11
> 1.1.3
> 
> 
> org.apache.flink
> flink-streaming-scala_2.11
> 1.1.3
> 
> 
> org.apache.flink
> flink-clients_2.11
> 1.1.3
> 
> 
> org.apache.flink
> flink-statebackend-rocksdb_2.11
> 1.1.3
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #3006: [FLINK-5323] CheckpointNotifier should be removed from do...

2016-12-14 Thread abhishsi
Github user abhishsi commented on the issue:

https://github.com/apache/flink/pull/3006
  
@rmetzger - not sure why CI failed. How do I rerun it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #3006: [FLINK-5323] CheckpointNotifier should be removed from do...

2016-12-14 Thread abhishsi
Github user abhishsi commented on the issue:

https://github.com/apache/flink/pull/3006
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5323) CheckpointNotifier should be removed from docs

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748733#comment-15748733
 ] 

ASF GitHub Bot commented on FLINK-5323:
---

Github user abhishsi commented on the issue:

https://github.com/apache/flink/pull/3006
  
retest this please


> CheckpointNotifier should be removed from docs
> --
>
> Key: FLINK-5323
> URL: https://issues.apache.org/jira/browse/FLINK-5323
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.1.3
>Reporter: Abhishek Singh
>Priority: Trivial
>
> I was following the official documentation: 
> https://ci.apache.org/projects/flink/flink-docs-release-1.1/apis/streaming/state.html
> Looks like this is the right one to be using: import 
> org.apache.flink.runtime.state.CheckpointListener;
> -Abhishek-
> On Dec 9, 2016, at 4:30 PM, Abhishek R. Singh 
>  wrote:
> I can’t seem to find CheckpointNotifier. Appreciate help !
> CheckpointNotifier is not a member of package 
> org.apache.flink.streaming.api.checkpoint
> From my pom.xml:
> 
> org.apache.flink
> flink-scala_2.11
> 1.1.3
> 
> 
> org.apache.flink
> flink-streaming-scala_2.11
> 1.1.3
> 
> 
> org.apache.flink
> flink-clients_2.11
> 1.1.3
> 
> 
> org.apache.flink
> flink-statebackend-rocksdb_2.11
> 1.1.3
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2997: [FLINK-5240][tests] ensure state backends are prop...

2016-12-14 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2997


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5240) Properly Close StateBackend in StreamTask when closing/canceling

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748676#comment-15748676
 ] 

ASF GitHub Bot commented on FLINK-5240:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2997


> Properly Close StateBackend in StreamTask when closing/canceling
> 
>
> Key: FLINK-5240
> URL: https://issues.apache.org/jira/browse/FLINK-5240
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Reporter: Aljoscha Krettek
>Priority: Blocker
> Fix For: 1.2.0
>
>
> Right now, the {{StreamTask}} never calls {{close()}} on the state backend.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-5240) Properly Close StateBackend in StreamTask when closing/canceling

2016-12-14 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-5240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved FLINK-5240.
---
Resolution: Fixed
  Assignee: Maximilian Michels

Added a test case to verify closing of the StateBackend.

Commit: bf2874e22a41ae195ea162f4e9c31e90a42a4c1a

> Properly Close StateBackend in StreamTask when closing/canceling
> 
>
> Key: FLINK-5240
> URL: https://issues.apache.org/jira/browse/FLINK-5240
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Reporter: Aljoscha Krettek
>Assignee: Maximilian Michels
>Priority: Blocker
> Fix For: 1.2.0
>
>
> Right now, the {{StreamTask}} never calls {{close()}} on the state backend.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4973) Flakey Yarn tests due to recently added latency marker

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748686#comment-15748686
 ] 

ASF GitHub Bot commented on FLINK-4973:
---

GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/3008

[FLINK-4973] Let LatencyMarksEmitter use StreamTask's ProcessingTimeService

The LatencyMarksEmitter class uses now the StreamTask's 
ProcessingTimeService to schedule
latency mark emission. For that the ProcessingTimeService was extended to 
have the method
scheduleAtFixedRate to schedule repeated tasks. The latency mark emission 
is such a repeated
task.

cc @rmetzger.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink fixLatencyMarksEmitter

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3008.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3008


commit c06b3457f2b98969f444da07569656ac07727166
Author: Till Rohrmann 
Date:   2016-12-14T13:53:11Z

[FLINK-4973] Let LatencyMarksEmitter use StreamTask's ProcessingTimeService

The LatencyMarksEmitter class uses now the StreamTask's 
ProcessingTimeService to schedule
latency mark emission. For that the ProcessingTimeService was extended to 
have the method
scheduleAtFixedRate to schedule repeated tasks. The latency mark emission 
is such a repeated
task.




> Flakey Yarn tests due to recently added latency marker
> --
>
> Key: FLINK-4973
> URL: https://issues.apache.org/jira/browse/FLINK-4973
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.2.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.2.0
>
>
> The newly introduced {{LatencyMarksEmitter}} emits latency marker on the 
> {{Output}}. This can still happen after the underlying {{BufferPool}} has 
> been destroyed. The occurring exception is then logged:
> {code}
> 2016-10-29 15:00:48,088 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Source: Custom File Source (1/1) switched to FINISHED
> 2016-10-29 15:00:48,089 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Freeing task resources for Source: Custom File Source (1/1)
> 2016-10-29 15:00:48,089 INFO  org.apache.flink.yarn.YarnTaskManager   
>   - Un-registering task and sending final execution state 
> FINISHED to JobManager for task Source: Custom File Source 
> (8fe0f817fa6d960ea33f6e57e0c3891c)
> 2016-10-29 15:00:48,101 WARN  
> org.apache.flink.streaming.api.operators.AbstractStreamOperator  - Error 
> while emitting latency marker
> java.lang.RuntimeException: Buffer pool is destroyed.
>   at 
> org.apache.flink.streaming.runtime.io.RecordWriterOutput.emitLatencyMarker(RecordWriterOutput.java:99)
>   at 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.emitLatencyMarker(AbstractStreamOperator.java:734)
>   at 
> org.apache.flink.streaming.api.operators.StreamSource$LatencyMarksEmitter$1.run(StreamSource.java:134)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalStateException: Buffer pool is destroyed.
>   at 
> org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBuffer(LocalBufferPool.java:144)
>   at 
> org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBufferBlocking(LocalBufferPool.java:133)
>   at 
> org.apache.flink.runtime.io.network.api.writer.RecordWriter.sendToTarget(RecordWriter.java:118)
>   at 
> org.apache.flink.runtime.io.network.api.writer.RecordWriter.randomEmit(RecordWriter.java:103)
>   at 
> org.apache.flink.streaming.runtime.io.StreamRecordWriter.randomEmit(StreamRecordWriter.java:104)
>   at 
> org.apache.flink.streaming.runtime.io.RecordWriterOutput.emitLatencyMarker(RecordWriterOutput.java:96)
>   ... 9 more
> {code}
> This 

[GitHub] flink pull request #3008: [FLINK-4973] Let LatencyMarksEmitter use StreamTas...

2016-12-14 Thread tillrohrmann
GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/3008

[FLINK-4973] Let LatencyMarksEmitter use StreamTask's ProcessingTimeService

The LatencyMarksEmitter class uses now the StreamTask's 
ProcessingTimeService to schedule
latency mark emission. For that the ProcessingTimeService was extended to 
have the method
scheduleAtFixedRate to schedule repeated tasks. The latency mark emission 
is such a repeated
task.

cc @rmetzger.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink fixLatencyMarksEmitter

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3008.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3008


commit c06b3457f2b98969f444da07569656ac07727166
Author: Till Rohrmann 
Date:   2016-12-14T13:53:11Z

[FLINK-4973] Let LatencyMarksEmitter use StreamTask's ProcessingTimeService

The LatencyMarksEmitter class uses now the StreamTask's 
ProcessingTimeService to schedule
latency mark emission. For that the ProcessingTimeService was extended to 
have the method
scheduleAtFixedRate to schedule repeated tasks. The latency mark emission 
is such a repeated
task.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5240) Properly Close StateBackend in StreamTask when closing/canceling

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748669#comment-15748669
 ] 

ASF GitHub Bot commented on FLINK-5240:
---

Github user mxm commented on the issue:

https://github.com/apache/flink/pull/2997
  
Thanks for the review, will merge then.


> Properly Close StateBackend in StreamTask when closing/canceling
> 
>
> Key: FLINK-5240
> URL: https://issues.apache.org/jira/browse/FLINK-5240
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Reporter: Aljoscha Krettek
>Priority: Blocker
> Fix For: 1.2.0
>
>
> Right now, the {{StreamTask}} never calls {{close()}} on the state backend.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2997: [FLINK-5240][tests] ensure state backends are properly cl...

2016-12-14 Thread mxm
Github user mxm commented on the issue:

https://github.com/apache/flink/pull/2997
  
Thanks for the review, will merge then.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5240) Properly Close StateBackend in StreamTask when closing/canceling

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748632#comment-15748632
 ] 

ASF GitHub Bot commented on FLINK-5240:
---

Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/2997
  
+1, please go ahead and merge.

And thank for fixing.   


> Properly Close StateBackend in StreamTask when closing/canceling
> 
>
> Key: FLINK-5240
> URL: https://issues.apache.org/jira/browse/FLINK-5240
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Reporter: Aljoscha Krettek
>Priority: Blocker
> Fix For: 1.2.0
>
>
> Right now, the {{StreamTask}} never calls {{close()}} on the state backend.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2997: [FLINK-5240][tests] ensure state backends are properly cl...

2016-12-14 Thread aljoscha
Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/2997
  
+1, please go ahead and merge.

And thank for fixing. 👍 😃 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4574) Strengthen fetch interval implementation in Kinesis consumer

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748589#comment-15748589
 ] 

ASF GitHub Bot commented on FLINK-4574:
---

Github user tzulitai commented on the issue:

https://github.com/apache/flink/pull/2925
  
I'm definitely planning to look at this over the next few days :) Currently 
quite overwhelmed right now.

Thanks for all your recent work on the Kinesis connector @tony810430, and 
very sorry for the late reviews. Please bear with me for a little while, I'll 
get back to the PRs soon ;)


> Strengthen fetch interval implementation in Kinesis consumer
> 
>
> Key: FLINK-4574
> URL: https://issues.apache.org/jira/browse/FLINK-4574
> Project: Flink
>  Issue Type: Improvement
>  Components: Kinesis Connector
>Affects Versions: 1.1.0
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Wei-Che Wei
> Fix For: 1.2.0
>
>
> As pointed out by [~rmetzger], right now the fetch interval implementation in 
> the {{ShardConsumer}} class of the Kinesis consumer can lead to much longer 
> interval times than specified by the user, ex. say the specified fetch 
> interval is {{f}}, it takes {{x}} to complete a {{getRecords()}} call, and 
> {{y}} to complete processing the fetched records for emitting, than the 
> actual interval between each fetch is actually {{f+x+y}}.
> The main problem with this is that we can never guarantee how much time has 
> past since the last {{getRecords}} call, thus can not guarantee that returned 
> shard iterators will not have expired the next time we use them, even if we 
> limit the user-given value for {{f}} to not be longer than the iterator 
> expire time.
> I propose to improve this by, per {{ShardConsumer}}, use a 
> {{ScheduledExecutorService}} / {{Timer}} to do the fixed-interval fetching, 
> and a separate blocking queue that collects the fetched records for emitting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3551) Sync Scala and Java Streaming Examples

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748587#comment-15748587
 ] 

ASF GitHub Bot commented on FLINK-3551:
---

Github user ch33hau commented on the issue:

https://github.com/apache/flink/pull/2761
  
Hi @thvasilo, thanks for the help, it's ok, knew that everyone is quite 
busy =)


> Sync Scala and Java Streaming Examples
> --
>
> Key: FLINK-3551
> URL: https://issues.apache.org/jira/browse/FLINK-3551
> Project: Flink
>  Issue Type: Sub-task
>  Components: Examples
>Affects Versions: 1.0.0
>Reporter: Stephan Ewen
>Assignee: Lim Chee Hau
> Fix For: 1.2.0
>
>
> The Scala Examples lack behind the Java Examples



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2925: [FLINK-4574] [kinesis] Strengthen fetch interval implemen...

2016-12-14 Thread tzulitai
Github user tzulitai commented on the issue:

https://github.com/apache/flink/pull/2925
  
I'm definitely planning to look at this over the next few days :) Currently 
quite overwhelmed right now.

Thanks for all your recent work on the Kinesis connector @tony810430, and 
very sorry for the late reviews. Please bear with me for a little while, I'll 
get back to the PRs soon ;)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2761: [FLINK-3551] [examples] Sync Scala Streaming Examples

2016-12-14 Thread ch33hau
Github user ch33hau commented on the issue:

https://github.com/apache/flink/pull/2761
  
Hi @thvasilo, thanks for the help, it's ok, knew that everyone is quite 
busy =)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3551) Sync Scala and Java Streaming Examples

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748576#comment-15748576
 ] 

ASF GitHub Bot commented on FLINK-3551:
---

Github user thvasilo commented on the issue:

https://github.com/apache/flink/pull/2761
  
Hello @ch33hau, sorry for the late reply, I've been at a conference the 
past week. With the latest changes this LGTM, I've edited the fix version in 
JIRA to 1.2.0 to give this more visibility for the upcoming release, since it's 
very useful to have more examples.

Hopefully some committer can take a look soon, I'll ping @fhueske here, 
maybe he can shepherd the PR or assign somebody.


> Sync Scala and Java Streaming Examples
> --
>
> Key: FLINK-3551
> URL: https://issues.apache.org/jira/browse/FLINK-3551
> Project: Flink
>  Issue Type: Sub-task
>  Components: Examples
>Affects Versions: 1.0.0
>Reporter: Stephan Ewen
>Assignee: Lim Chee Hau
> Fix For: 1.2.0
>
>
> The Scala Examples lack behind the Java Examples



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2761: [FLINK-3551] [examples] Sync Scala Streaming Examples

2016-12-14 Thread thvasilo
Github user thvasilo commented on the issue:

https://github.com/apache/flink/pull/2761
  
Hello @ch33hau, sorry for the late reply, I've been at a conference the 
past week. With the latest changes this LGTM, I've edited the fix version in 
JIRA to 1.2.0 to give this more visibility for the upcoming release, since it's 
very useful to have more examples.

Hopefully some committer can take a look soon, I'll ping @fhueske here, 
maybe he can shepherd the PR or assign somebody.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (FLINK-4429) Move Redis Sink from Flink to Bahir

2016-12-14 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-4429.
---
   Resolution: Fixed
Fix Version/s: 1.2.0

Redis is now in Bahir, so I deleted it from Flink in 
http://git-wip-us.apache.org/repos/asf/flink/commit/8038ae4c 

> Move Redis Sink from Flink to Bahir
> ---
>
> Key: FLINK-4429
> URL: https://issues.apache.org/jira/browse/FLINK-4429
> Project: Flink
>  Issue Type: Task
>  Components: Streaming Connectors
>Reporter: Robert Metzger
>Assignee: Robert Metzger
> Fix For: 1.2.0
>
>
> As per [1] the Flink community decided to move the Redis connector from Flink 
> to Bahir.
> [1] 
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Move-Redis-and-Flume-connectors-to-Apache-Bahir-and-redirect-contributions-there-td13102.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3551) Sync Scala and Java Streaming Examples

2016-12-14 Thread Theodore Vasiloudis (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Theodore Vasiloudis updated FLINK-3551:
---
Fix Version/s: (was: 1.0.1)
   1.2.0

> Sync Scala and Java Streaming Examples
> --
>
> Key: FLINK-3551
> URL: https://issues.apache.org/jira/browse/FLINK-3551
> Project: Flink
>  Issue Type: Sub-task
>  Components: Examples
>Affects Versions: 1.0.0
>Reporter: Stephan Ewen
>Assignee: Lim Chee Hau
> Fix For: 1.2.0
>
>
> The Scala Examples lack behind the Java Examples



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4704) Move Table API to org.apache.flink.table

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748549#comment-15748549
 ] 

ASF GitHub Bot commented on FLINK-4704:
---

Github user ex00 commented on the issue:

https://github.com/apache/flink/pull/2958
  
PR has been updated.


> Move Table API to org.apache.flink.table
> 
>
> Key: FLINK-4704
> URL: https://issues.apache.org/jira/browse/FLINK-4704
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Anton Mushin
>Priority: Blocker
> Fix For: 1.2.0
>
>
> This would be a large change. But maybe now is still a good time to do it. 
> Otherwise we will never fix this.
> Actually, the Table API is in the wrong package. At the moment it is in 
> {{org.apache.flink.api.table}} and the actual Scala/Java APIs are in 
> {{org.apache.flink.api.java/scala.table}}. All other APIs such as Python, 
> Gelly, Flink ML do not use the {{org.apache.flink.api}} namespace.
> I suggest the following packages:
> {code}
> org.apache.flink.table
> org.apache.flink.table.api.java
> org.apache.flink.table.api.scala
> {code}
> What do you think?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2958: [FLINK-4704] Move Table API to org.apache.flink.table

2016-12-14 Thread ex00
Github user ex00 commented on the issue:

https://github.com/apache/flink/pull/2958
  
PR has been updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5323) CheckpointNotifier should be removed from docs

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748525#comment-15748525
 ] 

ASF GitHub Bot commented on FLINK-5323:
---

Github user rmetzger commented on the issue:

https://github.com/apache/flink/pull/3006
  
+1 to merge


> CheckpointNotifier should be removed from docs
> --
>
> Key: FLINK-5323
> URL: https://issues.apache.org/jira/browse/FLINK-5323
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.1.3
>Reporter: Abhishek Singh
>Priority: Trivial
>
> I was following the official documentation: 
> https://ci.apache.org/projects/flink/flink-docs-release-1.1/apis/streaming/state.html
> Looks like this is the right one to be using: import 
> org.apache.flink.runtime.state.CheckpointListener;
> -Abhishek-
> On Dec 9, 2016, at 4:30 PM, Abhishek R. Singh 
>  wrote:
> I can’t seem to find CheckpointNotifier. Appreciate help !
> CheckpointNotifier is not a member of package 
> org.apache.flink.streaming.api.checkpoint
> From my pom.xml:
> 
> org.apache.flink
> flink-scala_2.11
> 1.1.3
> 
> 
> org.apache.flink
> flink-streaming-scala_2.11
> 1.1.3
> 
> 
> org.apache.flink
> flink-clients_2.11
> 1.1.3
> 
> 
> org.apache.flink
> flink-statebackend-rocksdb_2.11
> 1.1.3
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #3006: [FLINK-5323] CheckpointNotifier should be removed from do...

2016-12-14 Thread rmetzger
Github user rmetzger commented on the issue:

https://github.com/apache/flink/pull/3006
  
+1 to merge


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-5319) ClassCastException when reusing an inherited method reference as KeySelector for different classes

2016-12-14 Thread Alexander Chermenin (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748508#comment-15748508
 ] 

Alexander Chermenin commented on FLINK-5319:


Well, we can change the signature of {{map}} methods (and others) from 
{code}public  MapOperator map(MapFunction mapper){code} to 
{code}public  MapOperator map(MapFunction mapper){code} 
to make possible such code as next one:
{code}DataSet intDataSet = env.fromElements(1, 2, 3);
DataSet longDataSet = env.fromElements(1L, 2L, 3L);

MapFunction function = Number::doubleValue;

List intToDoubles = intDataSet.map(function).collect();
List longToDoubles = longDataSet.map(function).collect();{code}
What do you think about it?

> ClassCastException when reusing an inherited method reference as KeySelector 
> for different classes
> --
>
> Key: FLINK-5319
> URL: https://issues.apache.org/jira/browse/FLINK-5319
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.2.0
>Reporter: Alexander Chermenin
>Assignee: Timo Walther
>
> Code sample:
> {code}static abstract class A {
> int id;
> A(int id) {this.id = id; }
> int getId() { return id; }
> }
> static class B extends A { B(int id) { super(id % 3); } }
> static class C extends A { C(int id) { super(id % 2); } }
> private static B b(int id) { return new B(id); }
> private static C c(int id) { return new C(id); }
> /**
>  * Main method.
>  */
> public static void main(String[] args) throws Exception {
> StreamExecutionEnvironment environment =
> StreamExecutionEnvironment.getExecutionEnvironment();
> B[] bs = IntStream.range(0, 10).mapToObj(Test::b).toArray(B[]::new);
> C[] cs = IntStream.range(0, 10).mapToObj(Test::c).toArray(C[]::new);
> DataStreamSource bStream = environment.fromElements(bs);
> DataStreamSource cStream = environment.fromElements(cs);
> bStream.keyBy((KeySelector) A::getId).print();
> cStream.keyBy((KeySelector) A::getId).print();
> environment.execute();
> }
> {code}
> This code throws next exception:
> {code}Exception in thread "main" 
> org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:901)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:844)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:844)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: java.lang.RuntimeException: Could not extract key from 
> org.sample.flink.examples.Test$C@5e1a8111
>   at 
> org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:75)
>   at 
> org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:39)
>   at 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:746)
>   at 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:724)
>   at 
> org.apache.flink.streaming.api.operators.StreamSourceContexts$NonTimestampContext.collect(StreamSourceContexts.java:84)
>   at 
> org.apache.flink.streaming.api.functions.source.FromElementsFunction.run(FromElementsFunction.java:127)
>   at 
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:75)
>   at 
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:56)
>   at 
> org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:56)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:269)
>   at 

[jira] [Commented] (FLINK-5008) Update quickstart documentation

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748493#comment-15748493
 ] 

ASF GitHub Bot commented on FLINK-5008:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/2764#discussion_r92399079
  
--- Diff: docs/quickstart/java_api_quickstart.md ---
@@ -46,39 +46,79 @@ Use one of the following commands to __create a 
project__:
 {% highlight bash %}
 $ mvn archetype:generate   \
   -DarchetypeGroupId=org.apache.flink  \
-  -DarchetypeArtifactId=flink-quickstart-java  \
+  -DarchetypeArtifactId=flink-quickstart-java  \{% unless 
site.is_stable %}
+  
-DarchetypeCatalog=https://repository.apache.org/content/repositories/snapshots/
 \{% endunless %}
   -DarchetypeVersion={{site.version}}
 {% endhighlight %}
 This allows you to name your newly created 
project. It will interactively ask you for the groupId, artifactId, 
and package name.
 
 
 {% highlight bash %}
+{% if site.is_stable %}
 $ curl https://flink.apache.org/q/quickstart.sh | bash
+{% else %}
+$ curl https://flink.apache.org/q/quickstart-SNAPSHOT.sh | bash
+{% endif %}
 {% endhighlight %}
 
 
 
 ## Inspect Project
 
-There will be a new directory in your working directory. If you've used 
the _curl_ approach, the directory is called `quickstart`. Otherwise, it has 
the name of your artifactId.
+There will be a new directory in your working directory. If you've used
+the _curl_ approach, the directory is called `quickstart`. Otherwise,
+it has the name of your `artifactId`:
+
+{% highlight bash %}
+$ tree quickstart/
+quickstart/
+├── pom.xml
+└── src
+└── main
+├── java
+│   └── org
+│   └── myorg
+│   └── quickstart
+│   ├── BatchJob.java
+│   ├── SocketTextStreamWordCount.java
+│   ├── StreamingJob.java
+│   └── WordCount.java
+└── resources
+└── log4j.properties
+{% endhighlight %}
 
 The sample project is a __Maven project__, which contains four classes. 
_StreamingJob_ and _BatchJob_ are basic skeleton programs, 
_SocketTextStreamWordCount_ is a working streaming example and _WordCountJob_ 
is a working batch example. Please note that the _main_ method of all classes 
allow you to start Flink in a development/testing mode.
 
-We recommend you __import this project into your IDE__ to develop and test 
it. If you use Eclipse, the [m2e plugin](http://www.eclipse.org/m2e/) allows to 
[import Maven 
projects](http://books.sonatype.com/m2eclipse-book/reference/creating-sect-importing-projects.html#fig-creating-import).
 Some Eclipse bundles include that plugin by default, others require you to 
install it manually. The IntelliJ IDE also supports Maven projects out of the 
box.
+We recommend you __import this project into your IDE__ to develop and
+test it. If you use Eclipse, the [m2e plugin](http://www.eclipse.org/m2e/)
+allows to [import Maven 
projects](http://books.sonatype.com/m2eclipse-book/reference/creating-sect-importing-projects.html#fig-creating-import).
+Some Eclipse bundles include that plugin by default, others require you
+to install it manually. The IntelliJ IDE supports Maven projects out of
+the box.
 
 
-A note to Mac OS X users: The default JVM heapsize for Java is too small 
for Flink. You have to manually increase it. Choose "Run Configurations" -> 
Arguments and write into the "VM Arguments" box: "-Xmx800m" in Eclipse.
+*A note to Mac OS X users*: The default JVM heapsize for Java is too
+small for Flink. You have to manually increase it. In Eclipse, choose
--- End diff --

Is that only the case for Eclipse users? 
If yes, it should be mentioned. If not, we should give instructions to 
increase the heapsize for IntelliJ as well.


> Update quickstart documentation
> ---
>
> Key: FLINK-5008
> URL: https://issues.apache.org/jira/browse/FLINK-5008
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Minor
>
> * The IDE setup documentation of Flink is outdated in both parts: IntelliJ 
> IDEA was based on an old version and Eclipse/Scala IDE does not work at all 
> anymore.
> * The example in the "Quickstart: Setup" is outdated and requires "." to be 
> in the path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5008) Update quickstart documentation

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748494#comment-15748494
 ] 

ASF GitHub Bot commented on FLINK-5008:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/2764#discussion_r92400684
  
--- Diff: docs/quickstart/scala_api_quickstart.md ---
@@ -136,49 +136,82 @@ Use one of the following commands to __create a 
project__:
 {% highlight bash %}
 $ mvn archetype:generate   \
   -DarchetypeGroupId=org.apache.flink  \
-  -DarchetypeArtifactId=flink-quickstart-scala \
+  -DarchetypeArtifactId=flink-quickstart-scala \{% unless 
site.is_stable %}
+  
-DarchetypeCatalog=https://repository.apache.org/content/repositories/snapshots/
 \{% endunless %}
   -DarchetypeVersion={{site.version}}
 {% endhighlight %}
 This allows you to name your newly created project. 
It will interactively ask you for the groupId, artifactId, and package name.
 
 
 {% highlight bash %}
-$ curl https://flink.apache.org/q/quickstart-scala.sh | bash
+{% if site.is_stable %}
+$ curl https://flink.apache.org/q/quickstart-scala.sh | bash
+{% else %}
+$ curl https://flink.apache.org/q/quickstart-scala-SNAPSHOT.sh | bash
+{% endif %}
 {% endhighlight %}
 
 
 
 
 ### Inspect Project
 
-There will be a new directory in your working directory. If you've used 
the _curl_ approach, the directory is called `quickstart`. Otherwise, it has 
the name of your artifactId.
+There will be a new directory in your working directory. If you've used
+the _curl_ approach, the directory is called `quickstart`. Otherwise,
--- End diff --

"approach" -> "shortcut"?


> Update quickstart documentation
> ---
>
> Key: FLINK-5008
> URL: https://issues.apache.org/jira/browse/FLINK-5008
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Minor
>
> * The IDE setup documentation of Flink is outdated in both parts: IntelliJ 
> IDEA was based on an old version and Eclipse/Scala IDE does not work at all 
> anymore.
> * The example in the "Quickstart: Setup" is outdated and requires "." to be 
> in the path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5008) Update quickstart documentation

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748488#comment-15748488
 ] 

ASF GitHub Bot commented on FLINK-5008:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/2764#discussion_r92400994
  
--- Diff: docs/quickstart/scala_api_quickstart.md ---
@@ -194,7 +227,9 @@ data 1
 is 1
 ~~~
 
-The following code shows the WordCount implementation from the Quickstart 
which processes some text lines with two operators (FlatMap and Reduce), and 
prints the resulting words and counts to std-out.
+The following code shows the `WordCount` implementation from the
+Quickstart which processes some text lines with two operators (FlatMap
+and Reduce), and prints the resulting words and counts to std-out.
--- End diff --

sum / reduce: See comment above


> Update quickstart documentation
> ---
>
> Key: FLINK-5008
> URL: https://issues.apache.org/jira/browse/FLINK-5008
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Minor
>
> * The IDE setup documentation of Flink is outdated in both parts: IntelliJ 
> IDEA was based on an old version and Eclipse/Scala IDE does not work at all 
> anymore.
> * The example in the "Quickstart: Setup" is outdated and requires "." to be 
> in the path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5008) Update quickstart documentation

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748495#comment-15748495
 ] 

ASF GitHub Bot commented on FLINK-5008:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/2764#discussion_r92400095
  
--- Diff: docs/quickstart/java_api_quickstart.md ---
@@ -94,7 +134,9 @@ data 1
 is 1
 ~~~
 
-The following code shows the WordCount implementation from the Quickstart 
which processes some text lines with two operators (FlatMap and Reduce), and 
prints the resulting words and counts to std-out.
+The following code shows the `WordCount` implementation from the
+Quickstart which processes some text lines with two operators (FlatMap
+and Reduce), and prints the resulting words and counts to std-out.
--- End diff --

The code does not show a `Reduce` but only `sum` which is executed as 
`Reduce`. Would be good to clarify this. Otherwise it might be confusing for 
readers.


> Update quickstart documentation
> ---
>
> Key: FLINK-5008
> URL: https://issues.apache.org/jira/browse/FLINK-5008
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Minor
>
> * The IDE setup documentation of Flink is outdated in both parts: IntelliJ 
> IDEA was based on an old version and Eclipse/Scala IDE does not work at all 
> anymore.
> * The example in the "Quickstart: Setup" is outdated and requires "." to be 
> in the path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5008) Update quickstart documentation

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748491#comment-15748491
 ] 

ASF GitHub Bot commented on FLINK-5008:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/2764#discussion_r92398312
  
--- Diff: docs/quickstart/java_api_quickstart.md ---
@@ -46,39 +46,79 @@ Use one of the following commands to __create a 
project__:
 {% highlight bash %}
 $ mvn archetype:generate   \
   -DarchetypeGroupId=org.apache.flink  \
-  -DarchetypeArtifactId=flink-quickstart-java  \
+  -DarchetypeArtifactId=flink-quickstart-java  \{% unless 
site.is_stable %}
+  
-DarchetypeCatalog=https://repository.apache.org/content/repositories/snapshots/
 \{% endunless %}
   -DarchetypeVersion={{site.version}}
 {% endhighlight %}
 This allows you to name your newly created 
project. It will interactively ask you for the groupId, artifactId, 
and package name.
 
 
 {% highlight bash %}
+{% if site.is_stable %}
 $ curl https://flink.apache.org/q/quickstart.sh | bash
+{% else %}
+$ curl https://flink.apache.org/q/quickstart-SNAPSHOT.sh | bash
+{% endif %}
 {% endhighlight %}
 
 
 
 ## Inspect Project
 
-There will be a new directory in your working directory. If you've used 
the _curl_ approach, the directory is called `quickstart`. Otherwise, it has 
the name of your artifactId.
+There will be a new directory in your working directory. If you've used
+the _curl_ approach, the directory is called `quickstart`. Otherwise,
--- End diff --

"the _curl_ approach" -> "the _curl_ shortcut"?


> Update quickstart documentation
> ---
>
> Key: FLINK-5008
> URL: https://issues.apache.org/jira/browse/FLINK-5008
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Minor
>
> * The IDE setup documentation of Flink is outdated in both parts: IntelliJ 
> IDEA was based on an old version and Eclipse/Scala IDE does not work at all 
> anymore.
> * The example in the "Quickstart: Setup" is outdated and requires "." to be 
> in the path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5008) Update quickstart documentation

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748486#comment-15748486
 ] 

ASF GitHub Bot commented on FLINK-5008:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/2764#discussion_r92399626
  
--- Diff: docs/quickstart/java_api_quickstart.md ---
@@ -46,39 +46,79 @@ Use one of the following commands to __create a 
project__:
 {% highlight bash %}
 $ mvn archetype:generate   \
   -DarchetypeGroupId=org.apache.flink  \
-  -DarchetypeArtifactId=flink-quickstart-java  \
+  -DarchetypeArtifactId=flink-quickstart-java  \{% unless 
site.is_stable %}
+  
-DarchetypeCatalog=https://repository.apache.org/content/repositories/snapshots/
 \{% endunless %}
   -DarchetypeVersion={{site.version}}
 {% endhighlight %}
 This allows you to name your newly created 
project. It will interactively ask you for the groupId, artifactId, 
and package name.
 
 
 {% highlight bash %}
+{% if site.is_stable %}
 $ curl https://flink.apache.org/q/quickstart.sh | bash
+{% else %}
+$ curl https://flink.apache.org/q/quickstart-SNAPSHOT.sh | bash
+{% endif %}
 {% endhighlight %}
 
 
 
 ## Inspect Project
 
-There will be a new directory in your working directory. If you've used 
the _curl_ approach, the directory is called `quickstart`. Otherwise, it has 
the name of your artifactId.
+There will be a new directory in your working directory. If you've used
+the _curl_ approach, the directory is called `quickstart`. Otherwise,
+it has the name of your `artifactId`:
+
+{% highlight bash %}
+$ tree quickstart/
+quickstart/
+├── pom.xml
+└── src
+└── main
+├── java
+│   └── org
+│   └── myorg
+│   └── quickstart
+│   ├── BatchJob.java
+│   ├── SocketTextStreamWordCount.java
+│   ├── StreamingJob.java
+│   └── WordCount.java
+└── resources
+└── log4j.properties
+{% endhighlight %}
 
 The sample project is a __Maven project__, which contains four classes. 
_StreamingJob_ and _BatchJob_ are basic skeleton programs, 
_SocketTextStreamWordCount_ is a working streaming example and _WordCountJob_ 
is a working batch example. Please note that the _main_ method of all classes 
allow you to start Flink in a development/testing mode.
 
-We recommend you __import this project into your IDE__ to develop and test 
it. If you use Eclipse, the [m2e plugin](http://www.eclipse.org/m2e/) allows to 
[import Maven 
projects](http://books.sonatype.com/m2eclipse-book/reference/creating-sect-importing-projects.html#fig-creating-import).
 Some Eclipse bundles include that plugin by default, others require you to 
install it manually. The IntelliJ IDE also supports Maven projects out of the 
box.
+We recommend you __import this project into your IDE__ to develop and
+test it. If you use Eclipse, the [m2e plugin](http://www.eclipse.org/m2e/)
+allows to [import Maven 
projects](http://books.sonatype.com/m2eclipse-book/reference/creating-sect-importing-projects.html#fig-creating-import).
+Some Eclipse bundles include that plugin by default, others require you
+to install it manually. The IntelliJ IDE supports Maven projects out of
+the box.
 
 
-A note to Mac OS X users: The default JVM heapsize for Java is too small 
for Flink. You have to manually increase it. Choose "Run Configurations" -> 
Arguments and write into the "VM Arguments" box: "-Xmx800m" in Eclipse.
+*A note to Mac OS X users*: The default JVM heapsize for Java is too
+small for Flink. You have to manually increase it. In Eclipse, choose
+`Run Configurations -> Arguments` and write into the `VM Arguments`
+box: `-Xmx800m`.
 
 ## Build Project
 
-If you want to __build your project__, go to your project directory and 
issue the `mvn clean install -Pbuild-jar` command. You will __find a jar__ that 
runs on every Flink cluster in __target/your-artifact-id-{{ site.version 
}}.jar__. There is also a fat-jar,  __target/your-artifact-id-{{ site.version 
}}-flink-fat-jar.jar__. This
+If you want to __build your project__, go to your project directory and
+issue the `mvn clean install -Pbuild-jar` command. You will
+__find a jar__ that runs on every Flink cluster in
--- End diff --

"every Flink cluster" -> "every Flink cluster with a compatible version"? 
Or rephrase the sentence.


> Update quickstart documentation
> ---
>
> Key: FLINK-5008
> 

[jira] [Commented] (FLINK-5008) Update quickstart documentation

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748485#comment-15748485
 ] 

ASF GitHub Bot commented on FLINK-5008:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/2764#discussion_r92392183
  
--- Diff: README.md ---
@@ -104,25 +104,11 @@ Check out our [Setting up 
IntelliJ](https://github.com/apache/flink/blob/master/
 
 ### Eclipse Scala IDE
 
-For Eclipse users, we recommend using Scala IDE 3.0.3, based on Eclipse 
Kepler. While this is a slightly older version,
-we found it to be the version that works most robustly for a complex 
project like Flink.
-
-Further details, and a guide to newer Scala IDE versions can be found in 
the
-[How to setup 
Eclipse](https://github.com/apache/flink/blob/master/docs/internals/ide_setup.md#eclipse)
 docs.
-
-**Note:** Before following this setup, make sure to run the build from the 
command line once
-(`mvn clean install -DskipTests`, see above)
-
-1. Download the Scala IDE (preferred) or install the plugin to Eclipse 
Kepler. See 
-   [How to setup 
Eclipse](https://github.com/apache/flink/blob/master/docs/internals/ide_setup.md#eclipse)
 for download links and instructions.
-2. Add the "macroparadise" compiler plugin to the Scala compiler.
-   Open "Window" -> "Preferences" -> "Scala" -> "Compiler" -> "Advanced" 
and put into the "Xplugin" field the path to
-   the *macroparadise* jar file (typically 
"/home/*-your-user-*/.m2/repository/org/scalamacros/paradise_2.10.4/2.0.1/paradise_2.10.4-2.0.1.jar").
-   Note: If you do not have the jar file, you probably did not run the 
command line build.
-3. Import the Flink Maven projects ("File" -> "Import" -> "Maven" -> 
"Existing Maven Projects") 
-4. During the import, Eclipse will ask to automatically install additional 
Maven build helper plugins.
-5. Close the "flink-java8" project. Since Eclipse Kepler does not support 
Java 8, you cannot develop this project.
+**NOTE:** From our experience, this setup does not work with Flink
--- End diff --

In the beginning of this paragraph it says "The Flink committers use 
IntelliJ IDEA and Eclipse IDE to develop the Flink codebase.". This should be 
changed as well since we cannot help to get a working setup.


> Update quickstart documentation
> ---
>
> Key: FLINK-5008
> URL: https://issues.apache.org/jira/browse/FLINK-5008
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Minor
>
> * The IDE setup documentation of Flink is outdated in both parts: IntelliJ 
> IDEA was based on an old version and Eclipse/Scala IDE does not work at all 
> anymore.
> * The example in the "Quickstart: Setup" is outdated and requires "." to be 
> in the path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5008) Update quickstart documentation

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748501#comment-15748501
 ] 

ASF GitHub Bot commented on FLINK-5008:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/2764#discussion_r92398867
  
--- Diff: docs/quickstart/java_api_quickstart.md ---
@@ -46,39 +46,79 @@ Use one of the following commands to __create a 
project__:
 {% highlight bash %}
 $ mvn archetype:generate   \
   -DarchetypeGroupId=org.apache.flink  \
-  -DarchetypeArtifactId=flink-quickstart-java  \
+  -DarchetypeArtifactId=flink-quickstart-java  \{% unless 
site.is_stable %}
+  
-DarchetypeCatalog=https://repository.apache.org/content/repositories/snapshots/
 \{% endunless %}
   -DarchetypeVersion={{site.version}}
 {% endhighlight %}
 This allows you to name your newly created 
project. It will interactively ask you for the groupId, artifactId, 
and package name.
 
 
 {% highlight bash %}
+{% if site.is_stable %}
 $ curl https://flink.apache.org/q/quickstart.sh | bash
+{% else %}
+$ curl https://flink.apache.org/q/quickstart-SNAPSHOT.sh | bash
+{% endif %}
 {% endhighlight %}
 
 
 
 ## Inspect Project
 
-There will be a new directory in your working directory. If you've used 
the _curl_ approach, the directory is called `quickstart`. Otherwise, it has 
the name of your artifactId.
+There will be a new directory in your working directory. If you've used
+the _curl_ approach, the directory is called `quickstart`. Otherwise,
+it has the name of your `artifactId`:
+
+{% highlight bash %}
+$ tree quickstart/
+quickstart/
+├── pom.xml
+└── src
+└── main
+├── java
+│   └── org
+│   └── myorg
+│   └── quickstart
+│   ├── BatchJob.java
+│   ├── SocketTextStreamWordCount.java
+│   ├── StreamingJob.java
+│   └── WordCount.java
+└── resources
+└── log4j.properties
+{% endhighlight %}
 
 The sample project is a __Maven project__, which contains four classes. 
_StreamingJob_ and _BatchJob_ are basic skeleton programs, 
_SocketTextStreamWordCount_ is a working streaming example and _WordCountJob_ 
is a working batch example. Please note that the _main_ method of all classes 
allow you to start Flink in a development/testing mode.
 
-We recommend you __import this project into your IDE__ to develop and test 
it. If you use Eclipse, the [m2e plugin](http://www.eclipse.org/m2e/) allows to 
[import Maven 
projects](http://books.sonatype.com/m2eclipse-book/reference/creating-sect-importing-projects.html#fig-creating-import).
 Some Eclipse bundles include that plugin by default, others require you to 
install it manually. The IntelliJ IDE also supports Maven projects out of the 
box.
+We recommend you __import this project into your IDE__ to develop and
+test it. If you use Eclipse, the [m2e plugin](http://www.eclipse.org/m2e/)
+allows to [import Maven 
projects](http://books.sonatype.com/m2eclipse-book/reference/creating-sect-importing-projects.html#fig-creating-import).
+Some Eclipse bundles include that plugin by default, others require you
+to install it manually. The IntelliJ IDE supports Maven projects out of
+the box.
 
 
-A note to Mac OS X users: The default JVM heapsize for Java is too small 
for Flink. You have to manually increase it. Choose "Run Configurations" -> 
Arguments and write into the "VM Arguments" box: "-Xmx800m" in Eclipse.
+*A note to Mac OS X users*: The default JVM heapsize for Java is too
--- End diff --

Isn't the latest version called "macOS"?


> Update quickstart documentation
> ---
>
> Key: FLINK-5008
> URL: https://issues.apache.org/jira/browse/FLINK-5008
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Minor
>
> * The IDE setup documentation of Flink is outdated in both parts: IntelliJ 
> IDEA was based on an old version and Eclipse/Scala IDE does not work at all 
> anymore.
> * The example in the "Quickstart: Setup" is outdated and requires "." to be 
> in the path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5008) Update quickstart documentation

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748489#comment-15748489
 ] 

ASF GitHub Bot commented on FLINK-5008:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/2764#discussion_r92395568
  
--- Diff: docs/internals/ide_setup.md ---
@@ -25,104 +25,56 @@ under the License.
 * Replaced by the TOC
 {:toc}
 
-## Eclipse
-
-A brief guide how to set up Eclipse for development of the Flink core.
-Flink uses mixed Scala/Java projects, which pose a challenge to some IDEs.
-Below is the setup guide that works best from our personal experience.
-
-For Eclipse users, we currently recomment the Scala IDE 3.0.3, as the most 
robust solution.
-
-
-### Eclipse Scala IDE 3.0.3
-
-**NOTE:** While this version of the Scala IDE is not the newest, we have 
found it to be the most reliably working
-version for complex projects like Flink. One restriction is, though, that 
it works only with Java 7, not with Java 8.
-
-**Note:** Before following this setup, make sure to run the build from the 
command line once
-(`mvn clean package -DskipTests`)
-
-1. Download the Scala IDE (preferred) or install the plugin to Eclipse 
Kepler. See section below for download links
-   and instructions.
-2. Add the "macroparadise" compiler plugin to the Scala compiler.
-   Open "Window" -> "Preferences" -> "Scala" -> "Compiler" -> "Advanced" 
and put into the "Xplugin" field the path to
-   the *macroparadise* jar file (typically 
"/home/*-your-user-*/.m2/repository/org/scalamacros/paradise_2.10.4/2.0.1/paradise_2.10.4-2.0.1.jar").
-   Note: If you do not have the jar file, you probably did not ran the 
command line build.
-3. Import the Flink Maven projects ("File" -> "Import" -> "Maven" -> 
"Existing Maven Projects")
-4. During the import, Eclipse will ask to automatically install additional 
Maven build helper plugins.
-5. Close the "flink-java8" project. Since Eclipse Kepler does not support 
Java 8, you cannot develop this project.
-
-
- Download links for Scala IDE 3.0.3
-
-The Scala IDE 3.0.3 is a previous stable release, and download links are a 
bit hidden.
-
-The pre-packaged Scala IDE can be downloaded from the following links:
-
-* [Linux (64 
bit)](http://downloads.typesafe.com/scalaide-pack/3.0.3.vfinal-210-20140327/scala-SDK-3.0.3-2.10-linux.gtk.x86_64.tar.gz)
-* [Linux (32 
bit)](http://downloads.typesafe.com/scalaide-pack/3.0.3.vfinal-210-20140327/scala-SDK-3.0.3-2.10-linux.gtk.x86.tar.gz)
-* [MaxOS X Cocoa (64 
bit)](http://downloads.typesafe.com/scalaide-pack/3.0.3.vfinal-210-20140327/scala-SDK-3.0.3-2.10-macosx.cocoa.x86_64.zip)
-* [MaxOS X Cocoa (32 
bit)](http://downloads.typesafe.com/scalaide-pack/3.0.3.vfinal-210-20140327/scala-SDK-3.0.3-2.10-macosx.cocoa.x86.zip)
-* [Windows (64 
bit)](http://downloads.typesafe.com/scalaide-pack/3.0.3.vfinal-210-20140327/scala-SDK-3.0.3-2.10-win32.win32.x86_64.zip)
-* [Windows (32 
bit)](http://downloads.typesafe.com/scalaide-pack/3.0.3.vfinal-210-20140327/scala-SDK-3.0.3-2.10-win32.win32.x86.zip)
-
-Alternatively, you can download Eclipse Kepler from 
[https://eclipse.org/downloads/packages/release/Kepler/SR2](https://eclipse.org/downloads/packages/release/Kepler/SR2)
-and manually add the Scala and Maven plugins by plugin site at 
[http://scala-ide.org/download/prev-stable.html](http://scala-ide.org/download/prev-stable.html).
+The sections below describe how to import the Flink project into an IDE
+for the development of Flink itself. For writing Flink programs, please
+refer to the [Java API]({{ site.baseurl 
}}/quickstart/java_api_quickstart.html)
+and the [Scala API]({{ site.baseurl 
}}/quickstart/scala_api_quickstart.html)
+quickstart guides.
 
-* Either use the update site to install the plugin ("Help" -> "Install new 
Software")
-* Or download the [zip 
file](http://download.scala-ide.org/sdk/helium/e38/scala211/stable/update-site.zip),
 unpack it, and move the contents of the
-  "plugins" and "features" folders into the equally named folders of the 
Eclipse root directory
-
-**NOTE:** It might happen that some modules do not build in Eclipse 
correctly (even if the maven build succeeds).
-To fix this, right-click in the corresponding Eclipse project and choose 
"Properties" and than "Maven".
-Uncheck the box labeled "Resolve dependencies from Workspace projects", 
click "Apply" and then "OK". "
-
-
-### Eclipse Scala IDE 4.0.0
-
-**NOTE: From personal experience, the use of the Scala IDE 4.0.0 performs 
worse than previous versions for complex projects like Flink.**
-**Version 4.0.0 does not handle mixed Java/Scala projects as robustly 

[jira] [Commented] (FLINK-5008) Update quickstart documentation

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748490#comment-15748490
 ] 

ASF GitHub Bot commented on FLINK-5008:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/2764#discussion_r92405057
  
--- Diff: docs/quickstart/setup_quickstart.md ---
@@ -55,7 +69,10 @@ Check the __JobManager's web frontend__ at 
[http://localhost:8081](http://localh
 
 ## Run Example
 
-Now, we are going to run the [SocketWindowWordCount 
example](https://github.com/apache/flink/blob/master/flink-examples/flink-examples-streaming/src/main/java/org/apache/flink/streaming/examples/socket/SocketWindowWordCount.java)
 and read text from a socket and count the number of distinct words.
+Now, we are going to run the
+[SocketWindowWordCount 
example](https://github.com/apache/flink/blob/master/flink-examples/flink-examples-streaming/src/main/java/org/apache/flink/streaming/examples/socket/SocketWindowWordCount.java)
+and read text from a socket and count the number of distinct words per 
sliding
--- End diff --

Sliding window is changed to Tumbling window below


> Update quickstart documentation
> ---
>
> Key: FLINK-5008
> URL: https://issues.apache.org/jira/browse/FLINK-5008
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Minor
>
> * The IDE setup documentation of Flink is outdated in both parts: IntelliJ 
> IDEA was based on an old version and Eclipse/Scala IDE does not work at all 
> anymore.
> * The example in the "Quickstart: Setup" is outdated and requires "." to be 
> in the path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5008) Update quickstart documentation

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748496#comment-15748496
 ] 

ASF GitHub Bot commented on FLINK-5008:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/2764#discussion_r92400530
  
--- Diff: docs/quickstart/run_example_quickstart.md ---
@@ -90,23 +92,23 @@ use it in our program. Edit the `dependencies` section 
so that it looks like thi
 
 
 org.apache.flink
-flink-streaming-java_2.10
+flink-streaming-java_2.11
--- End diff --

Why is this changed to Scala 2.11. Is this the default of the quickstarts?


> Update quickstart documentation
> ---
>
> Key: FLINK-5008
> URL: https://issues.apache.org/jira/browse/FLINK-5008
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Minor
>
> * The IDE setup documentation of Flink is outdated in both parts: IntelliJ 
> IDEA was based on an old version and Eclipse/Scala IDE does not work at all 
> anymore.
> * The example in the "Quickstart: Setup" is outdated and requires "." to be 
> in the path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5008) Update quickstart documentation

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748498#comment-15748498
 ] 

ASF GitHub Bot commented on FLINK-5008:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/2764#discussion_r92392722
  
--- Diff: docs/internals/ide_setup.md ---
@@ -25,104 +25,56 @@ under the License.
 * Replaced by the TOC
 {:toc}
 
-## Eclipse
-
-A brief guide how to set up Eclipse for development of the Flink core.
-Flink uses mixed Scala/Java projects, which pose a challenge to some IDEs.
-Below is the setup guide that works best from our personal experience.
-
-For Eclipse users, we currently recomment the Scala IDE 3.0.3, as the most 
robust solution.
-
-
-### Eclipse Scala IDE 3.0.3
-
-**NOTE:** While this version of the Scala IDE is not the newest, we have 
found it to be the most reliably working
-version for complex projects like Flink. One restriction is, though, that 
it works only with Java 7, not with Java 8.
-
-**Note:** Before following this setup, make sure to run the build from the 
command line once
-(`mvn clean package -DskipTests`)
-
-1. Download the Scala IDE (preferred) or install the plugin to Eclipse 
Kepler. See section below for download links
-   and instructions.
-2. Add the "macroparadise" compiler plugin to the Scala compiler.
-   Open "Window" -> "Preferences" -> "Scala" -> "Compiler" -> "Advanced" 
and put into the "Xplugin" field the path to
-   the *macroparadise* jar file (typically 
"/home/*-your-user-*/.m2/repository/org/scalamacros/paradise_2.10.4/2.0.1/paradise_2.10.4-2.0.1.jar").
-   Note: If you do not have the jar file, you probably did not ran the 
command line build.
-3. Import the Flink Maven projects ("File" -> "Import" -> "Maven" -> 
"Existing Maven Projects")
-4. During the import, Eclipse will ask to automatically install additional 
Maven build helper plugins.
-5. Close the "flink-java8" project. Since Eclipse Kepler does not support 
Java 8, you cannot develop this project.
-
-
- Download links for Scala IDE 3.0.3
-
-The Scala IDE 3.0.3 is a previous stable release, and download links are a 
bit hidden.
-
-The pre-packaged Scala IDE can be downloaded from the following links:
-
-* [Linux (64 
bit)](http://downloads.typesafe.com/scalaide-pack/3.0.3.vfinal-210-20140327/scala-SDK-3.0.3-2.10-linux.gtk.x86_64.tar.gz)
-* [Linux (32 
bit)](http://downloads.typesafe.com/scalaide-pack/3.0.3.vfinal-210-20140327/scala-SDK-3.0.3-2.10-linux.gtk.x86.tar.gz)
-* [MaxOS X Cocoa (64 
bit)](http://downloads.typesafe.com/scalaide-pack/3.0.3.vfinal-210-20140327/scala-SDK-3.0.3-2.10-macosx.cocoa.x86_64.zip)
-* [MaxOS X Cocoa (32 
bit)](http://downloads.typesafe.com/scalaide-pack/3.0.3.vfinal-210-20140327/scala-SDK-3.0.3-2.10-macosx.cocoa.x86.zip)
-* [Windows (64 
bit)](http://downloads.typesafe.com/scalaide-pack/3.0.3.vfinal-210-20140327/scala-SDK-3.0.3-2.10-win32.win32.x86_64.zip)
-* [Windows (32 
bit)](http://downloads.typesafe.com/scalaide-pack/3.0.3.vfinal-210-20140327/scala-SDK-3.0.3-2.10-win32.win32.x86.zip)
-
-Alternatively, you can download Eclipse Kepler from 
[https://eclipse.org/downloads/packages/release/Kepler/SR2](https://eclipse.org/downloads/packages/release/Kepler/SR2)
-and manually add the Scala and Maven plugins by plugin site at 
[http://scala-ide.org/download/prev-stable.html](http://scala-ide.org/download/prev-stable.html).
+The sections below describe how to import the Flink project into an IDE
+for the development of Flink itself. For writing Flink programs, please
+refer to the [Java API]({{ site.baseurl 
}}/quickstart/java_api_quickstart.html)
+and the [Scala API]({{ site.baseurl 
}}/quickstart/scala_api_quickstart.html)
+quickstart guides.
 
-* Either use the update site to install the plugin ("Help" -> "Install new 
Software")
-* Or download the [zip 
file](http://download.scala-ide.org/sdk/helium/e38/scala211/stable/update-site.zip),
 unpack it, and move the contents of the
-  "plugins" and "features" folders into the equally named folders of the 
Eclipse root directory
-
-**NOTE:** It might happen that some modules do not build in Eclipse 
correctly (even if the maven build succeeds).
-To fix this, right-click in the corresponding Eclipse project and choose 
"Properties" and than "Maven".
-Uncheck the box labeled "Resolve dependencies from Workspace projects", 
click "Apply" and then "OK". "
-
-
-### Eclipse Scala IDE 4.0.0
-
-**NOTE: From personal experience, the use of the Scala IDE 4.0.0 performs 
worse than previous versions for complex projects like Flink.**
-**Version 4.0.0 does not handle mixed Java/Scala projects as robustly 

[jira] [Commented] (FLINK-5008) Update quickstart documentation

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748487#comment-15748487
 ] 

ASF GitHub Bot commented on FLINK-5008:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/2764#discussion_r92400903
  
--- Diff: docs/quickstart/scala_api_quickstart.md ---
@@ -136,49 +136,82 @@ Use one of the following commands to __create a 
project__:
 {% highlight bash %}
 $ mvn archetype:generate   \
   -DarchetypeGroupId=org.apache.flink  \
-  -DarchetypeArtifactId=flink-quickstart-scala \
+  -DarchetypeArtifactId=flink-quickstart-scala \{% unless 
site.is_stable %}
+  
-DarchetypeCatalog=https://repository.apache.org/content/repositories/snapshots/
 \{% endunless %}
   -DarchetypeVersion={{site.version}}
 {% endhighlight %}
 This allows you to name your newly created project. 
It will interactively ask you for the groupId, artifactId, and package name.
 
 
 {% highlight bash %}
-$ curl https://flink.apache.org/q/quickstart-scala.sh | bash
+{% if site.is_stable %}
+$ curl https://flink.apache.org/q/quickstart-scala.sh | bash
+{% else %}
+$ curl https://flink.apache.org/q/quickstart-scala-SNAPSHOT.sh | bash
+{% endif %}
 {% endhighlight %}
 
 
 
 
 ### Inspect Project
 
-There will be a new directory in your working directory. If you've used 
the _curl_ approach, the directory is called `quickstart`. Otherwise, it has 
the name of your artifactId.
+There will be a new directory in your working directory. If you've used
+the _curl_ approach, the directory is called `quickstart`. Otherwise,
+it has the name of your `artifactId`:
+
+{% highlight bash %}
+$ tree quickstart/
+quickstart/
+├── pom.xml
+└── src
+└── main
+├── resources
+│   └── log4j.properties
+└── scala
+└── org
+└── myorg
+└── quickstart
+├── BatchJob.scala
+├── SocketTextStreamWordCount.scala
+├── StreamingJob.scala
+└── WordCount.scala
+{% endhighlight %}
 
 The sample project is a __Maven project__, which contains four classes. 
_StreamingJob_ and _BatchJob_ are basic skeleton programs, 
_SocketTextStreamWordCount_ is a working streaming example and _WordCountJob_ 
is a working batch example. Please note that the _main_ method of all classes 
allow you to start Flink in a development/testing mode.
 
 We recommend you __import this project into your IDE__. For Eclipse, you 
need the following plugins, which you can install from the provided Eclipse 
Update Sites:
 
 * _Eclipse 4.x_
-  * [Scala IDE](http://download.scala-ide.org/sdk/e38/scala210/stable/site)
+  * [Scala 
IDE](http://download.scala-ide.org/sdk/lithium/e44/scala211/stable/site)
   * [m2eclipse-scala](http://alchim31.free.fr/m2e-scala/update-site)
-  * [Build Helper Maven 
Plugin](https://repository.sonatype.org/content/repositories/forge-sites/m2e-extras/0.15.0/N/0.15.0.201206251206/)
-* _Eclipse 3.7_
-  * [Scala IDE](http://download.scala-ide.org/sdk/e37/scala210/stable/site)
+  * [Build Helper Maven 
Plugin](https://repo1.maven.org/maven2/.m2e/connectors/m2eclipse-buildhelper/0.15.0/N/0.15.0.201207090124/)
+* _Eclipse 3.8_
+  * [Scala IDE for Scala 
2.11](http://download.scala-ide.org/sdk/helium/e38/scala211/stable/site) or 
[Scala IDE for Scala 
2.10](http://download.scala-ide.org/sdk/helium/e38/scala210/stable/site)
   * [m2eclipse-scala](http://alchim31.free.fr/m2e-scala/update-site)
   * [Build Helper Maven 
Plugin](https://repository.sonatype.org/content/repositories/forge-sites/m2e-extras/0.14.0/N/0.14.0.201109282148/)
 
-The IntelliJ IDE also supports Maven and offers a plugin for Scala 
development.
+The IntelliJ IDE supports Maven out of the box and offers a plugin for
+Scala development.
 
 
 ### Build Project
 
-If you want to __build your project__, go to your project directory and 
issue the `mvn clean package -Pbuild-jar` command. You will __find a jar__ that 
runs on every Flink cluster in __target/your-artifact-id-{{ site.version 
}}.jar__. There is also a fat-jar,  __target/your-artifact-id-{{ site.version 
}}-flink-fat-jar.jar__. This
+If you want to __build your project__, go to your project directory and
+issue the `mvn clean package -Pbuild-jar` command. You will
+__find a jar__ that runs on every Flink cluster in
--- End diff --

every Flink cluster (see above)


> Update quickstart documentation
> 

[jira] [Commented] (FLINK-5008) Update quickstart documentation

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748500#comment-15748500
 ] 

ASF GitHub Bot commented on FLINK-5008:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/2764#discussion_r92404579
  
--- Diff: docs/quickstart/setup_quickstart.md ---
@@ -88,7 +111,9 @@ Now, we are going to run the [SocketWindowWordCount 
example](https://github.com/
 
   
 
-* Counts are printed to `stdout`. Monitor the JobManager's output file and 
write some text in `nc`:
+* Words are counted in time windows of 5 seconds (processing time, tumbling
+  windows) and are printed to `stdout`. Monitor the JobManager's output 
file
+  and write some text in `nc` (input is sent per line after hitting 
):
--- End diff --

"input is sent per line" -> "input is sent to Flink line by line"


> Update quickstart documentation
> ---
>
> Key: FLINK-5008
> URL: https://issues.apache.org/jira/browse/FLINK-5008
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Minor
>
> * The IDE setup documentation of Flink is outdated in both parts: IntelliJ 
> IDEA was based on an old version and Eclipse/Scala IDE does not work at all 
> anymore.
> * The example in the "Quickstart: Setup" is outdated and requires "." to be 
> in the path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   >