[jira] [Commented] (FLINK-1731) Add kMeans clustering algorithm to machine learning library

2015-05-14 Thread Alexander Alexandrov (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543734#comment-14543734
 ] 

Alexander Alexandrov commented on FLINK-1731:
-

I would go with a {{DataSet}} for the centroids as well. That said, we can 
reduce syntax at the client side by providing either

- an implicit converter that {{Seq\[A\] = DataSet\[A\]}} (needs to be part of 
the Flink Scala API, could be already there), or
- an overloaded {{setCentroids(Seq\[A\])}} setter.

 Add kMeans clustering algorithm to machine learning library
 ---

 Key: FLINK-1731
 URL: https://issues.apache.org/jira/browse/FLINK-1731
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Till Rohrmann
Assignee: Peter Schrott
  Labels: ML

 The Flink repository already contains a kMeans implementation but it is not 
 yet ported to the machine learning library. I assume that only the used data 
 types have to be adapted and then it can be more or less directly moved to 
 flink-ml.
 The kMeans++ [1] and the kMeans|| [2] algorithm constitute a better 
 implementation because the improve the initial seeding phase to achieve near 
 optimal clustering. It might be worthwhile to implement kMeans||.
 Resources:
 [1] http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf
 [2] http://theory.stanford.edu/~sergei/papers/vldb12-kmpar.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1743) Add multinomial logistic regression to machine learning library

2015-05-14 Thread Alexander Alexandrov (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Alexandrov updated FLINK-1743:

Assignee: (was: Alexander Alexandrov)

 Add multinomial logistic regression to machine learning library
 ---

 Key: FLINK-1743
 URL: https://issues.apache.org/jira/browse/FLINK-1743
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Till Rohrmann
  Labels: ML

 Multinomial logistic regression [1] would be good first classification 
 algorithm which can classify multiple classes. 
 Resources:
 [1] [http://en.wikipedia.org/wiki/Multinomial_logistic_regression]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1731) Add kMeans clustering algorithm to machine learning library

2015-05-14 Thread Alexander Alexandrov (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Alexandrov updated FLINK-1731:

Assignee: (was: Alexander Alexandrov)

 Add kMeans clustering algorithm to machine learning library
 ---

 Key: FLINK-1731
 URL: https://issues.apache.org/jira/browse/FLINK-1731
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Till Rohrmann
  Labels: ML

 The Flink repository already contains a kMeans implementation but it is not 
 yet ported to the machine learning library. I assume that only the used data 
 types have to be adapted and then it can be more or less directly moved to 
 flink-ml.
 The kMeans++ [1] and the kMeans|| [2] algorithm constitute a better 
 implementation because the improve the initial seeding phase to achieve near 
 optimal clustering. It might be worthwhile to implement kMeans||.
 Resources:
 [1] http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf
 [2] http://theory.stanford.edu/~sergei/papers/vldb12-kmpar.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1731) Add kMeans clustering algorithm to machine learning library

2015-05-14 Thread Peter Schrott (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543670#comment-14543670
 ] 

Peter Schrott commented on FLINK-1731:
--

Hi flink people,

as we now figured out how to pass in the initial centroids (via ParameterMap) 
there is still the open question, if we should use a Seqence or DataSet.
As I already mentioned before, I am not sure about the side effects regarding 
parallelism using the DataSet type.

- thanks for advices.

 Add kMeans clustering algorithm to machine learning library
 ---

 Key: FLINK-1731
 URL: https://issues.apache.org/jira/browse/FLINK-1731
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Till Rohrmann
Assignee: Peter Schrott
  Labels: ML

 The Flink repository already contains a kMeans implementation but it is not 
 yet ported to the machine learning library. I assume that only the used data 
 types have to be adapted and then it can be more or less directly moved to 
 flink-ml.
 The kMeans++ [1] and the kMeans|| [2] algorithm constitute a better 
 implementation because the improve the initial seeding phase to achieve near 
 optimal clustering. It might be worthwhile to implement kMeans||.
 Resources:
 [1] http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf
 [2] http://theory.stanford.edu/~sergei/papers/vldb12-kmpar.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1731) Add kMeans clustering algorithm to machine learning library

2015-05-14 Thread Alexander Alexandrov (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543643#comment-14543643
 ] 

Alexander Alexandrov commented on FLINK-1731:
-

[~peedeeX21] for some reason I cannot assign this to you directly. I cleared 
the assignee field so you can assign the issue to yourself. 

 Add kMeans clustering algorithm to machine learning library
 ---

 Key: FLINK-1731
 URL: https://issues.apache.org/jira/browse/FLINK-1731
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Till Rohrmann
  Labels: ML

 The Flink repository already contains a kMeans implementation but it is not 
 yet ported to the machine learning library. I assume that only the used data 
 types have to be adapted and then it can be more or less directly moved to 
 flink-ml.
 The kMeans++ [1] and the kMeans|| [2] algorithm constitute a better 
 implementation because the improve the initial seeding phase to achieve near 
 optimal clustering. It might be worthwhile to implement kMeans||.
 Resources:
 [1] http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf
 [2] http://theory.stanford.edu/~sergei/papers/vldb12-kmpar.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-1731) Add kMeans clustering algorithm to machine learning library

2015-05-14 Thread Alexander Alexandrov (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543734#comment-14543734
 ] 

Alexander Alexandrov edited comment on FLINK-1731 at 5/14/15 2:35 PM:
--

I would go with a {{DataSet}} for the centroids as well. That said, we can 
reduce syntax at the client side by providing either

- an overloaded {{setCentroids(Seq\[A\])}} setter, or
- an implicit converter of type {{Seq\[A\] = DataSet\[A\]}} (needs to be part 
of the Flink Scala API, could be already there) which allows to pass a 
{{Seq\[A\]}} argument to a {{setCentroids(DataSet\[A\])}} setter.


was (Author: aalexandrov):
I would go with a {{DataSet}} for the centroids as well. That said, we can 
reduce syntax at the client side by providing either

- an implicit converter that {{Seq\[A\] = DataSet\[A\]}} (needs to be part of 
the Flink Scala API, could be already there), or
- an overloaded {{setCentroids(Seq\[A\])}} setter.

 Add kMeans clustering algorithm to machine learning library
 ---

 Key: FLINK-1731
 URL: https://issues.apache.org/jira/browse/FLINK-1731
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Till Rohrmann
Assignee: Peter Schrott
  Labels: ML

 The Flink repository already contains a kMeans implementation but it is not 
 yet ported to the machine learning library. I assume that only the used data 
 types have to be adapted and then it can be more or less directly moved to 
 flink-ml.
 The kMeans++ [1] and the kMeans|| [2] algorithm constitute a better 
 implementation because the improve the initial seeding phase to achieve near 
 optimal clustering. It might be worthwhile to implement kMeans||.
 Resources:
 [1] http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf
 [2] http://theory.stanford.edu/~sergei/papers/vldb12-kmpar.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2013) Create generalized linear model framework

2015-05-14 Thread Theodore Vasiloudis (JIRA)
Theodore Vasiloudis created FLINK-2013:
--

 Summary: Create generalized linear model framework
 Key: FLINK-2013
 URL: https://issues.apache.org/jira/browse/FLINK-2013
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Theodore Vasiloudis
Assignee: Theodore Vasiloudis


[Generalized linear 
models|http://en.wikipedia.org/wiki/Generalized_linear_model] (GLMs) provide an 
abstraction for many learning models that can be used for regression and 
classification tasks.

Some example GLMs are linear regression, logistic regression, LASSO and ridge 
regression.

The goal for this JIRA is to provide interfaces for the set of common 
properties and functions these models share. 
The goal would be to have a design pattern similar to the one that 
[sklearn|http://scikit-learn.org/stable/modules/linear_model.html] and 
[MLlib|http://spark.apache.org/docs/1.3.0/mllib-linear-methods.html] uses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2015) Add ridge regression

2015-05-14 Thread Theodore Vasiloudis (JIRA)
Theodore Vasiloudis created FLINK-2015:
--

 Summary: Add ridge regression
 Key: FLINK-2015
 URL: https://issues.apache.org/jira/browse/FLINK-2015
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Theodore Vasiloudis
Priority: Minor


Ridge regression is a linear regression model that imposes penalties on the 
size of the coefficients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1731) Add kMeans clustering algorithm to machine learning library

2015-05-14 Thread Theodore Vasiloudis (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543711#comment-14543711
 ] 

Theodore Vasiloudis commented on FLINK-1731:


Since the centroids will have to be broadcast to all task managers, that means 
that they will have to be placed inside a DataSet eventually.

One approach is to use a Sequence which you then convert into a DataSet inside 
the algorithm, or require that the user provides a DataSet as a parameter.

In GradientDescent we are using the second option, i.e. we expect a DataSet of 
weights, you can do the same with centroids.

 Add kMeans clustering algorithm to machine learning library
 ---

 Key: FLINK-1731
 URL: https://issues.apache.org/jira/browse/FLINK-1731
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Till Rohrmann
Assignee: Peter Schrott
  Labels: ML

 The Flink repository already contains a kMeans implementation but it is not 
 yet ported to the machine learning library. I assume that only the used data 
 types have to be adapted and then it can be more or less directly moved to 
 flink-ml.
 The kMeans++ [1] and the kMeans|| [2] algorithm constitute a better 
 implementation because the improve the initial seeding phase to achieve near 
 optimal clustering. It might be worthwhile to implement kMeans||.
 Resources:
 [1] http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf
 [2] http://theory.stanford.edu/~sergei/papers/vldb12-kmpar.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1731) Add kMeans clustering algorithm to machine learning library

2015-05-14 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543647#comment-14543647
 ] 

Robert Metzger commented on FLINK-1731:
---

[~aalexandrov]: only users with Contributor permissions can be assigned to 
issues.
I made [~peedeeX21] a contributor and assigned him.

 Add kMeans clustering algorithm to machine learning library
 ---

 Key: FLINK-1731
 URL: https://issues.apache.org/jira/browse/FLINK-1731
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Till Rohrmann
Assignee: Alexander Alexandrov
  Labels: ML

 The Flink repository already contains a kMeans implementation but it is not 
 yet ported to the machine learning library. I assume that only the used data 
 types have to be adapted and then it can be more or less directly moved to 
 flink-ml.
 The kMeans++ [1] and the kMeans|| [2] algorithm constitute a better 
 implementation because the improve the initial seeding phase to achieve near 
 optimal clustering. It might be worthwhile to implement kMeans||.
 Resources:
 [1] http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf
 [2] http://theory.stanford.edu/~sergei/papers/vldb12-kmpar.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-1731) Add kMeans clustering algorithm to machine learning library

2015-05-14 Thread Peter Schrott (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Schrott reassigned FLINK-1731:


Assignee: Peter Schrott  (was: Alexander Alexandrov)

 Add kMeans clustering algorithm to machine learning library
 ---

 Key: FLINK-1731
 URL: https://issues.apache.org/jira/browse/FLINK-1731
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Till Rohrmann
Assignee: Peter Schrott
  Labels: ML

 The Flink repository already contains a kMeans implementation but it is not 
 yet ported to the machine learning library. I assume that only the used data 
 types have to be adapted and then it can be more or less directly moved to 
 flink-ml.
 The kMeans++ [1] and the kMeans|| [2] algorithm constitute a better 
 implementation because the improve the initial seeding phase to achieve near 
 optimal clustering. It might be worthwhile to implement kMeans||.
 Resources:
 [1] http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf
 [2] http://theory.stanford.edu/~sergei/papers/vldb12-kmpar.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1731) Add kMeans clustering algorithm to machine learning library

2015-05-14 Thread Peter Schrott (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543648#comment-14543648
 ] 

Peter Schrott commented on FLINK-1731:
--

Great! Thanks!

 Add kMeans clustering algorithm to machine learning library
 ---

 Key: FLINK-1731
 URL: https://issues.apache.org/jira/browse/FLINK-1731
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Till Rohrmann
Assignee: Alexander Alexandrov
  Labels: ML

 The Flink repository already contains a kMeans implementation but it is not 
 yet ported to the machine learning library. I assume that only the used data 
 types have to be adapted and then it can be more or less directly moved to 
 flink-ml.
 The kMeans++ [1] and the kMeans|| [2] algorithm constitute a better 
 implementation because the improve the initial seeding phase to achieve near 
 optimal clustering. It might be worthwhile to implement kMeans||.
 Resources:
 [1] http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf
 [2] http://theory.stanford.edu/~sergei/papers/vldb12-kmpar.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2013) Create generalized linear model framework

2015-05-14 Thread Theodore Vasiloudis (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Theodore Vasiloudis updated FLINK-2013:
---
Description: 
[Generalized linear 
models|http://en.wikipedia.org/wiki/Generalized_linear_model] (GLMs) provide an 
abstraction for many learning models that can be used for regression and 
classification tasks.

Some example GLMs are linear regression, logistic regression, LASSO and ridge 
regression.

The goal for this JIRA is to provide interfaces for the set of common 
properties and functions these models share. 
In order to achieve that, a design pattern similar to the one that 
[sklearn|http://scikit-learn.org/stable/modules/linear_model.html] and 
[MLlib|http://spark.apache.org/docs/1.3.0/mllib-linear-methods.html] employ 
will be used.

  was:
[Generalized linear 
models|http://en.wikipedia.org/wiki/Generalized_linear_model] (GLMs) provide an 
abstraction for many learning models that can be used for regression and 
classification tasks.

Some example GLMs are linear regression, logistic regression, LASSO and ridge 
regression.

The goal for this JIRA is to provide interfaces for the set of common 
properties and functions these models share. 
The goal would be to have a design pattern similar to the one that 
[sklearn|http://scikit-learn.org/stable/modules/linear_model.html] and 
[MLlib|http://spark.apache.org/docs/1.3.0/mllib-linear-methods.html] uses.


 Create generalized linear model framework
 -

 Key: FLINK-2013
 URL: https://issues.apache.org/jira/browse/FLINK-2013
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Theodore Vasiloudis
Assignee: Theodore Vasiloudis
  Labels: ML

 [Generalized linear 
 models|http://en.wikipedia.org/wiki/Generalized_linear_model] (GLMs) provide 
 an abstraction for many learning models that can be used for regression and 
 classification tasks.
 Some example GLMs are linear regression, logistic regression, LASSO and ridge 
 regression.
 The goal for this JIRA is to provide interfaces for the set of common 
 properties and functions these models share. 
 In order to achieve that, a design pattern similar to the one that 
 [sklearn|http://scikit-learn.org/stable/modules/linear_model.html] and 
 [MLlib|http://spark.apache.org/docs/1.3.0/mllib-linear-methods.html] employ 
 will be used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2004] Fix memory leak in presense of fa...

2015-05-14 Thread rmetzger
GitHub user rmetzger opened a pull request:

https://github.com/apache/flink/pull/674

[FLINK-2004] Fix memory leak in presense of failed checkpoints for Kafka 
Source



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rmetzger/flink flink2004

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/674.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #674


commit 27f11822b7db2716f3484def8ad350eb7e0b0893
Author: Robert Metzger rmetz...@apache.org
Date:   2015-05-14T09:45:30Z

[FLINK-2004] Fix memory leak in presense of failed checkpoints in Kafka 
source

commit 36cb4758c200713a97858989ac73f117186ed9dc
Author: Robert Metzger rmetz...@apache.org
Date:   2015-05-14T13:57:18Z

unused imports




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2004) Memory leak in presence of failed checkpoints in KafkaSource

2015-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543782#comment-14543782
 ] 

ASF GitHub Bot commented on FLINK-2004:
---

GitHub user rmetzger opened a pull request:

https://github.com/apache/flink/pull/674

[FLINK-2004] Fix memory leak in presense of failed checkpoints for Kafka 
Source



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rmetzger/flink flink2004

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/674.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #674


commit 27f11822b7db2716f3484def8ad350eb7e0b0893
Author: Robert Metzger rmetz...@apache.org
Date:   2015-05-14T09:45:30Z

[FLINK-2004] Fix memory leak in presense of failed checkpoints in Kafka 
source

commit 36cb4758c200713a97858989ac73f117186ed9dc
Author: Robert Metzger rmetz...@apache.org
Date:   2015-05-14T13:57:18Z

unused imports




 Memory leak in presence of failed checkpoints in KafkaSource
 

 Key: FLINK-2004
 URL: https://issues.apache.org/jira/browse/FLINK-2004
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Robert Metzger
Priority: Critical
 Fix For: 0.9


 Checkpoints that fail never send a commit message to the tasks.
 Maintaining a map of all pending checkpoints introduces a memory leak, as 
 entries for failed checkpoints will never be removed.
 Approaches to fix this:
   - The source cleans up entries from older checkpoints once a checkpoint is 
 committed (simple implementation in a linked hash map)
   - The commit message could include the optional state handle (source needs 
 not maintain the map)
   - The checkpoint coordinator could send messages for failed checkpoints?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2008] Fix broker failure test case

2015-05-14 Thread rmetzger
GitHub user rmetzger opened a pull request:

https://github.com/apache/flink/pull/675

[FLINK-2008] Fix broker failure test case



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rmetzger/flink stephan_kafka

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/675.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #675


commit ad449cfd308559734daa493b34d5d40305972c82
Author: Robert Metzger rmetz...@apache.org
Date:   2015-05-13T07:34:37Z

[FLINK-2008] Fix broker failure test case




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1907) Scala Interactive Shell

2015-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543858#comment-14543858
 ] 

ASF GitHub Bot commented on FLINK-1907:
---

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/672#discussion_r30330756
  
--- Diff: flink-optimizer/pom.xml ---
@@ -58,6 +58,12 @@ under the License.
artifactIdguava/artifactId
version${guava.version}/version
/dependency
+   dependency
+   groupIdorg.apache.hadoop/groupId
+   
artifactIdhadoop-mapreduce-client-jobclient/artifactId
+   version2.2.0/version
+   scopetest/scope
+   /dependency
--- End diff --

why is this dependency necessary in the optimizer?


 Scala Interactive Shell
 ---

 Key: FLINK-1907
 URL: https://issues.apache.org/jira/browse/FLINK-1907
 Project: Flink
  Issue Type: New Feature
  Components: Scala API
Reporter: Nikolaas Steenbergen
Assignee: Nikolaas Steenbergen
Priority: Minor

 Build an interactive Shell for the Scala api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1990] [staging table] Support upper cas...

2015-05-14 Thread aljoscha
Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/667#issuecomment-102076699
  
Can you post a link to the failed run? Or is it on your local machine? Some 
of the streaming test cases fail sometimes right now, this is a known problem. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1990) Uppercase AS keyword not allowed in select expression

2015-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543871#comment-14543871
 ] 

ASF GitHub Bot commented on FLINK-1990:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/667#issuecomment-102076699
  
Can you post a link to the failed run? Or is it on your local machine? Some 
of the streaming test cases fail sometimes right now, this is a known problem. 


 Uppercase AS keyword not allowed in select expression
 ---

 Key: FLINK-1990
 URL: https://issues.apache.org/jira/browse/FLINK-1990
 Project: Flink
  Issue Type: Bug
  Components: Table API
Affects Versions: 0.9
Reporter: Fabian Hueske
Assignee: Aljoscha Krettek
Priority: Minor
 Fix For: 0.9


 Table API select expressions do not allow an uppercase AS keyword.
 The following expression fails with an {{ExpressionException}}:
  {{table.groupBy(request).select(request, request.count AS cnt)}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1525]Introduction of a small input para...

2015-05-14 Thread hsaputra
Github user hsaputra commented on a diff in the pull request:

https://github.com/apache/flink/pull/664#discussion_r30342062
  
--- Diff: 
flink-java/src/test/java/org/apache/flink/api/java/utils/ParameterToolTest.java 
---
@@ -0,0 +1,220 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.java.utils;
+
+import org.apache.flink.api.java.ClosureCleaner;
+import org.apache.flink.configuration.Configuration;
+import org.junit.Assert;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+
+import java.io.File;
+import java.io.FileInputStream;
+import java.io.FileOutputStream;
+import java.io.IOException;
+import java.util.Map;
+import java.util.Properties;
+
+public class ParameterToolTest {
+
+   @Rule
+   public TemporaryFolder tmp = new TemporaryFolder();
+
+
+   // - Parser tests -
+
+   @Test(expected = RuntimeException.class)
+   public void testIllegalArgs() {
+   ParameterTool parameter = ParameterTool.fromArgs(new 
String[]{berlin});
+   Assert.assertEquals(0, parameter.getNumberOfParameters());
+   }
+
+   @Test
+   public void testNoVal() {
+   ParameterTool parameter = ParameterTool.fromArgs(new 
String[]{-berlin});
+   Assert.assertEquals(1, parameter.getNumberOfParameters());
+   Assert.assertTrue(parameter.has(berlin));
+   }
+
+   @Test
+   public void testNoValDouble() {
+   ParameterTool parameter = ParameterTool.fromArgs(new 
String[]{--berlin});
+   Assert.assertEquals(1, parameter.getNumberOfParameters());
+   Assert.assertTrue(parameter.has(berlin));
+   }
+
+   @Test
+   public void testMultipleNoVal() {
+   ParameterTool parameter = ParameterTool.fromArgs(new 
String[]{--a, --b, --c, --d, --e, --f});
+   Assert.assertEquals(6, parameter.getNumberOfParameters());
+   Assert.assertTrue(parameter.has(a));
+   Assert.assertTrue(parameter.has(b));
+   Assert.assertTrue(parameter.has(c));
+   Assert.assertTrue(parameter.has(d));
+   Assert.assertTrue(parameter.has(e));
+   Assert.assertTrue(parameter.has(f));
+   }
+
+   @Test
+   public void testMultipleNoValMixed() {
+   ParameterTool parameter = ParameterTool.fromArgs(new 
String[]{--a, -b, -c, -d, --e, --f});
+   Assert.assertEquals(6, parameter.getNumberOfParameters());
+   Assert.assertTrue(parameter.has(a));
+   Assert.assertTrue(parameter.has(b));
+   Assert.assertTrue(parameter.has(c));
+   Assert.assertTrue(parameter.has(d));
+   Assert.assertTrue(parameter.has(e));
+   Assert.assertTrue(parameter.has(f));
+   }
+
+   @Test(expected = IllegalArgumentException.class)
+   public void testEmptyVal() {
+   ParameterTool parameter = ParameterTool.fromArgs(new 
String[]{--a, -b, --});
+   Assert.assertEquals(2, parameter.getNumberOfParameters());
+   Assert.assertTrue(parameter.has(a));
+   Assert.assertTrue(parameter.has(b));
+   }
+
+   @Test(expected = IllegalArgumentException.class)
+   public void testEmptyValShort() {
+   ParameterTool parameter = ParameterTool.fromArgs(new 
String[]{--a, -b, -});
+   Assert.assertEquals(2, parameter.getNumberOfParameters());
+   Assert.assertTrue(parameter.has(a));
+   Assert.assertTrue(parameter.has(b));
+   }
+
+
+
+   /*@Test
--- End diff --

Do you want to keep this test as optional? If yes then it is better to have 
comments on why and when to uncomment this test. Otherwise, let's just remove 
it for now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature

[jira] [Commented] (FLINK-1727) Add decision tree to machine learning library

2015-05-14 Thread Sachin Goel (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14544095#comment-14544095
 ] 

Sachin Goel commented on FLINK-1727:


The approach in [1] seems the most generic to implement. The major optimization 
in terms of time is going to come in terms of the number of splits we perform 
for each attribute, which I think really depends on the data. But from previous 
experience, a histogram size of 1000 works okay. We can provide some sort of 
cross validation later on to decide on the size perhaps?

 Add decision tree to machine learning library
 -

 Key: FLINK-1727
 URL: https://issues.apache.org/jira/browse/FLINK-1727
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Till Rohrmann
Assignee: Mikio Braun
  Labels: ML

 Decision trees are widely used for classification and regression tasks. Thus, 
 it would be worthwhile to add support for them to Flink's machine learning 
 library. 
 A streaming parallel decision tree learning algorithm has been proposed by 
 Ben-Haim and Tom-Tov [1]. This can maybe adapted to a batch use case as well. 
 [2] contains an overview of different techniques of how to scale inductive 
 learning algorithms up. A presentation of Spark's MLlib decision tree 
 implementation can be found in [3].
 Resources:
 [1] [http://www.jmlr.org/papers/volume11/ben-haim10a/ben-haim10a.pdf]
 [2] 
 [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.46.8226rep=rep1type=pdf]
 [3] 
 [http://spark-summit.org/wp-content/uploads/2014/07/Scalable-Distributed-Decision-Trees-in-Spark-Made-Das-Sparks-Talwalkar.pdf]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1731) Add kMeans clustering algorithm to machine learning library

2015-05-14 Thread Theodore Vasiloudis (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14544164#comment-14544164
 ] 

Theodore Vasiloudis commented on FLINK-1731:


Yeah that might be the better option. The optimization framework is more 
developer oriented, but since Kmeans is mostly aimed at practitioners it would 
be better to abstract away the complexity.

 Add kMeans clustering algorithm to machine learning library
 ---

 Key: FLINK-1731
 URL: https://issues.apache.org/jira/browse/FLINK-1731
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Till Rohrmann
Assignee: Peter Schrott
  Labels: ML

 The Flink repository already contains a kMeans implementation but it is not 
 yet ported to the machine learning library. I assume that only the used data 
 types have to be adapted and then it can be more or less directly moved to 
 flink-ml.
 The kMeans++ [1] and the kMeans|| [2] algorithm constitute a better 
 implementation because the improve the initial seeding phase to achieve near 
 optimal clustering. It might be worthwhile to implement kMeans||.
 Resources:
 [1] http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf
 [2] http://theory.stanford.edu/~sergei/papers/vldb12-kmpar.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1525) Provide utils to pass -D parameters to UDFs

2015-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14544069#comment-14544069
 ] 

ASF GitHub Bot commented on FLINK-1525:
---

Github user hsaputra commented on a diff in the pull request:

https://github.com/apache/flink/pull/664#discussion_r30342062
  
--- Diff: 
flink-java/src/test/java/org/apache/flink/api/java/utils/ParameterToolTest.java 
---
@@ -0,0 +1,220 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.java.utils;
+
+import org.apache.flink.api.java.ClosureCleaner;
+import org.apache.flink.configuration.Configuration;
+import org.junit.Assert;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.rules.TemporaryFolder;
+
+import java.io.File;
+import java.io.FileInputStream;
+import java.io.FileOutputStream;
+import java.io.IOException;
+import java.util.Map;
+import java.util.Properties;
+
+public class ParameterToolTest {
+
+   @Rule
+   public TemporaryFolder tmp = new TemporaryFolder();
+
+
+   // - Parser tests -
+
+   @Test(expected = RuntimeException.class)
+   public void testIllegalArgs() {
+   ParameterTool parameter = ParameterTool.fromArgs(new 
String[]{berlin});
+   Assert.assertEquals(0, parameter.getNumberOfParameters());
+   }
+
+   @Test
+   public void testNoVal() {
+   ParameterTool parameter = ParameterTool.fromArgs(new 
String[]{-berlin});
+   Assert.assertEquals(1, parameter.getNumberOfParameters());
+   Assert.assertTrue(parameter.has(berlin));
+   }
+
+   @Test
+   public void testNoValDouble() {
+   ParameterTool parameter = ParameterTool.fromArgs(new 
String[]{--berlin});
+   Assert.assertEquals(1, parameter.getNumberOfParameters());
+   Assert.assertTrue(parameter.has(berlin));
+   }
+
+   @Test
+   public void testMultipleNoVal() {
+   ParameterTool parameter = ParameterTool.fromArgs(new 
String[]{--a, --b, --c, --d, --e, --f});
+   Assert.assertEquals(6, parameter.getNumberOfParameters());
+   Assert.assertTrue(parameter.has(a));
+   Assert.assertTrue(parameter.has(b));
+   Assert.assertTrue(parameter.has(c));
+   Assert.assertTrue(parameter.has(d));
+   Assert.assertTrue(parameter.has(e));
+   Assert.assertTrue(parameter.has(f));
+   }
+
+   @Test
+   public void testMultipleNoValMixed() {
+   ParameterTool parameter = ParameterTool.fromArgs(new 
String[]{--a, -b, -c, -d, --e, --f});
+   Assert.assertEquals(6, parameter.getNumberOfParameters());
+   Assert.assertTrue(parameter.has(a));
+   Assert.assertTrue(parameter.has(b));
+   Assert.assertTrue(parameter.has(c));
+   Assert.assertTrue(parameter.has(d));
+   Assert.assertTrue(parameter.has(e));
+   Assert.assertTrue(parameter.has(f));
+   }
+
+   @Test(expected = IllegalArgumentException.class)
+   public void testEmptyVal() {
+   ParameterTool parameter = ParameterTool.fromArgs(new 
String[]{--a, -b, --});
+   Assert.assertEquals(2, parameter.getNumberOfParameters());
+   Assert.assertTrue(parameter.has(a));
+   Assert.assertTrue(parameter.has(b));
+   }
+
+   @Test(expected = IllegalArgumentException.class)
+   public void testEmptyValShort() {
+   ParameterTool parameter = ParameterTool.fromArgs(new 
String[]{--a, -b, -});
+   Assert.assertEquals(2, parameter.getNumberOfParameters());
+   Assert.assertTrue(parameter.has(a));
+   Assert.assertTrue(parameter.has(b));
+   }
+
+
+
+   /*@Test
--- End diff --

Do you want to keep this test as optional? If yes then it is better to have 

[GitHub] flink pull request: [FLINK-1525]Introduction of a small input para...

2015-05-14 Thread hsaputra
Github user hsaputra commented on the pull request:

https://github.com/apache/flink/pull/664#issuecomment-102111410
  
Hi @rmetzger, just did a pass and other than comments about unused test and 
broken Travis due to check style I think this PR is ready to go. +1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1525) Provide utils to pass -D parameters to UDFs

2015-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14544074#comment-14544074
 ] 

ASF GitHub Bot commented on FLINK-1525:
---

Github user hsaputra commented on the pull request:

https://github.com/apache/flink/pull/664#issuecomment-102111410
  
Hi @rmetzger, just did a pass and other than comments about unused test and 
broken Travis due to check style I think this PR is ready to go. +1


 Provide utils to pass -D parameters to UDFs 
 

 Key: FLINK-1525
 URL: https://issues.apache.org/jira/browse/FLINK-1525
 Project: Flink
  Issue Type: Improvement
  Components: flink-contrib
Reporter: Robert Metzger
Assignee: Robert Metzger
  Labels: starter

 Hadoop users are used to setting job configuration through -D on the 
 command line.
 Right now, Flink users have to manually parse command line arguments and pass 
 them to the methods.
 It would be nice to provide a standard args parser with is taking care of 
 such stuff.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1727) Add decision tree to machine learning library

2015-05-14 Thread Sachin Goel (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14544091#comment-14544091
 ] 

Sachin Goel commented on FLINK-1727:


I would like to work on this. I'm already half way through the implementation.

 Add decision tree to machine learning library
 -

 Key: FLINK-1727
 URL: https://issues.apache.org/jira/browse/FLINK-1727
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Till Rohrmann
Assignee: Mikio Braun
  Labels: ML

 Decision trees are widely used for classification and regression tasks. Thus, 
 it would be worthwhile to add support for them to Flink's machine learning 
 library. 
 A streaming parallel decision tree learning algorithm has been proposed by 
 Ben-Haim and Tom-Tov [1]. This can maybe adapted to a batch use case as well. 
 [2] contains an overview of different techniques of how to scale inductive 
 learning algorithms up. A presentation of Spark's MLlib decision tree 
 implementation can be found in [3].
 Resources:
 [1] [http://www.jmlr.org/papers/volume11/ben-haim10a/ben-haim10a.pdf]
 [2] 
 [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.46.8226rep=rep1type=pdf]
 [3] 
 [http://spark-summit.org/wp-content/uploads/2014/07/Scalable-Distributed-Decision-Trees-in-Spark-Made-Das-Sparks-Talwalkar.pdf]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2014) Add LASSO regression

2015-05-14 Thread Theodore Vasiloudis (JIRA)
Theodore Vasiloudis created FLINK-2014:
--

 Summary: Add LASSO regression
 Key: FLINK-2014
 URL: https://issues.apache.org/jira/browse/FLINK-2014
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Theodore Vasiloudis


LASSO is a linear model that uses L1 regularization



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2008) PersistentKafkaSource is sometimes emitting tuples multiple times

2015-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543824#comment-14543824
 ] 

ASF GitHub Bot commented on FLINK-2008:
---

GitHub user rmetzger opened a pull request:

https://github.com/apache/flink/pull/675

[FLINK-2008] Fix broker failure test case



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rmetzger/flink stephan_kafka

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/675.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #675


commit ad449cfd308559734daa493b34d5d40305972c82
Author: Robert Metzger rmetz...@apache.org
Date:   2015-05-13T07:34:37Z

[FLINK-2008] Fix broker failure test case




 PersistentKafkaSource is sometimes emitting tuples multiple times
 -

 Key: FLINK-2008
 URL: https://issues.apache.org/jira/browse/FLINK-2008
 Project: Flink
  Issue Type: Bug
  Components: Kafka Connector, Streaming
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Robert Metzger

 The PersistentKafkaSource is expected to emit records exactly once.
 Two test cases of the KafkaITCase are sporadically failing because records 
 are emitted multiple times.
 Affected tests:
 {{testPersistentSourceWithOffsetUpdates()}}, after the offsets have been 
 changed manually in ZK:
 {code}
 java.lang.RuntimeException: Expected v to be 3, but was 4 on element 0 
 array=[4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 
 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 
 2]
 {code}
 {{brokerFailureTest()}} also fails:
 {code}
 05/13/2015 08:13:16   Custom source - Stream Sink(1/1) switched to FAILED 
 java.lang.AssertionError: Received tuple with value 21 twice
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.assertTrue(Assert.java:41)
   at org.junit.Assert.assertFalse(Assert.java:64)
   at 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$15.invoke(KafkaITCase.java:877)
   at 
 org.apache.flink.streaming.connectors.kafka.KafkaITCase$15.invoke(KafkaITCase.java:859)
   at 
 org.apache.flink.streaming.api.operators.StreamSink.callUserFunction(StreamSink.java:39)
   at 
 org.apache.flink.streaming.api.operators.StreamOperator.callUserFunctionAndLogException(StreamOperator.java:137)
   at 
 org.apache.flink.streaming.api.operators.ChainableStreamOperator.collect(ChainableStreamOperator.java:54)
   at 
 org.apache.flink.streaming.api.collector.CollectorWrapper.collect(CollectorWrapper.java:39)
   at 
 org.apache.flink.streaming.connectors.kafka.api.persistent.PersistentKafkaSource.run(PersistentKafkaSource.java:173)
   at 
 org.apache.flink.streaming.api.operators.StreamSource.callUserFunction(StreamSource.java:40)
   at 
 org.apache.flink.streaming.api.operators.StreamOperator.callUserFunctionAndLogException(StreamOperator.java:137)
   at 
 org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:34)
   at 
 org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:139)
   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
   at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1907] Scala shell

2015-05-14 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/672#discussion_r30331086
  
--- Diff: 
flink-staging/flink-scala-shell/src/main/java/org.apache.flink/api/java/ScalaShellRemoteEnvironment.java
 ---
@@ -0,0 +1,84 @@
+
--- End diff --

missing apache header


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1398] Introduce extractSingleField() in...

2015-05-14 Thread aljoscha
Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/308#issuecomment-102077233
  
I think the consensus was that we don't want to have such a method in the 
DataSet API. We can, however, put a utility for this in flink-contrib. This 
utility should work for both batch and streaming? Any other opinions? Please 
correct me if I'm wrong.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1907) Scala Interactive Shell

2015-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543876#comment-14543876
 ] 

ASF GitHub Bot commented on FLINK-1907:
---

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/672#discussion_r30331698
  
--- Diff: 
flink-staging/flink-scala-shell/src/main/scala/org.apache.flink/api/scala/FlinkShell.scala
 ---
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.api.scala
+
+
+import scala.tools.nsc.Settings
+
+import org.apache.flink.configuration.Configuration
+import org.apache.flink.runtime.minicluster.LocalFlinkMiniCluster
+
+/**
+ * Created by Nikolaas Steenbergen on 22-4-15.
--- End diff --

we don't put author's names into the comments (shared code ownership)


 Scala Interactive Shell
 ---

 Key: FLINK-1907
 URL: https://issues.apache.org/jira/browse/FLINK-1907
 Project: Flink
  Issue Type: New Feature
  Components: Scala API
Reporter: Nikolaas Steenbergen
Assignee: Nikolaas Steenbergen
Priority: Minor

 Build an interactive Shell for the Scala api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1398) A new DataSet function: extractElementFromTuple

2015-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543875#comment-14543875
 ] 

ASF GitHub Bot commented on FLINK-1398:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/308#issuecomment-102077233
  
I think the consensus was that we don't want to have such a method in the 
DataSet API. We can, however, put a utility for this in flink-contrib. This 
utility should work for both batch and streaming? Any other opinions? Please 
correct me if I'm wrong.


 A new DataSet function: extractElementFromTuple
 ---

 Key: FLINK-1398
 URL: https://issues.apache.org/jira/browse/FLINK-1398
 Project: Flink
  Issue Type: Wish
Reporter: Felix Neutatz
Assignee: Felix Neutatz
Priority: Minor

 This is the use case:
 {code:xml}
 DataSetTuple2Integer, Double data =  env.fromElements(new 
 Tuple2Integer, Double(1,2.0));
 
 data.map(new ElementFromTuple());
 
 }
 public static final class ElementFromTuple implements 
 MapFunctionTuple2Integer, Double, Double {
 @Override
 public Double map(Tuple2Integer, Double value) {
 return value.f1;
 }
 }
 {code}
 It would be awesome if we had something like this:
 {code:xml}
 data.extractElement(1);
 {code}
 This means that we implement a function for DataSet which extracts a certain 
 element from a given Tuple.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1907] Scala shell

2015-05-14 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/672#discussion_r30331698
  
--- Diff: 
flink-staging/flink-scala-shell/src/main/scala/org.apache.flink/api/scala/FlinkShell.scala
 ---
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.api.scala
+
+
+import scala.tools.nsc.Settings
+
+import org.apache.flink.configuration.Configuration
+import org.apache.flink.runtime.minicluster.LocalFlinkMiniCluster
+
+/**
+ * Created by Nikolaas Steenbergen on 22-4-15.
--- End diff --

we don't put author's names into the comments (shared code ownership)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1907] Scala shell

2015-05-14 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/672#discussion_r30331377
  
--- Diff: 
flink-staging/flink-scala-shell/src/main/scala/org.apache.flink/api/scala/FlinkILoop.scala
 ---
@@ -0,0 +1,199 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.scala
+
+import java.io.{BufferedReader, File, FileOutputStream}
+
+import scala.tools.nsc.Settings
+import scala.tools.nsc.interpreter._
+
+import org.apache.flink.api.java.ScalaShellRemoteEnvironment
+import org.apache.flink.util.AbstractID
+
+/**
+ * Created by Nikolaas Steenbergen on 16-4-15.
+ */
+class FlinkILoop(val host: String,
+ val port: Int,
+ in0: Option[BufferedReader],
+ out0: JPrintWriter)
+  extends ILoop(in0, out0) {
+
+  def this(host:String, port:Int, in0: BufferedReader, out: JPrintWriter){
+this(host:String, port:Int, Some(in0), out)
+  }
+
+  def this(host:String, port:Int){
+this(host:String,port: Int,None, new JPrintWriter(Console.out, true))
+  }
+  // remote environment
+  private val remoteEnv: ScalaShellRemoteEnvironment = {
+val remoteEnv = new ScalaShellRemoteEnvironment(host, port, this)
+remoteEnv
+  }
+
+  // local environment
+  val scalaEnv: ExecutionEnvironment = {
+val scalaEnv = new ExecutionEnvironment(remoteEnv)
+scalaEnv
+  }
+
+
+  /**
+   * we override the process (initialization) method to
+   * insert Flink related stuff for not using a file for input.
+   */
+
+  /** Create a new interpreter. */
+  override def createInterpreter() {
+if (addedClasspath != )
+{
+  settings.classpath append addedClasspath
+}
+intp = new ILoopInterpreter
+intp.quietRun(import org.apache.flink.api.scala._)
+intp.quietRun(import org.apache.flink.api.common.functions._)
+intp.bind(env, this.scalaEnv)
+  }
+
+
+
+  /**
+   * creates a temporary directory to store compiled console files
+   */
+  private val tmpDirBase: File = {
+// get unique temporary folder:
+val abstractID: String = new AbstractID().toString
+val tmpDir: File = new File(
+  System.getProperty(java.io.tmpdir),
+  scala_shell_tmp- + abstractID)
+if (!tmpDir.exists) {
+  tmpDir.mkdir
+}
+tmpDir
+  }
+
+  // scala_shell commands
+  private val tmpDirShell: File = {
+new File(tmpDirBase, scala_shell_commands)
+  }
+
+  // scala shell jar file name
+  private val tmpJarShell: File = {
+new File(tmpDirBase, scala_shell_commands.jar)
+  }
+
+
+  /**
+   * writes contents of the compiled lines that have been executed in the 
shell into a
+   * physical directory: creates a unique temporary directory
+   */
+  def writeFilesToDisk(): Unit = {
+val vd = intp.virtualDirectory
+
+var vdIt = vd.iterator
+
+for (fi - vdIt) {
+  if (fi.isDirectory) {
+
+var fiIt = fi.iterator
+
+for (f - fiIt) {
+
+  // directory for compiled line
+  val lineDir = new File(tmpDirShell.getAbsolutePath, fi.name)
+  lineDir.mkdirs()
+
+  // compiled classes for commands from shell
+  val writeFile = new File(lineDir.getAbsolutePath, f.name)
+  val outputStream = new FileOutputStream(writeFile)
+  val inputStream = f.input
+
+  // copy file contents
+  org.apache.commons.io.IOUtils.copy(inputStream, outputStream)
+
+  inputStream.close()
+  outputStream.close()
+}
+  }
+}
+  }
+
+  /**
+   * CUSTOM START METHODS OVERRIDE:
+   */
+  override def prompt = Scala-Flink 
+
+  /**
+   * custom welcome message
+   

[jira] [Commented] (FLINK-1907) Scala Interactive Shell

2015-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543869#comment-14543869
 ] 

ASF GitHub Bot commented on FLINK-1907:
---

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/672#discussion_r30331377
  
--- Diff: 
flink-staging/flink-scala-shell/src/main/scala/org.apache.flink/api/scala/FlinkILoop.scala
 ---
@@ -0,0 +1,199 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.scala
+
+import java.io.{BufferedReader, File, FileOutputStream}
+
+import scala.tools.nsc.Settings
+import scala.tools.nsc.interpreter._
+
+import org.apache.flink.api.java.ScalaShellRemoteEnvironment
+import org.apache.flink.util.AbstractID
+
+/**
+ * Created by Nikolaas Steenbergen on 16-4-15.
+ */
+class FlinkILoop(val host: String,
+ val port: Int,
+ in0: Option[BufferedReader],
+ out0: JPrintWriter)
+  extends ILoop(in0, out0) {
+
+  def this(host:String, port:Int, in0: BufferedReader, out: JPrintWriter){
+this(host:String, port:Int, Some(in0), out)
+  }
+
+  def this(host:String, port:Int){
+this(host:String,port: Int,None, new JPrintWriter(Console.out, true))
+  }
+  // remote environment
+  private val remoteEnv: ScalaShellRemoteEnvironment = {
+val remoteEnv = new ScalaShellRemoteEnvironment(host, port, this)
+remoteEnv
+  }
+
+  // local environment
+  val scalaEnv: ExecutionEnvironment = {
+val scalaEnv = new ExecutionEnvironment(remoteEnv)
+scalaEnv
+  }
+
+
+  /**
+   * we override the process (initialization) method to
+   * insert Flink related stuff for not using a file for input.
+   */
+
+  /** Create a new interpreter. */
+  override def createInterpreter() {
+if (addedClasspath != )
+{
+  settings.classpath append addedClasspath
+}
+intp = new ILoopInterpreter
+intp.quietRun(import org.apache.flink.api.scala._)
+intp.quietRun(import org.apache.flink.api.common.functions._)
+intp.bind(env, this.scalaEnv)
+  }
+
+
+
+  /**
+   * creates a temporary directory to store compiled console files
+   */
+  private val tmpDirBase: File = {
+// get unique temporary folder:
+val abstractID: String = new AbstractID().toString
+val tmpDir: File = new File(
+  System.getProperty(java.io.tmpdir),
+  scala_shell_tmp- + abstractID)
+if (!tmpDir.exists) {
+  tmpDir.mkdir
+}
+tmpDir
+  }
+
+  // scala_shell commands
+  private val tmpDirShell: File = {
+new File(tmpDirBase, scala_shell_commands)
+  }
+
+  // scala shell jar file name
+  private val tmpJarShell: File = {
+new File(tmpDirBase, scala_shell_commands.jar)
+  }
+
+
+  /**
+   * writes contents of the compiled lines that have been executed in the 
shell into a
+   * physical directory: creates a unique temporary directory
+   */
+  def writeFilesToDisk(): Unit = {
+val vd = intp.virtualDirectory
+
+var vdIt = vd.iterator
+
+for (fi - vdIt) {
+  if (fi.isDirectory) {
+
+var fiIt = fi.iterator
+
+for (f - fiIt) {
+
+  // directory for compiled line
+  val lineDir = new File(tmpDirShell.getAbsolutePath, fi.name)
+  lineDir.mkdirs()
+
+  // compiled classes for commands from shell
+  val writeFile = new File(lineDir.getAbsolutePath, f.name)
+  val outputStream = new FileOutputStream(writeFile)
+  val inputStream = f.input
+
+  // copy file contents
+  org.apache.commons.io.IOUtils.copy(inputStream, outputStream)
+
+  

[GitHub] flink pull request: [FLINK-1907] Scala shell

2015-05-14 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/672#issuecomment-102069667
  
Amazing! I'm super excited to see this finally implemented.

I'll soon review the changes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1907) Scala Interactive Shell

2015-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543825#comment-14543825
 ] 

ASF GitHub Bot commented on FLINK-1907:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/672#issuecomment-102069667
  
Amazing! I'm super excited to see this finally implemented.

I'll soon review the changes.


 Scala Interactive Shell
 ---

 Key: FLINK-1907
 URL: https://issues.apache.org/jira/browse/FLINK-1907
 Project: Flink
  Issue Type: New Feature
  Components: Scala API
Reporter: Nikolaas Steenbergen
Assignee: Nikolaas Steenbergen
Priority: Minor

 Build an interactive Shell for the Scala api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1907) Scala Interactive Shell

2015-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543829#comment-14543829
 ] 

ASF GitHub Bot commented on FLINK-1907:
---

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/672#discussion_r30330062
  
--- Diff: docs/scala_shell_quickstart.md ---
@@ -0,0 +1,72 @@
+---
+title: Quickstart: Scala Shell
+---
+!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+License); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+--
+
+* This will be replaced by the TOC
+{:toc}
+
+Start working on your Flink Scala program in a few simple steps.
+
+## Startup Flink interactive Scala shell
+
+Flink has an integrated interactive scala shell.
+It can be used in a local setup as well as in a cluster setup.
+
+To use it in a local setup just execute:
+
+__Sample Input__:
+~~~bash
+flink/bin/start-scala-shell.sh 
+~~~
+
+And it will initialize a local JobManager by itself.
+
+To use it in a cluster setup you can supply the host and port of the 
JobManager with:
+
+__Sample Input__:
+~~~bash
+flink/bin/start-scala-shell.sh -host hostname -port portnumber
+~~~
+
+
+## Usage
+
+The shell will prebind the ExecutionEnvironment as env, so far only 
batch mode is supported.
+
+The following example will execute the wordcount program in the scala 
shell:
+
+~~~scala
+Flink-Shell val text = env.fromElements(To be, or not to be,--that is 
the question:--,Whether 'tis nobler in the mind to suffer, The slings and 
arrows of outrageous fortune,Or to take arms against a sea of troubles,)
+Flink-Shell val counts = text.flatMap { _.toLowerCase.split(\\W+) }.map 
{ (_, 1) }.groupBy(0).sum(1)
+Flink-Shell counts.print()
+~~~
+
+
+The print() command will automatically send the specified tasks to the 
JobManager for execution and will show the result of the computation in the 
terminal.
+
+It is possbile to write results to a file, like in the standard Scala api. 
However, in this case you need to call, to run your program:
+
+~~~scala
+Flink-Shell env.execute(MyProgram)
+~~~
+
+The Flink Shell comes with command history and autocompletion.
--- End diff --

The file is using `Scala` in upper and lowercase variants. I would make 
them all uppercase.


 Scala Interactive Shell
 ---

 Key: FLINK-1907
 URL: https://issues.apache.org/jira/browse/FLINK-1907
 Project: Flink
  Issue Type: New Feature
  Components: Scala API
Reporter: Nikolaas Steenbergen
Assignee: Nikolaas Steenbergen
Priority: Minor

 Build an interactive Shell for the Scala api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1907] Scala shell

2015-05-14 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/672#discussion_r30330465
  
--- Diff: flink-java/src/main/java/org/apache/flink/api/java/DataSet.java 
---
@@ -1337,11 +1336,16 @@ public long count() throws Exception {
/**
 * Writes a DataSet to the standard output stream (stdout).br/
 * For each element of the DataSet the result of {@link 
Object#toString()} is written.
-* 
-*  @return The DataSink that writes the DataSet.
 */
-   public DataSinkT print() {
-   return output(new PrintingOutputFormatT(false));
+   public void print() {
+   try {
+   ListT elements = this.collect();
+   for (T e: elements) {
+   System.out.println(e);
+   }
+   } catch (Exception e) {
+   System.out.println(Could not retrieve values for 
printing:  + e);
--- End diff --

This prints only the message. I would suggest to stringify the exception 
because the error messages coming from `collect()` might come from the remote 
cluster


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1907) Scala Interactive Shell

2015-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543859#comment-14543859
 ] 

ASF GitHub Bot commented on FLINK-1907:
---

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/672#discussion_r30330874
  
--- Diff: 
flink-scala/src/main/scala/org/apache/flink/api/scala/DataSet.scala ---
@@ -1327,8 +1327,8 @@ class DataSet[T: ClassTag](set: JavaDataSet[T]) {
* Writes a DataSet to the standard output stream (stdout). This uses 
[[AnyRef.toString]] on
* each element.
--- End diff --

the scaladoc here should also mention that an execution is triggered


 Scala Interactive Shell
 ---

 Key: FLINK-1907
 URL: https://issues.apache.org/jira/browse/FLINK-1907
 Project: Flink
  Issue Type: New Feature
  Components: Scala API
Reporter: Nikolaas Steenbergen
Assignee: Nikolaas Steenbergen
Priority: Minor

 Build an interactive Shell for the Scala api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1907) Scala Interactive Shell

2015-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543862#comment-14543862
 ] 

ASF GitHub Bot commented on FLINK-1907:
---

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/672#discussion_r30330912
  
--- Diff: flink-staging/flink-scala-shell/pom.xml ---
@@ -0,0 +1,246 @@
+?xml version=1.0 encoding=UTF-8?
+!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+License); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+  http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+--
+project xmlns=http://maven.apache.org/POM/4.0.0; 
xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance;
+   xsi:schemaLocation=http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/maven-v4_0_0.xsd;
+
+   modelVersion4.0.0/modelVersion
+
+   parent
+   groupIdorg.apache.flink/groupId
+   artifactIdflink-staging/artifactId
+   version0.9-SNAPSHOT/version
+   relativePath../relativePath
+   /parent
+
+   artifactIdflink-scala-shell/artifactId
+   nameflink-scala-shell/name
+
+   packagingjar/packaging
+
+   dependencies
+
+   !-- scala command line parsing --
+   dependency
+   groupIdcom.github.scopt/groupId
+artifactIdscopt_${scala.binary.version}/artifactId
+   /dependency
--- End diff --

space / tab mixed indentation.
Please use tabs.


 Scala Interactive Shell
 ---

 Key: FLINK-1907
 URL: https://issues.apache.org/jira/browse/FLINK-1907
 Project: Flink
  Issue Type: New Feature
  Components: Scala API
Reporter: Nikolaas Steenbergen
Assignee: Nikolaas Steenbergen
Priority: Minor

 Build an interactive Shell for the Scala api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1907] Scala shell

2015-05-14 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/672#discussion_r30330874
  
--- Diff: 
flink-scala/src/main/scala/org/apache/flink/api/scala/DataSet.scala ---
@@ -1327,8 +1327,8 @@ class DataSet[T: ClassTag](set: JavaDataSet[T]) {
* Writes a DataSet to the standard output stream (stdout). This uses 
[[AnyRef.toString]] on
* each element.
--- End diff --

the scaladoc here should also mention that an execution is triggered


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1907] Scala shell

2015-05-14 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/672#discussion_r30330912
  
--- Diff: flink-staging/flink-scala-shell/pom.xml ---
@@ -0,0 +1,246 @@
+?xml version=1.0 encoding=UTF-8?
+!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+License); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+  http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+--
+project xmlns=http://maven.apache.org/POM/4.0.0; 
xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance;
+   xsi:schemaLocation=http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/maven-v4_0_0.xsd;
+
+   modelVersion4.0.0/modelVersion
+
+   parent
+   groupIdorg.apache.flink/groupId
+   artifactIdflink-staging/artifactId
+   version0.9-SNAPSHOT/version
+   relativePath../relativePath
+   /parent
+
+   artifactIdflink-scala-shell/artifactId
+   nameflink-scala-shell/name
+
+   packagingjar/packaging
+
+   dependencies
+
+   !-- scala command line parsing --
+   dependency
+   groupIdcom.github.scopt/groupId
+artifactIdscopt_${scala.binary.version}/artifactId
+   /dependency
--- End diff --

space / tab mixed indentation.
Please use tabs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1907] Scala shell

2015-05-14 Thread nikste
Github user nikste commented on the pull request:

https://github.com/apache/flink/pull/672#issuecomment-102086481
  
thanks for the comments Robert, I'll fix the stuff tomorrow!
Indeed, the Scala shell itself is not so much code, most of the changes are 
caused by the change of  ```print()```.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1907) Scala Interactive Shell

2015-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543936#comment-14543936
 ] 

ASF GitHub Bot commented on FLINK-1907:
---

Github user nikste commented on the pull request:

https://github.com/apache/flink/pull/672#issuecomment-102086481
  
thanks for the comments Robert, I'll fix the stuff tomorrow!
Indeed, the Scala shell itself is not so much code, most of the changes are 
caused by the change of  ```print()```.


 Scala Interactive Shell
 ---

 Key: FLINK-1907
 URL: https://issues.apache.org/jira/browse/FLINK-1907
 Project: Flink
  Issue Type: New Feature
  Components: Scala API
Reporter: Nikolaas Steenbergen
Assignee: Nikolaas Steenbergen
Priority: Minor

 Build an interactive Shell for the Scala api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1711] - Converted all usages of Commons...

2015-05-14 Thread lokeshrajaram
GitHub user lokeshrajaram reopened a pull request:

https://github.com/apache/flink/pull/673

[FLINK-1711] - Converted all usages of Commons Validate to Guava Checks(for 
Java classes), Scala predef require(for Scala classes)

[FLINK-1711] - Converted all usages of Commons Validate to Guava Checks(for 
Java classes), Scala predef require(for Scala classes)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lokeshrajaram/flink all_guava

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/673.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #673


commit 04e1695d3b8414616216264a5b0972d762664ec7
Author: lrajaram lokesh_raja...@intuit.com
Date:   2015-05-10T01:57:36Z

converted all usages of Commons Validate to Guava Checks

commit 4f68d03d50d0fab47f5067906ec805f4a8b93cfa
Author: lrajaram lokesh_raja...@intuit.com
Date:   2015-05-14T02:29:03Z

converted all usages of commons validate to guava checks(for Java classes), 
scala predef require(for scala classes)

commit 1ecf70952a75728a2e2b9ae70e8f2c66ca9d337a
Author: lrajaram lokesh_raja...@intuit.com
Date:   2015-05-14T14:43:03Z

added guava dependency for flink-spargel module




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1711] - Converted all usages of Commons...

2015-05-14 Thread lokeshrajaram
Github user lokeshrajaram closed the pull request at:

https://github.com/apache/flink/pull/673


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (FLINK-1949) YARNSessionFIFOITCase sometimes fails to detect when the detached session finishes

2015-05-14 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-1949.
---
   Resolution: Pending Closed
Fix Version/s: 0.9

http://git-wip-us.apache.org/repos/asf/flink/commit/bd7d8679

 YARNSessionFIFOITCase sometimes fails to detect when the detached session 
 finishes
 --

 Key: FLINK-1949
 URL: https://issues.apache.org/jira/browse/FLINK-1949
 Project: Flink
  Issue Type: Bug
  Components: Tests, YARN Client
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Robert Metzger
 Fix For: 0.9


 {code}
 10:32:24,393 INFO  org.apache.flink.yarn.YARNSessionFIFOITCase
- CLI Frontend has returned, so the job is running
 10:32:24,398 INFO  org.apache.flink.yarn.YARNSessionFIFOITCase
- waiting for the job with appId application_1430130687160_0003 to finish
 10:32:24,629 INFO  org.apache.flink.yarn.YARNSessionFIFOITCase
- The job has finished. TaskManager output file found 
 /home/travis/build/tillrohrmann/flink/flink-yarn-tests/../flink-yarn-tests/target/flink-yarn-tests-fifo/flink-yarn-tests-fifo-logDir-nm-0_0/application_1430130687160_0003/container_1430130687160_0003_01_02/taskmanager-stdout.log
 10:32:24,630 WARN  org.apache.flink.yarn.YARNSessionFIFOITCase
- Error while detached yarn session was running
 java.lang.AssertionError: Expected string '(all,2)' not found in string ''
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.assertTrue(Assert.java:41)
   at 
 org.apache.flink.yarn.YARNSessionFIFOITCase.testDetachedPerJobYarnClusterInternal(YARNSessionFIFOITCase.java:504)
   at 
 org.apache.flink.yarn.YARNSessionFIFOITCase.testDetachedPerJobYarnClusterWithStreamingJob(YARNSessionFIFOITCase.java:563)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:483)
   at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
   at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
   at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
   at 
 org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
 {code}
 https://flink.a.o.uce.east.s3.amazonaws.com/travis-artifacts/tillrohrmann/flink/442/442.5.tar.gz



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1907) Scala Interactive Shell

2015-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543827#comment-14543827
 ] 

ASF GitHub Bot commented on FLINK-1907:
---

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/672#discussion_r30329898
  
--- Diff: docs/scala_shell_quickstart.md ---
@@ -0,0 +1,72 @@
+---
+title: Quickstart: Scala Shell
+---
+!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+License); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+--
+
+* This will be replaced by the TOC
+{:toc}
+
+Start working on your Flink Scala program in a few simple steps.
+
+## Startup Flink interactive Scala shell
+
+Flink has an integrated interactive scala shell.
+It can be used in a local setup as well as in a cluster setup.
+
+To use it in a local setup just execute:
+
+__Sample Input__:
+~~~bash
+flink/bin/start-scala-shell.sh 
+~~~
+
+And it will initialize a local JobManager by itself.
+
+To use it in a cluster setup you can supply the host and port of the 
JobManager with:
+
+__Sample Input__:
+~~~bash
+flink/bin/start-scala-shell.sh -host hostname -port portnumber
--- End diff --

wouldn't it be easier to just pass `hostname:port` to the script?

Are there any other flags available?


 Scala Interactive Shell
 ---

 Key: FLINK-1907
 URL: https://issues.apache.org/jira/browse/FLINK-1907
 Project: Flink
  Issue Type: New Feature
  Components: Scala API
Reporter: Nikolaas Steenbergen
Assignee: Nikolaas Steenbergen
Priority: Minor

 Build an interactive Shell for the Scala api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1907] Scala shell

2015-05-14 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/672#discussion_r30329898
  
--- Diff: docs/scala_shell_quickstart.md ---
@@ -0,0 +1,72 @@
+---
+title: Quickstart: Scala Shell
+---
+!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+License); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+--
+
+* This will be replaced by the TOC
+{:toc}
+
+Start working on your Flink Scala program in a few simple steps.
+
+## Startup Flink interactive Scala shell
+
+Flink has an integrated interactive scala shell.
+It can be used in a local setup as well as in a cluster setup.
+
+To use it in a local setup just execute:
+
+__Sample Input__:
+~~~bash
+flink/bin/start-scala-shell.sh 
+~~~
+
+And it will initialize a local JobManager by itself.
+
+To use it in a cluster setup you can supply the host and port of the 
JobManager with:
+
+__Sample Input__:
+~~~bash
+flink/bin/start-scala-shell.sh -host hostname -port portnumber
--- End diff --

wouldn't it be easier to just pass `hostname:port` to the script?

Are there any other flags available?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1907) Scala Interactive Shell

2015-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543835#comment-14543835
 ] 

ASF GitHub Bot commented on FLINK-1907:
---

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/672#discussion_r30330465
  
--- Diff: flink-java/src/main/java/org/apache/flink/api/java/DataSet.java 
---
@@ -1337,11 +1336,16 @@ public long count() throws Exception {
/**
 * Writes a DataSet to the standard output stream (stdout).br/
 * For each element of the DataSet the result of {@link 
Object#toString()} is written.
-* 
-*  @return The DataSink that writes the DataSet.
 */
-   public DataSinkT print() {
-   return output(new PrintingOutputFormatT(false));
+   public void print() {
+   try {
+   ListT elements = this.collect();
+   for (T e: elements) {
+   System.out.println(e);
+   }
+   } catch (Exception e) {
+   System.out.println(Could not retrieve values for 
printing:  + e);
--- End diff --

This prints only the message. I would suggest to stringify the exception 
because the error messages coming from `collect()` might come from the remote 
cluster


 Scala Interactive Shell
 ---

 Key: FLINK-1907
 URL: https://issues.apache.org/jira/browse/FLINK-1907
 Project: Flink
  Issue Type: New Feature
  Components: Scala API
Reporter: Nikolaas Steenbergen
Assignee: Nikolaas Steenbergen
Priority: Minor

 Build an interactive Shell for the Scala api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1907) Scala Interactive Shell

2015-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543917#comment-14543917
 ] 

ASF GitHub Bot commented on FLINK-1907:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/672#issuecomment-102082153
  
The change looks good, I had some easy to resolve comments.
Sadly, it seems that most of the changes are caused by the semantics change 
of `print()` ... the scalashell code doesn't seem to be so much.

We certainly need to update the documentation as well. A lot of code 
examples are doing print() and execute() together.
The docs should explain what print() is doing.



 Scala Interactive Shell
 ---

 Key: FLINK-1907
 URL: https://issues.apache.org/jira/browse/FLINK-1907
 Project: Flink
  Issue Type: New Feature
  Components: Scala API
Reporter: Nikolaas Steenbergen
Assignee: Nikolaas Steenbergen
Priority: Minor

 Build an interactive Shell for the Scala api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1907] Scala shell

2015-05-14 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/672#issuecomment-102082153
  
The change looks good, I had some easy to resolve comments.
Sadly, it seems that most of the changes are caused by the semantics change 
of `print()` ... the scalashell code doesn't seem to be so much.

We certainly need to update the documentation as well. A lot of code 
examples are doing print() and execute() together.
The docs should explain what print() is doing.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1907] Scala shell

2015-05-14 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/672#discussion_r30330062
  
--- Diff: docs/scala_shell_quickstart.md ---
@@ -0,0 +1,72 @@
+---
+title: Quickstart: Scala Shell
+---
+!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+License); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+--
+
+* This will be replaced by the TOC
+{:toc}
+
+Start working on your Flink Scala program in a few simple steps.
+
+## Startup Flink interactive Scala shell
+
+Flink has an integrated interactive scala shell.
+It can be used in a local setup as well as in a cluster setup.
+
+To use it in a local setup just execute:
+
+__Sample Input__:
+~~~bash
+flink/bin/start-scala-shell.sh 
+~~~
+
+And it will initialize a local JobManager by itself.
+
+To use it in a cluster setup you can supply the host and port of the 
JobManager with:
+
+__Sample Input__:
+~~~bash
+flink/bin/start-scala-shell.sh -host hostname -port portnumber
+~~~
+
+
+## Usage
+
+The shell will prebind the ExecutionEnvironment as env, so far only 
batch mode is supported.
+
+The following example will execute the wordcount program in the scala 
shell:
+
+~~~scala
+Flink-Shell val text = env.fromElements(To be, or not to be,--that is 
the question:--,Whether 'tis nobler in the mind to suffer, The slings and 
arrows of outrageous fortune,Or to take arms against a sea of troubles,)
+Flink-Shell val counts = text.flatMap { _.toLowerCase.split(\\W+) }.map 
{ (_, 1) }.groupBy(0).sum(1)
+Flink-Shell counts.print()
+~~~
+
+
+The print() command will automatically send the specified tasks to the 
JobManager for execution and will show the result of the computation in the 
terminal.
+
+It is possbile to write results to a file, like in the standard Scala api. 
However, in this case you need to call, to run your program:
+
+~~~scala
+Flink-Shell env.execute(MyProgram)
+~~~
+
+The Flink Shell comes with command history and autocompletion.
--- End diff --

The file is using `Scala` in upper and lowercase variants. I would make 
them all uppercase.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1907) Scala Interactive Shell

2015-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543866#comment-14543866
 ] 

ASF GitHub Bot commented on FLINK-1907:
---

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/672#discussion_r30331151
  
--- Diff: 
flink-staging/flink-scala-shell/src/main/java/org.apache.flink/api/java/ScalaShellRemoteEnvironment.java
 ---
@@ -0,0 +1,84 @@
+
+package org.apache.flink.api.java;
--- End diff --

we have the license header first, then the package 


 Scala Interactive Shell
 ---

 Key: FLINK-1907
 URL: https://issues.apache.org/jira/browse/FLINK-1907
 Project: Flink
  Issue Type: New Feature
  Components: Scala API
Reporter: Nikolaas Steenbergen
Assignee: Nikolaas Steenbergen
Priority: Minor

 Build an interactive Shell for the Scala api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1907) Scala Interactive Shell

2015-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543867#comment-14543867
 ] 

ASF GitHub Bot commented on FLINK-1907:
---

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/672#discussion_r30331192
  
--- Diff: 
flink-staging/flink-scala-shell/src/main/java/org.apache.flink/api/java/ScalaShellRemoteEnvironment.java
 ---
@@ -0,0 +1,84 @@
+
+package org.apache.flink.api.java;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import org.apache.flink.api.common.JobExecutionResult;
+import org.apache.flink.api.common.Plan;
+import org.apache.flink.api.common.PlanExecutor;
+
+import org.apache.flink.api.scala.FlinkILoop;
+
+import java.io.File;
+
+
+/**
+ * Created by Nikolaas Steenbergen on 23-4-15.
--- End diff --

Can you replace this my a short description of what the class does?


 Scala Interactive Shell
 ---

 Key: FLINK-1907
 URL: https://issues.apache.org/jira/browse/FLINK-1907
 Project: Flink
  Issue Type: New Feature
  Components: Scala API
Reporter: Nikolaas Steenbergen
Assignee: Nikolaas Steenbergen
Priority: Minor

 Build an interactive Shell for the Scala api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-1949) YARNSessionFIFOITCase sometimes fails to detect when the detached session finishes

2015-05-14 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger closed FLINK-1949.
-

 YARNSessionFIFOITCase sometimes fails to detect when the detached session 
 finishes
 --

 Key: FLINK-1949
 URL: https://issues.apache.org/jira/browse/FLINK-1949
 Project: Flink
  Issue Type: Bug
  Components: Tests, YARN Client
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Robert Metzger
 Fix For: 0.9


 {code}
 10:32:24,393 INFO  org.apache.flink.yarn.YARNSessionFIFOITCase
- CLI Frontend has returned, so the job is running
 10:32:24,398 INFO  org.apache.flink.yarn.YARNSessionFIFOITCase
- waiting for the job with appId application_1430130687160_0003 to finish
 10:32:24,629 INFO  org.apache.flink.yarn.YARNSessionFIFOITCase
- The job has finished. TaskManager output file found 
 /home/travis/build/tillrohrmann/flink/flink-yarn-tests/../flink-yarn-tests/target/flink-yarn-tests-fifo/flink-yarn-tests-fifo-logDir-nm-0_0/application_1430130687160_0003/container_1430130687160_0003_01_02/taskmanager-stdout.log
 10:32:24,630 WARN  org.apache.flink.yarn.YARNSessionFIFOITCase
- Error while detached yarn session was running
 java.lang.AssertionError: Expected string '(all,2)' not found in string ''
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.assertTrue(Assert.java:41)
   at 
 org.apache.flink.yarn.YARNSessionFIFOITCase.testDetachedPerJobYarnClusterInternal(YARNSessionFIFOITCase.java:504)
   at 
 org.apache.flink.yarn.YARNSessionFIFOITCase.testDetachedPerJobYarnClusterWithStreamingJob(YARNSessionFIFOITCase.java:563)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:483)
   at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
   at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
   at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
   at 
 org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
 {code}
 https://flink.a.o.uce.east.s3.amazonaws.com/travis-artifacts/tillrohrmann/flink/442/442.5.tar.gz



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1907) Scala Interactive Shell

2015-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543865#comment-14543865
 ] 

ASF GitHub Bot commented on FLINK-1907:
---

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/672#discussion_r30331086
  
--- Diff: 
flink-staging/flink-scala-shell/src/main/java/org.apache.flink/api/java/ScalaShellRemoteEnvironment.java
 ---
@@ -0,0 +1,84 @@
+
--- End diff --

missing apache header


 Scala Interactive Shell
 ---

 Key: FLINK-1907
 URL: https://issues.apache.org/jira/browse/FLINK-1907
 Project: Flink
  Issue Type: New Feature
  Components: Scala API
Reporter: Nikolaas Steenbergen
Assignee: Nikolaas Steenbergen
Priority: Minor

 Build an interactive Shell for the Scala api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1907] Scala shell

2015-05-14 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/672#discussion_r30331151
  
--- Diff: 
flink-staging/flink-scala-shell/src/main/java/org.apache.flink/api/java/ScalaShellRemoteEnvironment.java
 ---
@@ -0,0 +1,84 @@
+
+package org.apache.flink.api.java;
--- End diff --

we have the license header first, then the package 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1907] Scala shell

2015-05-14 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/672#discussion_r30331192
  
--- Diff: 
flink-staging/flink-scala-shell/src/main/java/org.apache.flink/api/java/ScalaShellRemoteEnvironment.java
 ---
@@ -0,0 +1,84 @@
+
+package org.apache.flink.api.java;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import org.apache.flink.api.common.JobExecutionResult;
+import org.apache.flink.api.common.Plan;
+import org.apache.flink.api.common.PlanExecutor;
+
+import org.apache.flink.api.scala.FlinkILoop;
+
+import java.io.File;
+
+
+/**
+ * Created by Nikolaas Steenbergen on 23-4-15.
--- End diff --

Can you replace this my a short description of what the class does?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1907] Scala shell

2015-05-14 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/672#discussion_r30330524
  
--- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/RemoteEnvironment.java ---
@@ -85,4 +87,14 @@ public String toString() {
return Remote Environment ( + this.host + : + this.port +  
- parallelism =  +
(getParallelism() == -1 ? default : 
getParallelism()) + ) :  + getIdString();
}
+
+
+   // needed to call execute on ScalaShellRemoteEnvironment
+   public int getPort() {
+   return(this.port);
--- End diff --

why are there parentheses around the return value?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1907) Scala Interactive Shell

2015-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543838#comment-14543838
 ] 

ASF GitHub Bot commented on FLINK-1907:
---

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/672#discussion_r30330524
  
--- Diff: 
flink-java/src/main/java/org/apache/flink/api/java/RemoteEnvironment.java ---
@@ -85,4 +87,14 @@ public String toString() {
return Remote Environment ( + this.host + : + this.port +  
- parallelism =  +
(getParallelism() == -1 ? default : 
getParallelism()) + ) :  + getIdString();
}
+
+
+   // needed to call execute on ScalaShellRemoteEnvironment
+   public int getPort() {
+   return(this.port);
--- End diff --

why are there parentheses around the return value?


 Scala Interactive Shell
 ---

 Key: FLINK-1907
 URL: https://issues.apache.org/jira/browse/FLINK-1907
 Project: Flink
  Issue Type: New Feature
  Components: Scala API
Reporter: Nikolaas Steenbergen
Assignee: Nikolaas Steenbergen
Priority: Minor

 Build an interactive Shell for the Scala api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1711) Replace all usages off commons.Validate with guava.check

2015-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543938#comment-14543938
 ] 

ASF GitHub Bot commented on FLINK-1711:
---

Github user lokeshrajaram closed the pull request at:

https://github.com/apache/flink/pull/673


 Replace all usages off commons.Validate with guava.check
 

 Key: FLINK-1711
 URL: https://issues.apache.org/jira/browse/FLINK-1711
 Project: Flink
  Issue Type: Improvement
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Lokesh Rajaram
Priority: Minor
  Labels: easyfix, starter
 Fix For: 0.9


 Per discussion on the mailing list, we decided to increase homogeneity. One 
 part is to consistently use the Guava methods {{checkNotNull}} and 
 {{checkArgument}}, rather than Apache Commons Lang3 {{Validate}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1711) Replace all usages off commons.Validate with guava.check

2015-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543940#comment-14543940
 ] 

ASF GitHub Bot commented on FLINK-1711:
---

GitHub user lokeshrajaram reopened a pull request:

https://github.com/apache/flink/pull/673

[FLINK-1711] - Converted all usages of Commons Validate to Guava Checks(for 
Java classes), Scala predef require(for Scala classes)

[FLINK-1711] - Converted all usages of Commons Validate to Guava Checks(for 
Java classes), Scala predef require(for Scala classes)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lokeshrajaram/flink all_guava

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/673.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #673


commit 04e1695d3b8414616216264a5b0972d762664ec7
Author: lrajaram lokesh_raja...@intuit.com
Date:   2015-05-10T01:57:36Z

converted all usages of Commons Validate to Guava Checks

commit 4f68d03d50d0fab47f5067906ec805f4a8b93cfa
Author: lrajaram lokesh_raja...@intuit.com
Date:   2015-05-14T02:29:03Z

converted all usages of commons validate to guava checks(for Java classes), 
scala predef require(for scala classes)

commit 1ecf70952a75728a2e2b9ae70e8f2c66ca9d337a
Author: lrajaram lokesh_raja...@intuit.com
Date:   2015-05-14T14:43:03Z

added guava dependency for flink-spargel module




 Replace all usages off commons.Validate with guava.check
 

 Key: FLINK-1711
 URL: https://issues.apache.org/jira/browse/FLINK-1711
 Project: Flink
  Issue Type: Improvement
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Lokesh Rajaram
Priority: Minor
  Labels: easyfix, starter
 Fix For: 0.9


 Per discussion on the mailing list, we decided to increase homogeneity. One 
 part is to consistently use the Guava methods {{checkNotNull}} and 
 {{checkArgument}}, rather than Apache Commons Lang3 {{Validate}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-2016) Add elastic net regression

2015-05-14 Thread Theodore Vasiloudis (JIRA)
Theodore Vasiloudis created FLINK-2016:
--

 Summary: Add elastic net regression
 Key: FLINK-2016
 URL: https://issues.apache.org/jira/browse/FLINK-2016
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Theodore Vasiloudis
Priority: Minor


[Elastic net|http://en.wikipedia.org/wiki/Elastic_net_regularization] is a 
linear regression method that combines L2 and L1 regularization.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1907] Scala shell

2015-05-14 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/672#discussion_r30329764
  
--- Diff: docs/scala_shell_quickstart.md ---
@@ -0,0 +1,72 @@
+---
+title: Quickstart: Scala Shell
+---
+!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+License); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+--
+
+* This will be replaced by the TOC
+{:toc}
+
+Start working on your Flink Scala program in a few simple steps.
+
+## Startup Flink interactive Scala shell
+
+Flink has an integrated interactive scala shell.
--- End diff --

scala uppercase


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1907) Scala Interactive Shell

2015-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543826#comment-14543826
 ] 

ASF GitHub Bot commented on FLINK-1907:
---

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/672#discussion_r30329764
  
--- Diff: docs/scala_shell_quickstart.md ---
@@ -0,0 +1,72 @@
+---
+title: Quickstart: Scala Shell
+---
+!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+License); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+--
+
+* This will be replaced by the TOC
+{:toc}
+
+Start working on your Flink Scala program in a few simple steps.
+
+## Startup Flink interactive Scala shell
+
+Flink has an integrated interactive scala shell.
--- End diff --

scala uppercase


 Scala Interactive Shell
 ---

 Key: FLINK-1907
 URL: https://issues.apache.org/jira/browse/FLINK-1907
 Project: Flink
  Issue Type: New Feature
  Components: Scala API
Reporter: Nikolaas Steenbergen
Assignee: Nikolaas Steenbergen
Priority: Minor

 Build an interactive Shell for the Scala api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: Pluggable state backend added

2015-05-14 Thread gyfora
GitHub user gyfora opened a pull request:

https://github.com/apache/flink/pull/676

Pluggable state backend added

This PR introduces the pluggable state backends using StateHandleProviders 
and extends the StateHandle interface with a discard method for cleaning up the 
unnecessary checkpoints.

It also adds a statehandle/provider implementation for storing checkpoints 
in any flink supported file system such as HDFS or Tachyon.

The checkpoint coordinator has been modified to properly discard user state 
handles using the following logic:
 - If a pending checkpoint expires (by the timer thread) it discards the 
user state
 - When a successful checkpoint expires (by acquiring following successful 
ones) it discards user states
 - When the checkpoint coordinator is shut down it discards all pending and 
successful states

Travis error:
I modified the recovery IT case to use the local file system for storing 
the checkpoints. Afterwards it checks whether the directory is empty. The test 
passes all the time ran locally, but it seems to fail on travis for no apparent 
reason. Usually a couple of files (2-5) remain in the checkpoint directory, 
meaning that almost all of them had been deleted but those. Also the 
checkpointing and recovery logic runs fine without test failure.

I would appreciate some help figuring this out somehow, or trying to 
reproduce it locally. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gyfora/flink statehandle

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/676.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #676


commit 22b5996e046cf83612fac2cb5aa02f2fd76a7e7b
Author: Gyula Fora gyf...@apache.org
Date:   2015-05-07T12:29:23Z

[streaming] ByteStream and File state handle added

commit fed66675e2e824eee00b88197c5a73882415c919
Author: Gyula Fora gyf...@apache.org
Date:   2015-05-14T09:49:30Z

[streaming] Discard method added to state handle

commit 36474aafe74be9b61a89b5240bbc39f47226da77
Author: Gyula Fora gyf...@apache.org
Date:   2015-05-14T19:23:42Z

[streaming] StateHandleProvider added for configurable state backend




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-297) Redesign GUI client-server model

2015-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14544451#comment-14544451
 ] 

ASF GitHub Bot commented on FLINK-297:
--

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/677#issuecomment-102184072
  
A simple way to try this out is to execute the class `TestRunner` in the 
`flink-runtime-web` project. It starts a mini cluster, starts the new web 
server and runs three jobs (to have some jobs in the history to serve).


 Redesign GUI client-server model
 

 Key: FLINK-297
 URL: https://issues.apache.org/jira/browse/FLINK-297
 Project: Flink
  Issue Type: Improvement
Reporter: GitHub Import
  Labels: github-import
 Fix For: pre-apache


 Factor out job manager status information as REST service running inside the 
 same process. Implement visualization server as a separate web application 
 that runs on the client-side and renders data fetched from from the job 
 manager RESTful API.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/297
 Created by: [aalexandrov|https://github.com/aalexandrov]
 Labels: enhancement, gui, 
 Created at: Tue Nov 26 14:54:53 CET 2013
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-297) Redesign GUI client-server model

2015-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1457#comment-1457
 ] 

ASF GitHub Bot commented on FLINK-297:
--

GitHub user StephanEwen opened a pull request:

https://github.com/apache/flink/pull/677

[FLINK-297] [web frontend] First part of JobManager runtime monitor REST AP

This pull requests is the first step towards the new JobManager monitoring 
web frontend.

The code for the new web server that handles the requests is in 
`flink-runtime-web`. That way, we keep
the core runtime project free of the fat dependencies that come with some 
web frameworks.

The new webserver runs side by side the old one for now.
You can activate the new web server by adding `jobmanager.new-web-frontend: 
true` to the config.

By default, the server listens at `http://localhost:8082`.

The implementation uses almost pure netty, which is fast and lightweight 
(dependency wise), and we are using netty anyways in the network stack for data 
exchange.

The server currently answers the following requests:

http://localhost:8082/overview
http://localhost:8082/jobs
http://localhost:8082/jobs/job-id
http://localhost:8082/jobs/job-id/vertices
http://localhost:8082/jobs/job-id/plan

Here, job-id refers to a the ID of a current or archived job.

All requests respond with JSON.

I am working with someone that helps me draft a frontend (HTML5 + 
angular.js) that renders the information and issues the requests against the 
given URLs. I'll share more as soon as we have something that is worth sharing.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StephanEwen/incubator-flink web_frontend_2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/677.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #677


commit 482d12f155a66e22120f5e0a9993a5b3e56503a5
Author: Stephan Ewen se...@apache.org
Date:   2015-04-06T16:27:26Z

[FLINK-297] [web frontend] First part of JobManager runtime monitor REST API

 - Adds a separate Maven project for easier maintenance. Also allows users 
to refer to runtime without web libraries.
 - Simple HTTP server based on netty http (slim dependency, since we use 
netty anyways)
 - REST URL parsing via jauter netty router
 - Abstract stubs for handlers that deal with errors and request/response
 - First set of URL request handlers that produce JSON responses




 Redesign GUI client-server model
 

 Key: FLINK-297
 URL: https://issues.apache.org/jira/browse/FLINK-297
 Project: Flink
  Issue Type: Improvement
Reporter: GitHub Import
  Labels: github-import
 Fix For: pre-apache


 Factor out job manager status information as REST service running inside the 
 same process. Implement visualization server as a separate web application 
 that runs on the client-side and renders data fetched from from the job 
 manager RESTful API.
  Imported from GitHub 
 Url: https://github.com/stratosphere/stratosphere/issues/297
 Created by: [aalexandrov|https://github.com/aalexandrov]
 Labels: enhancement, gui, 
 Created at: Tue Nov 26 14:54:53 CET 2013
 State: open



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-297] [web frontend] First part of JobMa...

2015-05-14 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/677#issuecomment-102184072
  
A simple way to try this out is to execute the class `TestRunner` in the 
`flink-runtime-web` project. It starts a mini cluster, starts the new web 
server and runs three jobs (to have some jobs in the history to serve).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-297] [web frontend] First part of JobMa...

2015-05-14 Thread StephanEwen
GitHub user StephanEwen opened a pull request:

https://github.com/apache/flink/pull/677

[FLINK-297] [web frontend] First part of JobManager runtime monitor REST AP

This pull requests is the first step towards the new JobManager monitoring 
web frontend.

The code for the new web server that handles the requests is in 
`flink-runtime-web`. That way, we keep
the core runtime project free of the fat dependencies that come with some 
web frameworks.

The new webserver runs side by side the old one for now.
You can activate the new web server by adding `jobmanager.new-web-frontend: 
true` to the config.

By default, the server listens at `http://localhost:8082`.

The implementation uses almost pure netty, which is fast and lightweight 
(dependency wise), and we are using netty anyways in the network stack for data 
exchange.

The server currently answers the following requests:

http://localhost:8082/overview
http://localhost:8082/jobs
http://localhost:8082/jobs/job-id
http://localhost:8082/jobs/job-id/vertices
http://localhost:8082/jobs/job-id/plan

Here, job-id refers to a the ID of a current or archived job.

All requests respond with JSON.

I am working with someone that helps me draft a frontend (HTML5 + 
angular.js) that renders the information and issues the requests against the 
given URLs. I'll share more as soon as we have something that is worth sharing.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StephanEwen/incubator-flink web_frontend_2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/677.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #677


commit 482d12f155a66e22120f5e0a9993a5b3e56503a5
Author: Stephan Ewen se...@apache.org
Date:   2015-04-06T16:27:26Z

[FLINK-297] [web frontend] First part of JobManager runtime monitor REST API

 - Adds a separate Maven project for easier maintenance. Also allows users 
to refer to runtime without web libraries.
 - Simple HTTP server based on netty http (slim dependency, since we use 
netty anyways)
 - REST URL parsing via jauter netty router
 - Abstract stubs for handlers that deal with errors and request/response
 - First set of URL request handlers that produce JSON responses




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1976) Add ForwardedFields* hints for the optimizer

2015-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543390#comment-14543390
 ] 

ASF GitHub Bot commented on FLINK-1976:
---

Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/663#issuecomment-101972007
  
Guys, 

Maybe it makes sense to merge this :)
It's been here for a while. 


 Add ForwardedFields* hints for the  optimizer
 -

 Key: FLINK-1976
 URL: https://issues.apache.org/jira/browse/FLINK-1976
 Project: Flink
  Issue Type: Wish
  Components: Gelly
Affects Versions: 0.9
Reporter: Andra Lungu
Assignee: Andra Lungu

 Some classes in Graph.java can be improved by adding ForwardedFields* 
 annotations. For instance, EmitOneEdgePerNode, 
 EmitOneVertexWithEdgeValuePerNode, EmitOneEdgeWithNeighborPerNode, 
 ProjectEdgeWithNeighbor, etc. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1976][gelly] Added ForwardedFields Anno...

2015-05-14 Thread andralungu
Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/663#issuecomment-101972007
  
Guys, 

Maybe it makes sense to merge this :)
It's been here for a while. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1523) Vertex-centric iteration extensions

2015-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543408#comment-14543408
 ] 

ASF GitHub Bot commented on FLINK-1523:
---

Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/537#issuecomment-101978637
  
Hi @vasia, 

I had a look at the new branch. The changes look good, degrees are no 
longer exposed to the user and the current approach removes the need to 
subclass Vertex. :+1: 

The only small remark/comment I have comes from a user perspective: 
- let's say that, by mistake, I forgot to set the degrees option;
- let's also say I was too busy to read the manual :)
- result: I will get -1 instead of the expected number of degrees per vertex

I understand why you had to pass -1 there; it should be of the same type as 
the degrees. However, maybe we can come up with some way to hint users that 
they should not forget to set the corresponding options.  Adding an extra line 
in the documentation might not suffice.  


 Vertex-centric iteration extensions
 ---

 Key: FLINK-1523
 URL: https://issues.apache.org/jira/browse/FLINK-1523
 Project: Flink
  Issue Type: Improvement
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Andra Lungu

 We would like to make the following extensions to the vertex-centric 
 iterations of Gelly:
 - allow vertices to access their in/out degrees and the total number of 
 vertices of the graph, inside the iteration.
 - allow choosing the neighborhood type (in/out/all) over which to run the 
 vertex-centric iteration. Now, the model uses the updates of the in-neighbors 
 to calculate state and send messages to out-neighbors. We could add a 
 parameter with value in/out/all to the {{VertexUpdateFunction}} and 
 {{MessagingFunction}}, that would indicate the type of neighborhood.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1523][gelly] Vertex centric iteration e...

2015-05-14 Thread andralungu
Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/537#issuecomment-101978637
  
Hi @vasia, 

I had a look at the new branch. The changes look good, degrees are no 
longer exposed to the user and the current approach removes the need to 
subclass Vertex. :+1: 

The only small remark/comment I have comes from a user perspective: 
- let's say that, by mistake, I forgot to set the degrees option;
- let's also say I was too busy to read the manual :)
- result: I will get -1 instead of the expected number of degrees per vertex

I understand why you had to pass -1 there; it should be of the same type as 
the degrees. However, maybe we can come up with some way to hint users that 
they should not forget to set the corresponding options.  Adding an extra line 
in the documentation might not suffice.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-2012) addVertices, addEdges, removeVertices, removeEdges methods

2015-05-14 Thread Andra Lungu (JIRA)
Andra Lungu created FLINK-2012:
--

 Summary: addVertices, addEdges, removeVertices, removeEdges methods
 Key: FLINK-2012
 URL: https://issues.apache.org/jira/browse/FLINK-2012
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Affects Versions: 0.9
Reporter: Andra Lungu
Assignee: Andra Lungu
Priority: Minor


Currently, Gelly only allows the addition/deletion of one vertex/edge at a 
time. If a user would want to add two (or more) vertices, he/she would need to 
add a vertex- create a new graph; then add another vertex - another graph 
etc.  

It would be nice to also have addVertices, addEdges, removeVertices, 
removeEdges methods. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1398) A new DataSet function: extractElementFromTuple

2015-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543419#comment-14543419
 ] 

ASF GitHub Bot commented on FLINK-1398:
---

Github user FelixNeutatz commented on the pull request:

https://github.com/apache/flink/pull/308#issuecomment-101981739
  
So, what is the final decision here?

  this seems like a lot of code for something that can be achieved using a 
simple mapper
-- that is exactly the reason why I would want this functionality - it is 
too much code for a simple thing :)


 A new DataSet function: extractElementFromTuple
 ---

 Key: FLINK-1398
 URL: https://issues.apache.org/jira/browse/FLINK-1398
 Project: Flink
  Issue Type: Wish
Reporter: Felix Neutatz
Assignee: Felix Neutatz
Priority: Minor

 This is the use case:
 {code:xml}
 DataSetTuple2Integer, Double data =  env.fromElements(new 
 Tuple2Integer, Double(1,2.0));
 
 data.map(new ElementFromTuple());
 
 }
 public static final class ElementFromTuple implements 
 MapFunctionTuple2Integer, Double, Double {
 @Override
 public Double map(Tuple2Integer, Double value) {
 return value.f1;
 }
 }
 {code}
 It would be awesome if we had something like this:
 {code:xml}
 data.extractElement(1);
 {code}
 This means that we implement a function for DataSet which extracts a certain 
 element from a given Tuple.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1398] Introduce extractSingleField() in...

2015-05-14 Thread FelixNeutatz
Github user FelixNeutatz commented on the pull request:

https://github.com/apache/flink/pull/308#issuecomment-101981739
  
So, what is the final decision here?

  this seems like a lot of code for something that can be achieved using a 
simple mapper
-- that is exactly the reason why I would want this functionality - it is 
too much code for a simple thing :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1707) Add an Affinity Propagation Library Method

2015-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543449#comment-14543449
 ] 

ASF GitHub Bot commented on FLINK-1707:
---

Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/649#issuecomment-101989242
  
@joey001, 

How did you rebase this? It should not normally contain everyone's 
commits... Right now it looks like 18 people participated in the implementation 
of Affinity Propagation :)

Also a common practice is to write a PR updated comment after adding 
something to your pull request. That way people will be notified and will 
review your changes asap. If you don't comment, you run the risk of having your 
PR hang in here longer ;)  


 Add an Affinity Propagation Library Method
 --

 Key: FLINK-1707
 URL: https://issues.apache.org/jira/browse/FLINK-1707
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: joey
Priority: Minor

 This issue proposes adding the an implementation of the Affinity Propagation 
 algorithm as a Gelly library method and a corresponding example.
 The algorithm is described in paper [1] and a description of a vertex-centric 
 implementation can be found is [2].
 [1]: http://www.psi.toronto.edu/affinitypropagation/FreyDueckScience07.pdf
 [2]: http://event.cwi.nl/grades2014/00-ching-slides.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1707][WIP]Add an Affinity Propagation L...

2015-05-14 Thread andralungu
Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/649#issuecomment-101989242
  
@joey001, 

How did you rebase this? It should not normally contain everyone's 
commits... Right now it looks like 18 people participated in the implementation 
of Affinity Propagation :)

Also a common practice is to write a PR updated comment after adding 
something to your pull request. That way people will be notified and will 
review your changes asap. If you don't comment, you run the risk of having your 
PR hang in here longer ;)  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1711] - Converted all usages of Commons...

2015-05-14 Thread lokeshrajaram
Github user lokeshrajaram closed the pull request at:

https://github.com/apache/flink/pull/673


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1711] - Converted all usages of Commons...

2015-05-14 Thread lokeshrajaram
GitHub user lokeshrajaram reopened a pull request:

https://github.com/apache/flink/pull/673

[FLINK-1711] - Converted all usages of Commons Validate to Guava Checks(for 
Java classes), Scala predef require(for Scala classes)

[FLINK-1711] - Converted all usages of Commons Validate to Guava Checks(for 
Java classes), Scala predef require(for Scala classes)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lokeshrajaram/flink all_guava

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/673.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #673


commit 04e1695d3b8414616216264a5b0972d762664ec7
Author: lrajaram lokesh_raja...@intuit.com
Date:   2015-05-10T01:57:36Z

converted all usages of Commons Validate to Guava Checks

commit 4f68d03d50d0fab47f5067906ec805f4a8b93cfa
Author: lrajaram lokesh_raja...@intuit.com
Date:   2015-05-14T02:29:03Z

converted all usages of commons validate to guava checks(for Java classes), 
scala predef require(for scala classes)

commit 1ecf70952a75728a2e2b9ae70e8f2c66ca9d337a
Author: lrajaram lokesh_raja...@intuit.com
Date:   2015-05-14T14:43:03Z

added guava dependency for flink-spargel module




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1711) Replace all usages off commons.Validate with guava.check

2015-05-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14544863#comment-14544863
 ] 

ASF GitHub Bot commented on FLINK-1711:
---

Github user lokeshrajaram closed the pull request at:

https://github.com/apache/flink/pull/673


 Replace all usages off commons.Validate with guava.check
 

 Key: FLINK-1711
 URL: https://issues.apache.org/jira/browse/FLINK-1711
 Project: Flink
  Issue Type: Improvement
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Lokesh Rajaram
Priority: Minor
  Labels: easyfix, starter
 Fix For: 0.9


 Per discussion on the mailing list, we decided to increase homogeneity. One 
 part is to consistently use the Guava methods {{checkNotNull}} and 
 {{checkArgument}}, rather than Apache Commons Lang3 {{Validate}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)