[jira] [Commented] (FLINK-3586) Risk of data overflow while use sum/count to calculate AVG value

2016-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299487#comment-15299487
 ] 

ASF GitHub Bot commented on FLINK-3586:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/2024#issuecomment-221477372
  
I would like to merge this later today.


> Risk of data overflow while use sum/count to calculate AVG value
> 
>
> Key: FLINK-3586
> URL: https://issues.apache.org/jira/browse/FLINK-3586
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API
>Reporter: Chengxiang Li
>Assignee: Fabian Hueske
>Priority: Minor
>
> Now, we use {{(sum: Long, count: Long}} to store AVG partial aggregate data, 
> which may have data overflow risk, we should use unbounded data type(such as 
> BigInteger) to store them for necessary data types.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3586] Fix potential overflow of Long AV...

2016-05-24 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/2024#issuecomment-221477372
  
I would like to merge this later today.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-3968) Cancel button on a running job not available on smaller screens

2016-05-24 Thread Lokesh Ravindranathan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lokesh Ravindranathan updated FLINK-3968:
-
Description: I am using a Mac with Chrome as my browser. I run a Flink job 
and when I look at the dashboard, there is no cancel button and I cannot scroll 
the page to the right on my Mac display(13"). But when I connect to a second 
monitor (24"), the button is visible. I presume this should be a rendering 
problem. The two screenshots one having the cancel button and the other missing 
the same - http://imgur.com/a/1nTSg.  (was: I am using a Mac with Chrome as my 
browser. I run a Flink job and when I look at the dashboard, there is no cancel 
button and I cannot scroll the page to the right on my Mac display(13"). But 
when I connect to a second monitor (24"), i can see the button visible. I 
presume this should be a rendering problem. The two screenshots one having the 
cancel button and the other missing the same - http://imgur.com/a/1nTSg.)

> Cancel button on a running job not available on smaller screens
> ---
>
> Key: FLINK-3968
> URL: https://issues.apache.org/jira/browse/FLINK-3968
> Project: Flink
>  Issue Type: Bug
>  Components: Webfrontend
>Affects Versions: 1.0.1
>Reporter: Lokesh Ravindranathan
>Priority: Minor
>
> I am using a Mac with Chrome as my browser. I run a Flink job and when I look 
> at the dashboard, there is no cancel button and I cannot scroll the page to 
> the right on my Mac display(13"). But when I connect to a second monitor 
> (24"), the button is visible. I presume this should be a rendering problem. 
> The two screenshots one having the cancel button and the other missing the 
> same - http://imgur.com/a/1nTSg.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3968) Cancel button on a running job not available on smaller screens

2016-05-24 Thread Lokesh Ravindranathan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lokesh Ravindranathan updated FLINK-3968:
-
Description: I am using a Mac with Chrome as my browser. I run a Flink job 
and when I look at the dashboard, there is no cancel button and I cannot scroll 
the page to the right on my Mac display(13"). But when I connect to a second 
monitor (24"), i can see the button visible. I presume this should be a 
rendering problem. The two screenshots one having the cancel button and the 
other missing the same - http://imgur.com/a/1nTSg.  (was: I am using a Mac with 
Chrome as my browser. I run a Flink job and when I look at the dashboard, there 
is no cancel button and I cannot scroll the page to the right on my Mac 
display(13"). But when I connect to a second monitor (24th), i can see the 
button visible. I presume this should be a rendering problem. The two 
screenshots one having the cancel button and the other missing the same - 
http://imgur.com/a/1nTSg.)

> Cancel button on a running job not available on smaller screens
> ---
>
> Key: FLINK-3968
> URL: https://issues.apache.org/jira/browse/FLINK-3968
> Project: Flink
>  Issue Type: Bug
>  Components: Webfrontend
>Affects Versions: 1.0.1
>Reporter: Lokesh Ravindranathan
>Priority: Minor
>
> I am using a Mac with Chrome as my browser. I run a Flink job and when I look 
> at the dashboard, there is no cancel button and I cannot scroll the page to 
> the right on my Mac display(13"). But when I connect to a second monitor 
> (24"), i can see the button visible. I presume this should be a rendering 
> problem. The two screenshots one having the cancel button and the other 
> missing the same - http://imgur.com/a/1nTSg.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3968) Cancel button on a running job not available on smaller screens

2016-05-24 Thread Lokesh Ravindranathan (JIRA)
Lokesh Ravindranathan created FLINK-3968:


 Summary: Cancel button on a running job not available on smaller 
screens
 Key: FLINK-3968
 URL: https://issues.apache.org/jira/browse/FLINK-3968
 Project: Flink
  Issue Type: Bug
  Components: Webfrontend
Affects Versions: 1.0.1
Reporter: Lokesh Ravindranathan
Priority: Minor


I am using a Mac with Chrome as my browser. I run a Flink job and when I look 
at the dashboard, there is no cancel button and I cannot scroll the page to the 
right on my Mac display(13"). But when I connect to a second monitor (24th), i 
can see the button visible. I presume this should be a rendering problem. The 
two screenshots one having the cancel button and the other missing the same - 
http://imgur.com/a/1nTSg.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2044) Implementation of Gelly HITS Algorithm

2016-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299312#comment-15299312
 ] 

ASF GitHub Bot commented on FLINK-2044:
---

Github user gallenvara commented on the pull request:

https://github.com/apache/flink/pull/1956#issuecomment-221452144
  
@vasia fixed!


> Implementation of Gelly HITS Algorithm
> --
>
> Key: FLINK-2044
> URL: https://issues.apache.org/jira/browse/FLINK-2044
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Reporter: Ahamd Javid
>Assignee: GaoLun
>Priority: Minor
>
> Implementation of Hits Algorithm in Gelly API using Java. the feature branch 
> can be found here: (https://github.com/JavidMayar/flink/commits/HITS)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2044] [gelly] Implementation of Gelly H...

2016-05-24 Thread gallenvara
Github user gallenvara commented on the pull request:

https://github.com/apache/flink/pull/1956#issuecomment-221452144
  
@vasia fixed!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3967) Provide RethinkDB Sink for Flink

2016-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299296#comment-15299296
 ] 

ASF GitHub Bot commented on FLINK-3967:
---

GitHub user mans2singh opened a pull request:

https://github.com/apache/flink/pull/2031

FLINK-3967 - Flink Sink for Rethink Db

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [ ] General
  - The pull request references the related JIRA issue ("[FLINK-XXX] Jira 
title text")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [ ] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [ ] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mans2singh/flink FLINK-3967

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2031.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2031


commit 98507a314587890f20b888a21df0b4b574f168a7
Author: mans2singh 
Date:   2016-05-25T00:54:44Z

FLINK-3967 - Flink Sink for Rethink Db




> Provide RethinkDB Sink for Flink
> 
>
> Key: FLINK-3967
> URL: https://issues.apache.org/jira/browse/FLINK-3967
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming, Streaming Connectors
>Affects Versions: 1.0.3
> Environment: All
>Reporter: Mans Singh
>Assignee: Mans Singh
>Priority: Minor
>  Labels: features
> Fix For: 1.1.0
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Provide Sink to stream data from flink to rethink db.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: FLINK-3967 - Flink Sink for Rethink Db

2016-05-24 Thread mans2singh
GitHub user mans2singh opened a pull request:

https://github.com/apache/flink/pull/2031

FLINK-3967 - Flink Sink for Rethink Db

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [ ] General
  - The pull request references the related JIRA issue ("[FLINK-XXX] Jira 
title text")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [ ] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [ ] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mans2singh/flink FLINK-3967

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2031.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2031


commit 98507a314587890f20b888a21df0b4b574f168a7
Author: mans2singh 
Date:   2016-05-25T00:54:44Z

FLINK-3967 - Flink Sink for Rethink Db




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: FLINK-3967 - Flink Streaming Sink for Rethink ...

2016-05-24 Thread mans4singh
Github user mans4singh closed the pull request at:

https://github.com/apache/flink/pull/2030


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3967) Provide RethinkDB Sink for Flink

2016-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299295#comment-15299295
 ] 

ASF GitHub Bot commented on FLINK-3967:
---

Github user mans4singh closed the pull request at:

https://github.com/apache/flink/pull/2030


> Provide RethinkDB Sink for Flink
> 
>
> Key: FLINK-3967
> URL: https://issues.apache.org/jira/browse/FLINK-3967
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming, Streaming Connectors
>Affects Versions: 1.0.3
> Environment: All
>Reporter: Mans Singh
>Assignee: Mans Singh
>Priority: Minor
>  Labels: features
> Fix For: 1.1.0
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Provide Sink to stream data from flink to rethink db.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3967) Provide RethinkDB Sink for Flink

2016-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299282#comment-15299282
 ] 

ASF GitHub Bot commented on FLINK-3967:
---

GitHub user mans4singh opened a pull request:

https://github.com/apache/flink/pull/2030

FLINK-3967 - Flink Streaming Sink for Rethink Db

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [ x] General
  - The pull request references the related JIRA issue ("[FLINK-XXX] Jira 
title text")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [ x] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [ x] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mans2singh/flink FLINK-3967

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2030.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2030


commit 98507a314587890f20b888a21df0b4b574f168a7
Author: mans2singh 
Date:   2016-05-25T00:54:44Z

FLINK-3967 - Flink Sink for Rethink Db




> Provide RethinkDB Sink for Flink
> 
>
> Key: FLINK-3967
> URL: https://issues.apache.org/jira/browse/FLINK-3967
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming, Streaming Connectors
>Affects Versions: 1.0.3
> Environment: All
>Reporter: Mans Singh
>Assignee: Mans Singh
>Priority: Minor
>  Labels: features
> Fix For: 1.1.0
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Provide Sink to stream data from flink to rethink db.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: FLINK-3967 - Flink Streaming Sink for Rethink ...

2016-05-24 Thread mans4singh
GitHub user mans4singh opened a pull request:

https://github.com/apache/flink/pull/2030

FLINK-3967 - Flink Streaming Sink for Rethink Db

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [ x] General
  - The pull request references the related JIRA issue ("[FLINK-XXX] Jira 
title text")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [ x] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [ x] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mans2singh/flink FLINK-3967

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2030.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2030


commit 98507a314587890f20b888a21df0b4b574f168a7
Author: mans2singh 
Date:   2016-05-25T00:54:44Z

FLINK-3967 - Flink Sink for Rethink Db




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-3967) Provide RethinkDB Sink for Flink

2016-05-24 Thread Mans Singh (JIRA)
Mans Singh created FLINK-3967:
-

 Summary: Provide RethinkDB Sink for Flink
 Key: FLINK-3967
 URL: https://issues.apache.org/jira/browse/FLINK-3967
 Project: Flink
  Issue Type: New Feature
  Components: Streaming, Streaming Connectors
Affects Versions: 1.0.3
 Environment: All
Reporter: Mans Singh
Assignee: Mans Singh
Priority: Minor
 Fix For: 1.1.0


Provide Sink to stream data from flink to rethink db.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2044) Implementation of Gelly HITS Algorithm

2016-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298855#comment-15298855
 ] 

ASF GitHub Bot commented on FLINK-2044:
---

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/1956#issuecomment-221394386
  
Hey @gallenvara,
I had a private chat with @greghogan about this PR. We think that we should 
change the label type to a boolean instead of string. It should make a 
difference for large graph inputs. After this last change we'll go ahead and 
finally merge this :)


> Implementation of Gelly HITS Algorithm
> --
>
> Key: FLINK-2044
> URL: https://issues.apache.org/jira/browse/FLINK-2044
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Reporter: Ahamd Javid
>Assignee: GaoLun
>Priority: Minor
>
> Implementation of Hits Algorithm in Gelly API using Java. the feature branch 
> can be found here: (https://github.com/JavidMayar/flink/commits/HITS)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2044] [gelly] Implementation of Gelly H...

2016-05-24 Thread vasia
Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/1956#issuecomment-221394386
  
Hey @gallenvara,
I had a private chat with @greghogan about this PR. We think that we should 
change the label type to a boolean instead of string. It should make a 
difference for large graph inputs. After this last change we'll go ahead and 
finally merge this :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2814) DeltaIteration: DualInputPlanNode cannot be cast to SingleInputPlanNode

2016-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298739#comment-15298739
 ] 

ASF GitHub Bot commented on FLINK-2814:
---

GitHub user rekhajoshm opened a pull request:

https://github.com/apache/flink/pull/2029

[FLINK-2814] Fix for DualInputPlanNode cannot be cast to SingleInputPlanNode

[FLINK-2814] Fix for DeltaIteration: DualInputPlanNode cannot be cast to 
SingleInputPlanNode

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rekhajoshm/flink FLINK-2814

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2029.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2029


commit 166407b6e35ffb104d11d8b0c41ace34bfbb0ecd
Author: Joshi 
Date:   2016-05-24T19:14:15Z

[FLINK-2814] DeltaIteration: DualInputPlanNode cannot be cast to 
SingleInputPlanNode




> DeltaIteration: DualInputPlanNode cannot be cast to SingleInputPlanNode
> ---
>
> Key: FLINK-2814
> URL: https://issues.apache.org/jira/browse/FLINK-2814
> Project: Flink
>  Issue Type: Bug
>  Components: Optimizer
>Affects Versions: 0.10.0
>Reporter: Greg Hogan
>Assignee: Rekha Joshi
>
> A delta iteration that closes with a solution set which is a {{JoinOperator}} 
> throws the following exception:
> {noformat}
> org.apache.flink.client.program.ProgramInvocationException: The main method 
> caused an error.
>   at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:444)
>   at 
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:345)
>   at org.apache.flink.client.program.Client.runBlocking(Client.java:289)
>   at 
> org.apache.flink.client.CliFrontend.executeProgramBlocking(CliFrontend.java:669)
>   at org.apache.flink.client.CliFrontend.run(CliFrontend.java:324)
>   at 
> org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:969)
>   at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1019)
> Caused by: java.lang.ClassCastException: 
> org.apache.flink.optimizer.plan.DualInputPlanNode cannot be cast to 
> org.apache.flink.optimizer.plan.SingleInputPlanNode
>   at 
> org.apache.flink.optimizer.dag.WorksetIterationNode.instantiate(WorksetIterationNode.java:432)
>   at 
> org.apache.flink.optimizer.dag.TwoInputNode.addLocalCandidates(TwoInputNode.java:557)
>   at 
> org.apache.flink.optimizer.dag.TwoInputNode.getAlternativePlans(TwoInputNode.java:478)
>   at 
> org.apache.flink.optimizer.dag.DataSinkNode.getAlternativePlans(DataSinkNode.java:204)
>   at 
> org.apache.flink.optimizer.dag.TwoInputNode.getAlternativePlans(TwoInputNode.java:309)
>   at 
> org.apache.flink.optimizer.dag.TwoInputNode.getAlternativePlans(TwoInputNode.java:308)
>   at org.apache.flink.optimizer.Optimizer.compile(Optimizer.java:500)
>   at org.apache.flink.optimizer.Optimizer.compile(Optimizer.java:402)
>   at 
> org.apache.flink.client.program.Client.getOptimizedPlan(Client.java:271)
>   at 
> org.apache.flink.client.program.Client.getOptimizedPlan(Client.java:543)
>   at org.apache.flink.client.program.Client.runBlocking(Client.java:350)
>   at 
> org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:64)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:796)
>   at org.apache.flink.api.java.DataSet.collect(DataSet.java:424)
>   at org.apache.flink.api.java.DataSet.print(DataSet.java:1365)
>   at Driver.main(Driver.java:366)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:429)
>   ... 6 more
> {noformat}
> Temporary fix is to attach an identity mapper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2814] Fix for DualInputPlanNode cannot ...

2016-05-24 Thread rekhajoshm
GitHub user rekhajoshm opened a pull request:

https://github.com/apache/flink/pull/2029

[FLINK-2814] Fix for DualInputPlanNode cannot be cast to SingleInputPlanNode

[FLINK-2814] Fix for DeltaIteration: DualInputPlanNode cannot be cast to 
SingleInputPlanNode

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rekhajoshm/flink FLINK-2814

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2029.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2029


commit 166407b6e35ffb104d11d8b0c41ace34bfbb0ecd
Author: Joshi 
Date:   2016-05-24T19:14:15Z

[FLINK-2814] DeltaIteration: DualInputPlanNode cannot be cast to 
SingleInputPlanNode




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (FLINK-2814) DeltaIteration: DualInputPlanNode cannot be cast to SingleInputPlanNode

2016-05-24 Thread Rekha Joshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rekha Joshi reassigned FLINK-2814:
--

Assignee: Rekha Joshi

> DeltaIteration: DualInputPlanNode cannot be cast to SingleInputPlanNode
> ---
>
> Key: FLINK-2814
> URL: https://issues.apache.org/jira/browse/FLINK-2814
> Project: Flink
>  Issue Type: Bug
>  Components: Optimizer
>Affects Versions: 0.10.0
>Reporter: Greg Hogan
>Assignee: Rekha Joshi
>
> A delta iteration that closes with a solution set which is a {{JoinOperator}} 
> throws the following exception:
> {noformat}
> org.apache.flink.client.program.ProgramInvocationException: The main method 
> caused an error.
>   at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:444)
>   at 
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:345)
>   at org.apache.flink.client.program.Client.runBlocking(Client.java:289)
>   at 
> org.apache.flink.client.CliFrontend.executeProgramBlocking(CliFrontend.java:669)
>   at org.apache.flink.client.CliFrontend.run(CliFrontend.java:324)
>   at 
> org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:969)
>   at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1019)
> Caused by: java.lang.ClassCastException: 
> org.apache.flink.optimizer.plan.DualInputPlanNode cannot be cast to 
> org.apache.flink.optimizer.plan.SingleInputPlanNode
>   at 
> org.apache.flink.optimizer.dag.WorksetIterationNode.instantiate(WorksetIterationNode.java:432)
>   at 
> org.apache.flink.optimizer.dag.TwoInputNode.addLocalCandidates(TwoInputNode.java:557)
>   at 
> org.apache.flink.optimizer.dag.TwoInputNode.getAlternativePlans(TwoInputNode.java:478)
>   at 
> org.apache.flink.optimizer.dag.DataSinkNode.getAlternativePlans(DataSinkNode.java:204)
>   at 
> org.apache.flink.optimizer.dag.TwoInputNode.getAlternativePlans(TwoInputNode.java:309)
>   at 
> org.apache.flink.optimizer.dag.TwoInputNode.getAlternativePlans(TwoInputNode.java:308)
>   at org.apache.flink.optimizer.Optimizer.compile(Optimizer.java:500)
>   at org.apache.flink.optimizer.Optimizer.compile(Optimizer.java:402)
>   at 
> org.apache.flink.client.program.Client.getOptimizedPlan(Client.java:271)
>   at 
> org.apache.flink.client.program.Client.getOptimizedPlan(Client.java:543)
>   at org.apache.flink.client.program.Client.runBlocking(Client.java:350)
>   at 
> org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:64)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:796)
>   at org.apache.flink.api.java.DataSet.collect(DataSet.java:424)
>   at org.apache.flink.api.java.DataSet.print(DataSet.java:1365)
>   at Driver.main(Driver.java:366)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:429)
>   ... 6 more
> {noformat}
> Temporary fix is to attach an identity mapper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2771] Fix for IterateTest.testSimpleIte...

2016-05-24 Thread rekhajoshm
GitHub user rekhajoshm opened a pull request:

https://github.com/apache/flink/pull/2028

[FLINK-2771] Fix for IterateTest.testSimpleIteration fail on Travis

[FLINK-2771] Fix for IterateTest.testSimpleIteration fail on Travis

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rekhajoshm/flink FLINK-2771

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2028.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2028


commit d7df832c17592d840d1419e4b6bc1764664c2e75
Author: Joshi 
Date:   2016-05-24T18:46:40Z

[FLINK-2771] Fix for IterateTest.testSimpleIteration fail on Travis




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2771) IterateTest.testSimpleIteration fails on Travis

2016-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298692#comment-15298692
 ] 

ASF GitHub Bot commented on FLINK-2771:
---

GitHub user rekhajoshm opened a pull request:

https://github.com/apache/flink/pull/2028

[FLINK-2771] Fix for IterateTest.testSimpleIteration fail on Travis

[FLINK-2771] Fix for IterateTest.testSimpleIteration fail on Travis

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rekhajoshm/flink FLINK-2771

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2028.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2028


commit d7df832c17592d840d1419e4b6bc1764664c2e75
Author: Joshi 
Date:   2016-05-24T18:46:40Z

[FLINK-2771] Fix for IterateTest.testSimpleIteration fail on Travis




> IterateTest.testSimpleIteration fails on Travis
> ---
>
> Key: FLINK-2771
> URL: https://issues.apache.org/jira/browse/FLINK-2771
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Till Rohrmann
>Assignee: Rekha Joshi
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.0.0
>
>
> The {{IterateTest.testSimpleIteration}} failed on Travis with
> {code}
> Failed tests: 
>   IterateTest.testSimpleIteration:384 null
> {code}
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/81986242/log.txt 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (FLINK-2771) IterateTest.testSimpleIteration fails on Travis

2016-05-24 Thread Rekha Joshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rekha Joshi reassigned FLINK-2771:
--

Assignee: Rekha Joshi

> IterateTest.testSimpleIteration fails on Travis
> ---
>
> Key: FLINK-2771
> URL: https://issues.apache.org/jira/browse/FLINK-2771
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Till Rohrmann
>Assignee: Rekha Joshi
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.0.0
>
>
> The {{IterateTest.testSimpleIteration}} failed on Travis with
> {code}
> Failed tests: 
>   IterateTest.testSimpleIteration:384 null
> {code}
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/81986242/log.txt 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3966) AvroTypeInfo does not work with readonly avro specific records

2016-05-24 Thread Matthew Burghoffer (JIRA)
Matthew Burghoffer created FLINK-3966:
-

 Summary: AvroTypeInfo does not work with readonly avro specific 
records
 Key: FLINK-3966
 URL: https://issues.apache.org/jira/browse/FLINK-3966
 Project: Flink
  Issue Type: Bug
  Components: Avro Support
Affects Versions: 1.0.2, 1.0.3
Reporter: Matthew Burghoffer
Priority: Minor


When generating avro code, users will often supply createSetters=false and 
fieldVisibility=private for Avro specific records (for effectively immutable 
records).  This means that, according to TypeExtractor.isValidPojoField, the 
object is not a valid pojo (though is completely a valid avro object) and the 
TypeInformation factory will fail.

Specific records are usually created using builders, so perhaps using this 
mechanism for when creating a new record / mutating an existing record is 
required.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2915) JobManagerProcessFailureBatchRecoveryITCase

2016-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298665#comment-15298665
 ] 

ASF GitHub Bot commented on FLINK-2915:
---

GitHub user rekhajoshm opened a pull request:

https://github.com/apache/flink/pull/2027

[FLINK-2915] Fix for JobManagerProcessFailureBatchRecoveryITCase Test

[FLINK-2915] Fix for JobManagerProcessFailureBatchRecoveryITCase Test

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rekhajoshm/flink FLINK-2915

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2027.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2027


commit c62499f32db284cb0b81db19c918065fc39db1e5
Author: Joshi 
Date:   2016-05-24T18:19:26Z

[FLINK-2915] Fix for JobManagerProcessFailureBatchRecoveryITCase Test




> JobManagerProcessFailureBatchRecoveryITCase
> ---
>
> Key: FLINK-2915
> URL: https://issues.apache.org/jira/browse/FLINK-2915
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Reporter: Matthias J. Sax
>Assignee: Rekha Joshi
>Priority: Critical
>  Labels: test-stability
>
> https://travis-ci.org/apache/flink/jobs/87193692
> {noformat}
> Failed tests:   
> JobManagerProcessFailureBatchRecoveryITCase>AbstractJobManagerProcessFailureRecoveryITCase.testJobManagerProcessFailure:259
>  JobManager did not start up within 291736881301 nanoseconds.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2915] Fix for JobManagerProcessFailureB...

2016-05-24 Thread rekhajoshm
GitHub user rekhajoshm opened a pull request:

https://github.com/apache/flink/pull/2027

[FLINK-2915] Fix for JobManagerProcessFailureBatchRecoveryITCase Test

[FLINK-2915] Fix for JobManagerProcessFailureBatchRecoveryITCase Test

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rekhajoshm/flink FLINK-2915

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2027.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2027


commit c62499f32db284cb0b81db19c918065fc39db1e5
Author: Joshi 
Date:   2016-05-24T18:19:26Z

[FLINK-2915] Fix for JobManagerProcessFailureBatchRecoveryITCase Test




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (FLINK-2915) JobManagerProcessFailureBatchRecoveryITCase

2016-05-24 Thread Rekha Joshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rekha Joshi reassigned FLINK-2915:
--

Assignee: Rekha Joshi

> JobManagerProcessFailureBatchRecoveryITCase
> ---
>
> Key: FLINK-2915
> URL: https://issues.apache.org/jira/browse/FLINK-2915
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Reporter: Matthias J. Sax
>Assignee: Rekha Joshi
>Priority: Critical
>  Labels: test-stability
>
> https://travis-ci.org/apache/flink/jobs/87193692
> {noformat}
> Failed tests:   
> JobManagerProcessFailureBatchRecoveryITCase>AbstractJobManagerProcessFailureRecoveryITCase.testJobManagerProcessFailure:259
>  JobManager did not start up within 291736881301 nanoseconds.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3965) Delegating GraphAlgorithm

2016-05-24 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-3965:
-

 Summary: Delegating GraphAlgorithm
 Key: FLINK-3965
 URL: https://issues.apache.org/jira/browse/FLINK-3965
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Affects Versions: 1.1.0
Reporter: Greg Hogan
Assignee: Greg Hogan
 Fix For: 1.1.0


Complex and related algorithms often overlap in computation of data. Two such 
examples are:
1) the local and global clustering coefficients each use a listing of triangles
2) the local clustering coefficient joins on vertex degree, and the underlying 
triangle listing annotates edge degree which uses vertex degree

We can reuse and rewrite algorithm output by creating a {{ProxyObject}} as a 
delegate for method calls to the {{DataSet}} returned by the algorithm.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3761) Introduce key group state backend

2016-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298371#comment-15298371
 ] 

ASF GitHub Bot commented on FLINK-3761:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1988#issuecomment-221312885
  
Yes definitely: better be safe than sorry. Will remove the 
`createPartitionedStateBackend` method from `AbstractStateBackend`.


> Introduce key group state backend
> -
>
> Key: FLINK-3761
> URL: https://issues.apache.org/jira/browse/FLINK-3761
> Project: Flink
>  Issue Type: Sub-task
>  Components: state backends
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>
> After an off-line discussion with [~aljoscha], we came to the conclusion that 
> it would be beneficial to reflect the differences between a keyed and a 
> non-keyed stream also in the state backends. A state backend which is used 
> for a keyed stream offers a value, list, folding and value state and has to 
> group its keys into key groups. 
> A state backend for non-keyed streams can only offer a union state to make it 
> work with dynamic scaling. A union state is a state which is broadcasted to 
> all tasks in case of a recovery. The state backends can then select what 
> information they need to recover from the whole state (formerly distributed).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3761] Introduction of key groups

2016-05-24 Thread tillrohrmann
Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1988#issuecomment-221312885
  
Yes definitely: better be safe than sorry. Will remove the 
`createPartitionedStateBackend` method from `AbstractStateBackend`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3761) Introduce key group state backend

2016-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298360#comment-15298360
 ] 

ASF GitHub Bot commented on FLINK-3761:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/1988#issuecomment-221311349
  
Jip, this last paragraph with the Factory is what I hinted at with my 
comment.  This may be somewhat academic but if there is a method 
`getPartitionedStateBackend` the likelihood of it being wrongly used is 
somewhat high.


> Introduce key group state backend
> -
>
> Key: FLINK-3761
> URL: https://issues.apache.org/jira/browse/FLINK-3761
> Project: Flink
>  Issue Type: Sub-task
>  Components: state backends
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>
> After an off-line discussion with [~aljoscha], we came to the conclusion that 
> it would be beneficial to reflect the differences between a keyed and a 
> non-keyed stream also in the state backends. A state backend which is used 
> for a keyed stream offers a value, list, folding and value state and has to 
> group its keys into key groups. 
> A state backend for non-keyed streams can only offer a union state to make it 
> work with dynamic scaling. A union state is a state which is broadcasted to 
> all tasks in case of a recovery. The state backends can then select what 
> information they need to recover from the whole state (formerly distributed).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3761] Introduction of key groups

2016-05-24 Thread aljoscha
Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/1988#issuecomment-221311349
  
Jip, this last paragraph with the Factory is what I hinted at with my 
comment. 😃 This may be somewhat academic but if there is a method 
`getPartitionedStateBackend` the likelihood of it being wrongly used is 
somewhat high.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3761) Introduce key group state backend

2016-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298355#comment-15298355
 ] 

ASF GitHub Bot commented on FLINK-3761:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1988#issuecomment-221309557
  
Thanks for the initial feedback @aljoscha :-)

The introduction of `PartitionedState` is indeed not strictly necessary for 
this PR. The idea was that we will have partitioned and non-partitioned state 
in the future. `PartitionedState` is the key-value state backed by the 
`PartitionedStateBackend` whereas non-partitioned state is backed by the 
`AbstractStateBackend`. The first non-partitioned state (apart from the state 
serialized via `CheckpointStateOutputStream`) could be the redistributable 
non-partitioned state necessary for the `KafkaSources`, for example. Thus, the 
`PartitionedState` is more of a logical separation and it lays the foundation 
so that also non-keyed stream operators can use a proper state abstraction. But 
I can revert it, if you deem it redundant or pre-mature.

It is true that the `PartitionedStateBackend` and the 
`KeyGroupStateBackend` have **almost** the same signature. However, the changes 
you've mentioned are imho crucial and made the whole refactoring of the state 
backends necessary in the first place. The difference is that the 
`KeyGroupStateBackend` is aware of the key groups and, consequently, is able to 
snapshot and restore each key group individually. Trying to work around this 
would mean that the `PartitionedStateBackend` always has a single key group 
associated. But for that, it would have to know the sub task index of the 
enclosing `StreamOperator` to assign a sensible key group index. Furthermore, 
it wouldn't make sense to use any other `PartitionedStateBackend` than the 
`KeyGroupStateBackend` (given that it respects the `KeyGroupAssigner`) for the 
`AbstractStreamOperator`, because the data is shuffled according to the key 
group assignments. In general, I think the notion of key groups are touching 
too many parts of the Flink runtime so that it makes no longer sense to try to 
unify the `KeyGroupStateBackends` and `PartitionedStateBackends`. The state 
backends used by the `AbstractStreamOperator` have to be aware of that notion.

You can regard the `PartitionedStateBackend` as an internal class which was 
introduced to reuse the existing state backend implementations via the 
`GenericKeyGroupStateBackend`. In the future it might make sense to directly 
implement the `KeyGroupStateBackend` interface to decrease the key group 
overhead. It's just unfortunate that Java does not allow to specify package 
private methods. Otherwise, I would have declared the 
`createPartitionedStateBackend` as package private. But since the 
`GenericKeyGroupStateBackend` resides in a sub-package of 
`o.a.f.runtime.state`, it cannot access this method. But I think we could 
refactor it the following way: Remove `createPartitionedStateBackend`, make 
`createKeyGroupStateBackend` abstract, let the implementations of 
`AbstractStateBackend` implement a `PartitionedStateBackendFactory` interface 
and define the `createKeyGroupStateBackend` method for all 
`AbstractStateBackend` implementations with creating a 
`GenericKeyGroupStateBackend` which requires a 
`PartitionedStateBackendFactory`. That would be probably a better design.


> Introduce key group state backend
> -
>
> Key: FLINK-3761
> URL: https://issues.apache.org/jira/browse/FLINK-3761
> Project: Flink
>  Issue Type: Sub-task
>  Components: state backends
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>
> After an off-line discussion with [~aljoscha], we came to the conclusion that 
> it would be beneficial to reflect the differences between a keyed and a 
> non-keyed stream also in the state backends. A state backend which is used 
> for a keyed stream offers a value, list, folding and value state and has to 
> group its keys into key groups. 
> A state backend for non-keyed streams can only offer a union state to make it 
> work with dynamic scaling. A union state is a state which is broadcasted to 
> all tasks in case of a recovery. The state backends can then select what 
> information they need to recover from the whole state (formerly distributed).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3761] Introduction of key groups

2016-05-24 Thread tillrohrmann
Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1988#issuecomment-221309557
  
Thanks for the initial feedback @aljoscha :-)

The introduction of `PartitionedState` is indeed not strictly necessary for 
this PR. The idea was that we will have partitioned and non-partitioned state 
in the future. `PartitionedState` is the key-value state backed by the 
`PartitionedStateBackend` whereas non-partitioned state is backed by the 
`AbstractStateBackend`. The first non-partitioned state (apart from the state 
serialized via `CheckpointStateOutputStream`) could be the redistributable 
non-partitioned state necessary for the `KafkaSources`, for example. Thus, the 
`PartitionedState` is more of a logical separation and it lays the foundation 
so that also non-keyed stream operators can use a proper state abstraction. But 
I can revert it, if you deem it redundant or pre-mature.

It is true that the `PartitionedStateBackend` and the 
`KeyGroupStateBackend` have **almost** the same signature. However, the changes 
you've mentioned are imho crucial and made the whole refactoring of the state 
backends necessary in the first place. The difference is that the 
`KeyGroupStateBackend` is aware of the key groups and, consequently, is able to 
snapshot and restore each key group individually. Trying to work around this 
would mean that the `PartitionedStateBackend` always has a single key group 
associated. But for that, it would have to know the sub task index of the 
enclosing `StreamOperator` to assign a sensible key group index. Furthermore, 
it wouldn't make sense to use any other `PartitionedStateBackend` than the 
`KeyGroupStateBackend` (given that it respects the `KeyGroupAssigner`) for the 
`AbstractStreamOperator`, because the data is shuffled according to the key 
group assignments. In general, I think the notion of key groups are touching 
too many parts of the Fl
 ink runtime so that it makes no longer sense to try to unify the 
`KeyGroupStateBackends` and `PartitionedStateBackends`. The state backends used 
by the `AbstractStreamOperator` have to be aware of that notion.

You can regard the `PartitionedStateBackend` as an internal class which was 
introduced to reuse the existing state backend implementations via the 
`GenericKeyGroupStateBackend`. In the future it might make sense to directly 
implement the `KeyGroupStateBackend` interface to decrease the key group 
overhead. It's just unfortunate that Java does not allow to specify package 
private methods. Otherwise, I would have declared the 
`createPartitionedStateBackend` as package private. But since the 
`GenericKeyGroupStateBackend` resides in a sub-package of 
`o.a.f.runtime.state`, it cannot access this method. But I think we could 
refactor it the following way: Remove `createPartitionedStateBackend`, make 
`createKeyGroupStateBackend` abstract, let the implementations of 
`AbstractStateBackend` implement a `PartitionedStateBackendFactory` interface 
and define the `createKeyGroupStateBackend` method for all 
`AbstractStateBackend` implementations with creating a 
`GenericKeyGroupStateBackend` which
  requires a `PartitionedStateBackendFactory`. That would be probably a better 
design.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [Flink-2971][table] Add outer joins to the Tab...

2016-05-24 Thread dawidwys
Github user dawidwys commented on the pull request:

https://github.com/apache/flink/pull/1981#issuecomment-221307901
  
@fhueske I have uploaded the updated PR. Unfortunately there are some 
strange VM crashes, that I think are not related to the changes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-3239) Support for Kerberos enabled Kafka 0.9.0.0

2016-05-24 Thread Stefano Baghino (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefano Baghino updated FLINK-3239:
---
Assignee: Vijay Srinivasaraghavan  (was: Stefano Baghino)

> Support for Kerberos enabled Kafka 0.9.0.0
> --
>
> Key: FLINK-3239
> URL: https://issues.apache.org/jira/browse/FLINK-3239
> Project: Flink
>  Issue Type: New Feature
>Reporter: Niels Basjes
>Assignee: Vijay Srinivasaraghavan
> Attachments: flink3239-prototype.patch
>
>
> In Kafka 0.9.0.0 support for Kerberos has been created ( KAFKA-1686 ).
> Request: Allow Flink to forward/manage the Kerberos tickets for Kafka 
> correctly so that we can use Kafka in a secured environment.
> I expect the needed changes to be similar to FLINK-2977 which implements the 
> same support for HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3761] Introduction of key groups

2016-05-24 Thread aljoscha
Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/1988#issuecomment-221289448
  
I started looking into it, but man this is one big change... 😃 

I have some first remarks about API and internals:

Whats the reason for the introduction of `PartitionedState`? The Javadoc 
for `State` already says that it is the base class for partitioned state and 
that it is only usable on a `KeyedStream`.

The signature of `KeyGroupedStateBackend` and `PartitionedStateBackend` is 
exactly the same. `AbstractStateBackend` has both, method 
`createPartitionedStateBackend` and `createKeyGroupStateBackend`. Users of an 
`AbstractStateBackend` should only ever call the latter while the former is 
reserved for internal use by the default implementation for 
`KeyGroupedStateBackend` which is `GenericKeyGroupStateBackend`. Also, 
`AbstractStreamOperator` has the new method `getKeyGroupStateBackend` that 
should be used by operators such as the `WindowOperator` to deal with 
partitioned state. Now, where am I going with this? What I think is that the 
`AbstractStateBackend` should only have a method 
`createPartitionedStateBackend` that is externally visible. This would be used 
by the `AbstractStreamOperator` to create a state backend and users of the 
interface, i.e. `WindowOperator` would also deal just with 
`PartitionedStateBackend`, which they get from 
`AbstractStreamOperator.getPartitionedStateBa
 ckend`. The fact that there are these key groups should not be visible to 
users of a state backend. Internally, state backends would use the 
`GenericKeyGroupStateBackend`, they could provide an interface to it for 
creating non-key-grouped backends.

Above, "exactly the same" is not 100 % correct, since the snapshot/restore 
methods differ slightly but I think this could be worked around. Also, I found 
it quite hard to express what I actually mean but I hope you get my point. 😅 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3761) Introduce key group state backend

2016-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298250#comment-15298250
 ] 

ASF GitHub Bot commented on FLINK-3761:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/1988#issuecomment-221289448
  
I started looking into it, but man this is one big change...  

I have some first remarks about API and internals:

Whats the reason for the introduction of `PartitionedState`? The Javadoc 
for `State` already says that it is the base class for partitioned state and 
that it is only usable on a `KeyedStream`.

The signature of `KeyGroupedStateBackend` and `PartitionedStateBackend` is 
exactly the same. `AbstractStateBackend` has both, method 
`createPartitionedStateBackend` and `createKeyGroupStateBackend`. Users of an 
`AbstractStateBackend` should only ever call the latter while the former is 
reserved for internal use by the default implementation for 
`KeyGroupedStateBackend` which is `GenericKeyGroupStateBackend`. Also, 
`AbstractStreamOperator` has the new method `getKeyGroupStateBackend` that 
should be used by operators such as the `WindowOperator` to deal with 
partitioned state. Now, where am I going with this? What I think is that the 
`AbstractStateBackend` should only have a method 
`createPartitionedStateBackend` that is externally visible. This would be used 
by the `AbstractStreamOperator` to create a state backend and users of the 
interface, i.e. `WindowOperator` would also deal just with 
`PartitionedStateBackend`, which they get from 
`AbstractStreamOperator.getPartitionedStateBackend`. The fact that there are 
these key groups should not be visible to users of a state backend. Internally, 
state backends would use the `GenericKeyGroupStateBackend`, they could provide 
an interface to it for creating non-key-grouped backends.

Above, "exactly the same" is not 100 % correct, since the snapshot/restore 
methods differ slightly but I think this could be worked around. Also, I found 
it quite hard to express what I actually mean but I hope you get my point.  


> Introduce key group state backend
> -
>
> Key: FLINK-3761
> URL: https://issues.apache.org/jira/browse/FLINK-3761
> Project: Flink
>  Issue Type: Sub-task
>  Components: state backends
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>
> After an off-line discussion with [~aljoscha], we came to the conclusion that 
> it would be beneficial to reflect the differences between a keyed and a 
> non-keyed stream also in the state backends. A state backend which is used 
> for a keyed stream offers a value, list, folding and value state and has to 
> group its keys into key groups. 
> A state backend for non-keyed streams can only offer a union state to make it 
> work with dynamic scaling. A union state is a state which is broadcasted to 
> all tasks in case of a recovery. The state backends can then select what 
> information they need to recover from the whole state (formerly distributed).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3922) Infinite recursion on TypeExtractor

2016-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298241#comment-15298241
 ] 

ASF GitHub Bot commented on FLINK-3922:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/2011#issuecomment-221288209
  
Does this also prevent infinite recursion on non transitive recursive types?

Something like:
```java
public class A {
public B field;
}

public class B {
public A field;
}
```
Or (recursion via generics):
```java
public class Container {
public T field;
}

public class MyType extends Container {}
```

Can these be handled correctly right now?


> Infinite recursion on TypeExtractor
> ---
>
> Key: FLINK-3922
> URL: https://issues.apache.org/jira/browse/FLINK-3922
> Project: Flink
>  Issue Type: Bug
>  Components: Type Serialization System
>Affects Versions: 1.0.2
>Reporter: Flavio Pompermaier
>Assignee: Timo Walther
>Priority: Critical
>
> This program cause a StackOverflow (infinite recursion) in the TypeExtractor:
> {code:title=TypeSerializerStackOverflowOnRecursivePojo.java|borderStyle=solid}
> public class TypeSerializerStackOverflowOnRecursivePojo {
>   public static class RecursivePojo implements Serializable {
>   private static final long serialVersionUID = 1L;
>   
>   private RecursivePojo parent;
>   public RecursivePojo(){}
>   public RecursivePojo(K k, V v) {
>   }
>   public RecursivePojo getParent() {
>   return parent;
>   }
>   public void setParent(RecursivePojo parent) {
>   this.parent = parent;
>   }
>   
>   }
>   public static class TypedTuple extends Tuple3 RecursivePojo>>{
>   private static final long serialVersionUID = 1L;
>   }
>   
>   public static void main(String[] args) throws Exception {
>   ExecutionEnvironment env = 
> ExecutionEnvironment.getExecutionEnvironment();
>   env.fromCollection(Arrays.asList(new RecursivePojo Map>("test",new HashMap(
>   .map(t-> {TypedTuple ret = new TypedTuple();ret.setFields("1", 
> "1", t);return ret;}).returns(TypedTuple.class)
>   .print();
>   }
>   
> }
> {code}
> The thrown Exception is the following:
> {code:title=Exception thrown}
> Exception in thread "main" java.lang.StackOverflowError
>   at 
> sun.reflect.generics.parser.SignatureParser.parsePackageNameAndSimpleClassTypeSignature(SignatureParser.java:328)
>   at 
> sun.reflect.generics.parser.SignatureParser.parseClassTypeSignature(SignatureParser.java:310)
>   at 
> sun.reflect.generics.parser.SignatureParser.parseFieldTypeSignature(SignatureParser.java:289)
>   at 
> sun.reflect.generics.parser.SignatureParser.parseFieldTypeSignature(SignatureParser.java:283)
>   at 
> sun.reflect.generics.parser.SignatureParser.parseTypeSignature(SignatureParser.java:485)
>   at 
> sun.reflect.generics.parser.SignatureParser.parseReturnType(SignatureParser.java:627)
>   at 
> sun.reflect.generics.parser.SignatureParser.parseMethodTypeSignature(SignatureParser.java:577)
>   at 
> sun.reflect.generics.parser.SignatureParser.parseMethodSig(SignatureParser.java:171)
>   at 
> sun.reflect.generics.repository.ConstructorRepository.parse(ConstructorRepository.java:55)
>   at 
> sun.reflect.generics.repository.ConstructorRepository.parse(ConstructorRepository.java:43)
>   at 
> sun.reflect.generics.repository.AbstractRepository.(AbstractRepository.java:74)
>   at 
> sun.reflect.generics.repository.GenericDeclRepository.(GenericDeclRepository.java:49)
>   at 
> sun.reflect.generics.repository.ConstructorRepository.(ConstructorRepository.java:51)
>   at 
> sun.reflect.generics.repository.MethodRepository.(MethodRepository.java:46)
>   at 
> sun.reflect.generics.repository.MethodRepository.make(MethodRepository.java:59)
>   at java.lang.reflect.Method.getGenericInfo(Method.java:102)
>   at java.lang.reflect.Method.getGenericReturnType(Method.java:255)
>   at 
> org.apache.flink.api.java.typeutils.TypeExtractor.isValidPojoField(TypeExtractor.java:1610)
>   at 
> org.apache.flink.api.java.typeutils.TypeExtractor.analyzePojo(TypeExtractor.java:1671)
>   at 
> org.apache.flink.api.java.typeutils.TypeExtractor.privateGetForClass(TypeExtractor.java:1559)
>   at 
> org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoWithTypeHierarchy(TypeExtractor.java:732)

[GitHub] flink pull request: [FLINK-3922] [types] Infinite recursion on Typ...

2016-05-24 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/2011#issuecomment-221288209
  
Does this also prevent infinite recursion on non transitive recursive types?

Something like:
```java
public class A {
public B field;
}

public class B {
public A field;
}
```
Or (recursion via generics):
```java
public class Container {
public T field;
}

public class MyType extends Container {}
```

Can these be handled correctly right now?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Closed] (FLINK-3963) AbstractReporter uses shaded dependency

2016-05-24 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels closed FLINK-3963.
-
Resolution: Fixed

Fixed via 5b9872492394026f3e6ac31b9937141ebedb1481 by replacing Netty's 
ConcurrentHashMap with Java's.

> AbstractReporter uses shaded dependency
> ---
>
> Key: FLINK-3963
> URL: https://issues.apache.org/jira/browse/FLINK-3963
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Kostas Kloudas
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> This fails our Hadoop 1 build on Travis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (FLINK-3963) AbstractReporter uses shaded dependency

2016-05-24 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels reopened FLINK-3963:
---

The import is still not correct and let's Hadoop 1 builds fail.

> AbstractReporter uses shaded dependency
> ---
>
> Key: FLINK-3963
> URL: https://issues.apache.org/jira/browse/FLINK-3963
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Kostas Kloudas
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> This fails our Hadoop 1 build on Travis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2314) Make Streaming File Sources Persistent

2016-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298209#comment-15298209
 ] 

ASF GitHub Bot commented on FLINK-2314:
---

Github user kl0u commented on a diff in the pull request:

https://github.com/apache/flink/pull/2020#discussion_r64395572
  
--- Diff: 
flink-tests/src/test/java/org/apache/flink/test/checkpointing/StreamFaultToleranceTestBase.java
 ---
@@ -104,8 +106,25 @@ public void runCheckpointedProgram() {
postSubmit();
}
catch (Exception e) {
+   Throwable th = e;
+   int depth = 0;
+
+   for (; depth < 20; depth++) {
+   if (th instanceof SuccessException) {
+   try {
+   postSubmit();
+   } catch (Exception e1) {
+   e1.printStackTrace();
--- End diff --

Thanks @aljoscha ! Done.


> Make Streaming File Sources Persistent
> --
>
> Key: FLINK-2314
> URL: https://issues.apache.org/jira/browse/FLINK-2314
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Stephan Ewen
>Assignee: Kostas Kloudas
>
> Streaming File sources should participate in the checkpointing. They should 
> track the bytes they read from the file and checkpoint it.
> One can look at the sequence generating source function for an example of a 
> checkpointed source.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2314] Make Streaming File Sources Persi...

2016-05-24 Thread kl0u
Github user kl0u commented on a diff in the pull request:

https://github.com/apache/flink/pull/2020#discussion_r64395572
  
--- Diff: 
flink-tests/src/test/java/org/apache/flink/test/checkpointing/StreamFaultToleranceTestBase.java
 ---
@@ -104,8 +106,25 @@ public void runCheckpointedProgram() {
postSubmit();
}
catch (Exception e) {
+   Throwable th = e;
+   int depth = 0;
+
+   for (; depth < 20; depth++) {
+   if (th instanceof SuccessException) {
+   try {
+   postSubmit();
+   } catch (Exception e1) {
+   e1.printStackTrace();
--- End diff --

Thanks @aljoscha ! Done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2314] Make Streaming File Sources Persi...

2016-05-24 Thread aljoscha
Github user aljoscha commented on a diff in the pull request:

https://github.com/apache/flink/pull/2020#discussion_r64392589
  
--- Diff: 
flink-tests/src/test/java/org/apache/flink/test/checkpointing/StreamFaultToleranceTestBase.java
 ---
@@ -104,8 +106,25 @@ public void runCheckpointedProgram() {
postSubmit();
}
catch (Exception e) {
+   Throwable th = e;
+   int depth = 0;
+
+   for (; depth < 20; depth++) {
+   if (th instanceof SuccessException) {
+   try {
+   postSubmit();
+   } catch (Exception e1) {
+   e1.printStackTrace();
--- End diff --

Should we not forward the exception here? You introduced this block so that 
`postSubmit()` also runs when the `SuccessException` was thrown, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2314) Make Streaming File Sources Persistent

2016-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298185#comment-15298185
 ] 

ASF GitHub Bot commented on FLINK-2314:
---

Github user aljoscha commented on a diff in the pull request:

https://github.com/apache/flink/pull/2020#discussion_r64392589
  
--- Diff: 
flink-tests/src/test/java/org/apache/flink/test/checkpointing/StreamFaultToleranceTestBase.java
 ---
@@ -104,8 +106,25 @@ public void runCheckpointedProgram() {
postSubmit();
}
catch (Exception e) {
+   Throwable th = e;
+   int depth = 0;
+
+   for (; depth < 20; depth++) {
+   if (th instanceof SuccessException) {
+   try {
+   postSubmit();
+   } catch (Exception e1) {
+   e1.printStackTrace();
--- End diff --

Should we not forward the exception here? You introduced this block so that 
`postSubmit()` also runs when the `SuccessException` was thrown, right?


> Make Streaming File Sources Persistent
> --
>
> Key: FLINK-2314
> URL: https://issues.apache.org/jira/browse/FLINK-2314
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Affects Versions: 0.9
>Reporter: Stephan Ewen
>Assignee: Kostas Kloudas
>
> Streaming File sources should participate in the checkpointing. They should 
> track the bytes they read from the file and checkpoint it.
> One can look at the sequence generating source function for an example of a 
> checkpointed source.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3941) Add support for UNION (with duplicate elimination)

2016-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298114#comment-15298114
 ] 

ASF GitHub Bot commented on FLINK-3941:
---

Github user yjshen commented on a diff in the pull request:

https://github.com/apache/flink/pull/2025#discussion_r64383299
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/nodes/dataset/DataSetUnion.scala
 ---
@@ -69,16 +73,23 @@ class DataSetUnion(
   rows + metadata.getRowCount(child)
 }
 
-planner.getCostFactory.makeCost(rowCnt, 0, 0)
+planner.getCostFactory.makeCost(
+  rowCnt,
+  if (all) 0 else rowCnt,
+  if (all) 0 else rowCnt)
   }
 
   override def translateToPlan(
   tableEnv: BatchTableEnvironment,
   expectedType: Option[TypeInformation[Any]]): DataSet[Any] = {
 
-val leftDataSet = 
left.asInstanceOf[DataSetRel].translateToPlan(tableEnv)
-val rightDataSet = 
right.asInstanceOf[DataSetRel].translateToPlan(tableEnv)
-leftDataSet.union(rightDataSet).asInstanceOf[DataSet[Any]]
+val leftDataSet = 
left.asInstanceOf[DataSetRel].translateToPlan(tableEnv, expectedType)
+val rightDataSet = 
right.asInstanceOf[DataSetRel].translateToPlan(tableEnv, expectedType)
--- End diff --

`expectedType` is passed down to `Union`'s children, enables possible 
conversion to `Row` enforced by `Aggregate`.


> Add support for UNION (with duplicate elimination)
> --
>
> Key: FLINK-3941
> URL: https://issues.apache.org/jira/browse/FLINK-3941
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Fabian Hueske
>Assignee: Yijie Shen
>Priority: Minor
>
> Currently, only UNION ALL is supported by Table API and SQL.
> UNION (with duplicate elimination) can be supported by applying a 
> {{DataSet.distinct()}} after the union on all fields. This issue includes:
> - Extending {{DataSetUnion}}
> - Relaxing {{DataSetUnionRule}} to translated non-all unions.
> - Extend the Table API with union() method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3941][TableAPI]Add support for UNION (w...

2016-05-24 Thread yjshen
Github user yjshen commented on a diff in the pull request:

https://github.com/apache/flink/pull/2025#discussion_r64383299
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/nodes/dataset/DataSetUnion.scala
 ---
@@ -69,16 +73,23 @@ class DataSetUnion(
   rows + metadata.getRowCount(child)
 }
 
-planner.getCostFactory.makeCost(rowCnt, 0, 0)
+planner.getCostFactory.makeCost(
+  rowCnt,
+  if (all) 0 else rowCnt,
+  if (all) 0 else rowCnt)
   }
 
   override def translateToPlan(
   tableEnv: BatchTableEnvironment,
   expectedType: Option[TypeInformation[Any]]): DataSet[Any] = {
 
-val leftDataSet = 
left.asInstanceOf[DataSetRel].translateToPlan(tableEnv)
-val rightDataSet = 
right.asInstanceOf[DataSetRel].translateToPlan(tableEnv)
-leftDataSet.union(rightDataSet).asInstanceOf[DataSet[Any]]
+val leftDataSet = 
left.asInstanceOf[DataSetRel].translateToPlan(tableEnv, expectedType)
+val rightDataSet = 
right.asInstanceOf[DataSetRel].translateToPlan(tableEnv, expectedType)
--- End diff --

`expectedType` is passed down to `Union`'s children, enables possible 
conversion to `Row` enforced by `Aggregate`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-3941][TableAPI]Add support for UNION (w...

2016-05-24 Thread yjshen
Github user yjshen commented on a diff in the pull request:

https://github.com/apache/flink/pull/2025#discussion_r64382835
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/nodes/dataset/DataSetUnion.scala
 ---
@@ -69,16 +73,23 @@ class DataSetUnion(
   rows + metadata.getRowCount(child)
 }
 
-planner.getCostFactory.makeCost(rowCnt, 0, 0)
+planner.getCostFactory.makeCost(
+  rowCnt,
+  if (all) 0 else rowCnt,
+  if (all) 0 else rowCnt)
   }
 
   override def translateToPlan(
   tableEnv: BatchTableEnvironment,
   expectedType: Option[TypeInformation[Any]]): DataSet[Any] = {
 
-val leftDataSet = 
left.asInstanceOf[DataSetRel].translateToPlan(tableEnv)
-val rightDataSet = 
right.asInstanceOf[DataSetRel].translateToPlan(tableEnv)
-leftDataSet.union(rightDataSet).asInstanceOf[DataSet[Any]]
+val leftDataSet = 
left.asInstanceOf[DataSetRel].translateToPlan(tableEnv, expectedType)
+val rightDataSet = 
right.asInstanceOf[DataSetRel].translateToPlan(tableEnv, expectedType)
+if (all) {
+  leftDataSet.union(rightDataSet).asInstanceOf[DataSet[Any]]
+} else {
+  leftDataSet.union(rightDataSet).distinct().asInstanceOf[DataSet[Any]]
--- End diff --

In `DATASET_OPT_RULES`, `UnionToDistinctRule` substitute `Union` with 
`UnionAll` followed by an `Aggregate`, therefore this branch doesn't actually 
get executed. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-2155) Add an additional checkstyle validation for illegal imports

2016-05-24 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-2155:
--
Fix Version/s: 1.1.0

> Add an additional checkstyle validation for illegal imports
> ---
>
> Key: FLINK-2155
> URL: https://issues.apache.org/jira/browse/FLINK-2155
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Affects Versions: 1.1.0
>Reporter: Lokesh Rajaram
>Assignee: Kostas Kloudas
> Fix For: 0.10.0, 1.1.0
>
>
> Add an additional check-style validation for illegal imports.
> To begin with the following two package import are marked as illegal:
>  1. org.apache.commons.lang3.Validate
>  2. org.apache.flink.shaded.*
> Implementation based on: 
> http://checkstyle.sourceforge.net/config_imports.html#IllegalImport



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2155) Add an additional checkstyle validation for illegal imports

2016-05-24 Thread Kostas Kloudas (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298086#comment-15298086
 ] 

Kostas Kloudas commented on FLINK-2155:
---

Yes of course! Will do that later in the day.

> Add an additional checkstyle validation for illegal imports
> ---
>
> Key: FLINK-2155
> URL: https://issues.apache.org/jira/browse/FLINK-2155
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Affects Versions: 1.1.0
>Reporter: Lokesh Rajaram
>Assignee: Kostas Kloudas
> Fix For: 0.10.0, 1.1.0
>
>
> Add an additional check-style validation for illegal imports.
> To begin with the following two package import are marked as illegal:
>  1. org.apache.commons.lang3.Validate
>  2. org.apache.flink.shaded.*
> Implementation based on: 
> http://checkstyle.sourceforge.net/config_imports.html#IllegalImport



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2155) Add an additional checkstyle validation for illegal imports

2016-05-24 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-2155:
--
Affects Version/s: 1.1.0

> Add an additional checkstyle validation for illegal imports
> ---
>
> Key: FLINK-2155
> URL: https://issues.apache.org/jira/browse/FLINK-2155
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Affects Versions: 1.1.0
>Reporter: Lokesh Rajaram
>Assignee: Kostas Kloudas
> Fix For: 0.10.0, 1.1.0
>
>
> Add an additional check-style validation for illegal imports.
> To begin with the following two package import are marked as illegal:
>  1. org.apache.commons.lang3.Validate
>  2. org.apache.flink.shaded.*
> Implementation based on: 
> http://checkstyle.sourceforge.net/config_imports.html#IllegalImport



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2155) Add an additional checkstyle validation for illegal imports

2016-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298085#comment-15298085
 ] 

ASF GitHub Bot commented on FLINK-2155:
---

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/2026#issuecomment-221251824
  
Follow-up issue is here: https://issues.apache.org/jira/browse/FLINK-2155


> Add an additional checkstyle validation for illegal imports
> ---
>
> Key: FLINK-2155
> URL: https://issues.apache.org/jira/browse/FLINK-2155
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Lokesh Rajaram
>Assignee: Kostas Kloudas
> Fix For: 0.10.0
>
>
> Add an additional check-style validation for illegal imports.
> To begin with the following two package import are marked as illegal:
>  1. org.apache.commons.lang3.Validate
>  2. org.apache.flink.shaded.*
> Implementation based on: 
> http://checkstyle.sourceforge.net/config_imports.html#IllegalImport



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-3963) AbstractReporter uses shaded dependency

2016-05-24 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels closed FLINK-3963.
-
Resolution: Fixed

Fixed with cbee4ef20431be9d934a25ba89a801b16b4f85dd

> AbstractReporter uses shaded dependency
> ---
>
> Key: FLINK-3963
> URL: https://issues.apache.org/jira/browse/FLINK-3963
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Kostas Kloudas
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> This fails our Hadoop 1 build on Travis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [hotfix] Removed shaded import

2016-05-24 Thread mxm
Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/2026#issuecomment-221251824
  
Follow-up issue is here: https://issues.apache.org/jira/browse/FLINK-2155


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Reopened] (FLINK-2155) Add an additional checkstyle validation for illegal imports

2016-05-24 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels reopened FLINK-2155:
---
  Assignee: Kostas Kloudas  (was: Lokesh Rajaram)

The Checkstyle rule for checking for shaded imports doesn't seem to work 
correctly (see FLINK-3963). [~kkl0u] Could you take a look?

> Add an additional checkstyle validation for illegal imports
> ---
>
> Key: FLINK-2155
> URL: https://issues.apache.org/jira/browse/FLINK-2155
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Lokesh Rajaram
>Assignee: Kostas Kloudas
> Fix For: 0.10.0
>
>
> Add an additional check-style validation for illegal imports.
> To begin with the following two package import are marked as illegal:
>  1. org.apache.commons.lang3.Validate
>  2. org.apache.flink.shaded.*
> Implementation based on: 
> http://checkstyle.sourceforge.net/config_imports.html#IllegalImport



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [hotfix] Removed shaded import

2016-05-24 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2026


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3963) AbstractReporter uses shaded dependency

2016-05-24 Thread Kostas Kloudas (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298082#comment-15298082
 ] 

Kostas Kloudas commented on FLINK-3963:
---

Thanks a lot [~mxm] for opening this. The PR is 
https://github.com/apache/flink/pull/2026

> AbstractReporter uses shaded dependency
> ---
>
> Key: FLINK-3963
> URL: https://issues.apache.org/jira/browse/FLINK-3963
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Kostas Kloudas
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> This fails our Hadoop 1 build on Travis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [hotfix] Removed shaded import

2016-05-24 Thread mxm
Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/2026#issuecomment-221250699
  
+1 going to merge. Let's fix the Checkstyle rule in a separate issue/pull 
request.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3758) Add possibility to register accumulators in custom triggers

2016-05-24 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298079#comment-15298079
 ] 

Aljoscha Krettek commented on FLINK-3758:
-

The issue for the monitoring is this one: FLINK-456

> Add possibility to register accumulators in custom triggers
> ---
>
> Key: FLINK-3758
> URL: https://issues.apache.org/jira/browse/FLINK-3758
> Project: Flink
>  Issue Type: Improvement
>Reporter: Konstantin Knauf
>Assignee: Konstantin Knauf
>Priority: Minor
>
> For monitoring purposes it would be nice to be able to to use accumulators in 
> custom trigger functions. 
> Basically, the trigger context could just expose {{getAccumulator}} of 
> {{RuntimeContext}} or does this create problems I am not aware of?
> Adding accumulators in a trigger function is more difficult, I think, but 
> that's not really neccessary as the accummulator could just be added in some 
> other upstream operator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3758) Add possibility to register accumulators in custom triggers

2016-05-24 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298074#comment-15298074
 ] 

Aljoscha Krettek commented on FLINK-3758:
-

[~knaufk], we might have to reconsider how this is going to work. I assume you 
wanted to give access of accumulators to the Triggers to do monitoring of some 
sort? If this is correct, we should probably integrate this with the new 
monitoring features that [~Zentol] is currently working on.


> Add possibility to register accumulators in custom triggers
> ---
>
> Key: FLINK-3758
> URL: https://issues.apache.org/jira/browse/FLINK-3758
> Project: Flink
>  Issue Type: Improvement
>Reporter: Konstantin Knauf
>Assignee: Konstantin Knauf
>Priority: Minor
>
> For monitoring purposes it would be nice to be able to to use accumulators in 
> custom trigger functions. 
> Basically, the trigger context could just expose {{getAccumulator}} of 
> {{RuntimeContext}} or does this create problems I am not aware of?
> Adding accumulators in a trigger function is more difficult, I think, but 
> that's not really neccessary as the accummulator could just be added in some 
> other upstream operator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [hotfix] Removed shaded import

2016-05-24 Thread kl0u
Github user kl0u commented on the pull request:

https://github.com/apache/flink/pull/2026#issuecomment-221248085
  
Yes, give me a sec.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [hotfix] Removed shaded import

2016-05-24 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/2026#issuecomment-221247552
  
The fix looks okay.
However, can you add a rule to the checkstyle config in 
"tools/maven/checkstyle.xml" to ensure that the build will fail if people use 
wrong imports?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3941) Add support for UNION (with duplicate elimination)

2016-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298073#comment-15298073
 ] 

ASF GitHub Bot commented on FLINK-3941:
---

Github user yjshen commented on a diff in the pull request:

https://github.com/apache/flink/pull/2025#discussion_r64377174
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/runtime/aggregate/AggregateMapFunction.scala
 ---
@@ -44,7 +44,7 @@ class AggregateMapFunction[IN, OUT](
 
   override def map(value: IN): OUT = {
 
-val input = value.asInstanceOf[Row]
+val input = value.asInstanceOf[Product]
--- End diff --

Ah, I think I get why this happen, will fix this.


> Add support for UNION (with duplicate elimination)
> --
>
> Key: FLINK-3941
> URL: https://issues.apache.org/jira/browse/FLINK-3941
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Fabian Hueske
>Assignee: Yijie Shen
>Priority: Minor
>
> Currently, only UNION ALL is supported by Table API and SQL.
> UNION (with duplicate elimination) can be supported by applying a 
> {{DataSet.distinct()}} after the union on all fields. This issue includes:
> - Extending {{DataSetUnion}}
> - Relaxing {{DataSetUnionRule}} to translated non-all unions.
> - Extend the Table API with union() method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3941][TableAPI]Add support for UNION (w...

2016-05-24 Thread yjshen
Github user yjshen commented on a diff in the pull request:

https://github.com/apache/flink/pull/2025#discussion_r64377174
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/runtime/aggregate/AggregateMapFunction.scala
 ---
@@ -44,7 +44,7 @@ class AggregateMapFunction[IN, OUT](
 
   override def map(value: IN): OUT = {
 
-val input = value.asInstanceOf[Row]
+val input = value.asInstanceOf[Product]
--- End diff --

Ah, I think I get why this happen, will fix this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [hotfix] Removed shaded import

2016-05-24 Thread kl0u
GitHub user kl0u opened a pull request:

https://github.com/apache/flink/pull/2026

[hotfix] Removed shaded import



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kl0u/flink hotfix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2026.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2026


commit 04d5a067fd3234c2e353f16ff21ba98494098307
Author: kl0u 
Date:   2016-05-24T11:56:05Z

[hotfix] Removed shaded import




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3941) Add support for UNION (with duplicate elimination)

2016-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298057#comment-15298057
 ] 

ASF GitHub Bot commented on FLINK-3941:
---

Github user yjshen commented on a diff in the pull request:

https://github.com/apache/flink/pull/2025#discussion_r64371728
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/runtime/aggregate/AggregateMapFunction.scala
 ---
@@ -44,7 +44,7 @@ class AggregateMapFunction[IN, OUT](
 
   override def map(value: IN): OUT = {
 
-val input = value.asInstanceOf[Row]
+val input = value.asInstanceOf[Product]
--- End diff --

In `Execution mode = COLLECTION, Table config = EFFICIENT` for `testUnion`, 
the `value` is of `scala.Tuple3` type, not work as expected?


> Add support for UNION (with duplicate elimination)
> --
>
> Key: FLINK-3941
> URL: https://issues.apache.org/jira/browse/FLINK-3941
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Fabian Hueske
>Assignee: Yijie Shen
>Priority: Minor
>
> Currently, only UNION ALL is supported by Table API and SQL.
> UNION (with duplicate elimination) can be supported by applying a 
> {{DataSet.distinct()}} after the union on all fields. This issue includes:
> - Extending {{DataSetUnion}}
> - Relaxing {{DataSetUnionRule}} to translated non-all unions.
> - Extend the Table API with union() method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3941][TableAPI]Add support for UNION (w...

2016-05-24 Thread yjshen
Github user yjshen commented on a diff in the pull request:

https://github.com/apache/flink/pull/2025#discussion_r64371728
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/runtime/aggregate/AggregateMapFunction.scala
 ---
@@ -44,7 +44,7 @@ class AggregateMapFunction[IN, OUT](
 
   override def map(value: IN): OUT = {
 
-val input = value.asInstanceOf[Row]
+val input = value.asInstanceOf[Product]
--- End diff --

In `Execution mode = COLLECTION, Table config = EFFICIENT` for `testUnion`, 
the `value` is of `scala.Tuple3` type, not work as expected?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-3964) Job submission times out with recursive.file.enumeration

2016-05-24 Thread Juho Autio (JIRA)
Juho Autio created FLINK-3964:
-

 Summary: Job submission times out with recursive.file.enumeration
 Key: FLINK-3964
 URL: https://issues.apache.org/jira/browse/FLINK-3964
 Project: Flink
  Issue Type: Bug
Reporter: Juho Autio


When using "recursive.file.enumeration" with a big enough folder structure to 
list, flink batch job fails right at the beginning because of a timeout.

h2. Problem details

We get this error: {{Communication with JobManager failed: Job submission to 
the JobManager timed out}}.

The code we have is basically this:

{code}
val env = ExecutionEnvironment.getExecutionEnvironment

val parameters = new Configuration

// set the recursive enumeration parameter
parameters.setBoolean("recursive.file.enumeration", true)

val parameter = ParameterTool.fromArgs(args)

val input_data_path : String = parameter.get("input_data_path", null )

val data : DataSet[(Text,Text)] = env.readSequenceFile(classOf[Text], 
classOf[Text], input_data_path)
.withParameters(parameters)

data.first(10).print
{code}

If we set {{input_data_path}} parameter to {{s3n://bucket/path/date=*/}} it 
times out. If we use a more restrictive pattern like 
{{s3n://bucket/path/date=20160523/}}, it doesn't time out.

To me it seems that time taken to list files shouldn't cause any timeouts on 
job submission level.

For us this was "fixed" by adding {{akka.client.timeout: 600 s}} in 
{{flink-conf.yaml}}, but I wonder if the timeout would still occur if we have 
even more files to list?



P.S. Is there any way to set {{akka.client.timeout}} when calling {{bin/flink 
run}} instead of editing {{flink-conf.yaml}}. I tried to add it as a {{-yD}} 
flag but couldn't get it working.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3941) Add support for UNION (with duplicate elimination)

2016-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298039#comment-15298039
 ] 

ASF GitHub Bot commented on FLINK-3941:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/2025#issuecomment-221232111
  
Changes look good. Can you also please update the supported feature set in 
the docs (`docs/apis/table.md`)?

Should be good to merge once that is done. 
Thanks, Fabian


> Add support for UNION (with duplicate elimination)
> --
>
> Key: FLINK-3941
> URL: https://issues.apache.org/jira/browse/FLINK-3941
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Fabian Hueske
>Assignee: Yijie Shen
>Priority: Minor
>
> Currently, only UNION ALL is supported by Table API and SQL.
> UNION (with duplicate elimination) can be supported by applying a 
> {{DataSet.distinct()}} after the union on all fields. This issue includes:
> - Extending {{DataSetUnion}}
> - Relaxing {{DataSetUnionRule}} to translated non-all unions.
> - Extend the Table API with union() method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3941][TableAPI]Add support for UNION (w...

2016-05-24 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/2025#issuecomment-221232111
  
Changes look good. Can you also please update the supported feature set in 
the docs (`docs/apis/table.md`)?

Should be good to merge once that is done. 
Thanks, Fabian


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3963) AbstractReporter uses shaded dependency

2016-05-24 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298035#comment-15298035
 ] 

Maximilian Michels commented on FLINK-3963:
---

Thanks for reporting [~kkl0u]. Do you want to open a pull request?

> AbstractReporter uses shaded dependency
> ---
>
> Key: FLINK-3963
> URL: https://issues.apache.org/jira/browse/FLINK-3963
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Kostas Kloudas
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> This fails our Hadoop 1 build on Travis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3963) AbstractReporter uses shaded dependency

2016-05-24 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-3963:
-

 Summary: AbstractReporter uses shaded dependency
 Key: FLINK-3963
 URL: https://issues.apache.org/jira/browse/FLINK-3963
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.1.0
Reporter: Maximilian Michels
Assignee: Kostas Kloudas
 Fix For: 1.1.0


This fails our Hadoop 1 build on Travis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3941][TableAPI]Add support for UNION (w...

2016-05-24 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/2025#discussion_r64367480
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/runtime/aggregate/AggregateMapFunction.scala
 ---
@@ -44,7 +44,7 @@ class AggregateMapFunction[IN, OUT](
 
   override def map(value: IN): OUT = {
 
-val input = value.asInstanceOf[Row]
+val input = value.asInstanceOf[Product]
--- End diff --

Currently aggregates do only support `Row`, because the aggregate code is 
not generated yet.
`DataSetAggregate` enforces `Row` as input type (see `DataSetAggregate` 
line 99), so `value.asInstanceOf[Row]` should be safe. 

Did you observe a problem with this cast?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3941) Add support for UNION (with duplicate elimination)

2016-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298031#comment-15298031
 ] 

ASF GitHub Bot commented on FLINK-3941:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/2025#discussion_r64367480
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/runtime/aggregate/AggregateMapFunction.scala
 ---
@@ -44,7 +44,7 @@ class AggregateMapFunction[IN, OUT](
 
   override def map(value: IN): OUT = {
 
-val input = value.asInstanceOf[Row]
+val input = value.asInstanceOf[Product]
--- End diff --

Currently aggregates do only support `Row`, because the aggregate code is 
not generated yet.
`DataSetAggregate` enforces `Row` as input type (see `DataSetAggregate` 
line 99), so `value.asInstanceOf[Row]` should be safe. 

Did you observe a problem with this cast?


> Add support for UNION (with duplicate elimination)
> --
>
> Key: FLINK-3941
> URL: https://issues.apache.org/jira/browse/FLINK-3941
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Fabian Hueske
>Assignee: Yijie Shen
>Priority: Minor
>
> Currently, only UNION ALL is supported by Table API and SQL.
> UNION (with duplicate elimination) can be supported by applying a 
> {{DataSet.distinct()}} after the union on all fields. This issue includes:
> - Extending {{DataSetUnion}}
> - Relaxing {{DataSetUnionRule}} to translated non-all unions.
> - Extend the Table API with union() method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2314] Make Streaming File Sources Persi...

2016-05-24 Thread aljoscha
Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/2020#issuecomment-221227518
  
CC: @StephanEwen 

By the way, it might not look like it but the only additional methods this 
introduces on `StreamExecutionEnvironment` are are these three:

```
public  DataStreamSource readFile(FileInputFormat 
inputFormat,

String filePath,

WatchType watchType,

long interval)

public  DataStreamSource readFile(FileInputFormat 
inputFormat,

String filePath,

WatchType watchType,

long interval,

FilePathFilter filter)

public  DataStreamSource readFile(FileInputFormat 
inputFormat,

String filePath,

WatchType watchType,

long interval,

FilePathFilter filter,

TypeInformation typeInformation)
```

The rest are unfortunately public methods and we can't remove them, even 
though some should probably be removed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-3941][TableAPI]Add support for UNION (w...

2016-05-24 Thread yjshen
Github user yjshen commented on the pull request:

https://github.com/apache/flink/pull/2025#issuecomment-221225446
  
Hi @fhueske , thanks for the review! The changes are:
- Remove the whole `java/batch/table/UnionITCase.java` 
- Remove `testUnionWithFilter`, `testUnionWithJoin` and 
`testUnionWithAggregate`
- Add SQL Union Test.

Looks better now?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3960) EventTimeWindowCheckpointingITCase fails with a segmentation fault

2016-05-24 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298010#comment-15298010
 ] 

Maximilian Michels commented on FLINK-3960:
---

Merged temporary fix which should be reverted once we have fixed this issue: 
98a939552e12fc699ff39111bbe877e112460ceb

> EventTimeWindowCheckpointingITCase fails with a segmentation fault
> --
>
> Key: FLINK-3960
> URL: https://issues.apache.org/jira/browse/FLINK-3960
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming, Tests
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Aljoscha Krettek
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> As a follow-up issue of FLINK-3909, our tests fail with the following. I 
> believe [~aljoscha] is working on a fix.
> {noformat}
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x7fae0c62a264, pid=72720, tid=140385528268544
> #
> # JRE version: Java(TM) SE Runtime Environment (7.0_76-b13) (build 
> 1.7.0_76-b13)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.76-b04 mixed mode 
> linux-amd64 compressed oops)
> # Problematic frame:
> # C  [librocksdbjni78704726610339516..so+0x13c264]  
> rocksdb_iterator_helper(rocksdb::DB*, rocksdb::ReadOptions, 
> rocksdb::ColumnFamilyHandle*)+0x4
> #
> # Failed to write core dump. Core dumps have been disabled. To enable core 
> dumping, try "ulimit -c unlimited" before starting Java again
> #
> # An error report file with more information is saved as:
> # /home/travis/build/mxm/flink/flink-tests/target/hs_err_pid72720.log
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.java.com/bugreport/crash.jsp
> # The crash happened outside the Java Virtual Machine in native code.
> # See problematic frame for where to report the bug.
> #
> Aborted (core dumped)
> {noformat}
> I propose to disable the test case in the meantime because it is blocking our 
> test execution which we need for pull requests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3962) JMXReporter doesn't properly register/deregister metrics

2016-05-24 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298008#comment-15298008
 ] 

Maximilian Michels commented on FLINK-3962:
---

[~Zentol] I assigned you because you had been working on the metrics reporting.
[~StephanEwen] said he probably knows why this occurs.

> JMXReporter doesn't properly register/deregister metrics
> 
>
> Key: FLINK-3962
> URL: https://issues.apache.org/jira/browse/FLINK-3962
> Project: Flink
>  Issue Type: Bug
>  Components: TaskManager
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Chesnay Schepler
> Fix For: 1.1.0
>
>
> The following fails our Yarn tests because it checks for errors in the 
> jobmanager/taskmanager logs:
> {noformat}
> 2016-05-23 19:20:02,349 ERROR org.apache.flink.metrics.reporter.JMXReporter   
>   - A metric with the name 
> org.apache.flink.metrics:key0=testing-worker-linux-docker-05a6b382-3386-linux-4,key1=taskmanager,key2=9398ca9392af615e9d1896d0bd7ff52a,key3=Flink_Java_Job_at_Mon_May_23_19-20-00_UTC_2016,key4=,name=numBytesIn
>  was already registered.
> javax.management.InstanceAlreadyExistsException: 
> org.apache.flink.metrics:key0=testing-worker-linux-docker-05a6b382-3386-linux-4,key1=taskmanager,key2=9398ca9392af615e9d1896d0bd7ff52a,key3=Flink_Java_Job_at_Mon_May_23_19-20-00_UTC_2016,key4=,name=numBytesIn
>   at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324)
>   at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
>   at 
> org.apache.flink.metrics.reporter.JMXReporter.notifyOfAddedMetric(JMXReporter.java:76)
>   at 
> org.apache.flink.metrics.MetricRegistry.register(MetricRegistry.java:177)
>   at 
> org.apache.flink.metrics.groups.AbstractMetricGroup.addMetric(AbstractMetricGroup.java:191)
>   at 
> org.apache.flink.metrics.groups.AbstractMetricGroup.counter(AbstractMetricGroup.java:144)
>   at 
> org.apache.flink.metrics.groups.IOMetricGroup.(IOMetricGroup.java:40)
>   at 
> org.apache.flink.metrics.groups.TaskMetricGroup.(TaskMetricGroup.java:68)
>   at 
> org.apache.flink.metrics.groups.JobMetricGroup.addTask(JobMetricGroup.java:74)
>   at 
> org.apache.flink.metrics.groups.TaskManagerMetricGroup.addTaskForJob(TaskManagerMetricGroup.java:86)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.submitTask(TaskManager.scala:1092)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$handleTaskMessage(TaskManager.scala:441)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$handleMessage$1.applyOrElse(TaskManager.scala:283)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
>   at 
> org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:36)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
>   at 
> scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
>   at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
>   at 
> org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
>   at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
>   at 
> org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:124)
>   at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
>   at akka.actor.ActorCell.invoke(ActorCell.scala:487)
>   at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
>   at akka.dispatch.Mailbox.run(Mailbox.scala:221)
>   at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
>   at 

[jira] [Created] (FLINK-3962) JMXReporter doesn't properly register/deregister metrics

2016-05-24 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-3962:
-

 Summary: JMXReporter doesn't properly register/deregister metrics
 Key: FLINK-3962
 URL: https://issues.apache.org/jira/browse/FLINK-3962
 Project: Flink
  Issue Type: Bug
  Components: TaskManager
Affects Versions: 1.1.0
Reporter: Maximilian Michels
Assignee: Chesnay Schepler
 Fix For: 1.1.0


The following fails our Yarn tests because it checks for errors in the 
jobmanager/taskmanager logs:

{noformat}
2016-05-23 19:20:02,349 ERROR org.apache.flink.metrics.reporter.JMXReporter 
- A metric with the name 
org.apache.flink.metrics:key0=testing-worker-linux-docker-05a6b382-3386-linux-4,key1=taskmanager,key2=9398ca9392af615e9d1896d0bd7ff52a,key3=Flink_Java_Job_at_Mon_May_23_19-20-00_UTC_2016,key4=,name=numBytesIn
 was already registered.
javax.management.InstanceAlreadyExistsException: 
org.apache.flink.metrics:key0=testing-worker-linux-docker-05a6b382-3386-linux-4,key1=taskmanager,key2=9398ca9392af615e9d1896d0bd7ff52a,key3=Flink_Java_Job_at_Mon_May_23_19-20-00_UTC_2016,key4=,name=numBytesIn
at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324)
at 
com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
at 
org.apache.flink.metrics.reporter.JMXReporter.notifyOfAddedMetric(JMXReporter.java:76)
at 
org.apache.flink.metrics.MetricRegistry.register(MetricRegistry.java:177)
at 
org.apache.flink.metrics.groups.AbstractMetricGroup.addMetric(AbstractMetricGroup.java:191)
at 
org.apache.flink.metrics.groups.AbstractMetricGroup.counter(AbstractMetricGroup.java:144)
at 
org.apache.flink.metrics.groups.IOMetricGroup.(IOMetricGroup.java:40)
at 
org.apache.flink.metrics.groups.TaskMetricGroup.(TaskMetricGroup.java:68)
at 
org.apache.flink.metrics.groups.JobMetricGroup.addTask(JobMetricGroup.java:74)
at 
org.apache.flink.metrics.groups.TaskManagerMetricGroup.addTaskForJob(TaskManagerMetricGroup.java:86)
at 
org.apache.flink.runtime.taskmanager.TaskManager.submitTask(TaskManager.scala:1092)
at 
org.apache.flink.runtime.taskmanager.TaskManager.org$apache$flink$runtime$taskmanager$TaskManager$$handleTaskMessage(TaskManager.scala:441)
at 
org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$handleMessage$1.applyOrElse(TaskManager.scala:283)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at 
org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:36)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at 
scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at 
org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33)
at 
org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
at 
org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at 
org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:124)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)
at akka.dispatch.Mailbox.run(Mailbox.scala:221)
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3960) EventTimeWindowCheckpointingITCase fails with a segmentation fault

2016-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297998#comment-15297998
 ] 

ASF GitHub Bot commented on FLINK-3960:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2022


> EventTimeWindowCheckpointingITCase fails with a segmentation fault
> --
>
> Key: FLINK-3960
> URL: https://issues.apache.org/jira/browse/FLINK-3960
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming, Tests
>Affects Versions: 1.1.0
>Reporter: Maximilian Michels
>Assignee: Aljoscha Krettek
>  Labels: test-stability
> Fix For: 1.1.0
>
>
> As a follow-up issue of FLINK-3909, our tests fail with the following. I 
> believe [~aljoscha] is working on a fix.
> {noformat}
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x7fae0c62a264, pid=72720, tid=140385528268544
> #
> # JRE version: Java(TM) SE Runtime Environment (7.0_76-b13) (build 
> 1.7.0_76-b13)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.76-b04 mixed mode 
> linux-amd64 compressed oops)
> # Problematic frame:
> # C  [librocksdbjni78704726610339516..so+0x13c264]  
> rocksdb_iterator_helper(rocksdb::DB*, rocksdb::ReadOptions, 
> rocksdb::ColumnFamilyHandle*)+0x4
> #
> # Failed to write core dump. Core dumps have been disabled. To enable core 
> dumping, try "ulimit -c unlimited" before starting Java again
> #
> # An error report file with more information is saved as:
> # /home/travis/build/mxm/flink/flink-tests/target/hs_err_pid72720.log
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.java.com/bugreport/crash.jsp
> # The crash happened outside the Java Virtual Machine in native code.
> # See problematic frame for where to report the bug.
> #
> Aborted (core dumped)
> {noformat}
> I propose to disable the test case in the meantime because it is blocking our 
> test execution which we need for pull requests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3960] ignore EventTimeWindowCheckpointi...

2016-05-24 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2022


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Closed] (FLINK-3955) Change Table.toSink() to Table.writeToSink()

2016-05-24 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske closed FLINK-3955.

   Resolution: Implemented
Fix Version/s: 1.1.0

Implemented with 829c75c49531c64b5c73acd199d3d2a87388d54f

> Change Table.toSink() to Table.writeToSink()
> 
>
> Key: FLINK-3955
> URL: https://issues.apache.org/jira/browse/FLINK-3955
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
> Fix For: 1.1.0
>
>
> Currently, a {{Table}} can be emitted to a {{TableSink}} using the 
> {{Table.toSink()}} method.
> However, the name of the method indicates that the {{Table}} is converted 
> into a {{Sink}}.
> Therefore, I propose to change the method to {{Table.writeToSink()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-3728) Throw meaningful exceptions for unsupported SQL features

2016-05-24 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske closed FLINK-3728.

   Resolution: Implemented
Fix Version/s: 1.1.0

Implemented with 78f551194618bc9f47130e5bbf3dfa9ec5cd8362

> Throw meaningful exceptions for unsupported SQL features
> 
>
> Key: FLINK-3728
> URL: https://issues.apache.org/jira/browse/FLINK-3728
> Project: Flink
>  Issue Type: Bug
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Vasia Kalavri
>Assignee: Fabian Hueske
> Fix For: 1.1.0
>
>
> We must explicitly exclude unsupported SQL features such as Grouping Sets 
> from being translated to Flink programs. 
> Otherwise, the resulting program will compute invalid results.
> For that we must restrict the Calcite rules that translate Logical 
> {{RelNodes}} into {{DataSetRel}} or {{DataStreamRel}} nodes.
> We may only translate to {{DataSetRel}} or {{DataStreamRel}} nodes if these 
> support the semantics of the {{RelNode}}.
> Not translating a {{RelNode}} will yield a Calcite {{CannotPlanException}} 
> that we should catch and enrich with a meaningful error message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3728] [tableAPI] Improve error message ...

2016-05-24 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2018


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-3955] [tableAPI] Rename Table.toSink() ...

2016-05-24 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2023


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3955) Change Table.toSink() to Table.writeToSink()

2016-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297924#comment-15297924
 ] 

ASF GitHub Bot commented on FLINK-3955:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2023


> Change Table.toSink() to Table.writeToSink()
> 
>
> Key: FLINK-3955
> URL: https://issues.apache.org/jira/browse/FLINK-3955
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
>
> Currently, a {{Table}} can be emitted to a {{TableSink}} using the 
> {{Table.toSink()}} method.
> However, the name of the method indicates that the {{Table}} is converted 
> into a {{Sink}}.
> Therefore, I propose to change the method to {{Table.writeToSink()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3728) Throw meaningful exceptions for unsupported SQL features

2016-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297923#comment-15297923
 ] 

ASF GitHub Bot commented on FLINK-3728:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2018


> Throw meaningful exceptions for unsupported SQL features
> 
>
> Key: FLINK-3728
> URL: https://issues.apache.org/jira/browse/FLINK-3728
> Project: Flink
>  Issue Type: Bug
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Vasia Kalavri
>Assignee: Fabian Hueske
>
> We must explicitly exclude unsupported SQL features such as Grouping Sets 
> from being translated to Flink programs. 
> Otherwise, the resulting program will compute invalid results.
> For that we must restrict the Calcite rules that translate Logical 
> {{RelNodes}} into {{DataSetRel}} or {{DataStreamRel}} nodes.
> We may only translate to {{DataSetRel}} or {{DataStreamRel}} nodes if these 
> support the semantics of the {{RelNode}}.
> Not translating a {{RelNode}} will yield a Calcite {{CannotPlanException}} 
> that we should catch and enrich with a meaningful error message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3699) Allow per-job Kerberos authentication

2016-05-24 Thread Stefano Baghino (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297875#comment-15297875
 ] 

Stefano Baghino commented on FLINK-3699:


Hi Eron; I concur with your opinion. Thanks for taking the time and making the 
effort to organize the work to be done in order to improve this aspect of 
Flink. Unfortunately I'm not able to work on this issue right now, so I'm 
switching it to unassigned. This issue can be used to track progress toward 
this goal while the much finer grained tasks you reported are being worked on.

> Allow per-job Kerberos authentication 
> --
>
> Key: FLINK-3699
> URL: https://issues.apache.org/jira/browse/FLINK-3699
> Project: Flink
>  Issue Type: Improvement
>  Components: JobManager, Scheduler, TaskManager, YARN Client
>Affects Versions: 1.0.0
>Reporter: Stefano Baghino
>  Labels: kerberos, security, yarn
>
> Currently, authentication in a secure ("Kerberized") environment is performed 
> once as a standalone cluster or a YARN session is started up. This means that 
> jobs submitted will all be executed with the privileges of the user that 
> started up the cluster. This is reasonable in a lot of situations but 
> disallows a fine control over ACLs when Flink is involved.
> Adding a way for each job submission to be independently authenticated would 
> allow each job to run with the privileges of a specific user, enabling much 
> more granular control over ACLs, in particular in the context of existing 
> secure cluster setups.
> So far, a known workaround to this limitation (at least when running on YARN) 
> is to run a per-job cluster as a specific user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-3699) Allow per-job Kerberos authentication

2016-05-24 Thread Stefano Baghino (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefano Baghino updated FLINK-3699:
---
Assignee: (was: Stefano Baghino)

> Allow per-job Kerberos authentication 
> --
>
> Key: FLINK-3699
> URL: https://issues.apache.org/jira/browse/FLINK-3699
> Project: Flink
>  Issue Type: Improvement
>  Components: JobManager, Scheduler, TaskManager, YARN Client
>Affects Versions: 1.0.0
>Reporter: Stefano Baghino
>  Labels: kerberos, security, yarn
>
> Currently, authentication in a secure ("Kerberized") environment is performed 
> once as a standalone cluster or a YARN session is started up. This means that 
> jobs submitted will all be executed with the privileges of the user that 
> started up the cluster. This is reasonable in a lot of situations but 
> disallows a fine control over ACLs when Flink is involved.
> Adding a way for each job submission to be independently authenticated would 
> allow each job to run with the privileges of a specific user, enabling much 
> more granular control over ACLs, in particular in the context of existing 
> secure cluster setups.
> So far, a known workaround to this limitation (at least when running on YARN) 
> is to run a per-job cluster as a specific user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3728] [tableAPI] Improve error message ...

2016-05-24 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/2018#issuecomment-221191485
  
merging


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3728) Throw meaningful exceptions for unsupported SQL features

2016-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297854#comment-15297854
 ] 

ASF GitHub Bot commented on FLINK-3728:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/2018#issuecomment-221191485
  
merging


> Throw meaningful exceptions for unsupported SQL features
> 
>
> Key: FLINK-3728
> URL: https://issues.apache.org/jira/browse/FLINK-3728
> Project: Flink
>  Issue Type: Bug
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Vasia Kalavri
>Assignee: Fabian Hueske
>
> We must explicitly exclude unsupported SQL features such as Grouping Sets 
> from being translated to Flink programs. 
> Otherwise, the resulting program will compute invalid results.
> For that we must restrict the Calcite rules that translate Logical 
> {{RelNodes}} into {{DataSetRel}} or {{DataStreamRel}} nodes.
> We may only translate to {{DataSetRel}} or {{DataStreamRel}} nodes if these 
> support the semantics of the {{RelNode}}.
> Not translating a {{RelNode}} will yield a Calcite {{CannotPlanException}} 
> that we should catch and enrich with a meaningful error message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3955] [tableAPI] Rename Table.toSink() ...

2016-05-24 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/2023#issuecomment-221191467
  
merging


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3955) Change Table.toSink() to Table.writeToSink()

2016-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297853#comment-15297853
 ] 

ASF GitHub Bot commented on FLINK-3955:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/2023#issuecomment-221191467
  
merging


> Change Table.toSink() to Table.writeToSink()
> 
>
> Key: FLINK-3955
> URL: https://issues.apache.org/jira/browse/FLINK-3955
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Fabian Hueske
>Assignee: Fabian Hueske
>
> Currently, a {{Table}} can be emitted to a {{TableSink}} using the 
> {{Table.toSink()}} method.
> However, the name of the method indicates that the {{Table}} is converted 
> into a {{Sink}}.
> Therefore, I propose to change the method to {{Table.writeToSink()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [Flink-2971][table] Add outer joins to the Tab...

2016-05-24 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1981#issuecomment-221188902
  
No worries and thanks @dawidwys!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [Flink-2971][table] Add outer joins to the Tab...

2016-05-24 Thread dawidwys
Github user dawidwys commented on the pull request:

https://github.com/apache/flink/pull/1981#issuecomment-221187695
  
Hi @fhueske. Yes I will update this PR today. Sorry I haven't done it 
earlier but I was away for the past week.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [Flink-2971][table] Add outer joins to the Tab...

2016-05-24 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1981#issuecomment-221187388
  
Hi @dawidwys, do you have some time to work on this PR this week?
I will be away next week + a couple of days and would really like to merge 
it before.
Thanks, Fabian


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3728) Throw meaningful exceptions for unsupported SQL features

2016-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297824#comment-15297824
 ] 

ASF GitHub Bot commented on FLINK-3728:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/2018#issuecomment-221186713
  
Thanks :-)


> Throw meaningful exceptions for unsupported SQL features
> 
>
> Key: FLINK-3728
> URL: https://issues.apache.org/jira/browse/FLINK-3728
> Project: Flink
>  Issue Type: Bug
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Vasia Kalavri
>Assignee: Fabian Hueske
>
> We must explicitly exclude unsupported SQL features such as Grouping Sets 
> from being translated to Flink programs. 
> Otherwise, the resulting program will compute invalid results.
> For that we must restrict the Calcite rules that translate Logical 
> {{RelNodes}} into {{DataSetRel}} or {{DataStreamRel}} nodes.
> We may only translate to {{DataSetRel}} or {{DataStreamRel}} nodes if these 
> support the semantics of the {{RelNode}}.
> Not translating a {{RelNode}} will yield a Calcite {{CannotPlanException}} 
> that we should catch and enrich with a meaningful error message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3941) Add support for UNION (with duplicate elimination)

2016-05-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297823#comment-15297823
 ] 

ASF GitHub Bot commented on FLINK-3941:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/2025#issuecomment-221186678
  
Thanks for the PR, @yjshen. Looks good except for a few minor comments.
Can you also add one test method to the Scala SQL UnionITCase, such that 
the SQL side is also covered?

Union on streams cannot be supported right now. It would need a lot of the 
windowing logic to deduplicate rows.

Thanks, Fabian


> Add support for UNION (with duplicate elimination)
> --
>
> Key: FLINK-3941
> URL: https://issues.apache.org/jira/browse/FLINK-3941
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API
>Affects Versions: 1.1.0
>Reporter: Fabian Hueske
>Assignee: Yijie Shen
>Priority: Minor
>
> Currently, only UNION ALL is supported by Table API and SQL.
> UNION (with duplicate elimination) can be supported by applying a 
> {{DataSet.distinct()}} after the union on all fields. This issue includes:
> - Extending {{DataSetUnion}}
> - Relaxing {{DataSetUnionRule}} to translated non-all unions.
> - Extend the Table API with union() method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3941][TableAPI]Add support for UNION (w...

2016-05-24 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/2025#issuecomment-221186678
  
Thanks for the PR, @yjshen. Looks good except for a few minor comments.
Can you also add one test method to the Scala SQL UnionITCase, such that 
the SQL side is also covered?

Union on streams cannot be supported right now. It would need a lot of the 
windowing logic to deduplicate rows.

Thanks, Fabian


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-3728] [tableAPI] Improve error message ...

2016-05-24 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/2018#issuecomment-221186713
  
Thanks :-)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


  1   2   >