[jira] [Commented] (FLINK-1284) Uniform random sampling operator over windows

2016-04-26 Thread Austin Ouyang (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15259486#comment-15259486
 ] 

Austin Ouyang commented on FLINK-1284:
--

Hi [~till.rohrmann],

I've noticed that there's been low activity on this issue over the last year. 
Would it be possible for me to tackle this issue and be assigned? Thanks!

> Uniform random sampling operator over windows
> -
>
> Key: FLINK-1284
> URL: https://issues.apache.org/jira/browse/FLINK-1284
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming
>Reporter: Paris Carbone
>Priority: Minor
>
> It would be useful for several use cases to have a built-in uniform random 
> sampling operator in the streaming API that can operate on windows. This can 
> be used for example for online machine learning operations, evaluating 
> heuristics or continuous visualisation of representative values.
> The operator could be given a field and a number of random samples needed, 
> following a window statement as such:
> mystream.window(..).sample(fieldID,#samples)
> Given that pre-aggregation is enabled, this could perhaps be implemented as a 
> binary reduce operator or a combinable groupreduce that pre-aggregates the 
> empiricals of that field.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2220) Join on Pojo without hashCode() silently fails

2016-04-26 Thread GaoLun (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15259367#comment-15259367
 ] 

GaoLun commented on FLINK-2220:
---

I wrote test for generictype and the problem still arose. The small fix PR 
added log warning if POJO used as key but do not override the two method.

> Join on Pojo without hashCode() silently fails
> --
>
> Key: FLINK-2220
> URL: https://issues.apache.org/jira/browse/FLINK-2220
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 0.9, 0.8.1
>Reporter: Marcus Leich
>
> I need to perform a join using a complete Pojo as join key.
> With DOP > 1 this only works if the Pojo comes with a meaningful hasCode() 
> implementation, as otherwise equal objects will get hashed to different 
> partitions based on their memory address and not on the content.
> I guess it's fine if users are required to implement hasCode() themselves, 
> but it would be nice of documentation or better yet, Flink itself could alert 
> users that this is a requirement, similar to how Comparable is required for 
> keys.
> Use the following code to reproduce the issue:
> public class Pojo implements Comparable {
> public byte[] data;
> public Pojo () {
> }
> public Pojo (byte[] data) {
> this.data = data;
> }
> @Override
> public int compareTo(Pojo o) {
> return UnsignedBytes.lexicographicalComparator().compare(data, 
> o.data);
> }
> // uncomment me for making the join work
> /* @Override
> public int hashCode() {
> return Arrays.hashCode(data);
> }*/
> }
> public void testJoin () throws Exception {
> final ExecutionEnvironment env = 
> ExecutionEnvironment.createLocalEnvironment();
> env.setParallelism(4);
> DataSet> left = env.fromElements(
> new Tuple2<>(new Pojo(new byte[] {0, 24, 23, 1, 3}), "black"),
> new Tuple2<>(new Pojo(new byte[] {0, 14, 13, 14, 13}), "red"),
> new Tuple2<>(new Pojo(new byte[] {1}), "Spark"),
> new Tuple2<>(new Pojo(new byte[] {2}), "good"),
> new Tuple2<>(new Pojo(new byte[] {5}), "bug"));
> DataSet> right = env.fromElements(
> new Tuple2<>(new Pojo(new byte[] {0, 24, 23, 1, 3}), "white"),
> new Tuple2<>(new Pojo(new byte[] {0, 14, 13, 14, 13}), 
> "green"),
> new Tuple2<>(new Pojo(new byte[] {1}), "Flink"),
> new Tuple2<>(new Pojo(new byte[] {2}), "evil"),
> new Tuple2<>(new Pojo(new byte[] {5}), "fix"));
> // will not print anything unless Pojo has a real hashCode() 
> implementation
> 
> left.join(right).where(0).equalTo(0).projectFirst(1).projectSecond(1).print();
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2828) Add interfaces for Table API input formats

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15259202#comment-15259202
 ] 

ASF GitHub Bot commented on FLINK-2828:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1939#issuecomment-214922363
  
Thanks for the review @vasia. I addressed your comments.


> Add interfaces for Table API input formats
> --
>
> Key: FLINK-2828
> URL: https://issues.apache.org/jira/browse/FLINK-2828
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API
>Reporter: Timo Walther
>Assignee: Fabian Hueske
>
> In order to support input formats for the Table API, interfaces are 
> necessary. I propose two types of TableSources:
> - AdaptiveTableSources can adapt their output to the requirements of the 
> plan. Although the output schema stays the same, the TableSource can react on 
> field resolution and/or predicates internally and can return adapted 
> DataSet/DataStream versions in the "translate" step.
> - StaticTableSources are an easy way to provide the Table API with additional 
> input formats without much implementation effort (e.g. for fromCsvFile())
> TableSources need to be deeply integrated into the Table API.
> The TableEnvironment requires a newly introduced AbstractExecutionEnvironment 
> (common super class of all ExecutionEnvironments for DataSets and 
> DataStreams).
> Here's what a TableSource can see from more complicated queries:
> {code}
> getTableJava(tableSource1)
>   .filter("a===5 || a===6")
>   .select("a as a4, b as b4, c as c4")
>   .filter("b4===7")
>   .join(getTableJava(tableSource2))
>   .where("a===a4 && c==='Test' && c4==='Test2'")
> // Result predicates for tableSource1:
> //  List("a===5 || a===6", "b===7", "c==='Test2'")
> // Result predicates for tableSource2:
> //  List("c==='Test'")
> // Result resolved fields for tableSource1 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("a", false), ("b", true), ("b", false), ("c", false), 
> ("c", true))
> // Result resolved fields for tableSource2 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("c", true))
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2828] [tableAPI] Add TableSource interf...

2016-04-26 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1939#issuecomment-214922363
  
Thanks for the review @vasia. I addressed your comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2946) Add orderBy() to Table API

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15259045#comment-15259045
 ] 

ASF GitHub Bot commented on FLINK-2946:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1926#issuecomment-214903475
  
Hi @dawidwys, thanks for the fast update. I'll have a look at the PR 
tomorrow. 
The JIRA for Table API outer joins is FLINK-2971. I'll assign it to you. 
Thanks!


> Add orderBy() to Table API
> --
>
> Key: FLINK-2946
> URL: https://issues.apache.org/jira/browse/FLINK-2946
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API
>Reporter: Timo Walther
>Assignee: Dawid Wysakowicz
>
> In order to implement a FLINK-2099 prototype that uses the Table APIs code 
> generation facilities, the Table API needs a sorting feature.
> I would implement it the next days. Ideas how to implement such a sorting 
> feature are very welcome. Is there any more efficient way instead of 
> {{.sortPartition(...).setParallism(1)}}? Is it better to sort locally on the 
> nodes first and finally sort on one node afterwards?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2971) Add outer joins to the Table API

2016-04-26 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-2971:
-
Assignee: Dawid Wysakowicz

> Add outer joins to the Table API
> 
>
> Key: FLINK-2971
> URL: https://issues.apache.org/jira/browse/FLINK-2971
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API
>Reporter: Timo Walther
>Assignee: Dawid Wysakowicz
>
> Since Flink now supports outer joins, the Table API can also support left, 
> right and full outer joins.
> Given that null values are properly supported by RowSerializer etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2946] Add orderBy() to Table API

2016-04-26 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1926#issuecomment-214903475
  
Hi @dawidwys, thanks for the fast update. I'll have a look at the PR 
tomorrow. 
The JIRA for Table API outer joins is FLINK-2971. I'll assign it to you. 
Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3669) WindowOperator registers a lot of timers at StreamTask

2016-04-26 Thread Konstantin Knauf (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258912#comment-15258912
 ] 

Konstantin Knauf commented on FLINK-3669:
-

[~aljoscha] Hmpf. You're right. I was actually a little bit suspicious, that I 
did not need any additional state ;) Anyway, I pushed a new version. This time 
also with modifications to StreamTask to set deleteOnCancelPolicy(true). 

https://github.com/knaufk/flink/tree/FLINK-3669

Aside: A lot of my Travis-Builds fail with a compilation problem in 
flink-connector-elasticsearch2 even on master (e.g. 
https://travis-ci.org/knaufk/flink/builds/125920011 3 und 5). What's the reason?

> WindowOperator registers a lot of timers at StreamTask
> --
>
> Key: FLINK-3669
> URL: https://issues.apache.org/jira/browse/FLINK-3669
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 1.0.1
>Reporter: Aljoscha Krettek
>Assignee: Konstantin Knauf
>Priority: Blocker
>
> Right now, the WindowOperator registers a timer at the StreamTask for every 
> processing-time timer that a Trigger registers. We should combine several 
> registered trigger timers to only register one low-level timer (timer 
> coalescing).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3257) Add Exactly-Once Processing Guarantees in Iterative DataStream Jobs

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258904#comment-15258904
 ] 

ASF GitHub Bot commented on FLINK-3257:
---

Github user senorcarbone commented on the pull request:

https://github.com/apache/flink/pull/1668#issuecomment-214883298
  
ok good to know @uce! Let me get back to it in a couple of weeks and make 
it right, now it is a bit impossible to find time.


> Add Exactly-Once Processing Guarantees in Iterative DataStream Jobs
> ---
>
> Key: FLINK-3257
> URL: https://issues.apache.org/jira/browse/FLINK-3257
> Project: Flink
>  Issue Type: Improvement
>Reporter: Paris Carbone
>Assignee: Paris Carbone
>
> The current snapshotting algorithm cannot support cycles in the execution 
> graph. An alternative scheme can potentially include records in-transit 
> through the back-edges of a cyclic execution graph (ABS [1]) to achieve the 
> same guarantees.
> One straightforward implementation of ABS for cyclic graphs can work as 
> follows along the lines:
> 1) Upon triggering a barrier in an IterationHead from the TaskManager start 
> block output and start upstream backup of all records forwarded from the 
> respective IterationSink.
> 2) The IterationSink should eventually forward the current snapshotting epoch 
> barrier to the IterationSource.
> 3) Upon receiving a barrier from the IterationSink, the IterationSource 
> should finalize the snapshot, unblock its output and emit all records 
> in-transit in FIFO order and continue the usual execution.
> --
> Upon restart the IterationSource should emit all records from the injected 
> snapshot first and then continue its usual execution.
> Several optimisations and slight variations can be potentially achieved but 
> this can be the initial implementation take.
> [1] http://arxiv.org/abs/1506.08603



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3257] Add Exactly-Once Processing Guara...

2016-04-26 Thread senorcarbone
Github user senorcarbone commented on the pull request:

https://github.com/apache/flink/pull/1668#issuecomment-214883298
  
ok good to know @uce! Let me get back to it in a couple of weeks and make 
it right, now it is a bit impossible to find time.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2946] Add orderBy() to Table API

2016-04-26 Thread dawidwys
Github user dawidwys commented on the pull request:

https://github.com/apache/flink/pull/1926#issuecomment-214862766
  
I tried to address all your comments. Unfortunately I didn't notice lines 
longer than 100 yesterday, sorry for that.

I started working on outer joins, but unfortunately I could not find an 
appropriate jira, so I would be grateful if you could point me to such. 
Afterwards I will happily work on introducing storages.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2946) Add orderBy() to Table API

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258772#comment-15258772
 ] 

ASF GitHub Bot commented on FLINK-2946:
---

Github user dawidwys commented on the pull request:

https://github.com/apache/flink/pull/1926#issuecomment-214862766
  
I tried to address all your comments. Unfortunately I didn't notice lines 
longer than 100 yesterday, sorry for that.

I started working on outer joins, but unfortunately I could not find an 
appropriate jira, so I would be grateful if you could point me to such. 
Afterwards I will happily work on introducing storages.


> Add orderBy() to Table API
> --
>
> Key: FLINK-2946
> URL: https://issues.apache.org/jira/browse/FLINK-2946
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API
>Reporter: Timo Walther
>Assignee: Dawid Wysakowicz
>
> In order to implement a FLINK-2099 prototype that uses the Table APIs code 
> generation facilities, the Table API needs a sorting feature.
> I would implement it the next days. Ideas how to implement such a sorting 
> feature are very welcome. Is there any more efficient way instead of 
> {{.sortPartition(...).setParallism(1)}}? Is it better to sort locally on the 
> nodes first and finally sort on one node afterwards?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2254) Add Bipartite Graph Support for Gelly

2016-04-26 Thread Saumitra Shahapure (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258680#comment-15258680
 ] 

Saumitra Shahapure commented on FLINK-2254:
---

Hi [~vkalavri], I started working on this issue, but could not manage time 
properly to finish it. Feel free to reassign it.

> Add Bipartite Graph Support for Gelly
> -
>
> Key: FLINK-2254
> URL: https://issues.apache.org/jira/browse/FLINK-2254
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Affects Versions: 0.10.0
>Reporter: Andra Lungu
>Assignee: Saumitra Shahapure
>  Labels: requires-design-doc
>
> A bipartite graph is a graph for which the set of vertices can be divided 
> into two disjoint sets such that each edge having a source vertex in the 
> first set, will have a target vertex in the second set. We would like to 
> support efficient operations for this type of graphs along with a set of 
> metrics(http://jponnela.com/web_documents/twomode.pdf). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-2254) Add Bipartite Graph Support for Gelly

2016-04-26 Thread Saumitra Shahapure (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saumitra Shahapure updated FLINK-2254:
--
Assignee: (was: Saumitra Shahapure)

> Add Bipartite Graph Support for Gelly
> -
>
> Key: FLINK-2254
> URL: https://issues.apache.org/jira/browse/FLINK-2254
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Affects Versions: 0.10.0
>Reporter: Andra Lungu
>  Labels: requires-design-doc
>
> A bipartite graph is a graph for which the set of vertices can be divided 
> into two disjoint sets such that each edge having a source vertex in the 
> first set, will have a target vertex in the second set. We would like to 
> support efficient operations for this type of graphs along with a set of 
> metrics(http://jponnela.com/web_documents/twomode.pdf). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-951) Reworking of Iteration Synchronization, Accumulators and Aggregators

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258599#comment-15258599
 ] 

ASF GitHub Bot commented on FLINK-951:
--

Github user markus-h closed the pull request at:

https://github.com/apache/flink/pull/570


> Reworking of Iteration Synchronization, Accumulators and Aggregators
> 
>
> Key: FLINK-951
> URL: https://issues.apache.org/jira/browse/FLINK-951
> Project: Flink
>  Issue Type: Improvement
>  Components: Iterations, Optimizer
>Affects Versions: 0.9
>Reporter: Markus Holzemer
>Assignee: Markus Holzemer
>  Labels: refactoring
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> I just realized that there is no real Jira issue for the task I am currently 
> working on. 
> I am currently reworking a few things regarding Iteration Synchronization, 
> Accumulators and Aggregators. Currently the synchronization at the end of one 
> superstep is done through channel events. That makes it hard to track the 
> current status of iterations. That is why I am changing this synchronization 
> to use RPC calls with the JobManager, so that the JobManager manages the 
> current status of all iterations.
> Currently we use Accumulators outside of iterations and Aggregators inside of 
> iterations. Both have a similiar function, but a bit different interfaces and 
> handling. I want to unify these two concepts. I propose that we stick in the 
> future to Accumulators only. Aggregators therefore are removed and 
> Accumulators are extended to cover the usecases Aggregators were used fore 
> before. The switch to RPC for iterations makes it possible to also send the 
> current Accumulator values at the end of each superstep, so that the 
> JobManager (and thereby the webinterface) will be able to print intermediate 
> accumulation results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-951) Reworking of Iteration Synchronization, Accumulators and Aggregators

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258598#comment-15258598
 ] 

ASF GitHub Bot commented on FLINK-951:
--

Github user markus-h commented on the pull request:

https://github.com/apache/flink/pull/570#issuecomment-214833775
  
Sorry for not driving this further. I think this pull request is now way 
too outdated to have any chance of rebasing it to the current master, therefore 
I will close it.


> Reworking of Iteration Synchronization, Accumulators and Aggregators
> 
>
> Key: FLINK-951
> URL: https://issues.apache.org/jira/browse/FLINK-951
> Project: Flink
>  Issue Type: Improvement
>  Components: Iterations, Optimizer
>Affects Versions: 0.9
>Reporter: Markus Holzemer
>Assignee: Markus Holzemer
>  Labels: refactoring
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> I just realized that there is no real Jira issue for the task I am currently 
> working on. 
> I am currently reworking a few things regarding Iteration Synchronization, 
> Accumulators and Aggregators. Currently the synchronization at the end of one 
> superstep is done through channel events. That makes it hard to track the 
> current status of iterations. That is why I am changing this synchronization 
> to use RPC calls with the JobManager, so that the JobManager manages the 
> current status of all iterations.
> Currently we use Accumulators outside of iterations and Aggregators inside of 
> iterations. Both have a similiar function, but a bit different interfaces and 
> handling. I want to unify these two concepts. I propose that we stick in the 
> future to Accumulators only. Aggregators therefore are removed and 
> Accumulators are extended to cover the usecases Aggregators were used fore 
> before. The switch to RPC for iterations makes it possible to also send the 
> current Accumulator values at the end of each superstep, so that the 
> JobManager (and thereby the webinterface) will be able to print intermediate 
> accumulation results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-951] Reworking of Iteration Synchroniza...

2016-04-26 Thread markus-h
Github user markus-h commented on the pull request:

https://github.com/apache/flink/pull/570#issuecomment-214833775
  
Sorry for not driving this further. I think this pull request is now way 
too outdated to have any chance of rebasing it to the current master, therefore 
I will close it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-951] Reworking of Iteration Synchroniza...

2016-04-26 Thread markus-h
Github user markus-h closed the pull request at:

https://github.com/apache/flink/pull/570


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2899) The groupReduceOn* methods which take types as a parameter fail with TypeErasure

2016-04-26 Thread Greg Hogan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258588#comment-15258588
 ] 

Greg Hogan commented on FLINK-2899:
---

Is this due to a limitation of {{TypeExtractor.createTypeInfo}} or the lack of 
parameterization in user code?

The {{TypeInformation}} is extracted from {{NeighborsFunctionWithVertexValue, NullValue, Vertex>}} with 
{{TypeExtractor.createTypeInfo(NeighborsFunctionWithVertexValue.class, 
function.getClass(), 3, null, null)}} (link 0).

The function is instantiated with {{new PropagateNeighborValues()}}, so no 
parameterization. Does this limit what {{TypeExtractor}} can extract?

I agree with [~Zentol] that if {{ApplyNeighborCoGroupFunction}} is extracting 
{{TypeInformation}} then accepting this as a function parameter looks 
superfluous.

0: 
[https://github.com/andralungu/flink/commit/1fbc09ca35e80f61f31cf7a6166f27f2a6f142c1#diff-e2c6679629dbf39dbd9ba785d1516f2cR188]
1: 
[https://github.com/andralungu/flink/commit/1fbc09ca35e80f61f31cf7a6166f27f2a6f142c1#diff-e2c6679629dbf39dbd9ba785d1516f2cR96]

> The groupReduceOn* methods which take types as a parameter fail with 
> TypeErasure
> 
>
> Key: FLINK-2899
> URL: https://issues.apache.org/jira/browse/FLINK-2899
> Project: Flink
>  Issue Type: Bug
>  Components: Gelly
>Affects Versions: 0.10.0
>Reporter: Andra Lungu
>
> I tried calling  groupReduceOnEdges (EdgesFunctionWithVertexValue T> edgesFunction, EdgeDirection direction, TypeInformation typeInfo) in 
> order to make the vertex-centric version of the Triangle Count library method 
> applicable to any kind of key and I got a TypeErasure Exception. 
> After doing a bit of debugging (see the hack in 
> https://github.com/andralungu/flink/tree/trianglecount-vertexcentric), I saw 
> that actually the call to 
> TypeExtractor.createTypeInfo(NeighborsFunctionWithVertexValue.class,  in 
> ApplyNeighborCoGroupFunction does not work properly, i.e. it returns null. 
> From what I see, the coGroup in groupReduceOnNeighbors tries to infer a type 
> before "returns" is called. 
> I may be missing something, but that particular feature (groupReduceOn with 
> types) is not documented or tested so we would also need some tests for that. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3771) Methods for translating Graphs

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258532#comment-15258532
 ] 

ASF GitHub Bot commented on FLINK-3771:
---

Github user greghogan commented on the pull request:

https://github.com/apache/flink/pull/1900#issuecomment-214825734
  
By reuse I was referring to needing `MapVertexValueLongToString` and 
`MapEdgeValueLongToString` since the current `Graph` methods operate at the 
level of `Vertex` and `Edge`. And then to translate labels you need 
`MapVertexLabelLongToString` and `MapEdgeLabelLongToString`. In this PR we 
simply have a single `LongToString` which can be used to translate vertex 
labels, edge labels, vertex values, and edge values.

I agree that `Translate` methods taking a `Graph` as input would be better 
as methods on `Graph`.

And if we don't implement as `MapFunction` then we still need `Translate` 
to handle the `TypeInformation`.


> Methods for translating Graphs
> --
>
> Key: FLINK-3771
> URL: https://issues.apache.org/jira/browse/FLINK-3771
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Affects Versions: 1.1.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
> Fix For: 1.1.0
>
>
> Provide methods for translation of the type or value of graph labels, vertex 
> values, and edge values.
> Sample use cases:
> * shifting graph labels in order to union generated graphs or graphs read 
> from multiple sources
> * downsizing labels or values since algorithms prefer to generate wide types 
> which may be expensive for further computation
> * changing label type for testing or benchmarking alternative code paths



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3771] [gelly] Methods for translating G...

2016-04-26 Thread greghogan
Github user greghogan commented on the pull request:

https://github.com/apache/flink/pull/1900#issuecomment-214825734
  
By reuse I was referring to needing `MapVertexValueLongToString` and 
`MapEdgeValueLongToString` since the current `Graph` methods operate at the 
level of `Vertex` and `Edge`. And then to translate labels you need 
`MapVertexLabelLongToString` and `MapEdgeLabelLongToString`. In this PR we 
simply have a single `LongToString` which can be used to translate vertex 
labels, edge labels, vertex values, and edge values.

I agree that `Translate` methods taking a `Graph` as input would be better 
as methods on `Graph`.

And if we don't implement as `MapFunction` then we still need `Translate` 
to handle the `TypeInformation`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3771) Methods for translating Graphs

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258510#comment-15258510
 ] 

ASF GitHub Bot commented on FLINK-3771:
---

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/1900#issuecomment-214819709
  
Implementations of `MapFunction`s can also be reused :)
What I was thinking was to provide the translator simply as map functions, 
e.g. like `Tuple2ToVertexMap`. We can add them to 
`org.apache.flink.graph.utils` or create a subpackage for that. Then we can add 
the `translate*` methods to `Graph`. If someone wants to use the provided 
translator on a dataset of vertices or edges, they can simply do this with a 
mapper.

My concern is that we should try to be consistent with the existing Gelly 
API. e.g. something like `graph.translateIds(new LongValueToStringValue())` is 
Gelly-like, while
`Translate.translateGraphLabels(graph, new LongValueToStringValue())` is 
not. Also, I think that this feature is simple enough to be implemented as a 
collection of map functions instead of a separate utility.


> Methods for translating Graphs
> --
>
> Key: FLINK-3771
> URL: https://issues.apache.org/jira/browse/FLINK-3771
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Affects Versions: 1.1.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
> Fix For: 1.1.0
>
>
> Provide methods for translation of the type or value of graph labels, vertex 
> values, and edge values.
> Sample use cases:
> * shifting graph labels in order to union generated graphs or graphs read 
> from multiple sources
> * downsizing labels or values since algorithms prefer to generate wide types 
> which may be expensive for further computation
> * changing label type for testing or benchmarking alternative code paths



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: FLINK-3717 - Be able to re-start reading from ...

2016-04-26 Thread kl0u
Github user kl0u commented on the pull request:

https://github.com/apache/flink/pull/1895#issuecomment-214817636
  
The problem with this change would be that we do not have yet a way to 
deserialize a record, so 
when we are trying to advance readRecords in the block, we have a NULL 
pointer exception.

Actually this is what we do in the AvroInputFormat case, but there it works.

Let me know if you have any other ideas on how to do it, or if you think 
that it may be a problem also 
for the Avro format.


> On Apr 25, 2016, at 4:24 PM, Aljoscha Krettek  
wrote:
> 
> I had some inline comments but overall the changes look good!
> 
> I think can simplify the BinaryInputFormat by getting rid of the filePos 
and justReadAllRecords fields and just snapshotting the blockPos. The fieldPos 
and justReadAllRecords information functionally depend on the blockPos, so 
storing the filePos and justReadAllRecords fields just adds more complexity 
since we're keeping track of all of them.
> 
> The snapshot would then just be (blockPos, readRecords), upon restore the 
correct file read position can be derived from the block/split start position.
> 
> —
> You are receiving this because you authored the thread.
> Reply to this email directly or view it on GitHub 





---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-3825) Update CEP documentation to include Scala API

2016-04-26 Thread Till Rohrmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-3825:
-
Assignee: Stefan Richter

> Update CEP documentation to include Scala API
> -
>
> Key: FLINK-3825
> URL: https://issues.apache.org/jira/browse/FLINK-3825
> Project: Flink
>  Issue Type: Improvement
>  Components: CEP, Documentation
>Affects Versions: 1.1.0
>Reporter: Till Rohrmann
>Assignee: Stefan Richter
>  Labels: documentation
>
> After adding the Scala CEP API FLINK-3708, we should update the online 
> documentation to also contain the Scala API. This can be done similarly to 
> the {{DataSet}} and {{DataStream}} API by providing Java and Scala code for 
> all examples.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: Flink 3750 fixed

2016-04-26 Thread fpompermaier
GitHub user fpompermaier opened a pull request:

https://github.com/apache/flink/pull/1941

Flink 3750 fixed

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [X] General
  - The pull request references the related JIRA issue
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message

- [X] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [X] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed

-
New cleaned PR aligned with the current master

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fpompermaier/flink FLINK-3750-fixed

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1941.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1941


commit 206920c85f617728d67d5d02f4fe07da228e069b
Author: Flavio Pompermaier 
Date:   2016-04-26T17:10:53Z

FLINK-3750 new PR (the old one was messed up...)

commit df0d9bff8fe8f5d291ab415e3a77db45f4fdbc92
Author: Flavio Pompermaier 
Date:   2016-04-26T17:13:09Z

Removed commented method




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3086) ExpressionParser does not support concatenation of suffix operations

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258480#comment-15258480
 ] 

ASF GitHub Bot commented on FLINK-3086:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1912#issuecomment-214814868
  
Yes, same here ;-)


> ExpressionParser does not support concatenation of suffix operations
> 
>
> Key: FLINK-3086
> URL: https://issues.apache.org/jira/browse/FLINK-3086
> Project: Flink
>  Issue Type: Bug
>  Components: Table API
>Reporter: Timo Walther
>Assignee: Timo Walther
>
> The ExpressionParser of the Table API does not support concatenation of 
> suffix operations. e.g. 
> {code}table.select("field.cast(STRING).substring(2)"){code} throws  an 
> exception.
> {code}
> org.apache.flink.api.table.ExpressionException: Could not parse expression: 
> string matching regex `\z' expected but `.' found
>   at 
> org.apache.flink.api.table.parser.ExpressionParser$.parseExpressionList(ExpressionParser.scala:224)
> {code}
> However, the Scala implicit Table Expression API supports this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3086] [table] ExpressionParser does not...

2016-04-26 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/1912#issuecomment-214814868
  
Yes, same here ;-)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3750) Make JDBCInputFormat a parallel source

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258471#comment-15258471
 ] 

ASF GitHub Bot commented on FLINK-3750:
---

Github user fpompermaier closed the pull request at:

https://github.com/apache/flink/pull/1885


> Make JDBCInputFormat a parallel source
> --
>
> Key: FLINK-3750
> URL: https://issues.apache.org/jira/browse/FLINK-3750
> Project: Flink
>  Issue Type: Improvement
>  Components: Batch
>Affects Versions: 1.0.1
>Reporter: Flavio Pompermaier
>Assignee: Flavio Pompermaier
>Priority: Minor
>  Labels: connector, jdbc
>
> At the moment the batch JDBC InputFormat does not support parallelism 
> (NonParallelInput). I'd like to remove such limitation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: Improved JDBCInputFormat (FLINK-3750) and othe...

2016-04-26 Thread fpompermaier
Github user fpompermaier commented on the pull request:

https://github.com/apache/flink/pull/1885#issuecomment-214813684
  
Sorry messed up...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3750) Make JDBCInputFormat a parallel source

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258472#comment-15258472
 ] 

ASF GitHub Bot commented on FLINK-3750:
---

Github user fpompermaier commented on the pull request:

https://github.com/apache/flink/pull/1885#issuecomment-214813684
  
Sorry messed up...


> Make JDBCInputFormat a parallel source
> --
>
> Key: FLINK-3750
> URL: https://issues.apache.org/jira/browse/FLINK-3750
> Project: Flink
>  Issue Type: Improvement
>  Components: Batch
>Affects Versions: 1.0.1
>Reporter: Flavio Pompermaier
>Assignee: Flavio Pompermaier
>Priority: Minor
>  Labels: connector, jdbc
>
> At the moment the batch JDBC InputFormat does not support parallelism 
> (NonParallelInput). I'd like to remove such limitation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: Improved JDBCInputFormat (FLINK-3750) and othe...

2016-04-26 Thread fpompermaier
Github user fpompermaier closed the pull request at:

https://github.com/apache/flink/pull/1885


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3800) ExecutionGraphs can become orphans

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258466#comment-15258466
 ] 

ASF GitHub Bot commented on FLINK-3800:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1923


> ExecutionGraphs can become orphans
> --
>
> Key: FLINK-3800
> URL: https://issues.apache.org/jira/browse/FLINK-3800
> Project: Flink
>  Issue Type: Bug
>  Components: JobManager
>Affects Versions: 1.0.0, 1.1.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>
> The {{JobManager.cancelAndClearEverything}} method fails all currently 
> executed jobs on the {{JobManager}} and then clears the list of 
> {{currentJobs}} kept in the JobManager. This can become problematic if the 
> user has set a restart strategy for a job, because the {{RestartStrategy}} 
> will try to restart the job. This can lead to unwanted re-deployments of the 
> job which consumes resources and thus will trouble the execution of other 
> jobs. If the restart strategy never stops, then this prevents that the 
> {{ExecutionGraph}} from ever being properly terminated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-3800) ExecutionGraphs can become orphans

2016-04-26 Thread Till Rohrmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann closed FLINK-3800.

Resolution: Fixed

Fixed via 28c57c3a57cbd7a02e756ee98c0b1168cec69feb

> ExecutionGraphs can become orphans
> --
>
> Key: FLINK-3800
> URL: https://issues.apache.org/jira/browse/FLINK-3800
> Project: Flink
>  Issue Type: Bug
>  Components: JobManager
>Affects Versions: 1.0.0, 1.1.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>
> The {{JobManager.cancelAndClearEverything}} method fails all currently 
> executed jobs on the {{JobManager}} and then clears the list of 
> {{currentJobs}} kept in the JobManager. This can become problematic if the 
> user has set a restart strategy for a job, because the {{RestartStrategy}} 
> will try to restart the job. This can lead to unwanted re-deployments of the 
> job which consumes resources and thus will trouble the execution of other 
> jobs. If the restart strategy never stops, then this prevents that the 
> {{ExecutionGraph}} from ever being properly terminated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3800] [jobmanager] Terminate ExecutionG...

2016-04-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1923


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-3800] [jobmanager] Terminate ExecutionG...

2016-04-26 Thread tillrohrmann
Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1923#issuecomment-214812452
  
Unrelated test case failures. Will be merging this PR to master and the 
1.0-release branch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2828) Add interfaces for Table API input formats

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258451#comment-15258451
 ] 

ASF GitHub Bot commented on FLINK-2828:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/1939#discussion_r61123187
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/schema/TableSourceTable.scala
 ---
@@ -0,0 +1,32 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.table.plan.schema
+
+import org.apache.flink.api.table.Row
+import org.apache.flink.api.table.sources.TableSource
+import org.apache.flink.api.table.typeutils.RowTypeInfo
+
+/** Table which defines an external table via a [[TableSource]] */
+class TableSourceTable(val tableSource: TableSource[_])
--- End diff --




> Add interfaces for Table API input formats
> --
>
> Key: FLINK-2828
> URL: https://issues.apache.org/jira/browse/FLINK-2828
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API
>Reporter: Timo Walther
>Assignee: Fabian Hueske
>
> In order to support input formats for the Table API, interfaces are 
> necessary. I propose two types of TableSources:
> - AdaptiveTableSources can adapt their output to the requirements of the 
> plan. Although the output schema stays the same, the TableSource can react on 
> field resolution and/or predicates internally and can return adapted 
> DataSet/DataStream versions in the "translate" step.
> - StaticTableSources are an easy way to provide the Table API with additional 
> input formats without much implementation effort (e.g. for fromCsvFile())
> TableSources need to be deeply integrated into the Table API.
> The TableEnvironment requires a newly introduced AbstractExecutionEnvironment 
> (common super class of all ExecutionEnvironments for DataSets and 
> DataStreams).
> Here's what a TableSource can see from more complicated queries:
> {code}
> getTableJava(tableSource1)
>   .filter("a===5 || a===6")
>   .select("a as a4, b as b4, c as c4")
>   .filter("b4===7")
>   .join(getTableJava(tableSource2))
>   .where("a===a4 && c==='Test' && c4==='Test2'")
> // Result predicates for tableSource1:
> //  List("a===5 || a===6", "b===7", "c==='Test2'")
> // Result predicates for tableSource2:
> //  List("c==='Test'")
> // Result resolved fields for tableSource1 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("a", false), ("b", true), ("b", false), ("c", false), 
> ("c", true))
> // Result resolved fields for tableSource2 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("c", true))
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2828] [tableAPI] Add TableSource interf...

2016-04-26 Thread vasia
Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/1939#discussion_r61123187
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/schema/TableSourceTable.scala
 ---
@@ -0,0 +1,32 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.table.plan.schema
+
+import org.apache.flink.api.table.Row
+import org.apache.flink.api.table.sources.TableSource
+import org.apache.flink.api.table.typeutils.RowTypeInfo
+
+/** Table which defines an external table via a [[TableSource]] */
+class TableSourceTable(val tableSource: TableSource[_])
--- End diff --

😦


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (FLINK-3708) Scala API for CEP

2016-04-26 Thread Till Rohrmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann resolved FLINK-3708.
--
Resolution: Fixed

Added via e29ac036a76cc78ea608ffe1f0784ec15e351c60

> Scala API for CEP
> -
>
> Key: FLINK-3708
> URL: https://issues.apache.org/jira/browse/FLINK-3708
> Project: Flink
>  Issue Type: Improvement
>  Components: CEP
>Affects Versions: 1.1.0
>Reporter: Till Rohrmann
>Assignee: Stefan Richter
>
> Currently, The CEP library does not support Scala case classes, because the 
> {{TypeExtractor}} cannot handle them. In order to support them, it would be 
> necessary to offer a Scala API for the CEP library.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3825) Update CEP documentation to include Scala API

2016-04-26 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-3825:


 Summary: Update CEP documentation to include Scala API
 Key: FLINK-3825
 URL: https://issues.apache.org/jira/browse/FLINK-3825
 Project: Flink
  Issue Type: Improvement
  Components: CEP, Documentation
Affects Versions: 1.1.0
Reporter: Till Rohrmann


After adding the Scala CEP API FLINK-3708, we should update the online 
documentation to also contain the Scala API. This can be done similarly to the 
{{DataSet}} and {{DataStream}} API by providing Java and Scala code for all 
examples.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3824) ResourceManager may repeatedly connect to outdated JobManager in HA mode

2016-04-26 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-3824:
-

 Summary: ResourceManager may repeatedly connect to outdated 
JobManager in HA mode
 Key: FLINK-3824
 URL: https://issues.apache.org/jira/browse/FLINK-3824
 Project: Flink
  Issue Type: Bug
  Components: ResourceManager
Affects Versions: 1.1.0
Reporter: Maximilian Michels
Assignee: Maximilian Michels
Priority: Minor
 Fix For: 1.1.0


When the ResourceManager receives a new leading JobManager via the 
LeaderRetrievalService it tries to register with this JobManager until 
connected. If during registration a new leader gets elected, the 
ResourceManager may still repeatedly try to register with the old one. This 
doesn't affect the registration with the new JobManager but leaves error 
messages in the log file and may process unnecessary messages.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3708) Scala API for CEP

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258427#comment-15258427
 ] 

ASF GitHub Bot commented on FLINK-3708:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1905


> Scala API for CEP
> -
>
> Key: FLINK-3708
> URL: https://issues.apache.org/jira/browse/FLINK-3708
> Project: Flink
>  Issue Type: Improvement
>  Components: CEP
>Affects Versions: 1.1.0
>Reporter: Till Rohrmann
>Assignee: Stefan Richter
>
> Currently, The CEP library does not support Scala case classes, because the 
> {{TypeExtractor}} cannot handle them. In order to support them, it would be 
> necessary to offer a Scala API for the CEP library.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2828) Add interfaces for Table API input formats

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258417#comment-15258417
 ] 

ASF GitHub Bot commented on FLINK-2828:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1939#discussion_r61121181
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/rules/dataSet/BatchTableSourceScanRule.scala
 ---
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.table.plan.rules.dataSet
+
+import org.apache.calcite.plan.{Convention, RelOptRule, RelOptRuleCall, 
RelTraitSet}
+import org.apache.calcite.rel.RelNode
+import org.apache.calcite.rel.convert.ConverterRule
+import org.apache.calcite.rel.core.TableScan
+import org.apache.calcite.rel.logical.LogicalTableScan
+import 
org.apache.flink.api.table.plan.nodes.dataset.{BatchTableSourceScan, 
DataSetConvention}
+import org.apache.flink.api.table.plan.schema.TableSourceTable
+import org.apache.flink.api.table.sources.BatchTableSource
+
+/** Rule to convert a [[LogicalTableScan]] into a 
[[BatchTableSourceScan]]. */
+class BatchTableSourceScanRule
+  extends ConverterRule(
+  classOf[LogicalTableScan],
+  Convention.NONE,
+  DataSetConvention.INSTANCE,
+  "BatchTableSourceScanRule")
+  {
+
+  /** Rule must only match if TableScan targets a [[BatchTableSource]] */
--- End diff --

Yes, that would be possible. Would you prefer that? I have a slight 
preference on having separate rules but if you think it is better to merge 
them, I'll do it.


> Add interfaces for Table API input formats
> --
>
> Key: FLINK-2828
> URL: https://issues.apache.org/jira/browse/FLINK-2828
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API
>Reporter: Timo Walther
>Assignee: Fabian Hueske
>
> In order to support input formats for the Table API, interfaces are 
> necessary. I propose two types of TableSources:
> - AdaptiveTableSources can adapt their output to the requirements of the 
> plan. Although the output schema stays the same, the TableSource can react on 
> field resolution and/or predicates internally and can return adapted 
> DataSet/DataStream versions in the "translate" step.
> - StaticTableSources are an easy way to provide the Table API with additional 
> input formats without much implementation effort (e.g. for fromCsvFile())
> TableSources need to be deeply integrated into the Table API.
> The TableEnvironment requires a newly introduced AbstractExecutionEnvironment 
> (common super class of all ExecutionEnvironments for DataSets and 
> DataStreams).
> Here's what a TableSource can see from more complicated queries:
> {code}
> getTableJava(tableSource1)
>   .filter("a===5 || a===6")
>   .select("a as a4, b as b4, c as c4")
>   .filter("b4===7")
>   .join(getTableJava(tableSource2))
>   .where("a===a4 && c==='Test' && c4==='Test2'")
> // Result predicates for tableSource1:
> //  List("a===5 || a===6", "b===7", "c==='Test2'")
> // Result predicates for tableSource2:
> //  List("c==='Test'")
> // Result resolved fields for tableSource1 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("a", false), ("b", true), ("b", false), ("c", false), 
> ("c", true))
> // Result resolved fields for tableSource2 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("c", true))
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2828] [tableAPI] Add TableSource interf...

2016-04-26 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1939#discussion_r61121181
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/rules/dataSet/BatchTableSourceScanRule.scala
 ---
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.table.plan.rules.dataSet
+
+import org.apache.calcite.plan.{Convention, RelOptRule, RelOptRuleCall, 
RelTraitSet}
+import org.apache.calcite.rel.RelNode
+import org.apache.calcite.rel.convert.ConverterRule
+import org.apache.calcite.rel.core.TableScan
+import org.apache.calcite.rel.logical.LogicalTableScan
+import 
org.apache.flink.api.table.plan.nodes.dataset.{BatchTableSourceScan, 
DataSetConvention}
+import org.apache.flink.api.table.plan.schema.TableSourceTable
+import org.apache.flink.api.table.sources.BatchTableSource
+
+/** Rule to convert a [[LogicalTableScan]] into a 
[[BatchTableSourceScan]]. */
+class BatchTableSourceScanRule
+  extends ConverterRule(
+  classOf[LogicalTableScan],
+  Convention.NONE,
+  DataSetConvention.INSTANCE,
+  "BatchTableSourceScanRule")
+  {
+
+  /** Rule must only match if TableScan targets a [[BatchTableSource]] */
--- End diff --

Yes, that would be possible. Would you prefer that? I have a slight 
preference on having separate rules but if you think it is better to merge 
them, I'll do it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3708) Scala API for CEP

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258406#comment-15258406
 ] 

ASF GitHub Bot commented on FLINK-3708:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1905#issuecomment-214806804
  
Failing test case is unrelated. Really good work @StefanRRichter. Will 
merge it. 

As a follow up, we should update the CEP documentation to also include 
Scala code examples.


> Scala API for CEP
> -
>
> Key: FLINK-3708
> URL: https://issues.apache.org/jira/browse/FLINK-3708
> Project: Flink
>  Issue Type: Improvement
>  Components: CEP
>Affects Versions: 1.1.0
>Reporter: Till Rohrmann
>Assignee: Stefan Richter
>
> Currently, The CEP library does not support Scala case classes, because the 
> {{TypeExtractor}} cannot handle them. In order to support them, it would be 
> necessary to offer a Scala API for the CEP library.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3708] Scala API for CEP (initial).

2016-04-26 Thread tillrohrmann
Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1905#issuecomment-214806804
  
Failing test case is unrelated. Really good work @StefanRRichter. Will 
merge it. 

As a follow up, we should update the CEP documentation to also include 
Scala code examples.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2828) Add interfaces for Table API input formats

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258402#comment-15258402
 ] 

ASF GitHub Bot commented on FLINK-2828:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1939#discussion_r61120533
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/schema/TableSourceTable.scala
 ---
@@ -0,0 +1,32 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.table.plan.schema
+
+import org.apache.flink.api.table.Row
+import org.apache.flink.api.table.sources.TableSource
+import org.apache.flink.api.table.typeutils.RowTypeInfo
+
+/** Table which defines an external table via a [[TableSource]] */
+class TableSourceTable(val tableSource: TableSource[_])
--- End diff --

You might not have noticed, but I renamed `TableTable` to `RelTable` :-D


> Add interfaces for Table API input formats
> --
>
> Key: FLINK-2828
> URL: https://issues.apache.org/jira/browse/FLINK-2828
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API
>Reporter: Timo Walther
>Assignee: Fabian Hueske
>
> In order to support input formats for the Table API, interfaces are 
> necessary. I propose two types of TableSources:
> - AdaptiveTableSources can adapt their output to the requirements of the 
> plan. Although the output schema stays the same, the TableSource can react on 
> field resolution and/or predicates internally and can return adapted 
> DataSet/DataStream versions in the "translate" step.
> - StaticTableSources are an easy way to provide the Table API with additional 
> input formats without much implementation effort (e.g. for fromCsvFile())
> TableSources need to be deeply integrated into the Table API.
> The TableEnvironment requires a newly introduced AbstractExecutionEnvironment 
> (common super class of all ExecutionEnvironments for DataSets and 
> DataStreams).
> Here's what a TableSource can see from more complicated queries:
> {code}
> getTableJava(tableSource1)
>   .filter("a===5 || a===6")
>   .select("a as a4, b as b4, c as c4")
>   .filter("b4===7")
>   .join(getTableJava(tableSource2))
>   .where("a===a4 && c==='Test' && c4==='Test2'")
> // Result predicates for tableSource1:
> //  List("a===5 || a===6", "b===7", "c==='Test2'")
> // Result predicates for tableSource2:
> //  List("c==='Test'")
> // Result resolved fields for tableSource1 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("a", false), ("b", true), ("b", false), ("c", false), 
> ("c", true))
> // Result resolved fields for tableSource2 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("c", true))
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2828) Add interfaces for Table API input formats

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258399#comment-15258399
 ] 

ASF GitHub Bot commented on FLINK-2828:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1939#discussion_r61120275
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/nodes/datastream/StreamTableSourceScan.scala
 ---
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.table.plan.nodes.datastream
+
+import org.apache.calcite.plan._
+import org.apache.calcite.rel.RelNode
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.TableScan
+import org.apache.flink.api.common.functions.MapFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.api.java.typeutils.PojoTypeInfo
+import org.apache.flink.api.table.StreamTableEnvironment
+import org.apache.flink.api.table.codegen.CodeGenerator
+import org.apache.flink.api.table.plan.schema.TableSourceTable
+import org.apache.flink.api.table.runtime.MapRunner
+import org.apache.flink.api.table.sources.StreamTableSource
+import 
org.apache.flink.api.table.typeutils.TypeConverter.determineReturnType
+import org.apache.flink.streaming.api.datastream.DataStream
+
+import scala.collection.JavaConversions._
+
+/** Flink RelNode to read data from an external source defined by a 
[[StreamTableSource]]. */
+class StreamTableSourceScan(
--- End diff --

Yes, I would go for a common base class here as well.


> Add interfaces for Table API input formats
> --
>
> Key: FLINK-2828
> URL: https://issues.apache.org/jira/browse/FLINK-2828
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API
>Reporter: Timo Walther
>Assignee: Fabian Hueske
>
> In order to support input formats for the Table API, interfaces are 
> necessary. I propose two types of TableSources:
> - AdaptiveTableSources can adapt their output to the requirements of the 
> plan. Although the output schema stays the same, the TableSource can react on 
> field resolution and/or predicates internally and can return adapted 
> DataSet/DataStream versions in the "translate" step.
> - StaticTableSources are an easy way to provide the Table API with additional 
> input formats without much implementation effort (e.g. for fromCsvFile())
> TableSources need to be deeply integrated into the Table API.
> The TableEnvironment requires a newly introduced AbstractExecutionEnvironment 
> (common super class of all ExecutionEnvironments for DataSets and 
> DataStreams).
> Here's what a TableSource can see from more complicated queries:
> {code}
> getTableJava(tableSource1)
>   .filter("a===5 || a===6")
>   .select("a as a4, b as b4, c as c4")
>   .filter("b4===7")
>   .join(getTableJava(tableSource2))
>   .where("a===a4 && c==='Test' && c4==='Test2'")
> // Result predicates for tableSource1:
> //  List("a===5 || a===6", "b===7", "c==='Test2'")
> // Result predicates for tableSource2:
> //  List("c==='Test'")
> // Result resolved fields for tableSource1 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("a", false), ("b", true), ("b", false), ("c", false), 
> ("c", true))
> // Result resolved fields for tableSource2 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("c", true))
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2828] [tableAPI] Add TableSource interf...

2016-04-26 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1939#discussion_r61120275
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/nodes/datastream/StreamTableSourceScan.scala
 ---
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.table.plan.nodes.datastream
+
+import org.apache.calcite.plan._
+import org.apache.calcite.rel.RelNode
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.TableScan
+import org.apache.flink.api.common.functions.MapFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.api.java.typeutils.PojoTypeInfo
+import org.apache.flink.api.table.StreamTableEnvironment
+import org.apache.flink.api.table.codegen.CodeGenerator
+import org.apache.flink.api.table.plan.schema.TableSourceTable
+import org.apache.flink.api.table.runtime.MapRunner
+import org.apache.flink.api.table.sources.StreamTableSource
+import 
org.apache.flink.api.table.typeutils.TypeConverter.determineReturnType
+import org.apache.flink.streaming.api.datastream.DataStream
+
+import scala.collection.JavaConversions._
+
+/** Flink RelNode to read data from an external source defined by a 
[[StreamTableSource]]. */
+class StreamTableSourceScan(
--- End diff --

Yes, I would go for a common base class here as well.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2828) Add interfaces for Table API input formats

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258398#comment-15258398
 ] 

ASF GitHub Bot commented on FLINK-2828:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1939#discussion_r61120172
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/nodes/dataset/BatchTableSourceScan.scala
 ---
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.table.plan.nodes.dataset
+
+import org.apache.calcite.plan._
+import org.apache.calcite.rel.RelNode
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.TableScan
+import org.apache.calcite.rel.metadata.RelMetadataQuery
+import org.apache.flink.api.common.functions.MapFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.api.java.DataSet
+import org.apache.flink.api.java.typeutils.PojoTypeInfo
+import org.apache.flink.api.table.BatchTableEnvironment
+import org.apache.flink.api.table.codegen.CodeGenerator
+import org.apache.flink.api.table.plan.schema.TableSourceTable
+import org.apache.flink.api.table.runtime.MapRunner
+import org.apache.flink.api.table.sources.BatchTableSource
+import 
org.apache.flink.api.table.typeutils.TypeConverter.determineReturnType
+
+import scala.collection.JavaConversions._
+import scala.collection.JavaConverters._
+
+/** Flink RelNode to read data from an external source defined by a 
[[BatchTableSource]]. */
+class BatchTableSourceScan(
--- End diff --

I agree, both classes share a lot of code, but another difference apart 
from the unwrapping is that `DataSetScan` simply forwards its `DataSet` whereas 
`BatchTableSourceScan` creates a new `DataSet` using the `BatchTableSource` and 
the table environment. So I would like to keep separate classes, but it makes 
sense to let them have a common abstract base class. What do you think?
Btw. `DataSetSource` was recently renamed to `DataSetScan` ;-)


> Add interfaces for Table API input formats
> --
>
> Key: FLINK-2828
> URL: https://issues.apache.org/jira/browse/FLINK-2828
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API
>Reporter: Timo Walther
>Assignee: Fabian Hueske
>
> In order to support input formats for the Table API, interfaces are 
> necessary. I propose two types of TableSources:
> - AdaptiveTableSources can adapt their output to the requirements of the 
> plan. Although the output schema stays the same, the TableSource can react on 
> field resolution and/or predicates internally and can return adapted 
> DataSet/DataStream versions in the "translate" step.
> - StaticTableSources are an easy way to provide the Table API with additional 
> input formats without much implementation effort (e.g. for fromCsvFile())
> TableSources need to be deeply integrated into the Table API.
> The TableEnvironment requires a newly introduced AbstractExecutionEnvironment 
> (common super class of all ExecutionEnvironments for DataSets and 
> DataStreams).
> Here's what a TableSource can see from more complicated queries:
> {code}
> getTableJava(tableSource1)
>   .filter("a===5 || a===6")
>   .select("a as a4, b as b4, c as c4")
>   .filter("b4===7")
>   .join(getTableJava(tableSource2))
>   .where("a===a4 && c==='Test' && c4==='Test2'")
> // Result predicates for tableSource1:
> //  List("a===5 || a===6", "b===7", "c==='Test2'")
> // Result predicates for tableSource2:
> //  List("c==='Test'")
> // Result resolved fields for tableSource1 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("a", false), ("b", true), ("b", false), ("c", false), 
> ("c", true))
> // Result resolved fields for tableSource2 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("c", true))
> 

[GitHub] flink pull request: [FLINK-2828] [tableAPI] Add TableSource interf...

2016-04-26 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/1939#discussion_r61120172
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/nodes/dataset/BatchTableSourceScan.scala
 ---
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.table.plan.nodes.dataset
+
+import org.apache.calcite.plan._
+import org.apache.calcite.rel.RelNode
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.TableScan
+import org.apache.calcite.rel.metadata.RelMetadataQuery
+import org.apache.flink.api.common.functions.MapFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.api.java.DataSet
+import org.apache.flink.api.java.typeutils.PojoTypeInfo
+import org.apache.flink.api.table.BatchTableEnvironment
+import org.apache.flink.api.table.codegen.CodeGenerator
+import org.apache.flink.api.table.plan.schema.TableSourceTable
+import org.apache.flink.api.table.runtime.MapRunner
+import org.apache.flink.api.table.sources.BatchTableSource
+import 
org.apache.flink.api.table.typeutils.TypeConverter.determineReturnType
+
+import scala.collection.JavaConversions._
+import scala.collection.JavaConverters._
+
+/** Flink RelNode to read data from an external source defined by a 
[[BatchTableSource]]. */
+class BatchTableSourceScan(
--- End diff --

I agree, both classes share a lot of code, but another difference apart 
from the unwrapping is that `DataSetScan` simply forwards its `DataSet` whereas 
`BatchTableSourceScan` creates a new `DataSet` using the `BatchTableSource` and 
the table environment. So I would like to keep separate classes, but it makes 
sense to let them have a common abstract base class. What do you think?
Btw. `DataSetSource` was recently renamed to `DataSetScan` ;-)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3741) Travis Compile Error: MissingRequirementError: object scala.runtime in compiler mirror not found.

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258361#comment-15258361
 ] 

ASF GitHub Bot commented on FLINK-3741:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1931#issuecomment-214799876
  
This PR does not solve the issue FLINK-3741.


> Travis Compile Error: MissingRequirementError: object scala.runtime in 
> compiler mirror not found.
> -
>
> Key: FLINK-3741
> URL: https://issues.apache.org/jira/browse/FLINK-3741
> Project: Flink
>  Issue Type: Bug
>Reporter: Todd Lisonbee
>  Labels: CI, build
>
> Build failed on one of my pull requests and at least 3 others from other 
> people.
> Seems like problem is in latest master as of 4/11/2016 (my pull request only 
> had a Javadoc comment added).
> OpenJDK 7, hadoop.profile=1 and other profiles
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/122460456/log.txt
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/122381837/log.txt
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/122589293/log.txt
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/122589296/log.txt
> Error:
> [INFO] 
> /home/travis/build/apache/flink/flink-libraries/flink-ml/src/main/scala:-1: 
> info: compiling
> [INFO] Compiling 43 source files to 
> /home/travis/build/apache/flink/flink-libraries/flink-ml/target/classes at 
> 1460421450188
> [ERROR] error: error while loading , error in opening zip file
> [ERROR] error: scala.reflect.internal.MissingRequirementError: object 
> scala.runtime in compiler mirror not found.
> [ERROR]   at 
> scala.reflect.internal.MissingRequirementError$.signal(MissingRequirementError.scala:16)
> [ERROR]   at 
> scala.reflect.internal.MissingRequirementError$.notFound(MissingRequirementError.scala:17)
> [INFO]at 
> scala.reflect.internal.Mirrors$RootsBase.getModuleOrClass(Mirrors.scala:48)
> [INFO]at 
> scala.reflect.internal.Mirrors$RootsBase.getModuleOrClass(Mirrors.scala:40)
> [INFO]at 
> scala.reflect.internal.Mirrors$RootsBase.getModuleOrClass(Mirrors.scala:61)
> [INFO]at 
> scala.reflect.internal.Mirrors$RootsBase.getPackage(Mirrors.scala:172)
> [INFO]at 
> scala.reflect.internal.Mirrors$RootsBase.getRequiredPackage(Mirrors.scala:175)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.RuntimePackage$lzycompute(Definitions.scala:183)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.RuntimePackage(Definitions.scala:183)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.RuntimePackageClass$lzycompute(Definitions.scala:184)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.RuntimePackageClass(Definitions.scala:184)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.AnnotationDefaultAttr$lzycompute(Definitions.scala:1024)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.AnnotationDefaultAttr(Definitions.scala:1023)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.syntheticCoreClasses$lzycompute(Definitions.scala:1153)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.syntheticCoreClasses(Definitions.scala:1152)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.symbolsNotPresentInBytecode$lzycompute(Definitions.scala:1196)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.symbolsNotPresentInBytecode(Definitions.scala:1196)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.init(Definitions.scala:1261)
> [INFO]at scala.tools.nsc.Global$Run.(Global.scala:1290)
> [INFO]at scala.tools.nsc.Driver.doCompile(Driver.scala:32)
> [INFO]at scala.tools.nsc.Main$.doCompile(Main.scala:79)
> [INFO]at scala.tools.nsc.Driver.process(Driver.scala:54)
> [INFO]at scala.tools.nsc.Driver.main(Driver.scala:67)
> [INFO]at scala.tools.nsc.Main.main(Main.scala)
> [INFO]at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [INFO]at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> [INFO]at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> [INFO]at java.lang.reflect.Method.invoke(Method.java:606)
> [INFO]at 
> org_scala_tools_maven_executions.MainHelper.runMain(MainHelper.java:161)
> [INFO]at 
> org_scala_tools_maven_executions.MainWithArgsInFile.main(MainWithArgsInFile.java:26)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3741) Travis Compile Error: MissingRequirementError: object scala.runtime in compiler mirror not found.

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258362#comment-15258362
 ] 

ASF GitHub Bot commented on FLINK-3741:
---

Github user tillrohrmann closed the pull request at:

https://github.com/apache/flink/pull/1931


> Travis Compile Error: MissingRequirementError: object scala.runtime in 
> compiler mirror not found.
> -
>
> Key: FLINK-3741
> URL: https://issues.apache.org/jira/browse/FLINK-3741
> Project: Flink
>  Issue Type: Bug
>Reporter: Todd Lisonbee
>  Labels: CI, build
>
> Build failed on one of my pull requests and at least 3 others from other 
> people.
> Seems like problem is in latest master as of 4/11/2016 (my pull request only 
> had a Javadoc comment added).
> OpenJDK 7, hadoop.profile=1 and other profiles
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/122460456/log.txt
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/122381837/log.txt
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/122589293/log.txt
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/122589296/log.txt
> Error:
> [INFO] 
> /home/travis/build/apache/flink/flink-libraries/flink-ml/src/main/scala:-1: 
> info: compiling
> [INFO] Compiling 43 source files to 
> /home/travis/build/apache/flink/flink-libraries/flink-ml/target/classes at 
> 1460421450188
> [ERROR] error: error while loading , error in opening zip file
> [ERROR] error: scala.reflect.internal.MissingRequirementError: object 
> scala.runtime in compiler mirror not found.
> [ERROR]   at 
> scala.reflect.internal.MissingRequirementError$.signal(MissingRequirementError.scala:16)
> [ERROR]   at 
> scala.reflect.internal.MissingRequirementError$.notFound(MissingRequirementError.scala:17)
> [INFO]at 
> scala.reflect.internal.Mirrors$RootsBase.getModuleOrClass(Mirrors.scala:48)
> [INFO]at 
> scala.reflect.internal.Mirrors$RootsBase.getModuleOrClass(Mirrors.scala:40)
> [INFO]at 
> scala.reflect.internal.Mirrors$RootsBase.getModuleOrClass(Mirrors.scala:61)
> [INFO]at 
> scala.reflect.internal.Mirrors$RootsBase.getPackage(Mirrors.scala:172)
> [INFO]at 
> scala.reflect.internal.Mirrors$RootsBase.getRequiredPackage(Mirrors.scala:175)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.RuntimePackage$lzycompute(Definitions.scala:183)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.RuntimePackage(Definitions.scala:183)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.RuntimePackageClass$lzycompute(Definitions.scala:184)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.RuntimePackageClass(Definitions.scala:184)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.AnnotationDefaultAttr$lzycompute(Definitions.scala:1024)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.AnnotationDefaultAttr(Definitions.scala:1023)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.syntheticCoreClasses$lzycompute(Definitions.scala:1153)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.syntheticCoreClasses(Definitions.scala:1152)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.symbolsNotPresentInBytecode$lzycompute(Definitions.scala:1196)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.symbolsNotPresentInBytecode(Definitions.scala:1196)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.init(Definitions.scala:1261)
> [INFO]at scala.tools.nsc.Global$Run.(Global.scala:1290)
> [INFO]at scala.tools.nsc.Driver.doCompile(Driver.scala:32)
> [INFO]at scala.tools.nsc.Main$.doCompile(Main.scala:79)
> [INFO]at scala.tools.nsc.Driver.process(Driver.scala:54)
> [INFO]at scala.tools.nsc.Driver.main(Driver.scala:67)
> [INFO]at scala.tools.nsc.Main.main(Main.scala)
> [INFO]at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [INFO]at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> [INFO]at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> [INFO]at java.lang.reflect.Method.invoke(Method.java:606)
> [INFO]at 
> org_scala_tools_maven_executions.MainHelper.runMain(MainHelper.java:161)
> [INFO]at 
> org_scala_tools_maven_executions.MainWithArgsInFile.main(MainWithArgsInFile.java:26)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3741] [build] Replace maven-scala-plugi...

2016-04-26 Thread tillrohrmann
Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1931#issuecomment-214799876
  
This PR does not solve the issue FLINK-3741.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-3741] [build] Replace maven-scala-plugi...

2016-04-26 Thread tillrohrmann
Github user tillrohrmann closed the pull request at:

https://github.com/apache/flink/pull/1931


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2220] Join on Pojo without hashCode() s...

2016-04-26 Thread gallenvara
GitHub user gallenvara opened a pull request:

https://github.com/apache/flink/pull/1940

[FLINK-2220] Join on Pojo without hashCode() silently fails

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [X] General
  - The pull request references the related JIRA issue
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message

Add a check to verify the POJO has overridden the `hashCode()` and 
`equals()` where it used as a key for operations(join,coGroup,etc).



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gallenvara/flink flink-2220

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1940.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1940


commit 2f8bfe59540831f3e2e9b181f3a51f0565693cb2
Author: gallenvara 
Date:   2016-04-26T16:08:27Z

Check hashcode and equal method overridden in which POJO used as key.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request:

2016-04-26 Thread tillrohrmann
Github user tillrohrmann commented on the pull request:


https://github.com/apache/flink/commit/1ccc798914a2bd94157f2a86f7961bc1cc5de490#commitcomment-17257298
  
Yes, will add it right away.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3199) KafkaITCase.testOneToOneSources

2016-04-26 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258317#comment-15258317
 ] 

Till Rohrmann commented on FLINK-3199:
--

Found another instance: 
https://s3.amazonaws.com/archive.travis-ci.org/jobs/125795632/log.txt

> KafkaITCase.testOneToOneSources
> ---
>
> Key: FLINK-3199
> URL: https://issues.apache.org/jira/browse/FLINK-3199
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Reporter: Matthias J. Sax
>Priority: Critical
>  Labels: test-stability
>
> https://travis-ci.org/mjsax/flink/jobs/100167558
> {noformat}
> Failed tests: 
> KafkaITCase.testOneToOneSources:96->KafkaConsumerTestBase.runOneToOneExactlyOnceTest:521->KafkaTestBase.tryExecute:318
>  Test failed: The program execution failed: Job execution failed.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request:

2016-04-26 Thread uce
Github user uce commented on the pull request:


https://github.com/apache/flink/commit/1ccc798914a2bd94157f2a86f7961bc1cc5de490#commitcomment-17256873
  
Can we add this to release-1.0 as well?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3741) Travis Compile Error: MissingRequirementError: object scala.runtime in compiler mirror not found.

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258316#comment-15258316
 ] 

ASF GitHub Bot commented on FLINK-3741:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1931#issuecomment-214790629
  
The initial problem still persists :-( 

Still, I would like to merge this PR to update the `scala-maven-plugin` 
using in Flink-ML.


> Travis Compile Error: MissingRequirementError: object scala.runtime in 
> compiler mirror not found.
> -
>
> Key: FLINK-3741
> URL: https://issues.apache.org/jira/browse/FLINK-3741
> Project: Flink
>  Issue Type: Bug
>Reporter: Todd Lisonbee
>  Labels: CI, build
>
> Build failed on one of my pull requests and at least 3 others from other 
> people.
> Seems like problem is in latest master as of 4/11/2016 (my pull request only 
> had a Javadoc comment added).
> OpenJDK 7, hadoop.profile=1 and other profiles
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/122460456/log.txt
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/122381837/log.txt
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/122589293/log.txt
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/122589296/log.txt
> Error:
> [INFO] 
> /home/travis/build/apache/flink/flink-libraries/flink-ml/src/main/scala:-1: 
> info: compiling
> [INFO] Compiling 43 source files to 
> /home/travis/build/apache/flink/flink-libraries/flink-ml/target/classes at 
> 1460421450188
> [ERROR] error: error while loading , error in opening zip file
> [ERROR] error: scala.reflect.internal.MissingRequirementError: object 
> scala.runtime in compiler mirror not found.
> [ERROR]   at 
> scala.reflect.internal.MissingRequirementError$.signal(MissingRequirementError.scala:16)
> [ERROR]   at 
> scala.reflect.internal.MissingRequirementError$.notFound(MissingRequirementError.scala:17)
> [INFO]at 
> scala.reflect.internal.Mirrors$RootsBase.getModuleOrClass(Mirrors.scala:48)
> [INFO]at 
> scala.reflect.internal.Mirrors$RootsBase.getModuleOrClass(Mirrors.scala:40)
> [INFO]at 
> scala.reflect.internal.Mirrors$RootsBase.getModuleOrClass(Mirrors.scala:61)
> [INFO]at 
> scala.reflect.internal.Mirrors$RootsBase.getPackage(Mirrors.scala:172)
> [INFO]at 
> scala.reflect.internal.Mirrors$RootsBase.getRequiredPackage(Mirrors.scala:175)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.RuntimePackage$lzycompute(Definitions.scala:183)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.RuntimePackage(Definitions.scala:183)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.RuntimePackageClass$lzycompute(Definitions.scala:184)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.RuntimePackageClass(Definitions.scala:184)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.AnnotationDefaultAttr$lzycompute(Definitions.scala:1024)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.AnnotationDefaultAttr(Definitions.scala:1023)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.syntheticCoreClasses$lzycompute(Definitions.scala:1153)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.syntheticCoreClasses(Definitions.scala:1152)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.symbolsNotPresentInBytecode$lzycompute(Definitions.scala:1196)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.symbolsNotPresentInBytecode(Definitions.scala:1196)
> [INFO]at 
> scala.reflect.internal.Definitions$DefinitionsClass.init(Definitions.scala:1261)
> [INFO]at scala.tools.nsc.Global$Run.(Global.scala:1290)
> [INFO]at scala.tools.nsc.Driver.doCompile(Driver.scala:32)
> [INFO]at scala.tools.nsc.Main$.doCompile(Main.scala:79)
> [INFO]at scala.tools.nsc.Driver.process(Driver.scala:54)
> [INFO]at scala.tools.nsc.Driver.main(Driver.scala:67)
> [INFO]at scala.tools.nsc.Main.main(Main.scala)
> [INFO]at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [INFO]at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> [INFO]at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> [INFO]at java.lang.reflect.Method.invoke(Method.java:606)
> [INFO]at 
> org_scala_tools_maven_executions.MainHelper.runMain(MainHelper.java:161)
> [INFO]at 
> org_scala_tools_maven_executions.MainWithArgsInFile.main(MainWithArgsInFile.java:26)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3741] [build] Replace maven-scala-plugi...

2016-04-26 Thread tillrohrmann
Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1931#issuecomment-214790629
  
The initial problem still persists :-( 

Still, I would like to merge this PR to update the `scala-maven-plugin` 
using in Flink-ML.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2828) Add interfaces for Table API input formats

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258282#comment-15258282
 ] 

ASF GitHub Bot commented on FLINK-2828:
---

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/1939#issuecomment-214782320
  
Thanks @fhueske, PR looks good :)
Maybe we can refactor the source RelNode to avoid much code duplication. 
And do you think we need a test for `CsvTableSource`?


> Add interfaces for Table API input formats
> --
>
> Key: FLINK-2828
> URL: https://issues.apache.org/jira/browse/FLINK-2828
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API
>Reporter: Timo Walther
>Assignee: Fabian Hueske
>
> In order to support input formats for the Table API, interfaces are 
> necessary. I propose two types of TableSources:
> - AdaptiveTableSources can adapt their output to the requirements of the 
> plan. Although the output schema stays the same, the TableSource can react on 
> field resolution and/or predicates internally and can return adapted 
> DataSet/DataStream versions in the "translate" step.
> - StaticTableSources are an easy way to provide the Table API with additional 
> input formats without much implementation effort (e.g. for fromCsvFile())
> TableSources need to be deeply integrated into the Table API.
> The TableEnvironment requires a newly introduced AbstractExecutionEnvironment 
> (common super class of all ExecutionEnvironments for DataSets and 
> DataStreams).
> Here's what a TableSource can see from more complicated queries:
> {code}
> getTableJava(tableSource1)
>   .filter("a===5 || a===6")
>   .select("a as a4, b as b4, c as c4")
>   .filter("b4===7")
>   .join(getTableJava(tableSource2))
>   .where("a===a4 && c==='Test' && c4==='Test2'")
> // Result predicates for tableSource1:
> //  List("a===5 || a===6", "b===7", "c==='Test2'")
> // Result predicates for tableSource2:
> //  List("c==='Test'")
> // Result resolved fields for tableSource1 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("a", false), ("b", true), ("b", false), ("c", false), 
> ("c", true))
> // Result resolved fields for tableSource2 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("c", true))
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3771) Methods for translating Graphs

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258288#comment-15258288
 ] 

ASF GitHub Bot commented on FLINK-3771:
---

Github user greghogan commented on the pull request:

https://github.com/apache/flink/pull/1900#issuecomment-214782976
  
Hi @vasia,
Comparing the APIs, `Translate` processes at the level of label or value 
whereas `mapVertices/Edges` process vertices and edges. What are use cases for 
the extra context (namely, the label) provided by `mapVertices/Edges`? 
Implementations of `Translator` can be reused across functions whereas the 
`MapFunction` provided to `mapVertices/Edges` are specific to that method. 
Would `mapLabels` require two new `MapFunction` or use an interface like 
`Translator`?

`Translator` is wrapped by a specific `MapFunction` for each `Translate` 
function.

Only one of the five current `Translate` functions operates on a `Graph`. 
`tranlateGraphLabels` could be moved to `Graph`, and there could be a 
`translateVertexValues` and `translateEdgeValues` also added to `Graph`. We 
don't always have a `Graph`, sometimes algorithms return a `DataSet` of 
vertices or edges, so having the static methods is helpful for these 
specialized `Vertex` and `Edge` operations.


> Methods for translating Graphs
> --
>
> Key: FLINK-3771
> URL: https://issues.apache.org/jira/browse/FLINK-3771
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Affects Versions: 1.1.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
> Fix For: 1.1.0
>
>
> Provide methods for translation of the type or value of graph labels, vertex 
> values, and edge values.
> Sample use cases:
> * shifting graph labels in order to union generated graphs or graphs read 
> from multiple sources
> * downsizing labels or values since algorithms prefer to generate wide types 
> which may be expensive for further computation
> * changing label type for testing or benchmarking alternative code paths



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3771] [gelly] Methods for translating G...

2016-04-26 Thread greghogan
Github user greghogan commented on the pull request:

https://github.com/apache/flink/pull/1900#issuecomment-214782976
  
Hi @vasia,
Comparing the APIs, `Translate` processes at the level of label or value 
whereas `mapVertices/Edges` process vertices and edges. What are use cases for 
the extra context (namely, the label) provided by `mapVertices/Edges`? 
Implementations of `Translator` can be reused across functions whereas the 
`MapFunction` provided to `mapVertices/Edges` are specific to that method. 
Would `mapLabels` require two new `MapFunction` or use an interface like 
`Translator`?

`Translator` is wrapped by a specific `MapFunction` for each `Translate` 
function.

Only one of the five current `Translate` functions operates on a `Graph`. 
`tranlateGraphLabels` could be moved to `Graph`, and there could be a 
`translateVertexValues` and `translateEdgeValues` also added to `Graph`. We 
don't always have a `Graph`, sometimes algorithms return a `DataSet` of 
vertices or edges, so having the static methods is helpful for these 
specialized `Vertex` and `Edge` operations.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2828] [tableAPI] Add TableSource interf...

2016-04-26 Thread vasia
Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/1939#discussion_r61103432
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/nodes/datastream/StreamTableSourceScan.scala
 ---
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.table.plan.nodes.datastream
+
+import org.apache.calcite.plan._
+import org.apache.calcite.rel.RelNode
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.TableScan
+import org.apache.flink.api.common.functions.MapFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.api.java.typeutils.PojoTypeInfo
+import org.apache.flink.api.table.StreamTableEnvironment
+import org.apache.flink.api.table.codegen.CodeGenerator
+import org.apache.flink.api.table.plan.schema.TableSourceTable
+import org.apache.flink.api.table.runtime.MapRunner
+import org.apache.flink.api.table.sources.StreamTableSource
+import 
org.apache.flink.api.table.typeutils.TypeConverter.determineReturnType
+import org.apache.flink.streaming.api.datastream.DataStream
+
+import scala.collection.JavaConversions._
+
+/** Flink RelNode to read data from an external source defined by a 
[[StreamTableSource]]. */
+class StreamTableSourceScan(
--- End diff --

Same with this class and `DataStreamSource`, no?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2828) Add interfaces for Table API input formats

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258271#comment-15258271
 ] 

ASF GitHub Bot commented on FLINK-2828:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/1939#discussion_r61106226
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/api/scala/table/test/TableSourceITCase.scala
 ---
@@ -0,0 +1,123 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.scala.table.test
+
+import org.apache.flink.api.common.io.GenericInputFormat
+import org.apache.flink.api.common.typeinfo.{TypeInformation, 
BasicTypeInfo}
+import org.apache.flink.api.scala._
+import org.apache.flink.api.scala.table._
+import org.apache.flink.api.java.{ExecutionEnvironment => JavaExecEnv, 
DataSet => JavaSet}
+import org.apache.flink.api.scala.ExecutionEnvironment
+import org.apache.flink.api.table.sources.BatchTableSource
+import org.apache.flink.api.table.typeutils.RowTypeInfo
+import org.apache.flink.api.table.{Row, TableEnvironment}
+import org.apache.flink.api.table.test.utils.TableProgramsTestBase
+import 
org.apache.flink.api.table.test.utils.TableProgramsTestBase.TableConfigMode
+import 
org.apache.flink.test.util.MultipleProgramsTestBase.TestExecutionMode
+import org.apache.flink.test.util.TestBaseUtils
+import org.junit.Test
+import org.junit.runner.RunWith
+import org.junit.runners.Parameterized
+
+import scala.collection.JavaConverters._
+
+@RunWith(classOf[Parameterized])
+class TableSourceITCase(
+mode: TestExecutionMode,
+configMode: TableConfigMode)
+  extends TableProgramsTestBase(mode, configMode) {
+
+  @Test
+  def testStreamTableSourceTableAPI(): Unit = {
+
+val env = ExecutionEnvironment.getExecutionEnvironment
+val tEnv = TableEnvironment.getTableEnvironment(env)
+
+tEnv.registerTableSource("MyTestTable", new TestBatchTableSource())
+val results = tEnv
+  .scan("MyTestTable")
+  .where('amount < 4)
+  .select('amount * 'id, 'name)
+  .toDataSet[Row].collect()
+
+val expected = Seq(
+  "0,Record_0", "0,Record_16", "0,Record_32", "1,Record_1", 
"17,Record_17",
+  "36,Record_18", "4,Record_2", "57,Record_19", 
"9,Record_3").mkString("\n")
+TestBaseUtils.compareResultAsText(results.asJava, expected)
+  }
+
+  @Test
+  def testStreamTableSourceSQL(): Unit = {
--- End diff --

here too :)


> Add interfaces for Table API input formats
> --
>
> Key: FLINK-2828
> URL: https://issues.apache.org/jira/browse/FLINK-2828
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API
>Reporter: Timo Walther
>Assignee: Fabian Hueske
>
> In order to support input formats for the Table API, interfaces are 
> necessary. I propose two types of TableSources:
> - AdaptiveTableSources can adapt their output to the requirements of the 
> plan. Although the output schema stays the same, the TableSource can react on 
> field resolution and/or predicates internally and can return adapted 
> DataSet/DataStream versions in the "translate" step.
> - StaticTableSources are an easy way to provide the Table API with additional 
> input formats without much implementation effort (e.g. for fromCsvFile())
> TableSources need to be deeply integrated into the Table API.
> The TableEnvironment requires a newly introduced AbstractExecutionEnvironment 
> (common super class of all ExecutionEnvironments for DataSets and 
> DataStreams).
> Here's what a TableSource can see from more complicated queries:
> {code}
> getTableJava(tableSource1)
>   .filter("a===5 || a===6")
>   .select("a as a4, b as b4, c as c4")
>   .filter("b4===7")
>   .join(getTableJava(tableSource2))
>   .where("a===a4 && c==='Test' && 

[jira] [Commented] (FLINK-2828) Add interfaces for Table API input formats

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258270#comment-15258270
 ] 

ASF GitHub Bot commented on FLINK-2828:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/1939#discussion_r61106195
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/api/scala/table/test/TableSourceITCase.scala
 ---
@@ -0,0 +1,123 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.scala.table.test
+
+import org.apache.flink.api.common.io.GenericInputFormat
+import org.apache.flink.api.common.typeinfo.{TypeInformation, 
BasicTypeInfo}
+import org.apache.flink.api.scala._
+import org.apache.flink.api.scala.table._
+import org.apache.flink.api.java.{ExecutionEnvironment => JavaExecEnv, 
DataSet => JavaSet}
+import org.apache.flink.api.scala.ExecutionEnvironment
+import org.apache.flink.api.table.sources.BatchTableSource
+import org.apache.flink.api.table.typeutils.RowTypeInfo
+import org.apache.flink.api.table.{Row, TableEnvironment}
+import org.apache.flink.api.table.test.utils.TableProgramsTestBase
+import 
org.apache.flink.api.table.test.utils.TableProgramsTestBase.TableConfigMode
+import 
org.apache.flink.test.util.MultipleProgramsTestBase.TestExecutionMode
+import org.apache.flink.test.util.TestBaseUtils
+import org.junit.Test
+import org.junit.runner.RunWith
+import org.junit.runners.Parameterized
+
+import scala.collection.JavaConverters._
+
+@RunWith(classOf[Parameterized])
+class TableSourceITCase(
+mode: TestExecutionMode,
+configMode: TableConfigMode)
+  extends TableProgramsTestBase(mode, configMode) {
+
+  @Test
+  def testStreamTableSourceTableAPI(): Unit = {
--- End diff --

-> test*Batch*TableSourceTableAPI?


> Add interfaces for Table API input formats
> --
>
> Key: FLINK-2828
> URL: https://issues.apache.org/jira/browse/FLINK-2828
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API
>Reporter: Timo Walther
>Assignee: Fabian Hueske
>
> In order to support input formats for the Table API, interfaces are 
> necessary. I propose two types of TableSources:
> - AdaptiveTableSources can adapt their output to the requirements of the 
> plan. Although the output schema stays the same, the TableSource can react on 
> field resolution and/or predicates internally and can return adapted 
> DataSet/DataStream versions in the "translate" step.
> - StaticTableSources are an easy way to provide the Table API with additional 
> input formats without much implementation effort (e.g. for fromCsvFile())
> TableSources need to be deeply integrated into the Table API.
> The TableEnvironment requires a newly introduced AbstractExecutionEnvironment 
> (common super class of all ExecutionEnvironments for DataSets and 
> DataStreams).
> Here's what a TableSource can see from more complicated queries:
> {code}
> getTableJava(tableSource1)
>   .filter("a===5 || a===6")
>   .select("a as a4, b as b4, c as c4")
>   .filter("b4===7")
>   .join(getTableJava(tableSource2))
>   .where("a===a4 && c==='Test' && c4==='Test2'")
> // Result predicates for tableSource1:
> //  List("a===5 || a===6", "b===7", "c==='Test2'")
> // Result predicates for tableSource2:
> //  List("c==='Test'")
> // Result resolved fields for tableSource1 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("a", false), ("b", true), ("b", false), ("c", false), 
> ("c", true))
> // Result resolved fields for tableSource2 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("c", true))
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2828] [tableAPI] Add TableSource interf...

2016-04-26 Thread vasia
Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/1939#discussion_r61106195
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/api/scala/table/test/TableSourceITCase.scala
 ---
@@ -0,0 +1,123 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.scala.table.test
+
+import org.apache.flink.api.common.io.GenericInputFormat
+import org.apache.flink.api.common.typeinfo.{TypeInformation, 
BasicTypeInfo}
+import org.apache.flink.api.scala._
+import org.apache.flink.api.scala.table._
+import org.apache.flink.api.java.{ExecutionEnvironment => JavaExecEnv, 
DataSet => JavaSet}
+import org.apache.flink.api.scala.ExecutionEnvironment
+import org.apache.flink.api.table.sources.BatchTableSource
+import org.apache.flink.api.table.typeutils.RowTypeInfo
+import org.apache.flink.api.table.{Row, TableEnvironment}
+import org.apache.flink.api.table.test.utils.TableProgramsTestBase
+import 
org.apache.flink.api.table.test.utils.TableProgramsTestBase.TableConfigMode
+import 
org.apache.flink.test.util.MultipleProgramsTestBase.TestExecutionMode
+import org.apache.flink.test.util.TestBaseUtils
+import org.junit.Test
+import org.junit.runner.RunWith
+import org.junit.runners.Parameterized
+
+import scala.collection.JavaConverters._
+
+@RunWith(classOf[Parameterized])
+class TableSourceITCase(
+mode: TestExecutionMode,
+configMode: TableConfigMode)
+  extends TableProgramsTestBase(mode, configMode) {
+
+  @Test
+  def testStreamTableSourceTableAPI(): Unit = {
--- End diff --

-> test*Batch*TableSourceTableAPI?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2828] [tableAPI] Add TableSource interf...

2016-04-26 Thread vasia
Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/1939#discussion_r61106226
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/api/scala/table/test/TableSourceITCase.scala
 ---
@@ -0,0 +1,123 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.scala.table.test
+
+import org.apache.flink.api.common.io.GenericInputFormat
+import org.apache.flink.api.common.typeinfo.{TypeInformation, 
BasicTypeInfo}
+import org.apache.flink.api.scala._
+import org.apache.flink.api.scala.table._
+import org.apache.flink.api.java.{ExecutionEnvironment => JavaExecEnv, 
DataSet => JavaSet}
+import org.apache.flink.api.scala.ExecutionEnvironment
+import org.apache.flink.api.table.sources.BatchTableSource
+import org.apache.flink.api.table.typeutils.RowTypeInfo
+import org.apache.flink.api.table.{Row, TableEnvironment}
+import org.apache.flink.api.table.test.utils.TableProgramsTestBase
+import 
org.apache.flink.api.table.test.utils.TableProgramsTestBase.TableConfigMode
+import 
org.apache.flink.test.util.MultipleProgramsTestBase.TestExecutionMode
+import org.apache.flink.test.util.TestBaseUtils
+import org.junit.Test
+import org.junit.runner.RunWith
+import org.junit.runners.Parameterized
+
+import scala.collection.JavaConverters._
+
+@RunWith(classOf[Parameterized])
+class TableSourceITCase(
+mode: TestExecutionMode,
+configMode: TableConfigMode)
+  extends TableProgramsTestBase(mode, configMode) {
+
+  @Test
+  def testStreamTableSourceTableAPI(): Unit = {
+
+val env = ExecutionEnvironment.getExecutionEnvironment
+val tEnv = TableEnvironment.getTableEnvironment(env)
+
+tEnv.registerTableSource("MyTestTable", new TestBatchTableSource())
+val results = tEnv
+  .scan("MyTestTable")
+  .where('amount < 4)
+  .select('amount * 'id, 'name)
+  .toDataSet[Row].collect()
+
+val expected = Seq(
+  "0,Record_0", "0,Record_16", "0,Record_32", "1,Record_1", 
"17,Record_17",
+  "36,Record_18", "4,Record_2", "57,Record_19", 
"9,Record_3").mkString("\n")
+TestBaseUtils.compareResultAsText(results.asJava, expected)
+  }
+
+  @Test
+  def testStreamTableSourceSQL(): Unit = {
--- End diff --

here too :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2828) Add interfaces for Table API input formats

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258266#comment-15258266
 ] 

ASF GitHub Bot commented on FLINK-2828:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/1939#discussion_r61105940
  
--- Diff: 
flink-libraries/flink-table/src/test/java/org/apache/flink/api/java/table/test/TableSourceITCase.java
 ---
@@ -0,0 +1,120 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.java.table.test;
+
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.api.java.table.BatchTableEnvironment;
+import org.apache.flink.api.scala.table.test.GeneratingInputFormat;
+import org.apache.flink.api.table.Row;
+import org.apache.flink.api.table.Table;
+import org.apache.flink.api.table.TableEnvironment;
+import org.apache.flink.api.table.sources.BatchTableSource;
+import org.apache.flink.api.table.test.utils.TableProgramsTestBase;
+import org.apache.flink.api.table.typeutils.RowTypeInfo;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import java.util.List;
+
+
+@RunWith(Parameterized.class)
+public class TableSourceITCase extends TableProgramsTestBase {
+
+   public TableSourceITCase(TestExecutionMode mode, TableConfigMode 
configMode) {
+   super(mode, configMode);
+   }
+
+   @Test
+   public void testStreamTableSourceTableAPI() throws Exception {
+   ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();
+   BatchTableEnvironment tableEnv = 
TableEnvironment.getTableEnvironment(env, config());
+
+   tableEnv.registerTableSource("MyTable", new 
TestBatchTableSource());
+
+   Table result = tableEnv.scan("MyTable")
+   .where("amount < 4")
+   .select("amount * id, name");
+
+   DataSet resultSet = tableEnv.toDataSet(result, Row.class);
+   List results = resultSet.collect();
+
+   String expected = "0,Record_0\n" + "0,Record_16\n" + 
"0,Record_32\n" + "1,Record_1\n" +
+   "17,Record_17\n" + "36,Record_18\n" + "4,Record_2\n" + 
"57,Record_19\n" + "9,Record_3\n";
+
+   compareResultAsText(results, expected);
+   }
+
+   @Test
+   public void testStreamTableSourceSQL() throws Exception {
--- End diff --

-> test*Batch*TableSourceSQL?


> Add interfaces for Table API input formats
> --
>
> Key: FLINK-2828
> URL: https://issues.apache.org/jira/browse/FLINK-2828
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API
>Reporter: Timo Walther
>Assignee: Fabian Hueske
>
> In order to support input formats for the Table API, interfaces are 
> necessary. I propose two types of TableSources:
> - AdaptiveTableSources can adapt their output to the requirements of the 
> plan. Although the output schema stays the same, the TableSource can react on 
> field resolution and/or predicates internally and can return adapted 
> DataSet/DataStream versions in the "translate" step.
> - StaticTableSources are an easy way to provide the Table API with additional 
> input formats without much implementation effort (e.g. for fromCsvFile())
> TableSources need to be deeply integrated into the Table API.
> The TableEnvironment requires a newly introduced AbstractExecutionEnvironment 
> (common super class of all ExecutionEnvironments for DataSets and 
> DataStreams).
> Here's what a TableSource can see from more complicated queries:
> {code}
> getTableJava(tableSource1)
>   .filter("a===5 

[jira] [Commented] (FLINK-2828) Add interfaces for Table API input formats

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258264#comment-15258264
 ] 

ASF GitHub Bot commented on FLINK-2828:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/1939#discussion_r61105889
  
--- Diff: 
flink-libraries/flink-table/src/test/java/org/apache/flink/api/java/table/test/TableSourceITCase.java
 ---
@@ -0,0 +1,120 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.java.table.test;
+
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.api.java.table.BatchTableEnvironment;
+import org.apache.flink.api.scala.table.test.GeneratingInputFormat;
+import org.apache.flink.api.table.Row;
+import org.apache.flink.api.table.Table;
+import org.apache.flink.api.table.TableEnvironment;
+import org.apache.flink.api.table.sources.BatchTableSource;
+import org.apache.flink.api.table.test.utils.TableProgramsTestBase;
+import org.apache.flink.api.table.typeutils.RowTypeInfo;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import java.util.List;
+
+
+@RunWith(Parameterized.class)
+public class TableSourceITCase extends TableProgramsTestBase {
+
+   public TableSourceITCase(TestExecutionMode mode, TableConfigMode 
configMode) {
+   super(mode, configMode);
+   }
+
+   @Test
+   public void testStreamTableSourceTableAPI() throws Exception {
--- End diff --

-> test*Batch*TableSourceTableAPI?


> Add interfaces for Table API input formats
> --
>
> Key: FLINK-2828
> URL: https://issues.apache.org/jira/browse/FLINK-2828
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API
>Reporter: Timo Walther
>Assignee: Fabian Hueske
>
> In order to support input formats for the Table API, interfaces are 
> necessary. I propose two types of TableSources:
> - AdaptiveTableSources can adapt their output to the requirements of the 
> plan. Although the output schema stays the same, the TableSource can react on 
> field resolution and/or predicates internally and can return adapted 
> DataSet/DataStream versions in the "translate" step.
> - StaticTableSources are an easy way to provide the Table API with additional 
> input formats without much implementation effort (e.g. for fromCsvFile())
> TableSources need to be deeply integrated into the Table API.
> The TableEnvironment requires a newly introduced AbstractExecutionEnvironment 
> (common super class of all ExecutionEnvironments for DataSets and 
> DataStreams).
> Here's what a TableSource can see from more complicated queries:
> {code}
> getTableJava(tableSource1)
>   .filter("a===5 || a===6")
>   .select("a as a4, b as b4, c as c4")
>   .filter("b4===7")
>   .join(getTableJava(tableSource2))
>   .where("a===a4 && c==='Test' && c4==='Test2'")
> // Result predicates for tableSource1:
> //  List("a===5 || a===6", "b===7", "c==='Test2'")
> // Result predicates for tableSource2:
> //  List("c==='Test'")
> // Result resolved fields for tableSource1 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("a", false), ("b", true), ("b", false), ("c", false), 
> ("c", true))
> // Result resolved fields for tableSource2 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("c", true))
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2828] [tableAPI] Add TableSource interf...

2016-04-26 Thread vasia
Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/1939#discussion_r61105940
  
--- Diff: 
flink-libraries/flink-table/src/test/java/org/apache/flink/api/java/table/test/TableSourceITCase.java
 ---
@@ -0,0 +1,120 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.java.table.test;
+
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.api.java.table.BatchTableEnvironment;
+import org.apache.flink.api.scala.table.test.GeneratingInputFormat;
+import org.apache.flink.api.table.Row;
+import org.apache.flink.api.table.Table;
+import org.apache.flink.api.table.TableEnvironment;
+import org.apache.flink.api.table.sources.BatchTableSource;
+import org.apache.flink.api.table.test.utils.TableProgramsTestBase;
+import org.apache.flink.api.table.typeutils.RowTypeInfo;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import java.util.List;
+
+
+@RunWith(Parameterized.class)
+public class TableSourceITCase extends TableProgramsTestBase {
+
+   public TableSourceITCase(TestExecutionMode mode, TableConfigMode 
configMode) {
+   super(mode, configMode);
+   }
+
+   @Test
+   public void testStreamTableSourceTableAPI() throws Exception {
+   ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();
+   BatchTableEnvironment tableEnv = 
TableEnvironment.getTableEnvironment(env, config());
+
+   tableEnv.registerTableSource("MyTable", new 
TestBatchTableSource());
+
+   Table result = tableEnv.scan("MyTable")
+   .where("amount < 4")
+   .select("amount * id, name");
+
+   DataSet resultSet = tableEnv.toDataSet(result, Row.class);
+   List results = resultSet.collect();
+
+   String expected = "0,Record_0\n" + "0,Record_16\n" + 
"0,Record_32\n" + "1,Record_1\n" +
+   "17,Record_17\n" + "36,Record_18\n" + "4,Record_2\n" + 
"57,Record_19\n" + "9,Record_3\n";
+
+   compareResultAsText(results, expected);
+   }
+
+   @Test
+   public void testStreamTableSourceSQL() throws Exception {
--- End diff --

-> test*Batch*TableSourceSQL?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2828] [tableAPI] Add TableSource interf...

2016-04-26 Thread vasia
Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/1939#discussion_r61105889
  
--- Diff: 
flink-libraries/flink-table/src/test/java/org/apache/flink/api/java/table/test/TableSourceITCase.java
 ---
@@ -0,0 +1,120 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.java.table.test;
+
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.api.java.table.BatchTableEnvironment;
+import org.apache.flink.api.scala.table.test.GeneratingInputFormat;
+import org.apache.flink.api.table.Row;
+import org.apache.flink.api.table.Table;
+import org.apache.flink.api.table.TableEnvironment;
+import org.apache.flink.api.table.sources.BatchTableSource;
+import org.apache.flink.api.table.test.utils.TableProgramsTestBase;
+import org.apache.flink.api.table.typeutils.RowTypeInfo;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.Parameterized;
+
+import java.util.List;
+
+
+@RunWith(Parameterized.class)
+public class TableSourceITCase extends TableProgramsTestBase {
+
+   public TableSourceITCase(TestExecutionMode mode, TableConfigMode 
configMode) {
+   super(mode, configMode);
+   }
+
+   @Test
+   public void testStreamTableSourceTableAPI() throws Exception {
--- End diff --

-> test*Batch*TableSourceTableAPI?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2828) Add interfaces for Table API input formats

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258258#comment-15258258
 ] 

ASF GitHub Bot commented on FLINK-2828:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/1939#discussion_r61105638
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/schema/TableSourceTable.scala
 ---
@@ -0,0 +1,32 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.table.plan.schema
+
+import org.apache.flink.api.table.Row
+import org.apache.flink.api.table.sources.TableSource
+import org.apache.flink.api.table.typeutils.RowTypeInfo
+
+/** Table which defines an external table via a [[TableSource]] */
+class TableSourceTable(val tableSource: TableSource[_])
--- End diff --

This is my favorite table name after `TableTable` :))


> Add interfaces for Table API input formats
> --
>
> Key: FLINK-2828
> URL: https://issues.apache.org/jira/browse/FLINK-2828
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API
>Reporter: Timo Walther
>Assignee: Fabian Hueske
>
> In order to support input formats for the Table API, interfaces are 
> necessary. I propose two types of TableSources:
> - AdaptiveTableSources can adapt their output to the requirements of the 
> plan. Although the output schema stays the same, the TableSource can react on 
> field resolution and/or predicates internally and can return adapted 
> DataSet/DataStream versions in the "translate" step.
> - StaticTableSources are an easy way to provide the Table API with additional 
> input formats without much implementation effort (e.g. for fromCsvFile())
> TableSources need to be deeply integrated into the Table API.
> The TableEnvironment requires a newly introduced AbstractExecutionEnvironment 
> (common super class of all ExecutionEnvironments for DataSets and 
> DataStreams).
> Here's what a TableSource can see from more complicated queries:
> {code}
> getTableJava(tableSource1)
>   .filter("a===5 || a===6")
>   .select("a as a4, b as b4, c as c4")
>   .filter("b4===7")
>   .join(getTableJava(tableSource2))
>   .where("a===a4 && c==='Test' && c4==='Test2'")
> // Result predicates for tableSource1:
> //  List("a===5 || a===6", "b===7", "c==='Test2'")
> // Result predicates for tableSource2:
> //  List("c==='Test'")
> // Result resolved fields for tableSource1 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("a", false), ("b", true), ("b", false), ("c", false), 
> ("c", true))
> // Result resolved fields for tableSource2 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("c", true))
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2828] [tableAPI] Add TableSource interf...

2016-04-26 Thread vasia
Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/1939#discussion_r61105638
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/schema/TableSourceTable.scala
 ---
@@ -0,0 +1,32 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.table.plan.schema
+
+import org.apache.flink.api.table.Row
+import org.apache.flink.api.table.sources.TableSource
+import org.apache.flink.api.table.typeutils.RowTypeInfo
+
+/** Table which defines an external table via a [[TableSource]] */
+class TableSourceTable(val tableSource: TableSource[_])
--- End diff --

This is my favorite table name after `TableTable` :))


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2828) Add interfaces for Table API input formats

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258231#comment-15258231
 ] 

ASF GitHub Bot commented on FLINK-2828:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/1939#discussion_r61103432
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/nodes/datastream/StreamTableSourceScan.scala
 ---
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.table.plan.nodes.datastream
+
+import org.apache.calcite.plan._
+import org.apache.calcite.rel.RelNode
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.TableScan
+import org.apache.flink.api.common.functions.MapFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.api.java.typeutils.PojoTypeInfo
+import org.apache.flink.api.table.StreamTableEnvironment
+import org.apache.flink.api.table.codegen.CodeGenerator
+import org.apache.flink.api.table.plan.schema.TableSourceTable
+import org.apache.flink.api.table.runtime.MapRunner
+import org.apache.flink.api.table.sources.StreamTableSource
+import 
org.apache.flink.api.table.typeutils.TypeConverter.determineReturnType
+import org.apache.flink.streaming.api.datastream.DataStream
+
+import scala.collection.JavaConversions._
+
+/** Flink RelNode to read data from an external source defined by a 
[[StreamTableSource]]. */
+class StreamTableSourceScan(
--- End diff --

Same with this class and `DataStreamSource`, no?


> Add interfaces for Table API input formats
> --
>
> Key: FLINK-2828
> URL: https://issues.apache.org/jira/browse/FLINK-2828
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API
>Reporter: Timo Walther
>Assignee: Fabian Hueske
>
> In order to support input formats for the Table API, interfaces are 
> necessary. I propose two types of TableSources:
> - AdaptiveTableSources can adapt their output to the requirements of the 
> plan. Although the output schema stays the same, the TableSource can react on 
> field resolution and/or predicates internally and can return adapted 
> DataSet/DataStream versions in the "translate" step.
> - StaticTableSources are an easy way to provide the Table API with additional 
> input formats without much implementation effort (e.g. for fromCsvFile())
> TableSources need to be deeply integrated into the Table API.
> The TableEnvironment requires a newly introduced AbstractExecutionEnvironment 
> (common super class of all ExecutionEnvironments for DataSets and 
> DataStreams).
> Here's what a TableSource can see from more complicated queries:
> {code}
> getTableJava(tableSource1)
>   .filter("a===5 || a===6")
>   .select("a as a4, b as b4, c as c4")
>   .filter("b4===7")
>   .join(getTableJava(tableSource2))
>   .where("a===a4 && c==='Test' && c4==='Test2'")
> // Result predicates for tableSource1:
> //  List("a===5 || a===6", "b===7", "c==='Test2'")
> // Result predicates for tableSource2:
> //  List("c==='Test'")
> // Result resolved fields for tableSource1 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("a", false), ("b", true), ("b", false), ("c", false), 
> ("c", true))
> // Result resolved fields for tableSource2 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("c", true))
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2828) Add interfaces for Table API input formats

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258247#comment-15258247
 ] 

ASF GitHub Bot commented on FLINK-2828:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/1939#discussion_r61104611
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/rules/dataSet/BatchTableSourceScanRule.scala
 ---
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.table.plan.rules.dataSet
+
+import org.apache.calcite.plan.{Convention, RelOptRule, RelOptRuleCall, 
RelTraitSet}
+import org.apache.calcite.rel.RelNode
+import org.apache.calcite.rel.convert.ConverterRule
+import org.apache.calcite.rel.core.TableScan
+import org.apache.calcite.rel.logical.LogicalTableScan
+import 
org.apache.flink.api.table.plan.nodes.dataset.{BatchTableSourceScan, 
DataSetConvention}
+import org.apache.flink.api.table.plan.schema.TableSourceTable
+import org.apache.flink.api.table.sources.BatchTableSource
+
+/** Rule to convert a [[LogicalTableScan]] into a 
[[BatchTableSourceScan]]. */
+class BatchTableSourceScanRule
+  extends ConverterRule(
+  classOf[LogicalTableScan],
+  Convention.NONE,
+  DataSetConvention.INSTANCE,
+  "BatchTableSourceScanRule")
+  {
+
+  /** Rule must only match if TableScan targets a [[BatchTableSource]] */
--- End diff --

Could merge this rule with `DataSetScanRule`, so that if we have a 
`TableSourceTable ` convert to `BatchTableSourceScan ` and if we have a 
`DataSetTable` convert to `DataSetSource`?


> Add interfaces for Table API input formats
> --
>
> Key: FLINK-2828
> URL: https://issues.apache.org/jira/browse/FLINK-2828
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API
>Reporter: Timo Walther
>Assignee: Fabian Hueske
>
> In order to support input formats for the Table API, interfaces are 
> necessary. I propose two types of TableSources:
> - AdaptiveTableSources can adapt their output to the requirements of the 
> plan. Although the output schema stays the same, the TableSource can react on 
> field resolution and/or predicates internally and can return adapted 
> DataSet/DataStream versions in the "translate" step.
> - StaticTableSources are an easy way to provide the Table API with additional 
> input formats without much implementation effort (e.g. for fromCsvFile())
> TableSources need to be deeply integrated into the Table API.
> The TableEnvironment requires a newly introduced AbstractExecutionEnvironment 
> (common super class of all ExecutionEnvironments for DataSets and 
> DataStreams).
> Here's what a TableSource can see from more complicated queries:
> {code}
> getTableJava(tableSource1)
>   .filter("a===5 || a===6")
>   .select("a as a4, b as b4, c as c4")
>   .filter("b4===7")
>   .join(getTableJava(tableSource2))
>   .where("a===a4 && c==='Test' && c4==='Test2'")
> // Result predicates for tableSource1:
> //  List("a===5 || a===6", "b===7", "c==='Test2'")
> // Result predicates for tableSource2:
> //  List("c==='Test'")
> // Result resolved fields for tableSource1 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("a", false), ("b", true), ("b", false), ("c", false), 
> ("c", true))
> // Result resolved fields for tableSource2 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("c", true))
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2828] [tableAPI] Add TableSource interf...

2016-04-26 Thread vasia
Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/1939#discussion_r61104611
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/rules/dataSet/BatchTableSourceScanRule.scala
 ---
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.table.plan.rules.dataSet
+
+import org.apache.calcite.plan.{Convention, RelOptRule, RelOptRuleCall, 
RelTraitSet}
+import org.apache.calcite.rel.RelNode
+import org.apache.calcite.rel.convert.ConverterRule
+import org.apache.calcite.rel.core.TableScan
+import org.apache.calcite.rel.logical.LogicalTableScan
+import 
org.apache.flink.api.table.plan.nodes.dataset.{BatchTableSourceScan, 
DataSetConvention}
+import org.apache.flink.api.table.plan.schema.TableSourceTable
+import org.apache.flink.api.table.sources.BatchTableSource
+
+/** Rule to convert a [[LogicalTableScan]] into a 
[[BatchTableSourceScan]]. */
+class BatchTableSourceScanRule
+  extends ConverterRule(
+  classOf[LogicalTableScan],
+  Convention.NONE,
+  DataSetConvention.INSTANCE,
+  "BatchTableSourceScanRule")
+  {
+
+  /** Rule must only match if TableScan targets a [[BatchTableSource]] */
--- End diff --

Could merge this rule with `DataSetScanRule`, so that if we have a 
`TableSourceTable ` convert to `BatchTableSourceScan ` and if we have a 
`DataSetTable` convert to `DataSetSource`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2828) Add interfaces for Table API input formats

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258219#comment-15258219
 ] 

ASF GitHub Bot commented on FLINK-2828:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/1939#discussion_r61101726
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/nodes/dataset/BatchTableSourceScan.scala
 ---
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.table.plan.nodes.dataset
+
+import org.apache.calcite.plan._
+import org.apache.calcite.rel.RelNode
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.TableScan
+import org.apache.calcite.rel.metadata.RelMetadataQuery
+import org.apache.flink.api.common.functions.MapFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.api.java.DataSet
+import org.apache.flink.api.java.typeutils.PojoTypeInfo
+import org.apache.flink.api.table.BatchTableEnvironment
+import org.apache.flink.api.table.codegen.CodeGenerator
+import org.apache.flink.api.table.plan.schema.TableSourceTable
+import org.apache.flink.api.table.runtime.MapRunner
+import org.apache.flink.api.table.sources.BatchTableSource
+import 
org.apache.flink.api.table.typeutils.TypeConverter.determineReturnType
+
+import scala.collection.JavaConversions._
+import scala.collection.JavaConverters._
+
+/** Flink RelNode to read data from an external source defined by a 
[[BatchTableSource]]. */
+class BatchTableSourceScan(
--- End diff --

This class has much in common with `DataSetSource`. The only difference is 
how we unwrap the table, right? Could we refactor to share common code maybe?


> Add interfaces for Table API input formats
> --
>
> Key: FLINK-2828
> URL: https://issues.apache.org/jira/browse/FLINK-2828
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API
>Reporter: Timo Walther
>Assignee: Fabian Hueske
>
> In order to support input formats for the Table API, interfaces are 
> necessary. I propose two types of TableSources:
> - AdaptiveTableSources can adapt their output to the requirements of the 
> plan. Although the output schema stays the same, the TableSource can react on 
> field resolution and/or predicates internally and can return adapted 
> DataSet/DataStream versions in the "translate" step.
> - StaticTableSources are an easy way to provide the Table API with additional 
> input formats without much implementation effort (e.g. for fromCsvFile())
> TableSources need to be deeply integrated into the Table API.
> The TableEnvironment requires a newly introduced AbstractExecutionEnvironment 
> (common super class of all ExecutionEnvironments for DataSets and 
> DataStreams).
> Here's what a TableSource can see from more complicated queries:
> {code}
> getTableJava(tableSource1)
>   .filter("a===5 || a===6")
>   .select("a as a4, b as b4, c as c4")
>   .filter("b4===7")
>   .join(getTableJava(tableSource2))
>   .where("a===a4 && c==='Test' && c4==='Test2'")
> // Result predicates for tableSource1:
> //  List("a===5 || a===6", "b===7", "c==='Test2'")
> // Result predicates for tableSource2:
> //  List("c==='Test'")
> // Result resolved fields for tableSource1 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("a", false), ("b", true), ("b", false), ("c", false), 
> ("c", true))
> // Result resolved fields for tableSource2 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("c", true))
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2828] [tableAPI] Add TableSource interf...

2016-04-26 Thread vasia
Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/1939#discussion_r61101726
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/nodes/dataset/BatchTableSourceScan.scala
 ---
@@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.table.plan.nodes.dataset
+
+import org.apache.calcite.plan._
+import org.apache.calcite.rel.RelNode
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.TableScan
+import org.apache.calcite.rel.metadata.RelMetadataQuery
+import org.apache.flink.api.common.functions.MapFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.api.java.DataSet
+import org.apache.flink.api.java.typeutils.PojoTypeInfo
+import org.apache.flink.api.table.BatchTableEnvironment
+import org.apache.flink.api.table.codegen.CodeGenerator
+import org.apache.flink.api.table.plan.schema.TableSourceTable
+import org.apache.flink.api.table.runtime.MapRunner
+import org.apache.flink.api.table.sources.BatchTableSource
+import 
org.apache.flink.api.table.typeutils.TypeConverter.determineReturnType
+
+import scala.collection.JavaConversions._
+import scala.collection.JavaConverters._
+
+/** Flink RelNode to read data from an external source defined by a 
[[BatchTableSource]]. */
+class BatchTableSourceScan(
--- End diff --

This class has much in common with `DataSetSource`. The only difference is 
how we unwrap the table, right? Could we refactor to share common code maybe?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2828) Add interfaces for Table API input formats

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258199#comment-15258199
 ] 

ASF GitHub Bot commented on FLINK-2828:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/1939#discussion_r61098440
  
--- Diff: docs/apis/batch/libs/table.md ---
@@ -67,6 +67,165 @@ The central concept of the Table API is a `Table` which 
represents a table with
 
 The following sections show by example how to use the Table API embedded 
in the Scala and Java DataSet APIs.
 
+### Registering Tables to and Accessing Tables from TableEnvironments
+
+`TableEnvironment`s have an internal table catalog to which tables can be 
registered with a unique name. After registration, a table can be accessed from 
the `TableEnvironment` by its name. Tables can be registered in different ways.
+
+ Register a DataSet
+
+A `DataSet` is registered as a `Table` in a `BatchTableEnvironment` as 
follows:
+
+
+
+{% highlight java %}
+ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
+BatchTableEnvironment tableEnv = TableEnvironment.getTableEnvironment(env);
+
+// register the DataSet cust as table "Customers" with fields derived from 
the dataset
+tableEnv.registerDataSet("Customers", cust)
+
+// register the DataSet ord as table "Orders" with fields user, product, 
and amount
+tableEnv.registerDataSet("Orders", ord, "user, product, amount");
+{% endhighlight %}
+
+
+
+{% highlight scala %}
+val env = ExecutionEnvironment.getExecutionEnvironment
+val tableEnv = TableEnvironment.getTableEnvironment(env)
+
+// register the DataSet cust as table "Customers" with fields derived from 
the dataset
+tableEnv.registerDataSet("Customers", cust)
+
+// register the DataSet ord as table "Orders" with fields user, product, 
and amount
+tableEnv.registerDataSet("Orders", ord, 'user, 'product, 'amount)
+{% endhighlight %}
+
+
+
+*Note: DataSet table names are not allowed to follow the 
`^_DataSetTable_[0-9]+` pattern, as these are reserved for internal use only.*
+
+ Register a DataStream
+
+A `DataStream` is registered as a `Table` in a `StreamTableEnvironment` as 
follows:
+
+
+
+{% highlight java %}
+StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
+StreamTableEnvironment tableEnv = 
TableEnvironment.getTableEnvironment(env);
+
+// register the DataStream cust as table "Customers" with fields derived 
from the datastream
+tableEnv.registerDataStream("Customers", cust)
+
+// register the DataStream ord as table "Orders" with fields user, 
product, and amount
+tableEnv.registerDataStream("Orders", ord, "user, product, amount");
+{% endhighlight %}
+
+
+
+{% highlight scala %}
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+val tableEnv = TableEnvironment.getTableEnvironment(env)
+
+// register the DataStream cust as table "Customers" with fields derived 
from the datastream
+tableEnv.registerDataStream("Customers", cust)
+
+// register the DataStream ord as table "Orders" with fields user, 
product, and amount
+tableEnv.registerDataStream("Orders", ord, 'user, 'product, 'amount)
+{% endhighlight %}
+
+
+
+*Note: DataStream table names are not allowed to follow the 
`^_DataStreamTable_[0-9]+` pattern, as these are reserved for internal use 
only.*
+
+ Register a Table
+
+A `Table` that originates from a Table API operation or a SQL query is 
registered in a `TableEnvironemnt` as follows:
+
+
+
+{% highlight java %}
+// works for StreamExecutionEnvironment identically
+ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
+BatchTableEnvironment tableEnv = TableEnvironment.getTableEnvironment(env);
+
+// convert a DataSet into a Table
+Table custT = tableEnv
+  .toTable(custDs, "name, zipcode")
+  .where("zipcode = '12345'")
+  .select("name")
+
+// register the Table custT as table "custNames"
+tableEnv.registerTable("custNames", custT)
+{% endhighlight %}
+
+
+
+{% highlight scala %}
+// works for StreamExecutionEnvironment identically
+val env = ExecutionEnvironment.getExecutionEnvironment
+val tableEnv = TableEnvironment.getTableEnvironment(env)
+
+// convert a DataSet into a Table
+val custT = custDs
+  .toTable(tableEnv, 'name, 'zipcode)
+  .where('zipcode === "12345")
+  .select('name)
+
+// register the Table custT as table "custNames"
+tableEnv.registerTable("custNames", custT)
+{% endhighlight %}
+
+
+
+A 

[GitHub] flink pull request: [FLINK-2828] [tableAPI] Add TableSource interf...

2016-04-26 Thread vasia
Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/1939#discussion_r61098440
  
--- Diff: docs/apis/batch/libs/table.md ---
@@ -67,6 +67,165 @@ The central concept of the Table API is a `Table` which 
represents a table with
 
 The following sections show by example how to use the Table API embedded 
in the Scala and Java DataSet APIs.
 
+### Registering Tables to and Accessing Tables from TableEnvironments
+
+`TableEnvironment`s have an internal table catalog to which tables can be 
registered with a unique name. After registration, a table can be accessed from 
the `TableEnvironment` by its name. Tables can be registered in different ways.
+
+ Register a DataSet
+
+A `DataSet` is registered as a `Table` in a `BatchTableEnvironment` as 
follows:
+
+
+
+{% highlight java %}
+ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
+BatchTableEnvironment tableEnv = TableEnvironment.getTableEnvironment(env);
+
+// register the DataSet cust as table "Customers" with fields derived from 
the dataset
+tableEnv.registerDataSet("Customers", cust)
+
+// register the DataSet ord as table "Orders" with fields user, product, 
and amount
+tableEnv.registerDataSet("Orders", ord, "user, product, amount");
+{% endhighlight %}
+
+
+
+{% highlight scala %}
+val env = ExecutionEnvironment.getExecutionEnvironment
+val tableEnv = TableEnvironment.getTableEnvironment(env)
+
+// register the DataSet cust as table "Customers" with fields derived from 
the dataset
+tableEnv.registerDataSet("Customers", cust)
+
+// register the DataSet ord as table "Orders" with fields user, product, 
and amount
+tableEnv.registerDataSet("Orders", ord, 'user, 'product, 'amount)
+{% endhighlight %}
+
+
+
+*Note: DataSet table names are not allowed to follow the 
`^_DataSetTable_[0-9]+` pattern, as these are reserved for internal use only.*
+
+ Register a DataStream
+
+A `DataStream` is registered as a `Table` in a `StreamTableEnvironment` as 
follows:
+
+
+
+{% highlight java %}
+StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
+StreamTableEnvironment tableEnv = 
TableEnvironment.getTableEnvironment(env);
+
+// register the DataStream cust as table "Customers" with fields derived 
from the datastream
+tableEnv.registerDataStream("Customers", cust)
+
+// register the DataStream ord as table "Orders" with fields user, 
product, and amount
+tableEnv.registerDataStream("Orders", ord, "user, product, amount");
+{% endhighlight %}
+
+
+
+{% highlight scala %}
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+val tableEnv = TableEnvironment.getTableEnvironment(env)
+
+// register the DataStream cust as table "Customers" with fields derived 
from the datastream
+tableEnv.registerDataStream("Customers", cust)
+
+// register the DataStream ord as table "Orders" with fields user, 
product, and amount
+tableEnv.registerDataStream("Orders", ord, 'user, 'product, 'amount)
+{% endhighlight %}
+
+
+
+*Note: DataStream table names are not allowed to follow the 
`^_DataStreamTable_[0-9]+` pattern, as these are reserved for internal use 
only.*
+
+ Register a Table
+
+A `Table` that originates from a Table API operation or a SQL query is 
registered in a `TableEnvironemnt` as follows:
+
+
+
+{% highlight java %}
+// works for StreamExecutionEnvironment identically
+ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
+BatchTableEnvironment tableEnv = TableEnvironment.getTableEnvironment(env);
+
+// convert a DataSet into a Table
+Table custT = tableEnv
+  .toTable(custDs, "name, zipcode")
+  .where("zipcode = '12345'")
+  .select("name")
+
+// register the Table custT as table "custNames"
+tableEnv.registerTable("custNames", custT)
+{% endhighlight %}
+
+
+
+{% highlight scala %}
+// works for StreamExecutionEnvironment identically
+val env = ExecutionEnvironment.getExecutionEnvironment
+val tableEnv = TableEnvironment.getTableEnvironment(env)
+
+// convert a DataSet into a Table
+val custT = custDs
+  .toTable(tableEnv, 'name, 'zipcode)
+  .where('zipcode === "12345")
+  .select('name)
+
+// register the Table custT as table "custNames"
+tableEnv.registerTable("custNames", custT)
+{% endhighlight %}
+
+
+
+A registered `Table` that originates from a Table API operation or SQL 
query is treated similarly as a view as known from relational DBMS, i.e., it 
can be inlined when optimizing the query.
+
+ Register an external table using a 

[jira] [Commented] (FLINK-2828) Add interfaces for Table API input formats

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258192#comment-15258192
 ] 

ASF GitHub Bot commented on FLINK-2828:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/1939#discussion_r61098044
  
--- Diff: docs/apis/batch/libs/table.md ---
@@ -67,6 +67,165 @@ The central concept of the Table API is a `Table` which 
represents a table with
 
 The following sections show by example how to use the Table API embedded 
in the Scala and Java DataSet APIs.
 
+### Registering Tables to and Accessing Tables from TableEnvironments
+
+`TableEnvironment`s have an internal table catalog to which tables can be 
registered with a unique name. After registration, a table can be accessed from 
the `TableEnvironment` by its name. Tables can be registered in different ways.
--- End diff --

Shall we add a comment that registering a table is only required for SQL 
queries and not for "pure" Table API programs?


> Add interfaces for Table API input formats
> --
>
> Key: FLINK-2828
> URL: https://issues.apache.org/jira/browse/FLINK-2828
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API
>Reporter: Timo Walther
>Assignee: Fabian Hueske
>
> In order to support input formats for the Table API, interfaces are 
> necessary. I propose two types of TableSources:
> - AdaptiveTableSources can adapt their output to the requirements of the 
> plan. Although the output schema stays the same, the TableSource can react on 
> field resolution and/or predicates internally and can return adapted 
> DataSet/DataStream versions in the "translate" step.
> - StaticTableSources are an easy way to provide the Table API with additional 
> input formats without much implementation effort (e.g. for fromCsvFile())
> TableSources need to be deeply integrated into the Table API.
> The TableEnvironment requires a newly introduced AbstractExecutionEnvironment 
> (common super class of all ExecutionEnvironments for DataSets and 
> DataStreams).
> Here's what a TableSource can see from more complicated queries:
> {code}
> getTableJava(tableSource1)
>   .filter("a===5 || a===6")
>   .select("a as a4, b as b4, c as c4")
>   .filter("b4===7")
>   .join(getTableJava(tableSource2))
>   .where("a===a4 && c==='Test' && c4==='Test2'")
> // Result predicates for tableSource1:
> //  List("a===5 || a===6", "b===7", "c==='Test2'")
> // Result predicates for tableSource2:
> //  List("c==='Test'")
> // Result resolved fields for tableSource1 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("a", false), ("b", true), ("b", false), ("c", false), 
> ("c", true))
> // Result resolved fields for tableSource2 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("c", true))
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2828] [tableAPI] Add TableSource interf...

2016-04-26 Thread vasia
Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/1939#discussion_r61098044
  
--- Diff: docs/apis/batch/libs/table.md ---
@@ -67,6 +67,165 @@ The central concept of the Table API is a `Table` which 
represents a table with
 
 The following sections show by example how to use the Table API embedded 
in the Scala and Java DataSet APIs.
 
+### Registering Tables to and Accessing Tables from TableEnvironments
+
+`TableEnvironment`s have an internal table catalog to which tables can be 
registered with a unique name. After registration, a table can be accessed from 
the `TableEnvironment` by its name. Tables can be registered in different ways.
--- End diff --

Shall we add a comment that registering a table is only required for SQL 
queries and not for "pure" Table API programs?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request:

2016-04-26 Thread aljoscha
Github user aljoscha commented on the pull request:


https://github.com/apache/flink/commit/879bb1bb029ed37e33adc7a655328940135cfcb3#commitcomment-17254978
  
Could do, but it does not really impact performance very much.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3086) ExpressionParser does not support concatenation of suffix operations

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258160#comment-15258160
 ] 

ASF GitHub Bot commented on FLINK-3086:
---

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/1912#issuecomment-214762168
  
I'm not very familiar with the `ExpressionParser`, but I don't see anything 
suspicious :)


> ExpressionParser does not support concatenation of suffix operations
> 
>
> Key: FLINK-3086
> URL: https://issues.apache.org/jira/browse/FLINK-3086
> Project: Flink
>  Issue Type: Bug
>  Components: Table API
>Reporter: Timo Walther
>Assignee: Timo Walther
>
> The ExpressionParser of the Table API does not support concatenation of 
> suffix operations. e.g. 
> {code}table.select("field.cast(STRING).substring(2)"){code} throws  an 
> exception.
> {code}
> org.apache.flink.api.table.ExpressionException: Could not parse expression: 
> string matching regex `\z' expected but `.' found
>   at 
> org.apache.flink.api.table.parser.ExpressionParser$.parseExpressionList(ExpressionParser.scala:224)
> {code}
> However, the Scala implicit Table Expression API supports this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3086] [table] ExpressionParser does not...

2016-04-26 Thread vasia
Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/1912#issuecomment-214762168
  
I'm not very familiar with the `ExpressionParser`, but I don't see anything 
suspicious :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request:

2016-04-26 Thread uce
Github user uce commented on the pull request:


https://github.com/apache/flink/commit/879bb1bb029ed37e33adc7a655328940135cfcb3#commitcomment-17254739
  
Should we add this to `release-1.0` as well?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3777) Add open and close methods to manage IF lifecycle

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258127#comment-15258127
 ] 

ASF GitHub Bot commented on FLINK-3777:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1903


> Add open and close methods to manage IF lifecycle
> -
>
> Key: FLINK-3777
> URL: https://issues.apache.org/jira/browse/FLINK-3777
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.0.1
>Reporter: Flavio Pompermaier
>Assignee: Flavio Pompermaier
>  Labels: inputformat, lifecycle
>
> At the moment the opening and closing of an inputFormat are not managed, 
> although open() could be (improperly IMHO) simulated by configure().
> This limits the possibility to reuse expensive resources (like database 
> connections) and manage their release. 
> Probably the best option would be to add 2 methods (i.e. openInputformat() 
> and closeInputFormat() ) to RichInputFormat*
> * NOTE: the best option from a "semantic" point of view would be to rename 
> the current open() and close() to openSplit() and closeSplit() respectively 
> while using open() and close() methods for the IF lifecycle management, but 
> this would cause a backward compatibility issue...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-3777) Add open and close methods to manage IF lifecycle

2016-04-26 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-3777.
---
Resolution: Fixed

Implemented in ac2137cfa5e63bd4f53a4b7669dc591ab210093f

> Add open and close methods to manage IF lifecycle
> -
>
> Key: FLINK-3777
> URL: https://issues.apache.org/jira/browse/FLINK-3777
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.0.1
>Reporter: Flavio Pompermaier
>Assignee: Flavio Pompermaier
>  Labels: inputformat, lifecycle
>
> At the moment the opening and closing of an inputFormat are not managed, 
> although open() could be (improperly IMHO) simulated by configure().
> This limits the possibility to reuse expensive resources (like database 
> connections) and manage their release. 
> Probably the best option would be to add 2 methods (i.e. openInputformat() 
> and closeInputFormat() ) to RichInputFormat*
> * NOTE: the best option from a "semantic" point of view would be to rename 
> the current open() and close() to openSplit() and closeSplit() respectively 
> while using open() and close() methods for the IF lifecycle management, but 
> this would cause a backward compatibility issue...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3777] Managed closeInputFormat()

2016-04-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1903


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3618) Rename abstract UDF classes in Scatter-Gather implementation

2016-04-26 Thread Vasia Kalavri (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258097#comment-15258097
 ] 

Vasia Kalavri commented on FLINK-3618:
--

I agree. The sooner we make this change the better. Would any of you like to 
take over this issue [~mju], [~greghogan]?

> Rename abstract UDF classes in Scatter-Gather implementation
> 
>
> Key: FLINK-3618
> URL: https://issues.apache.org/jira/browse/FLINK-3618
> Project: Flink
>  Issue Type: Improvement
>  Components: Gelly
>Affects Versions: 1.1.0, 1.0.1
>Reporter: Martin Junghanns
>Priority: Minor
>
> We now offer three Vertex-centric computing abstractions:
> * Pregel
> * Gather-Sum-Apply
> * Scatter-Gather
> Each of these abstractions provides abstract classes that need to be 
> implemented by the user:
> * Pregel: {{ComputeFunction}}
> * GSA: {{GatherFunction}}, {{SumFunction}}, {{ApplyFunction}}
> * Scatter-Gather: {{MessagingFunction}}, {{VertexUpdateFunction}}
> In Pregel and GSA, the names of those functions follow the name of the 
> abstraction or the name suggested in the corresponding papers. For 
> consistency of the API, I propose to rename {{MessageFunction}} to 
> {{ScatterFunction}} and {{VertexUpdateFunction}} to {{GatherFunction}}.
> Also for consistency, I would like to change the parameter order in 
> {{Graph.runScatterGatherIteration(VertexUpdateFunction f1, MessagingFunction 
> f2}} to  {{Graph.runScatterGatherIteration(ScatterFunction f1, GatherFunction 
> f2}} (like in {{Graph.runGatherSumApplyFunction(...)}})



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2828] [tableAPI] Add TableSource interf...

2016-04-26 Thread fhueske
GitHub user fhueske opened a pull request:

https://github.com/apache/flink/pull/1939

[FLINK-2828] [tableAPI] Add TableSource interfaces for external tables.

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [X] General
  - The pull request references the related JIRA issue
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message

- [X] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [X] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed

- Add CsvTableSource as a reference implementation for TableSources.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fhueske/flink tableSource

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1939.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1939


commit 06076767623895cc380d98543417feb3493396d9
Author: Fabian Hueske 
Date:   2016-04-25T17:00:09Z

[FLINK-2828] [tableAPI] Add TableSource interfaces for external tables.

- Add CsvTableSource as a reference implementation for TableSources.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2828) Add interfaces for Table API input formats

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258078#comment-15258078
 ] 

ASF GitHub Bot commented on FLINK-2828:
---

GitHub user fhueske opened a pull request:

https://github.com/apache/flink/pull/1939

[FLINK-2828] [tableAPI] Add TableSource interfaces for external tables.

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [X] General
  - The pull request references the related JIRA issue
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message

- [X] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [X] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed

- Add CsvTableSource as a reference implementation for TableSources.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/fhueske/flink tableSource

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1939.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1939


commit 06076767623895cc380d98543417feb3493396d9
Author: Fabian Hueske 
Date:   2016-04-25T17:00:09Z

[FLINK-2828] [tableAPI] Add TableSource interfaces for external tables.

- Add CsvTableSource as a reference implementation for TableSources.




> Add interfaces for Table API input formats
> --
>
> Key: FLINK-2828
> URL: https://issues.apache.org/jira/browse/FLINK-2828
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API
>Reporter: Timo Walther
>Assignee: Fabian Hueske
>
> In order to support input formats for the Table API, interfaces are 
> necessary. I propose two types of TableSources:
> - AdaptiveTableSources can adapt their output to the requirements of the 
> plan. Although the output schema stays the same, the TableSource can react on 
> field resolution and/or predicates internally and can return adapted 
> DataSet/DataStream versions in the "translate" step.
> - StaticTableSources are an easy way to provide the Table API with additional 
> input formats without much implementation effort (e.g. for fromCsvFile())
> TableSources need to be deeply integrated into the Table API.
> The TableEnvironment requires a newly introduced AbstractExecutionEnvironment 
> (common super class of all ExecutionEnvironments for DataSets and 
> DataStreams).
> Here's what a TableSource can see from more complicated queries:
> {code}
> getTableJava(tableSource1)
>   .filter("a===5 || a===6")
>   .select("a as a4, b as b4, c as c4")
>   .filter("b4===7")
>   .join(getTableJava(tableSource2))
>   .where("a===a4 && c==='Test' && c4==='Test2'")
> // Result predicates for tableSource1:
> //  List("a===5 || a===6", "b===7", "c==='Test2'")
> // Result predicates for tableSource2:
> //  List("c==='Test'")
> // Result resolved fields for tableSource1 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("a", false), ("b", true), ("b", false), ("c", false), 
> ("c", true))
> // Result resolved fields for tableSource2 (true = filtering, 
> false=selection):
> //  Set(("a", true), ("c", true))
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3771) Methods for translating Graphs

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258076#comment-15258076
 ] 

ASF GitHub Bot commented on FLINK-3771:
---

Github user greghogan commented on a diff in the pull request:

https://github.com/apache/flink/pull/1900#discussion_r61083553
  
--- Diff: 
flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/translate/Translate.java
 ---
@@ -0,0 +1,362 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph.translate;
+
+import org.apache.flink.api.common.ExecutionConfig;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.java.DataSet;
+import 
org.apache.flink.api.java.functions.FunctionAnnotation.ForwardedFields;
+import org.apache.flink.api.java.typeutils.TupleTypeInfo;
+import org.apache.flink.graph.Edge;
+import org.apache.flink.graph.Graph;
+import org.apache.flink.graph.Vertex;
+
+/**
+ * Methods for translation of the type or modification of the data of graph
+ * labels, vertex values, and edge values.
+ */
+public class Translate {
+
+   // 

+   //  Translate graph labels
+   // 

+
+   /**
+* Relabels {@link Vertex Vertices} and {@link Edge}s of a {@link 
Graph} using the given {@link Translator}.
+*
+* @param graph input graph
+* @param translator implements conversion from {@code OLD} to {@code 
NEW}
+* @param  old graph label type
+* @param  new graph label type
+* @param  vertex value type
+* @param  edge value type
+* @return translated graph
+*/
+   public static  Graph 
translateGraphLabels(Graph graph, Translator translator) {
+   return translateGraphLabels(graph, translator, 
ExecutionConfig.PARALLELISM_UNKNOWN);
+   }
+
+   /**
+* Relabels {@link Vertex Vertices} and {@link Edge}s of a {@link 
Graph} using the given {@link Translator}.
+*
+* @param graph input graph
+* @param translator implements conversion from {@code OLD} to {@code 
NEW}
+* @param parallelism operator parallelism
+* @param  old graph label type
+* @param  new graph label type
+* @param  vertex value type
+* @param  edge value type
+* @return translated graph
+*/
+   public static  Graph 
translateGraphLabels(Graph graph, Translator translator, 
int parallelism) {
+   // Vertices
+   DataSet> translatedVertices = 
translateVertexLabels(graph.getVertices(), translator, parallelism);
+
+   // Edges
+   DataSet> translatedEdges = 
translateEdgeLabels(graph.getEdges(), translator, parallelism);
+
+   // Graph
+   return Graph.fromDataSet(translatedVertices, translatedEdges, 
graph.getContext());
+   }
+
+   // 

+   //  Translate vertex labels
+   // 

+
+   /**
+* Translate {@link Vertex} labels using the given {@link Translator}.
+*
+* @param vertices input vertices
+* @param translator implements conversion from {@code OLD} to {@code 
NEW}
+* @param  old vertex label type
+* @param  new vertex label type
+* @param  vertex value type
+* @return translated vertices
+*/
+   public static  DataSet> 
translateVertexLabels(DataSet> 

[GitHub] flink pull request: [FLINK-3771] [gelly] Methods for translating G...

2016-04-26 Thread greghogan
Github user greghogan commented on a diff in the pull request:

https://github.com/apache/flink/pull/1900#discussion_r61083553
  
--- Diff: 
flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/translate/Translate.java
 ---
@@ -0,0 +1,362 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph.translate;
+
+import org.apache.flink.api.common.ExecutionConfig;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.java.DataSet;
+import 
org.apache.flink.api.java.functions.FunctionAnnotation.ForwardedFields;
+import org.apache.flink.api.java.typeutils.TupleTypeInfo;
+import org.apache.flink.graph.Edge;
+import org.apache.flink.graph.Graph;
+import org.apache.flink.graph.Vertex;
+
+/**
+ * Methods for translation of the type or modification of the data of graph
+ * labels, vertex values, and edge values.
+ */
+public class Translate {
+
+   // 

+   //  Translate graph labels
+   // 

+
+   /**
+* Relabels {@link Vertex Vertices} and {@link Edge}s of a {@link 
Graph} using the given {@link Translator}.
+*
+* @param graph input graph
+* @param translator implements conversion from {@code OLD} to {@code 
NEW}
+* @param  old graph label type
+* @param  new graph label type
+* @param  vertex value type
+* @param  edge value type
+* @return translated graph
+*/
+   public static  Graph 
translateGraphLabels(Graph graph, Translator translator) {
+   return translateGraphLabels(graph, translator, 
ExecutionConfig.PARALLELISM_UNKNOWN);
+   }
+
+   /**
+* Relabels {@link Vertex Vertices} and {@link Edge}s of a {@link 
Graph} using the given {@link Translator}.
+*
+* @param graph input graph
+* @param translator implements conversion from {@code OLD} to {@code 
NEW}
+* @param parallelism operator parallelism
+* @param  old graph label type
+* @param  new graph label type
+* @param  vertex value type
+* @param  edge value type
+* @return translated graph
+*/
+   public static  Graph 
translateGraphLabels(Graph graph, Translator translator, 
int parallelism) {
+   // Vertices
+   DataSet> translatedVertices = 
translateVertexLabels(graph.getVertices(), translator, parallelism);
+
+   // Edges
+   DataSet> translatedEdges = 
translateEdgeLabels(graph.getEdges(), translator, parallelism);
+
+   // Graph
+   return Graph.fromDataSet(translatedVertices, translatedEdges, 
graph.getContext());
+   }
+
+   // 

+   //  Translate vertex labels
+   // 

+
+   /**
+* Translate {@link Vertex} labels using the given {@link Translator}.
+*
+* @param vertices input vertices
+* @param translator implements conversion from {@code OLD} to {@code 
NEW}
+* @param  old vertex label type
+* @param  new vertex label type
+* @param  vertex value type
+* @return translated vertices
+*/
+   public static  DataSet> 
translateVertexLabels(DataSet> vertices, Translator 
translator) {
+   return translateVertexLabels(vertices, translator, 
ExecutionConfig.PARALLELISM_UNKNOWN);
+   }
+
+   /**
+* Translate {@link Vertex} labels using the given 

[jira] [Resolved] (FLINK-3799) Graph checksum should execute single job

2016-04-26 Thread Greg Hogan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Hogan resolved FLINK-3799.
---
Resolution: Implemented

Implemented in 06bf4bf3f465b1c019b40dfd7c1662507d7fa2e3

> Graph checksum should execute single job
> 
>
> Key: FLINK-3799
> URL: https://issues.apache.org/jira/browse/FLINK-3799
> Project: Flink
>  Issue Type: Improvement
>  Components: Gelly
>Affects Versions: 1.1.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
> Fix For: 1.1.0
>
>
> {{GraphUtils.checksumHashCode()}} calls {{DataSetUtils.checksumHashCode()}} 
> for both the vertex and edge {{DataSet}} which each require a separate job. 
> Rewrite this to only execute a single job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3799) Graph checksum should execute single job

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258049#comment-15258049
 ] 

ASF GitHub Bot commented on FLINK-3799:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1922


> Graph checksum should execute single job
> 
>
> Key: FLINK-3799
> URL: https://issues.apache.org/jira/browse/FLINK-3799
> Project: Flink
>  Issue Type: Improvement
>  Components: Gelly
>Affects Versions: 1.1.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
> Fix For: 1.1.0
>
>
> {{GraphUtils.checksumHashCode()}} calls {{DataSetUtils.checksumHashCode()}} 
> for both the vertex and edge {{DataSet}} which each require a separate job. 
> Rewrite this to only execute a single job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3799] [gelly] Graph checksum should exe...

2016-04-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1922


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3799) Graph checksum should execute single job

2016-04-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15258044#comment-15258044
 ] 

ASF GitHub Bot commented on FLINK-3799:
---

Github user greghogan commented on the pull request:

https://github.com/apache/flink/pull/1922#issuecomment-214734797
  
Merging ...


> Graph checksum should execute single job
> 
>
> Key: FLINK-3799
> URL: https://issues.apache.org/jira/browse/FLINK-3799
> Project: Flink
>  Issue Type: Improvement
>  Components: Gelly
>Affects Versions: 1.1.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
> Fix For: 1.1.0
>
>
> {{GraphUtils.checksumHashCode()}} calls {{DataSetUtils.checksumHashCode()}} 
> for both the vertex and edge {{DataSet}} which each require a separate job. 
> Rewrite this to only execute a single job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-3777] Managed closeInputFormat()

2016-04-26 Thread zentol
Github user zentol commented on the pull request:

https://github.com/apache/flink/pull/1903#issuecomment-214734647
  
Merging.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


  1   2   3   >