date:20170302

[GitHub] flink issue #3461: [FLINK-5954] Always assign names to the window in the Str...

2017-03-02 Thread sunjincheng121

Github user sunjincheng121 commented on the issue:

https://github.com/apache/flink/pull/3461
  
HI, @haohu thanks for your contribution!
Can you tell me why we need this change? and can you add simple unit test ?

Best,
SunJincheng


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5954) Always assign names to the window in the Stream SQL API

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893879#comment-15893879
 ] 

ASF GitHub Bot commented on FLINK-5954:
---

Github user sunjincheng121 commented on the issue:

https://github.com/apache/flink/pull/3461
  
HI, @haohu thanks for your contribution!
Can you tell me why we need this change? and can you add simple unit test ?

Best,
SunJincheng


> Always assign names to the window in the Stream SQL API
> ---
>
> Key: FLINK-5954
> URL: https://issues.apache.org/jira/browse/FLINK-5954
> Project: Flink
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>
> CALCITE-1603 and CALCITE-1615 brings in supports for {{TUMBLE}}, {{HOP}}, 
> {{SESSION}} grouped windows, as well as the corresponding auxiliary functions 
> that allow uses to query the start and the end of the windows (e.g., 
> {{TUMBLE_START()}} and {{TUMBLE_END()}} see 
> http://calcite.apache.org/docs/stream.html for more details).
> The goal of this jira is to add support for these auxiliary functions in 
> Flink. Flink already has runtime supports for them, as these functions are 
> essential mapped to the {{WindowStart}} and {{WindowEnd}} classes.
> To implement this feature in transformation, the transformation needs to 
> recognize these functions and map them to the {{WindowStart}} and 
> {{WindowEnd}} classes.
> The problem is that both classes can only refer to the windows using alias. 
> Therefore this jira proposes to assign a unique name for each window to 
> enable the transformation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5654) Add processing time OVER RANGE BETWEEN x PRECEDING aggregation to SQL

2017-03-02 Thread radu (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893873#comment-15893873
 ] 

radu commented on FLINK-5654:
-

Thanks sunjincheng - I saw the message to late unfortunately. I can do another 
pull request after the commit is merge to update the aggregations to your 
interface.

> Add processing time OVER RANGE BETWEEN x PRECEDING aggregation to SQL
> -
>
> Key: FLINK-5654
> URL: https://issues.apache.org/jira/browse/FLINK-5654
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Fabian Hueske
>Assignee: radu
>
> The goal of this issue is to add support for OVER RANGE aggregations on 
> processing time streams to the SQL interface.
> Queries similar to the following should be supported:
> {code}
> SELECT 
>   a, 
>   SUM(b) OVER (PARTITION BY c ORDER BY procTime() RANGE BETWEEN INTERVAL '1' 
> HOUR PRECEDING AND CURRENT ROW) AS sumB,
>   MIN(b) OVER (PARTITION BY c ORDER BY procTime() RANGE BETWEEN INTERVAL '1' 
> HOUR PRECEDING AND CURRENT ROW) AS minB
> FROM myStream
> {code}
> The following restrictions should initially apply:
> - All OVER clauses in the same SELECT clause must be exactly the same.
> - The PARTITION BY clause is optional (no partitioning results in single 
> threaded execution).
> - The ORDER BY clause may only have procTime() as parameter. procTime() is a 
> parameterless scalar function that just indicates processing time mode.
> - UNBOUNDED PRECEDING is not supported (see FLINK-5657)
> - FOLLOWING is not supported.
> The restrictions will be resolved in follow up issues. If we find that some 
> of the restrictions are trivial to address, we can add the functionality in 
> this issue as well.
> This issue includes:
> - Design of the DataStream operator to compute OVER ROW aggregates
> - Translation from Calcite's RelNode representation (LogicalProject with 
> RexOver expression).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5791) Resource should be strictly matched when allocating for yarn

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893867#comment-15893867
 ] 

ASF GitHub Bot commented on FLINK-5791:
---

Github user shuai-xu commented on the issue:

https://github.com/apache/flink/pull/3304
  
The failed case pass in my local work copy, I think maybe it is due to the 
environment of travis.


> Resource should be strictly matched when allocating for yarn
> 
>
> Key: FLINK-5791
> URL: https://issues.apache.org/jira/browse/FLINK-5791
> Project: Flink
>  Issue Type: Improvement
>  Components: YARN
>Reporter: shuai.xu
>Assignee: shuai.xu
>  Labels: flip-6
>
> In flip6, for yarn mode, resource should be assigned as requested to avoid 
> resource wasting and OOM.
> 1. YarnResourceManager will request container according to ResourceProfile   
> in slot request form JM.
> 2. RM will pass the ResourceProfile to TM for initializing its slots.
> 3. RM should match the slots offered by TM with SlotRequest from JM strictly.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink issue #3304: [FLINK-5791] [runtime] [FLIP-6] Slots should be strictly ...

2017-03-02 Thread shuai-xu

Github user shuai-xu commented on the issue:

https://github.com/apache/flink/pull/3304
  
The failed case pass in my local work copy, I think maybe it is due to the 
environment of travis.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5927) Remove old Aggregate interface and built-in functions

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893844#comment-15893844
 ] 

ASF GitHub Bot commented on FLINK-5927:
---

GitHub user shaoxuan-wang opened a pull request:

https://github.com/apache/flink/pull/3465

[FLINK-5927] [table] Remove old Aggregate interface and built-in functions

This PR deprecate and remove the old Aggregate interface, built-in 
functions, and associated Agg functions.

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [X] General
  - The pull request references the related JIRA issue ("[FLINK-XXX] Jira 
title text")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [ ] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [X] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shaoxuan-wang/flink F5927-submit

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3465.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3465


commit 5365b7502a31e0d4b51ec5e397edffd0373dcc17
Author: shaoxuan-wang 
Date:   2017-03-03T07:05:00Z

[FLINK-5927] [table] Remove old Aggregate interface and built-in functions




> Remove old Aggregate interface and built-in functions
> -
>
> Key: FLINK-5927
> URL: https://issues.apache.org/jira/browse/FLINK-5927
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Shaoxuan Wang
>Assignee: Shaoxuan Wang
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink pull request #3465: [FLINK-5927] [table] Remove old Aggregate interfac...

2017-03-02 Thread shaoxuan-wang

GitHub user shaoxuan-wang opened a pull request:

https://github.com/apache/flink/pull/3465

[FLINK-5927] [table] Remove old Aggregate interface and built-in functions

This PR deprecate and remove the old Aggregate interface, built-in 
functions, and associated Agg functions.

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [X] General
  - The pull request references the related JIRA issue ("[FLINK-XXX] Jira 
title text")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [ ] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [X] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shaoxuan-wang/flink F5927-submit

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3465.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3465


commit 5365b7502a31e0d4b51ec5e397edffd0373dcc17
Author: shaoxuan-wang 
Date:   2017-03-03T07:05:00Z

[FLINK-5927] [table] Remove old Aggregate interface and built-in functions




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5803) Add [partitioned] processing time OVER RANGE BETWEEN UNBOUNDED PRECEDING aggregation to SQL

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893836#comment-15893836
 ] 

ASF GitHub Bot commented on FLINK-5803:
---

Github user sunjincheng121 commented on the issue:

https://github.com/apache/flink/pull/3397
  
Hi, @fhueske I have rebase the code on #3423 's commit, and updated the PR. 
I appreciate if you can have look at this PR again.
Thanks,
SunJincheng


> Add [partitioned] processing time OVER RANGE BETWEEN UNBOUNDED PRECEDING 
> aggregation to SQL
> ---
>
> Key: FLINK-5803
> URL: https://issues.apache.org/jira/browse/FLINK-5803
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: sunjincheng
>Assignee: sunjincheng
>
> The goal of this issue is to add support for OVER RANGE aggregations on 
> processing time streams to the SQL interface.
> Queries similar to the following should be supported:
> {code}
> SELECT 
>   a, 
>   SUM(b) OVER (PARTITION BY c ORDER BY procTime() RANGE BETWEEN UNBOUNDED 
> PRECEDING AND CURRENT ROW) AS sumB,
>   MIN(b) OVER (PARTITION BY c ORDER BY procTime() RANGE BETWEEN UNBOUNDED 
> PRECEDING AND CURRENT ROW) AS minB
> FROM myStream
> {code}
> The following restrictions should initially apply:
> - All OVER clauses in the same SELECT clause must be exactly the same.
> - The ORDER BY clause may only have procTime() as parameter. procTime() is a 
> parameterless scalar function that just indicates processing time mode.
> - bounded PRECEDING is not supported (see FLINK-5654)
> - FOLLOWING is not supported.
> The restrictions will be resolved in follow up issues. If we find that some 
> of the restrictions are trivial to address, we can add the functionality in 
> this issue as well.
> This issue includes:
> - Design of the DataStream operator to compute OVER ROW aggregates
> - Translation from Calcite's RelNode representation (LogicalProject with 
> RexOver expression).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink issue #3397: [FLINK-5803][TableAPI] Add [partitioned] processing t...

2017-03-02 Thread sunjincheng121

Github user sunjincheng121 commented on the issue:

https://github.com/apache/flink/pull/3397
  
Hi, @fhueske I have rebase the code on #3423 's commit, and updated the PR. 
I appreciate if you can have look at this PR again.
Thanks,
SunJincheng


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5929) Allow Access to Per-Window State in ProcessWindowFunction

2017-03-02 Thread Aljoscha Krettek (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893830#comment-15893830
 ] 

Aljoscha Krettek commented on FLINK-5929:
-

Sorry, overlooked the other comment. Yes, I think throwing an Exception is good 
for now. For the tests, I think we're good if we have solid tests in 
{{WindowOperatorContractTest}}.

> Allow Access to Per-Window State in ProcessWindowFunction
> -
>
> Key: FLINK-5929
> URL: https://issues.apache.org/jira/browse/FLINK-5929
> Project: Flink
>  Issue Type: Improvement
>  Components: DataStream API
>Reporter: Aljoscha Krettek
>
> Right now, the state that a {{WindowFunction}} or {{ProcessWindowFunction}} 
> can access is scoped to the key of the window but not the window itself. That 
> is, state is global across all windows for a given key.
> For some use cases it is beneficial to keep state scoped to a window. For 
> example, if you expect to have several {{Trigger}} firings (due to early and 
> late firings) a user can keep state per window to keep some information 
> between those firings.
> The per-window state has to be cleaned up in some way. For this I see two 
> options:
>  - Keep track of all state that a user uses and clean up when we reach the 
> window GC horizon.
>  - Add a method {{cleanup()}} to {{ProcessWindowFunction}} which is called 
> when we reach the window GC horizon that users can/should use to clean up 
> their state.
> On the API side, we can add a method {{windowState()}} on 
> {{ProcessWindowFunction.Context}} that retrieves the per-window state and 
> {{globalState()}} that would allow access to the (already available) global 
> state. The {{Context}} would then look like this:
> {code}
> /**
>  * The context holding window metadata
>  */
> public abstract class Context {
> /**
>  * @return The window that is being evaluated.
>  */
> public abstract W window();
> /**
>  * State accessor for per-key and per-window state.
>  */
> KeyedStateStore windowState();
> /**
>  * State accessor for per-key global state.
>  */
> KeyedStateStore globalState();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5653) Add processing time OVER ROWS BETWEEN x PRECEDING aggregation to SQL

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893813#comment-15893813
 ] 

ASF GitHub Bot commented on FLINK-5653:
---

Github user shaoxuan-wang commented on a diff in the pull request:

https://github.com/apache/flink/pull/3443#discussion_r104096189
  
--- Diff: 
flink-libraries/flink-table/src/main/java/org/apache/flink/table/plan/nodes/datastream/aggs/DoubleSummaryAggregation.java
 ---
@@ -0,0 +1,214 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.plan.nodes.datastream.aggs;
+
+import static 
org.apache.flink.api.java.summarize.aggregation.CompensatedSum.ZERO;
+
+import org.apache.flink.api.java.summarize.aggregation.Aggregator;
+import org.apache.flink.api.java.summarize.aggregation.CompensatedSum;
+import 
org.apache.flink.api.java.summarize.aggregation.NumericSummaryAggregator;
+
+public class DoubleSummaryAggregation extends 
NumericSummaryAggregator {
--- End diff --

@huawei-flink ,   I have replied your question in the UDAGG design doc. 
AggregateFunction is the base class for UDAGG. We are very cautious to add any 
new method into this interface. As mentioned in the UDAGG design doc, only 
createAccumulator, getValue, accumulate are the must to have methods for an 
aggregate. Merge methods is optional only useful for advanced optimization for 
the runtime execution plan. Retract may also be a must-have if the users are 
care about the correctness. I do not see why reset is necessary for aggregate. 
If it is helpful in your case, you can always add this method in your User(you 
as the user) Defined Aggregate Function. UDAGG is still on the way, but I think 
it should be available very soon. 


> Add processing time OVER ROWS BETWEEN x PRECEDING aggregation to SQL
> 
>
> Key: FLINK-5653
> URL: https://issues.apache.org/jira/browse/FLINK-5653
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Fabian Hueske
>Assignee: Stefano Bortoli
>
> The goal of this issue is to add support for OVER ROWS aggregations on 
> processing time streams to the SQL interface.
> Queries similar to the following should be supported:
> {code}
> SELECT 
>   a, 
>   SUM(b) OVER (PARTITION BY c ORDER BY procTime() ROWS BETWEEN 2 PRECEDING 
> AND CURRENT ROW) AS sumB,
>   MIN(b) OVER (PARTITION BY c ORDER BY procTime() ROWS BETWEEN 2 PRECEDING 
> AND CURRENT ROW) AS minB
> FROM myStream
> {code}
> The following restrictions should initially apply:
> - All OVER clauses in the same SELECT clause must be exactly the same.
> - The PARTITION BY clause is optional (no partitioning results in single 
> threaded execution).
> - The ORDER BY clause may only have procTime() as parameter. procTime() is a 
> parameterless scalar function that just indicates processing time mode.
> - UNBOUNDED PRECEDING is not supported (see FLINK-5656)
> - FOLLOWING is not supported.
> The restrictions will be resolved in follow up issues. If we find that some 
> of the restrictions are trivial to address, we can add the functionality in 
> this issue as well.
> This issue includes:
> - Design of the DataStream operator to compute OVER ROW aggregates
> - Translation from Calcite's RelNode representation (LogicalProject with 
> RexOver expression).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink pull request #3443: [FLINK-5653] Add processing time OVER ROWS BETWEEN...

2017-03-02 Thread shaoxuan-wang

Github user shaoxuan-wang commented on a diff in the pull request:

https://github.com/apache/flink/pull/3443#discussion_r104096189
  
--- Diff: 
flink-libraries/flink-table/src/main/java/org/apache/flink/table/plan/nodes/datastream/aggs/DoubleSummaryAggregation.java
 ---
@@ -0,0 +1,214 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.plan.nodes.datastream.aggs;
+
+import static 
org.apache.flink.api.java.summarize.aggregation.CompensatedSum.ZERO;
+
+import org.apache.flink.api.java.summarize.aggregation.Aggregator;
+import org.apache.flink.api.java.summarize.aggregation.CompensatedSum;
+import 
org.apache.flink.api.java.summarize.aggregation.NumericSummaryAggregator;
+
+public class DoubleSummaryAggregation extends 
NumericSummaryAggregator {
--- End diff --

@huawei-flink ,   I have replied your question in the UDAGG design doc. 
AggregateFunction is the base class for UDAGG. We are very cautious to add any 
new method into this interface. As mentioned in the UDAGG design doc, only 
createAccumulator, getValue, accumulate are the must to have methods for an 
aggregate. Merge methods is optional only useful for advanced optimization for 
the runtime execution plan. Retract may also be a must-have if the users are 
care about the correctness. I do not see why reset is necessary for aggregate. 
If it is helpful in your case, you can always add this method in your User(you 
as the user) Defined Aggregate Function. UDAGG is still on the way, but I think 
it should be available very soon. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5955) Merging a list of buffered records will have problem when ObjectReuse is turned on

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893780#comment-15893780
 ] 

ASF GitHub Bot commented on FLINK-5955:
---

GitHub user shaoxuan-wang opened a pull request:

https://github.com/apache/flink/pull/3464

[FLINK-5955] [table] Merging a list of buffered records will have problem 
when ObjectReuse is turned on

This PR changes the dataSet AGG merge to pair-merge. 

If we buffer the iterated records for group-merge, we will get wrong error 
when ObjectReuse is turned on. Alternatively, we could deep-copy every record 
and buffer them for group-merge. But I think that is expense in terms of memory 
and also CPU. We could later add group-merge when needed (in the future we 
should add rules to select either pair-merge or group-merge, but for now all 
built-in aggregates should work fine with pair-merge).

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [X] General
  - The pull request references the related JIRA issue ("[FLINK-XXX] Jira 
title text")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [ ] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [X] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shaoxuan-wang/flink F5955-submit

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3464.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3464


commit e6cdab7cd309f16d028894943f177f4321889630
Author: shaoxuan-wang 
Date:   2017-03-03T05:50:29Z

[FLINK-5955] [table] Merging a list of buffered records will have problem 
when ObjectReuse is turned on




> Merging a list of buffered records will have problem when ObjectReuse is 
> turned on
> --
>
> Key: FLINK-5955
> URL: https://issues.apache.org/jira/browse/FLINK-5955
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: Shaoxuan Wang
>Assignee: Shaoxuan Wang
>
> Turn on ObjectReuse in MultipleProgramsTestBase:
> TestEnvironment clusterEnv = new TestEnvironment(cluster, 4, true);
> Then the tests "testEventTimeSessionGroupWindow", 
> "testEventTimeSessionGroupWindow", and 
> "testEventTimeTumblingGroupWindowOverTime"  will fail.
> The reason is that we have buffered iterated records for group-merge. I think 
> we should change the Agg merge to pair-merge, and later add group-merge when 
> needed (in the future we should add rules to select either pair-merge or 
> group-merge, but for now all built-in aggregates should work fine with 
> pair-merge).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink pull request #3464: [FLINK-5955] [table] Merging a list of buffered re...

2017-03-02 Thread shaoxuan-wang

GitHub user shaoxuan-wang opened a pull request:

https://github.com/apache/flink/pull/3464

[FLINK-5955] [table] Merging a list of buffered records will have problem 
when ObjectReuse is turned on

This PR changes the dataSet AGG merge to pair-merge. 

If we buffer the iterated records for group-merge, we will get wrong error 
when ObjectReuse is turned on. Alternatively, we could deep-copy every record 
and buffer them for group-merge. But I think that is expense in terms of memory 
and also CPU. We could later add group-merge when needed (in the future we 
should add rules to select either pair-merge or group-merge, but for now all 
built-in aggregates should work fine with pair-merge).

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [X] General
  - The pull request references the related JIRA issue ("[FLINK-XXX] Jira 
title text")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [ ] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [X] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shaoxuan-wang/flink F5955-submit

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3464.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3464


commit e6cdab7cd309f16d028894943f177f4321889630
Author: shaoxuan-wang 
Date:   2017-03-03T05:50:29Z

[FLINK-5955] [table] Merging a list of buffered records will have problem 
when ObjectReuse is turned on




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Updated] (FLINK-5955) Merging a list of buffered records will have problem when ObjectReuse is turned on

2017-03-02 Thread Kurt Young (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kurt Young updated FLINK-5955:
--
Component/s: Table API & SQL

> Merging a list of buffered records will have problem when ObjectReuse is 
> turned on
> --
>
> Key: FLINK-5955
> URL: https://issues.apache.org/jira/browse/FLINK-5955
> Project: Flink
>  Issue Type: Bug
>  Components: Table API & SQL
>Reporter: Shaoxuan Wang
>Assignee: Shaoxuan Wang
>
> Turn on ObjectReuse in MultipleProgramsTestBase:
> TestEnvironment clusterEnv = new TestEnvironment(cluster, 4, true);
> Then the tests "testEventTimeSessionGroupWindow", 
> "testEventTimeSessionGroupWindow", and 
> "testEventTimeTumblingGroupWindowOverTime"  will fail.
> The reason is that we have buffered iterated records for group-merge. I think 
> we should change the Agg merge to pair-merge, and later add group-merge when 
> needed (in the future we should add rules to select either pair-merge or 
> group-merge, but for now all built-in aggregates should work fine with 
> pair-merge).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5856) Need return redundant containers to yarn for yarn mode

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893761#comment-15893761
 ] 

ASF GitHub Bot commented on FLINK-5856:
---

Github user shuai-xu commented on the issue:

https://github.com/apache/flink/pull/3398
  
The travis failure  seems to have nothing with the change.


> Need return redundant containers to yarn for yarn mode
> --
>
> Key: FLINK-5856
> URL: https://issues.apache.org/jira/browse/FLINK-5856
> Project: Flink
>  Issue Type: Bug
>  Components: YARN
>Reporter: shuai.xu
>Assignee: shuai.xu
>  Labels: flip-6
>
> In flip6, for flink on yarn mode, RM requests container from yarn according 
> to the requirement of the JM. But the AMRMClientAsync used in yarn doesn't 
> guarantee that the number of containers returned exactly equal to the number 
> requested. So it need to record the number request by flink rm and return the 
> redundant ones to yarn.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink issue #3398: [FLINK-5856] [FLIP-6] return redundant containers to yarn...

2017-03-02 Thread shuai-xu

Github user shuai-xu commented on the issue:

https://github.com/apache/flink/pull/3398
  
The travis failure  seems to have nothing with the change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5952) JobCancellationWithSavepointHandlersTest uses deprecated JsonNode#getValuesAsText

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893743#comment-15893743
 ] 

ASF GitHub Bot commented on FLINK-5952:
---

Github user mtunique closed the pull request at:

https://github.com/apache/flink/pull/3463


> JobCancellationWithSavepointHandlersTest uses deprecated 
> JsonNode#getValuesAsText
> -
>
> Key: FLINK-5952
> URL: https://issues.apache.org/jira/browse/FLINK-5952
> Project: Flink
>  Issue Type: Improvement
>  Components: Tests, Webfrontend
>Affects Versions: 1.3.0
>Reporter: Chesnay Schepler
>Priority: Trivial
>
> Usage of JsonNode#getValuesAsText() should be replaced with JsonNode#asText().



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink pull request #3463: [FLINK-5952] JobCancellationWithSavepointHandlersT...

2017-03-02 Thread mtunique

Github user mtunique closed the pull request at:

https://github.com/apache/flink/pull/3463


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5952) JobCancellationWithSavepointHandlersTest uses deprecated JsonNode#getValuesAsText

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893718#comment-15893718
 ] 

ASF GitHub Bot commented on FLINK-5952:
---

GitHub user mtunique opened a pull request:

https://github.com/apache/flink/pull/3463

[FLINK-5952] JobCancellationWithSavepointHandlersTest uses deprecated 
JsonNode#getValuesAsText

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [x] General
  - The pull request references the related JIRA issue ("[FLINK-XXX] Jira 
title text")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [ ] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [x] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mtunique/flink FLINK-5952

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3463.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3463


commit d73d108ae9f8013aa541fb49b0b2d3d8ee7c3ae5
Author: mtunique 
Date:   2017-03-03T03:37:51Z

[FLINK-5952] JobCancellationWithSavepointHandlersTest uses deprecated 
JsonNode#getValuesAsText




> JobCancellationWithSavepointHandlersTest uses deprecated 
> JsonNode#getValuesAsText
> -
>
> Key: FLINK-5952
> URL: https://issues.apache.org/jira/browse/FLINK-5952
> Project: Flink
>  Issue Type: Improvement
>  Components: Tests, Webfrontend
>Affects Versions: 1.3.0
>Reporter: Chesnay Schepler
>Priority: Trivial
>
> Usage of JsonNode#getValuesAsText() should be replaced with JsonNode#asText().



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink pull request #3463: [FLINK-5952] JobCancellationWithSavepointHandlersT...

2017-03-02 Thread mtunique

GitHub user mtunique opened a pull request:

https://github.com/apache/flink/pull/3463

[FLINK-5952] JobCancellationWithSavepointHandlersTest uses deprecated 
JsonNode#getValuesAsText

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [x] General
  - The pull request references the related JIRA issue ("[FLINK-XXX] Jira 
title text")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [ ] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [x] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mtunique/flink FLINK-5952

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3463.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3463


commit d73d108ae9f8013aa541fb49b0b2d3d8ee7c3ae5
Author: mtunique 
Date:   2017-03-03T03:37:51Z

[FLINK-5952] JobCancellationWithSavepointHandlersTest uses deprecated 
JsonNode#getValuesAsText




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5584) Support Sliding-count row-window on streaming sql

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893670#comment-15893670
 ] 

ASF GitHub Bot commented on FLINK-5584:
---

Github user hongyuhong closed the pull request at:

https://github.com/apache/flink/pull/3175


> Support Sliding-count row-window on streaming sql
> -
>
> Key: FLINK-5584
> URL: https://issues.apache.org/jira/browse/FLINK-5584
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Yuhong Hong
>Assignee: Yuhong Hong
>
> Calcite has already support sliding-count row-window, the grammar look like:
> select sum(amount) over (rows 10 preceding) from Order;
> select sum(amount) over (partition by user rows 10 preceding) from Order;
> And it will parse the sql as a LogicalWindow relnode, the logical Window 
> contains aggregate func info and window info, it's similar to Flink 
> LogicalWIndowAggregate, so we can add an convert rule to directly convert 
> LogicalWindow into DataStreamAggregate relnode, and if Calcite support more 
> grammar, we can extend the convert rule.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink pull request #3175: [FLINK-5584]support sliding-count row-window on st...

2017-03-02 Thread hongyuhong

Github user hongyuhong closed the pull request at:

https://github.com/apache/flink/pull/3175


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Updated] (FLINK-5541) Missing null check for localJar in FlinkSubmitter#submitTopology()

2017-03-02 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated FLINK-5541:
--
Description: 
{code}
  if (localJar == null) {
try {
  for (final URL url : ((ContextEnvironment) 
ExecutionEnvironment.getExecutionEnvironment())
  .getJars()) {
// TODO verify that there is only one jar
localJar = new File(url.toURI()).getAbsolutePath();
  }
} catch (final URISyntaxException e) {
  // ignore
} catch (final ClassCastException e) {
  // ignore
}
  }

  logger.info("Submitting topology " + name + " in distributed mode with 
conf " + serConf);
  client.submitTopologyWithOpts(name, localJar, topology);
{code}
Since the try block may encounter URISyntaxException / ClassCastException, we 
should check that localJar is not null before calling submitTopologyWithOpts().

  was:
{code}
  if (localJar == null) {
try {
  for (final URL url : ((ContextEnvironment) 
ExecutionEnvironment.getExecutionEnvironment())
  .getJars()) {
// TODO verify that there is only one jar
localJar = new File(url.toURI()).getAbsolutePath();
  }
} catch (final URISyntaxException e) {
  // ignore
} catch (final ClassCastException e) {
  // ignore
}
  }

  logger.info("Submitting topology " + name + " in distributed mode with 
conf " + serConf);
  client.submitTopologyWithOpts(name, localJar, topology);
{code}

Since the try block may encounter URISyntaxException / ClassCastException, we 
should check that localJar is not null before calling submitTopologyWithOpts().


> Missing null check for localJar in FlinkSubmitter#submitTopology()
> --
>
> Key: FLINK-5541
> URL: https://issues.apache.org/jira/browse/FLINK-5541
> Project: Flink
>  Issue Type: Bug
>  Components: Storm Compatibility
>Reporter: Ted Yu
>Priority: Minor
>
> {code}
>   if (localJar == null) {
> try {
>   for (final URL url : ((ContextEnvironment) 
> ExecutionEnvironment.getExecutionEnvironment())
>   .getJars()) {
> // TODO verify that there is only one jar
> localJar = new File(url.toURI()).getAbsolutePath();
>   }
> } catch (final URISyntaxException e) {
>   // ignore
> } catch (final ClassCastException e) {
>   // ignore
> }
>   }
>   logger.info("Submitting topology " + name + " in distributed mode with 
> conf " + serConf);
>   client.submitTopologyWithOpts(name, localJar, topology);
> {code}
> Since the try block may encounter URISyntaxException / ClassCastException, we 
> should check that localJar is not null before calling 
> submitTopologyWithOpts().



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (FLINK-5629) Unclosed RandomAccessFile in StaticFileServerHandler#respondAsLeader()

2017-03-02 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15854366#comment-15854366
 ] 

Ted Yu edited comment on FLINK-5629 at 3/3/17 3:19 AM:
---

RandomAccessFile#length() may throw IOE.
raf is used in the following code path where DefaultFileRegion is not involved:

{code}
} else {
  lastContentFuture = ctx.writeAndFlush(new HttpChunkedInput(new 
ChunkedFile(raf, 0, fileLength, 8192)),
{code}
It is good practice to close RandomAccessFile in all code paths.


was (Author: yuzhih...@gmail.com):
RandomAccessFile#length() may throw IOE.
raf is used in the following code path where DefaultFileRegion is not involved:
{code}
} else {
  lastContentFuture = ctx.writeAndFlush(new HttpChunkedInput(new 
ChunkedFile(raf, 0, fileLength, 8192)),
{code}
It is good practice to close RandomAccessFile in all code paths.

> Unclosed RandomAccessFile in StaticFileServerHandler#respondAsLeader()
> --
>
> Key: FLINK-5629
> URL: https://issues.apache.org/jira/browse/FLINK-5629
> Project: Flink
>  Issue Type: Bug
>  Components: Webfrontend
>Reporter: Ted Yu
>Priority: Minor
>
> {code}
> final RandomAccessFile raf;
> try {
>   raf = new RandomAccessFile(file, "r");
> ...
> long fileLength = raf.length();
> {code}
> The RandomAccessFile should be closed upon return from method.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5219) Add non-grouped session windows for batch tables

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893618#comment-15893618
 ] 

ASF GitHub Bot commented on FLINK-5219:
---

Github user sunjincheng121 commented on the issue:

https://github.com/apache/flink/pull/3266
  
Hi, @fhueske I had rebase the code on PR 
[#3423|https://github.com/apache/flink/pull/3423]'s commit.


> Add non-grouped session windows for batch tables
> 
>
> Key: FLINK-5219
> URL: https://issues.apache.org/jira/browse/FLINK-5219
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: sunjincheng
>Assignee: sunjincheng
>
> Add non-grouped session windows for batch tables as described in 
> [FLIP-11|https://cwiki.apache.org/confluence/display/FLINK/FLIP-11%3A+Table+API+Stream+Aggregations].



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink issue #3266: [FLINK-5219][TableAPI] Add non-grouped session window...

2017-03-02 Thread sunjincheng121

Github user sunjincheng121 commented on the issue:

https://github.com/apache/flink/pull/3266
  
Hi, @fhueske I had rebase the code on PR 
[#3423|https://github.com/apache/flink/pull/3423]'s commit.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (FLINK-5955) Merging a list of buffered records will have problem when ObjectReuse is turned on

2017-03-02 Thread Shaoxuan Wang (JIRA)

Shaoxuan Wang created FLINK-5955:


 Summary: Merging a list of buffered records will have problem when 
ObjectReuse is turned on
 Key: FLINK-5955
 URL: https://issues.apache.org/jira/browse/FLINK-5955
 Project: Flink
  Issue Type: Bug
Reporter: Shaoxuan Wang
Assignee: Shaoxuan Wang


Turn on ObjectReuse in MultipleProgramsTestBase:
TestEnvironment clusterEnv = new TestEnvironment(cluster, 4, true);

Then the tests "testEventTimeSessionGroupWindow", 
"testEventTimeSessionGroupWindow", and 
"testEventTimeTumblingGroupWindowOverTime"  will fail.

The reason is that we have buffered iterated records for group-merge. I think 
we should change the Agg merge to pair-merge, and later add group-merge when 
needed (in the future we should add rules to select either pair-merge or 
group-merge, but for now all built-in aggregates should work fine with 
pair-merge).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5917) Remove MapState.size()

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893604#comment-15893604
 ] 

ASF GitHub Bot commented on FLINK-5917:
---

GitHub user shixiaogang opened a pull request:

https://github.com/apache/flink/pull/3462

[FLINK-5917][state] Remove size() method from MapState

The `size()` method is removed from `MapState` because its implementation 
is costly in the backends.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/alibaba/flink flink-5917

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3462.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3462


commit 6906b15ff593f46e106348aa1f5772e6b78efe74
Author: xiaogang.sxg 
Date:   2017-03-03T02:27:11Z

Remove size() method from MapState




> Remove MapState.size()
> --
>
> Key: FLINK-5917
> URL: https://issues.apache.org/jira/browse/FLINK-5917
> Project: Flink
>  Issue Type: Improvement
>  Components: DataStream API
>Affects Versions: 1.3.0
>Reporter: Aljoscha Krettek
>Assignee: Xiaogang Shi
>Priority: Blocker
> Fix For: 1.3.0
>
>
> I'm proposing to remove {{size()}} because it is a prohibitively expensive 
> operation and users might not be aware of it. Instead of {{size()}} users can 
> use an iterator over all mappings to determine the size, when doing this they 
> will be aware of the fact that it is a costly operation.
> Right now, {{size()}} is only costly on the RocksDB state backend but I think 
> with future developments on the in-memory state backend it might also become 
> an expensive operation there.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink pull request #3462: [FLINK-5917][state] Remove size() method from MapS...

2017-03-02 Thread shixiaogang

GitHub user shixiaogang opened a pull request:

https://github.com/apache/flink/pull/3462

[FLINK-5917][state] Remove size() method from MapState

The `size()` method is removed from `MapState` because its implementation 
is costly in the backends.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/alibaba/flink flink-5917

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3462.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3462


commit 6906b15ff593f46e106348aa1f5772e6b78efe74
Author: xiaogang.sxg 
Date:   2017-03-03T02:27:11Z

Remove size() method from MapState




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5586) Extend TableProgramsTestBase for object reuse modes

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893595#comment-15893595
 ] 

ASF GitHub Bot commented on FLINK-5586:
---

Github user KurtYoung commented on the issue:

https://github.com/apache/flink/pull/3339
  
Seems like this will effect the new aggregate interface and make the tests 
failed, will wait for that fix first.


> Extend TableProgramsTestBase for object reuse modes
> ---
>
> Key: FLINK-5586
> URL: https://issues.apache.org/jira/browse/FLINK-5586
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Kurt Young
>
> We should also test if all runtime operators of the Table API work correctly 
> if object reuse mode is set to true. This should be done for all 
> cluster-based ITCases, not the collection-based ones.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink issue #3339: [FLINK-5586] [table] Extend TableProgramsClusterTestBase ...

2017-03-02 Thread KurtYoung

Github user KurtYoung commented on the issue:

https://github.com/apache/flink/pull/3339
  
Seems like this will effect the new aggregate interface and make the tests 
failed, will wait for that fix first.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5954) Always assign names to the window in the Stream SQL API

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893551#comment-15893551
 ] 

ASF GitHub Bot commented on FLINK-5954:
---

GitHub user haohui opened a pull request:

https://github.com/apache/flink/pull/3461

[FLINK-5954] Always assign names to the window in the Stream SQL API.

Please see jira for more details. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/haohui/flink FLINK-5954

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3461.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3461


commit 93ace85d026959dce9332085eeb750fa9f50ce52
Author: Haohui Mai 
Date:   2017-03-03T01:44:43Z

[FLINK-5954] Always assign names to the window in the Stream SQL API.




> Always assign names to the window in the Stream SQL API
> ---
>
> Key: FLINK-5954
> URL: https://issues.apache.org/jira/browse/FLINK-5954
> Project: Flink
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>
> CALCITE-1603 and CALCITE-1615 brings in supports for {{TUMBLE}}, {{HOP}}, 
> {{SESSION}} grouped windows, as well as the corresponding auxiliary functions 
> that allow uses to query the start and the end of the windows (e.g., 
> {{TUMBLE_START()}} and {{TUMBLE_END()}} see 
> http://calcite.apache.org/docs/stream.html for more details).
> The goal of this jira is to add support for these auxiliary functions in 
> Flink. Flink already has runtime supports for them, as these functions are 
> essential mapped to the {{WindowStart}} and {{WindowEnd}} classes.
> To implement this feature in transformation, the transformation needs to 
> recognize these functions and map them to the {{WindowStart}} and 
> {{WindowEnd}} classes.
> The problem is that both classes can only refer to the windows using alias. 
> Therefore this jira proposes to assign a unique name for each window to 
> enable the transformation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink pull request #3461: [FLINK-5954] Always assign names to the window in ...

2017-03-02 Thread haohui

GitHub user haohui opened a pull request:

https://github.com/apache/flink/pull/3461

[FLINK-5954] Always assign names to the window in the Stream SQL API.

Please see jira for more details. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/haohui/flink FLINK-5954

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3461.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3461


commit 93ace85d026959dce9332085eeb750fa9f50ce52
Author: Haohui Mai 
Date:   2017-03-03T01:44:43Z

[FLINK-5954] Always assign names to the window in the Stream SQL API.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Updated] (FLINK-5954) Always assign names to the window in the Stream SQL API

2017-03-02 Thread Haohui Mai (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated FLINK-5954:
--
Summary: Always assign names to the window in the Stream SQL API  (was: 
Always assign names to the window in the Stream SQL APi)

> Always assign names to the window in the Stream SQL API
> ---
>
> Key: FLINK-5954
> URL: https://issues.apache.org/jira/browse/FLINK-5954
> Project: Flink
>  Issue Type: Improvement
>Reporter: Haohui Mai
>Assignee: Haohui Mai
>
> CALCITE-1603 and CALCITE-1615 brings in supports for {{TUMBLE}}, {{HOP}}, 
> {{SESSION}} grouped windows, as well as the corresponding auxiliary functions 
> that allow uses to query the start and the end of the windows (e.g., 
> {{TUMBLE_START()}} and {{TUMBLE_END()}} see 
> http://calcite.apache.org/docs/stream.html for more details).
> The goal of this jira is to add support for these auxiliary functions in 
> Flink. Flink already has runtime supports for them, as these functions are 
> essential mapped to the {{WindowStart}} and {{WindowEnd}} classes.
> To implement this feature in transformation, the transformation needs to 
> recognize these functions and map them to the {{WindowStart}} and 
> {{WindowEnd}} classes.
> The problem is that both classes can only refer to the windows using alias. 
> Therefore this jira proposes to assign a unique name for each window to 
> enable the transformation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (FLINK-5954) Always assign names to the window in the Stream SQL APi

2017-03-02 Thread Haohui Mai (JIRA)

Haohui Mai created FLINK-5954:
-

 Summary: Always assign names to the window in the Stream SQL APi
 Key: FLINK-5954
 URL: https://issues.apache.org/jira/browse/FLINK-5954
 Project: Flink
  Issue Type: Improvement
Reporter: Haohui Mai
Assignee: Haohui Mai


CALCITE-1603 and CALCITE-1615 brings in supports for {{TUMBLE}}, {{HOP}}, 
{{SESSION}} grouped windows, as well as the corresponding auxiliary functions 
that allow uses to query the start and the end of the windows (e.g., 
{{TUMBLE_START()}} and {{TUMBLE_END()}} see 
http://calcite.apache.org/docs/stream.html for more details).

The goal of this jira is to add support for these auxiliary functions in Flink. 
Flink already has runtime supports for them, as these functions are essential 
mapped to the {{WindowStart}} and {{WindowEnd}} classes.

To implement this feature in transformation, the transformation needs to 
recognize these functions and map them to the {{WindowStart}} and {{WindowEnd}} 
classes.

The problem is that both classes can only refer to the windows using alias. 
Therefore this jira proposes to assign a unique name for each window to enable 
the transformation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5918) port range support for config taskmanager.rpc.port

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893383#comment-15893383
 ] 

ASF GitHub Bot commented on FLINK-5918:
---

Github user barcahead commented on a diff in the pull request:

https://github.com/apache/flink/pull/3416#discussion_r104065323
  
--- Diff: flink-core/pom.xml ---
@@ -165,6 +165,7 @@ under the License.

org.apache.flink.configuration.ConfigConstants#ENABLE_QUARANTINE_MONITOR

org.apache.flink.configuration.ConfigConstants#NETWORK_REQUEST_BACKOFF_INITIAL_KEY

org.apache.flink.configuration.ConfigConstants#NETWORK_REQUEST_BACKOFF_MAX_KEY
+   
org.apache.flink.configuration.ConfigConstants#DEFAULT_TASK_MANAGER_IPC_PORT
--- End diff --

Thanks for the review, I will use a new constant instead.


> port range support for config taskmanager.rpc.port
> --
>
> Key: FLINK-5918
> URL: https://issues.apache.org/jira/browse/FLINK-5918
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.3.0
>Reporter: Yelei Feng
>Assignee: Yelei Feng
> Fix For: 1.3.0
>
>
> we should support to set port range for config {{taskmanager.rpc.port}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink pull request #3416: [FLINK-5918] [runtime] port range support for conf...

2017-03-02 Thread barcahead

Github user barcahead commented on a diff in the pull request:

https://github.com/apache/flink/pull/3416#discussion_r104065323
  
--- Diff: flink-core/pom.xml ---
@@ -165,6 +165,7 @@ under the License.

org.apache.flink.configuration.ConfigConstants#ENABLE_QUARANTINE_MONITOR

org.apache.flink.configuration.ConfigConstants#NETWORK_REQUEST_BACKOFF_INITIAL_KEY

org.apache.flink.configuration.ConfigConstants#NETWORK_REQUEST_BACKOFF_MAX_KEY
+   
org.apache.flink.configuration.ConfigConstants#DEFAULT_TASK_MANAGER_IPC_PORT
--- End diff --

Thanks for the review, I will use a new constant instead.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Closed] (FLINK-5597) Improve the LocalClusteringCoefficient documentation

2017-03-02 Thread Greg Hogan (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Hogan closed FLINK-5597.
-
Resolution: Implemented

Implemented in cb9e409b764f95e07441a0c8da6c24e21bc1564b

> Improve the LocalClusteringCoefficient documentation
> 
>
> Key: FLINK-5597
> URL: https://issues.apache.org/jira/browse/FLINK-5597
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Gelly
>Affects Versions: 1.3.0
>Reporter: Vasia Kalavri
>Assignee: Greg Hogan
> Fix For: 1.3.0
>
>
> The LocalClusteringCoefficient usage section should explain what is the 
> algorithm output and how to retrieve the actual local clustering coefficient 
> scores from it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Closed] (FLINK-4896) PageRank algorithm for directed graphs

2017-03-02 Thread Greg Hogan (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-4896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Hogan closed FLINK-4896.
-
Resolution: Implemented

Implemented in ea14053fe32280ffc36e586b5d3712c751fa1f84

> PageRank algorithm for directed graphs
> --
>
> Key: FLINK-4896
> URL: https://issues.apache.org/jira/browse/FLINK-4896
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Affects Versions: 1.2.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>  Labels: algorithm
> Fix For: 1.3.0
>
>
> Gelly includes PageRank implementations for scatter-gather and 
> gather-sum-apply. Both ship with the warning "The implementation assumes that 
> each page has at least one incoming and one outgoing link."
> PageRank is a directed algorithm and sources and sinks are common in directed 
> graphs.
> Sinks drain the total score across the graph which affects convergence and 
> the balance of the random hop (convergence is not currently a feature of 
> Gelly's PageRanks as this a very recent feature from FLINK-3888).
> Sources are handled nicely by the algorithm highlighted on Flink's features 
> page under "Iterations and Delta Iterations" since score deltas are 
> transmitted and a source's score never changes (is always equal to the random 
> hop probability divided by the vertex count).
>   https://flink.apache.org/features.html
> We should find an implementation featuring convergence and unrestricted 
> processing of directed graphs and move other implementations to Gelly 
> examples.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-4896) PageRank algorithm for directed graphs

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-4896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893249#comment-15893249
 ] 

ASF GitHub Bot commented on FLINK-4896:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2733


> PageRank algorithm for directed graphs
> --
>
> Key: FLINK-4896
> URL: https://issues.apache.org/jira/browse/FLINK-4896
> Project: Flink
>  Issue Type: New Feature
>  Components: Gelly
>Affects Versions: 1.2.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>  Labels: algorithm
> Fix For: 1.3.0
>
>
> Gelly includes PageRank implementations for scatter-gather and 
> gather-sum-apply. Both ship with the warning "The implementation assumes that 
> each page has at least one incoming and one outgoing link."
> PageRank is a directed algorithm and sources and sinks are common in directed 
> graphs.
> Sinks drain the total score across the graph which affects convergence and 
> the balance of the random hop (convergence is not currently a feature of 
> Gelly's PageRanks as this a very recent feature from FLINK-3888).
> Sources are handled nicely by the algorithm highlighted on Flink's features 
> page under "Iterations and Delta Iterations" since score deltas are 
> transmitted and a source's score never changes (is always equal to the random 
> hop probability divided by the vertex count).
>   https://flink.apache.org/features.html
> We should find an implementation featuring convergence and unrestricted 
> processing of directed graphs and move other implementations to Gelly 
> examples.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5597) Improve the LocalClusteringCoefficient documentation

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893250#comment-15893250
 ] 

ASF GitHub Bot commented on FLINK-5597:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3404


> Improve the LocalClusteringCoefficient documentation
> 
>
> Key: FLINK-5597
> URL: https://issues.apache.org/jira/browse/FLINK-5597
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Gelly
>Affects Versions: 1.3.0
>Reporter: Vasia Kalavri
>Assignee: Greg Hogan
> Fix For: 1.3.0
>
>
> The LocalClusteringCoefficient usage section should explain what is the 
> algorithm output and how to retrieve the actual local clustering coefficient 
> scores from it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink pull request #2733: [FLINK-4896] [gelly] PageRank algorithm for direct...

2017-03-02 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2733


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #3404: [FLINK-5597] [docs] Improve the LocalClusteringCoe...

2017-03-02 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3404


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5414) Bump up Calcite version to 1.11

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893216#comment-15893216
 ] 

ASF GitHub Bot commented on FLINK-5414:
---

Github user haohui commented on the issue:

https://github.com/apache/flink/pull/3426
  
Fix the unit tests.

There are two additional changes:

1. There are precision differences when converting `double` to 
`BigDecimal`. Fix the unit tests.
2. When registering UDFs Flink needs to distinguish nullable and 
non-nullable types. Patched `UserDefinedFunctionUtils`. We need a solution like 
FLINK-5177 to handle these cases systematically.


> Bump up Calcite version to 1.11
> ---
>
> Key: FLINK-5414
> URL: https://issues.apache.org/jira/browse/FLINK-5414
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Jark Wu
>
> The upcoming Calcite release 1.11 has a lot of stability fixes and new 
> features. We should update it for the Table API.
> E.g. we can hopefully merge FLINK-4864



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink issue #3426: [FLINK-5414] [table] Bump up Calcite version to 1.11

2017-03-02 Thread haohui

Github user haohui commented on the issue:

https://github.com/apache/flink/pull/3426
  
Fix the unit tests.

There are two additional changes:

1. There are precision differences when converting `double` to 
`BigDecimal`. Fix the unit tests.
2. When registering UDFs Flink needs to distinguish nullable and 
non-nullable types. Patched `UserDefinedFunctionUtils`. We need a solution like 
FLINK-5177 to handle these cases systematically.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #3356: [FLINK-5253] Remove special treatment of "dynamic propert...

2017-03-02 Thread mariusz89016

Github user mariusz89016 commented on the issue:

https://github.com/apache/flink/pull/3356
  
Thanks for the review! 
Currently, dynamic properties are set by 
`GlobalConfiguration.dynamicProperties` instead of saving in cluster descriptor.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5253) Remove special treatment of "dynamic properties"

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893108#comment-15893108
 ] 

ASF GitHub Bot commented on FLINK-5253:
---

Github user mariusz89016 commented on the issue:

https://github.com/apache/flink/pull/3356
  
Thanks for the review! 
Currently, dynamic properties are set by 
`GlobalConfiguration.dynamicProperties` instead of saving in cluster descriptor.


> Remove special treatment of "dynamic properties"
> 
>
> Key: FLINK-5253
> URL: https://issues.apache.org/jira/browse/FLINK-5253
> Project: Flink
>  Issue Type: Sub-task
>  Components: YARN
> Environment: {{flip-6}} feature branch
>Reporter: Stephan Ewen
>  Labels: flip-6
>
> The YARN client accepts configuration keys as command line parameters.
> Currently these are send to the AppMaster and TaskManager as "dynamic 
> properties", encoded in a special way via environment variables.
> The mechanism is quite fragile. We should simplify it:
>   - The YARN client takes the local {{flink-conf.yaml}} as the base.
>   - It overwrite config entries with command line properties when preparing 
> the configuration to be shipped to YARN container processes (JM / TM)
>   - No additional handling neccessary



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Closed] (FLINK-5945) Close function in OuterJoinOperatorBase#executeOnCollections

2017-03-02 Thread Greg Hogan (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Hogan closed FLINK-5945.
-
   Resolution: Fixed
Fix Version/s: (was: 1.1.4)
   1.1.5

1.3.0: 243ef69bf5233998dd7f849721cfcb83669b663c
1.2.1: 54a02d9a4b81aeb462f958bdeda0aaa509357677
1.1.5: 01703e60e0b583d6d32c2cba395f6199c5773c5e

> Close function in OuterJoinOperatorBase#executeOnCollections
> 
>
> Key: FLINK-5945
> URL: https://issues.apache.org/jira/browse/FLINK-5945
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.2.0, 1.1.4, 1.3.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
> Fix For: 1.3.0, 1.1.5, 1.2.1
>
>
> {{OuterJoinOperatorBase#executeOnCollections}} does not call 
> {{FunctionUtils.closeFunction(function);}}. I am seeing this affect the Gelly 
> test for the {{HITS}} algorithm when using a convergence threshold rather 
> than a fixed number of iterations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (FLINK-3679) Allow Kafka consumer to skip corrupted messages

2017-03-02 Thread Haohui Mai (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-3679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated FLINK-3679:
--
Summary: Allow Kafka consumer to skip corrupted messages  (was: 
DeserializationSchema should handle zero or more outputs for every input)

> Allow Kafka consumer to skip corrupted messages
> ---
>
> Key: FLINK-3679
> URL: https://issues.apache.org/jira/browse/FLINK-3679
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API, Kafka Connector
>Reporter: Jamie Grier
>Assignee: Haohui Mai
>
> There are a couple of issues with the DeserializationSchema API that I think 
> should be improved.  This request has come to me via an existing Flink user.
> The main issue is simply that the API assumes that there is a one-to-one 
> mapping between input and outputs.  In reality there are scenarios where one 
> input message (say from Kafka) might actually map to zero or more logical 
> elements in the pipeline.
> Particularly important here is the case where you receive a message from a 
> source (such as Kafka) and say the raw bytes don't deserialize properly.  
> Right now the only recourse is to throw IOException and therefore fail the 
> job.  
> This is definitely not good since bad data is a reality and failing the job 
> is not the right option.  If the job fails we'll just end up replaying the 
> bad data and the whole thing will start again.
> Instead in this case it would be best if the user could just return the empty 
> set.
> The other case is where one input message should logically be multiple output 
> messages.  This case is probably less important since there are other ways to 
> do this but in general it might be good to make the 
> DeserializationSchema.deserialize() method return a collection rather than a 
> single element.
> Maybe we need to support a DeserializationSchema variant that has semantics 
> more like that of FlatMap.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Closed] (FLINK-5768) Apply new aggregation functions for datastream and dataset tables

2017-03-02 Thread Fabian Hueske (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske closed FLINK-5768.

   Resolution: Implemented
Fix Version/s: 1.3.0

Implemented with 438276de8fab4f1a8f2b62b6452c2e5b2998ce5a

> Apply new aggregation functions for datastream and dataset tables
> -
>
> Key: FLINK-5768
> URL: https://issues.apache.org/jira/browse/FLINK-5768
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Shaoxuan Wang
>Assignee: Shaoxuan Wang
> Fix For: 1.3.0
>
>
> Apply new aggregation functions for datastream and dataset tables
> This includes:
> 1. Change the implementation of the DataStream aggregation runtime code to 
> use new aggregation functions and aggregate dataStream API.
> 2. DataStream will be always running in incremental mode, as explained in 
> 06/Feb/2017 in FLINK5564.
> 2. Change the implementation of the Dataset aggregation runtime code to use 
> new aggregation functions.
> 3. Clean up unused class and method.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink pull request #3423: [FLINK-5768] [table] Apply new aggregation functio...

2017-03-02 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3423


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5768) Apply new aggregation functions for datastream and dataset tables

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892937#comment-15892937
 ] 

ASF GitHub Bot commented on FLINK-5768:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3423


> Apply new aggregation functions for datastream and dataset tables
> -
>
> Key: FLINK-5768
> URL: https://issues.apache.org/jira/browse/FLINK-5768
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Shaoxuan Wang
>Assignee: Shaoxuan Wang
>
> Apply new aggregation functions for datastream and dataset tables
> This includes:
> 1. Change the implementation of the DataStream aggregation runtime code to 
> use new aggregation functions and aggregate dataStream API.
> 2. DataStream will be always running in incremental mode, as explained in 
> 06/Feb/2017 in FLINK5564.
> 2. Change the implementation of the Dataset aggregation runtime code to use 
> new aggregation functions.
> 3. Clean up unused class and method.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (FLINK-4949) Refactor Gelly driver inputs

2017-03-02 Thread Vasia Kalavri (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-4949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892696#comment-15892696
 ] 

Vasia Kalavri edited comment on FLINK-4949 at 3/2/17 6:05 PM:
--

Thank you [~greghogan]. I can review during the weekend.


was (Author: vkalavri):
Thanks you [~greghogan]. I can review during the weekend.

> Refactor Gelly driver inputs
> 
>
> Key: FLINK-4949
> URL: https://issues.apache.org/jira/browse/FLINK-4949
> Project: Flink
>  Issue Type: Improvement
>  Components: Gelly
>Affects Versions: 1.2.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
> Fix For: 1.3.0
>
>
> The Gelly drivers started as simple wrappers around library algorithms but 
> have grown to handle a matrix of input sources while often running multiple 
> algorithms and analytics with custom parameterization.
> This ticket will refactor the sourcing of the input graph into separate 
> classes for CSV files and RMat which will simplify the inclusion of new data 
> sources.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-4949) Refactor Gelly driver inputs

2017-03-02 Thread Vasia Kalavri (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-4949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892696#comment-15892696
 ] 

Vasia Kalavri commented on FLINK-4949:
--

Thanks you [~greghogan]. I can review during the weekend.

> Refactor Gelly driver inputs
> 
>
> Key: FLINK-4949
> URL: https://issues.apache.org/jira/browse/FLINK-4949
> Project: Flink
>  Issue Type: Improvement
>  Components: Gelly
>Affects Versions: 1.2.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
> Fix For: 1.3.0
>
>
> The Gelly drivers started as simple wrappers around library algorithms but 
> have grown to handle a matrix of input sources while often running multiple 
> algorithms and analytics with custom parameterization.
> This ticket will refactor the sourcing of the input graph into separate 
> classes for CSV files and RMat which will simplify the inclusion of new data 
> sources.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-1579) Create a Flink History Server

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892693#comment-15892693
 ] 

ASF GitHub Bot commented on FLINK-1579:
---

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/3460

[FLINK-1579] Implement History Server

This PR adds a slightly unpolished HistoryServer implementation. It is 
missing tests and some documentation, but is working.

This PR builds on top of #3377.

The basic idea is as follows:

The ```MemoryArchivist```, upon receiving an ```ExecutionGraph```, writes a 
set of json files into a directory structure resembling the REST API using the 
features introduced in FLINK-5870, FLINK-5852 and FLINK-5941. The target 
location is configurable using ```job-manager.archive.dir```. Each job resides 
in it's own directory, using the job ID as the directory name. As such, each 
archive is consistent on it's own and multiple jobmanagers may use the same 
archive dir.

The ```HistoryServer``` polls certain directories, configured via 
```historyserver.archive.dirs```, in regular intervals, configured via 
```historyserver.refresh-interval```, for new job archives. If a new archive is 
found it is downloaded and integrated into a cache of job archives in the local 
file system, configurable using ```historyserver.web.dir```. These files are 
served to a slightly modified WebFrontend using the 
```HistoryServerStaticFileServerHandler```.

In the end the HistoryServer is little more than an aggregator and archive 
viewer.

None of the directory configuration options have defaults; as it stands the 
entire feature is opt-in.

Should a file that the WebFrontend requests be missing a separate fetch 
routine kicks in which attempts to fetch the missing file. This is primarily 
aimed at eventually-consistent file-systems.

The HistoryServer is started using the new historyserver.sh script, which 
works similarly to job- or taskmanager scripts: ```./bin/historyserver.sh 
[start|stop]```

2 bigger refactorings were made to existing code to increase the amount of 
shared code:
* the netty setup in the WebRuntimeMonitor was moved into a separate 
NettySetup class which the HistoryServer can use as well
* an AbstractStaticFileServerHandler was added which the 
(HistoryServer)StaticFileServerHandler extend

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 1579_history_server_pr

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3460.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3460


commit 61a07456f151ac8f5418ac66629751e1a83ada3a
Author: zentol 
Date:   2017-01-24T09:13:24Z

[FLINK-1579] Implement History Server - Frontend

commit e6316e544fea160f7d050dd1b087301a83345d31
Author: zentol 
Date:   2017-02-21T11:36:17Z

[FLINK-5645] Store accumulators/metrics for canceled/failed tasks

commit 84fd2746b09ce41c2d9bd5be7f6e8a8cc1a3291d
Author: zentol 
Date:   2017-03-02T12:31:56Z

Refactor netty setup into separate class

commit 81d7e6b92fe69326d6edf6b63f3f9c95f5ebd0ef
Author: zentol 
Date:   2017-02-22T14:47:07Z

[FLINK-1579] Implement History Server - Backend

commit 8d1e8c59690ea97be4bbaf1a011c8ec4a68f5892
Author: zentol 
Date:   2017-03-02T11:09:36Z

Rebuild frontend




> Create a Flink History Server
> -
>
> Key: FLINK-1579
> URL: https://issues.apache.org/jira/browse/FLINK-1579
> Project: Flink
>  Issue Type: New Feature
>  Components: Distributed Coordination
>Affects Versions: 0.9
>Reporter: Robert Metzger
>Assignee: Chesnay Schepler
>
> Right now its not possible to analyze the job results for jobs that ran on 
> YARN, because we'll loose the information once the JobManager has stopped.
> Therefore, I propose to implement a "Flink History Server" which serves  the 
> results from these jobs.
> I haven't started thinking about the implementation, but I suspect it 
> involves some JSON files stored in HDFS :)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink pull request #3460: [FLINK-1579] Implement History Server

2017-03-02 Thread zentol

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/3460

[FLINK-1579] Implement History Server

This PR adds a slightly unpolished HistoryServer implementation. It is 
missing tests and some documentation, but is working.

This PR builds on top of #3377.

The basic idea is as follows:

The ```MemoryArchivist```, upon receiving an ```ExecutionGraph```, writes a 
set of json files into a directory structure resembling the REST API using the 
features introduced in FLINK-5870, FLINK-5852 and FLINK-5941. The target 
location is configurable using ```job-manager.archive.dir```. Each job resides 
in it's own directory, using the job ID as the directory name. As such, each 
archive is consistent on it's own and multiple jobmanagers may use the same 
archive dir.

The ```HistoryServer``` polls certain directories, configured via 
```historyserver.archive.dirs```, in regular intervals, configured via 
```historyserver.refresh-interval```, for new job archives. If a new archive is 
found it is downloaded and integrated into a cache of job archives in the local 
file system, configurable using ```historyserver.web.dir```. These files are 
served to a slightly modified WebFrontend using the 
```HistoryServerStaticFileServerHandler```.

In the end the HistoryServer is little more than an aggregator and archive 
viewer.

None of the directory configuration options have defaults; as it stands the 
entire feature is opt-in.

Should a file that the WebFrontend requests be missing a separate fetch 
routine kicks in which attempts to fetch the missing file. This is primarily 
aimed at eventually-consistent file-systems.

The HistoryServer is started using the new historyserver.sh script, which 
works similarly to job- or taskmanager scripts: ```./bin/historyserver.sh 
[start|stop]```

2 bigger refactorings were made to existing code to increase the amount of 
shared code:
* the netty setup in the WebRuntimeMonitor was moved into a separate 
NettySetup class which the HistoryServer can use as well
* an AbstractStaticFileServerHandler was added which the 
(HistoryServer)StaticFileServerHandler extend

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 1579_history_server_pr

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3460.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3460


commit 61a07456f151ac8f5418ac66629751e1a83ada3a
Author: zentol 
Date:   2017-01-24T09:13:24Z

[FLINK-1579] Implement History Server - Frontend

commit e6316e544fea160f7d050dd1b087301a83345d31
Author: zentol 
Date:   2017-02-21T11:36:17Z

[FLINK-5645] Store accumulators/metrics for canceled/failed tasks

commit 84fd2746b09ce41c2d9bd5be7f6e8a8cc1a3291d
Author: zentol 
Date:   2017-03-02T12:31:56Z

Refactor netty setup into separate class

commit 81d7e6b92fe69326d6edf6b63f3f9c95f5ebd0ef
Author: zentol 
Date:   2017-02-22T14:47:07Z

[FLINK-1579] Implement History Server - Backend

commit 8d1e8c59690ea97be4bbaf1a011c8ec4a68f5892
Author: zentol 
Date:   2017-03-02T11:09:36Z

Rebuild frontend




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-4565) Support for SQL IN operator

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892689#comment-15892689
 ] 

ASF GitHub Bot commented on FLINK-4565:
---

Github user DmytroShkvyra commented on a diff in the pull request:

https://github.com/apache/flink/pull/2870#discussion_r103987812
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/api/table/expressions/ScalarFunctionsTest.scala
 ---
@@ -1101,6 +1101,45 @@ class ScalarFunctionsTest extends ExpressionTestBase 
{
   "true")
   }
 
+  @Test
+  def testInExpressions(): Unit = {
+testTableApi(
--- End diff --

@twalthr Are you sure? That we need use IN with POJOs/Tuples/Case classes?
First of all I will hit performance because compare of these types too 
complicated. it's easier get subset of id's and use IN than compare POJOs and 
Case classes. More over we cant it use in SQL string statements (we need parse 
POJO and case classes from string)


> Support for SQL IN operator
> ---
>
> Key: FLINK-4565
> URL: https://issues.apache.org/jira/browse/FLINK-4565
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Dmytro Shkvyra
>
> It seems that Flink SQL supports the uncorrelated sub-query IN operator. But 
> it should also be available in the Table API and tested.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink pull request #2870: [FLINK-4565] Support for SQL IN operator

2017-03-02 Thread DmytroShkvyra

Github user DmytroShkvyra commented on a diff in the pull request:

https://github.com/apache/flink/pull/2870#discussion_r103987812
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/api/table/expressions/ScalarFunctionsTest.scala
 ---
@@ -1101,6 +1101,45 @@ class ScalarFunctionsTest extends ExpressionTestBase 
{
   "true")
   }
 
+  @Test
+  def testInExpressions(): Unit = {
+testTableApi(
--- End diff --

@twalthr Are you sure? That we need use IN with POJOs/Tuples/Case classes?
First of all I will hit performance because compare of these types too 
complicated. it's easier get subset of id's and use IN than compare POJOs and 
Case classes. More over we cant it use in SQL string statements (we need parse 
POJO and case classes from string)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Updated] (FLINK-5953) Sample Kinesis Scala code doesn't work

2017-03-02 Thread Matthew Billson (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew Billson updated FLINK-5953:
---
Description: 
In the Scala Kinesis example here: 
https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/connectors/kinesis.html
 we see the following: 

{code}val consumerConfig = new Properties();
consumerConfig.put(ConsumerConfigConstants.AWS_REGION, "us-east-1");
consumerConfig.put(ConsumerConfigConstants.AWS_ACCESS_KEY_ID, 
"aws_access_key_id");
consumerConfig.put(ConsumerConfigConstants.AWS_SECRET_ACCESS_KEY, 
"aws_secret_access_key");
consumerConfig.put(ConsumerConfigConstants.STREAM_INITIAL_POSITION, 
"LATEST");{code}

but in Scala ConsumerConfigConstants does not inherit the static members of 
AWSConfigConstants and so only the 4th line of this actually works.

  was:
In the Scala Kinesis example here: 
https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/connectors/kinesis.html
 we see the following: 

val consumerConfig = new Properties();
consumerConfig.put(ConsumerConfigConstants.AWS_REGION, "us-east-1");
consumerConfig.put(ConsumerConfigConstants.AWS_ACCESS_KEY_ID, 
"aws_access_key_id");
consumerConfig.put(ConsumerConfigConstants.AWS_SECRET_ACCESS_KEY, 
"aws_secret_access_key");
consumerConfig.put(ConsumerConfigConstants.STREAM_INITIAL_POSITION, "LATEST");

but in Scala ConsumerConfigConstants does not inherit the static members of 
AWSConfigConstants and so only the 4th line of this actually works.


> Sample Kinesis Scala code doesn't work
> --
>
> Key: FLINK-5953
> URL: https://issues.apache.org/jira/browse/FLINK-5953
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.2.0
>Reporter: Matthew Billson
>Priority: Minor
>  Labels: documentation, kinesis, scala
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> In the Scala Kinesis example here: 
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/connectors/kinesis.html
>  we see the following: 
> {code}val consumerConfig = new Properties();
> consumerConfig.put(ConsumerConfigConstants.AWS_REGION, "us-east-1");
> consumerConfig.put(ConsumerConfigConstants.AWS_ACCESS_KEY_ID, 
> "aws_access_key_id");
> consumerConfig.put(ConsumerConfigConstants.AWS_SECRET_ACCESS_KEY, 
> "aws_secret_access_key");
> consumerConfig.put(ConsumerConfigConstants.STREAM_INITIAL_POSITION, 
> "LATEST");{code}
> but in Scala ConsumerConfigConstants does not inherit the static members of 
> AWSConfigConstants and so only the 4th line of this actually works.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5107) Job Manager goes out of memory from long history of prior execution attempts

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892658#comment-15892658
 ] 

ASF GitHub Bot commented on FLINK-5107:
---

Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/2837#discussion_r103984233
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/jobmanager/JobManagerOptions.java
 ---
@@ -0,0 +1,38 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.jobmanager;
+
+import org.apache.flink.annotation.PublicEvolving;
+import org.apache.flink.configuration.ConfigOption;
+
+import static org.apache.flink.configuration.ConfigOptions.key;
+
+@PublicEvolving
+public class JobManagerOptions {
+
+   /**
+* The maximum number of prior execution attempts kept in history.
+*/
+   public static final ConfigOption MAX_ATTEMPTS_HISTORY_SIZE =
+   
key("job-manager.max-attempts-history-size").defaultValue(16);
--- End diff --

This key deviates from the existing job manager config constants, which all 
start with "jobmanager". Is this intended?


> Job Manager goes out of memory from long history of prior execution attempts
> 
>
> Key: FLINK-5107
> URL: https://issues.apache.org/jira/browse/FLINK-5107
> Project: Flink
>  Issue Type: Bug
>  Components: JobManager
>Reporter: Stefan Richter
>Assignee: Stefan Richter
>
> We have observed that the job manager can run out of memory during long 
> running jobs with many vertexes. Analysis of the heap dump shows, that the 
> ever-growing history of prior execution attempts is the culprit for this 
> problem.
> We should limit this history to a number of n most recent attempts. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink pull request #2837: [FLINK-5107] Introduced limit for prior execution ...

2017-03-02 Thread zentol

Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/2837#discussion_r103984233
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/jobmanager/JobManagerOptions.java
 ---
@@ -0,0 +1,38 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.jobmanager;
+
+import org.apache.flink.annotation.PublicEvolving;
+import org.apache.flink.configuration.ConfigOption;
+
+import static org.apache.flink.configuration.ConfigOptions.key;
+
+@PublicEvolving
+public class JobManagerOptions {
+
+   /**
+* The maximum number of prior execution attempts kept in history.
+*/
+   public static final ConfigOption MAX_ATTEMPTS_HISTORY_SIZE =
+   
key("job-manager.max-attempts-history-size").defaultValue(16);
--- End diff --

This key deviates from the existing job manager config constants, which all 
start with "jobmanager". Is this intended?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (FLINK-5953) Sample Kinesis Scala code doesn't work

2017-03-02 Thread Matthew Billson (JIRA)

Matthew Billson created FLINK-5953:
--

 Summary: Sample Kinesis Scala code doesn't work
 Key: FLINK-5953
 URL: https://issues.apache.org/jira/browse/FLINK-5953
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.2.0
Reporter: Matthew Billson
Priority: Minor


In the Scala Kinesis example here: 
https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/connectors/kinesis.html
 we see the following: 

val consumerConfig = new Properties();
consumerConfig.put(ConsumerConfigConstants.AWS_REGION, "us-east-1");
consumerConfig.put(ConsumerConfigConstants.AWS_ACCESS_KEY_ID, 
"aws_access_key_id");
consumerConfig.put(ConsumerConfigConstants.AWS_SECRET_ACCESS_KEY, 
"aws_secret_access_key");
consumerConfig.put(ConsumerConfigConstants.STREAM_INITIAL_POSITION, "LATEST");

but in Scala ConsumerConfigConstants does not inherit the static members of 
AWSConfigConstants and so only the 4th line of this actually works.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink pull request #3444: [FLINK-5941] Integrate Archiver pattern into handl...

2017-03-02 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3444


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5941) Let handlers take part in job archiving

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892645#comment-15892645
 ] 

ASF GitHub Bot commented on FLINK-5941:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3444


> Let handlers take part in job archiving
> ---
>
> Key: FLINK-5941
> URL: https://issues.apache.org/jira/browse/FLINK-5941
> Project: Flink
>  Issue Type: New Feature
>  Components: Webfrontend
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
> Fix For: 1.3.0
>
>
> The key idea behind the HistoryServer is to pre-compute all JSON responses 
> which the WebFrontend could request and store them as files in a directory 
> structure resembling the REST-API.
> For this require a mechanism to generate the responses and their 
> corresponding REST URL.
> FLINK-5852 made it easier to re-use the JSON generation code, while 
> FLINK-5870 made handlers aware of the REST URLs that they are registered one.
> The aim of this JIRA is to extend job-related handlers, building on the above 
> JIRAs, enabling them to generate a number of (Path, Json) pairs for a given 
> ExecutionGraph, containing all responses that they could generate for the 
> given graph and their respective REST URL..



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Closed] (FLINK-5941) Let handlers take part in job archiving

2017-03-02 Thread Chesnay Schepler (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-5941.
---
Resolution: Fixed

Implemented in 7fe0eb477df52cfd7254695a67d41f3cba34ef0a.

> Let handlers take part in job archiving
> ---
>
> Key: FLINK-5941
> URL: https://issues.apache.org/jira/browse/FLINK-5941
> Project: Flink
>  Issue Type: New Feature
>  Components: Webfrontend
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
> Fix For: 1.3.0
>
>
> The key idea behind the HistoryServer is to pre-compute all JSON responses 
> which the WebFrontend could request and store them as files in a directory 
> structure resembling the REST-API.
> For this require a mechanism to generate the responses and their 
> corresponding REST URL.
> FLINK-5852 made it easier to re-use the JSON generation code, while 
> FLINK-5870 made handlers aware of the REST URLs that they are registered one.
> The aim of this JIRA is to extend job-related handlers, building on the above 
> JIRAs, enabling them to generate a number of (Path, Json) pairs for a given 
> ExecutionGraph, containing all responses that they could generate for the 
> given graph and their respective REST URL..



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5441) Directly allow SQL queries on a Table

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892640#comment-15892640
 ] 

ASF GitHub Bot commented on FLINK-5441:
---

Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/3107
  
Yes, you are right. Forgot that we lazily translate the plans. :-/
Let's keep the tables then.


> Directly allow SQL queries on a Table
> -
>
> Key: FLINK-5441
> URL: https://issues.apache.org/jira/browse/FLINK-5441
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Jark Wu
>
> Right now a user has to register a table before it can be used in SQL 
> queries. In order to allow more fluent programming we propose calling SQL 
> directly on a table. An underscore can be used to reference the current table:
> {code}
> myTable.sql("SELECT a, b, c FROM _ WHERE d = 12")
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink issue #3107: [FLINK-5441] [table] Directly allow SQL queries on a Tabl...

2017-03-02 Thread fhueske

Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/3107
  
Yes, you are right. Forgot that we lazily translate the plans. :-/
Let's keep the tables then.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Closed] (FLINK-5769) Apply new aggregation functions for dataset tables

2017-03-02 Thread Fabian Hueske (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske closed FLINK-5769.

Resolution: Duplicate

Will be done as part of FLINK-5768.

> Apply new aggregation functions for dataset tables
> --
>
> Key: FLINK-5769
> URL: https://issues.apache.org/jira/browse/FLINK-5769
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Shaoxuan Wang
>Assignee: Shaoxuan Wang
>
> Change the implementation of the Dataset aggregation runtime code to use new 
> aggregation functions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5768) Apply new aggregation functions for datastream and dataset tables

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892625#comment-15892625
 ] 

ASF GitHub Bot commented on FLINK-5768:
---

Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/3423
  
thanks for the update @shaoxuan-wang. The PR looks good to merge. 
I do some final tests and run another build.

Thanks, Fabian


> Apply new aggregation functions for datastream and dataset tables
> -
>
> Key: FLINK-5768
> URL: https://issues.apache.org/jira/browse/FLINK-5768
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: Shaoxuan Wang
>Assignee: Shaoxuan Wang
>
> Apply new aggregation functions for datastream and dataset tables
> This includes:
> 1. Change the implementation of the DataStream aggregation runtime code to 
> use new aggregation functions and aggregate dataStream API.
> 2. DataStream will be always running in incremental mode, as explained in 
> 06/Feb/2017 in FLINK5564.
> 2. Change the implementation of the Dataset aggregation runtime code to use 
> new aggregation functions.
> 3. Clean up unused class and method.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink issue #3423: [FLINK-5768] [table] Apply new aggregation functions for ...

2017-03-02 Thread fhueske

Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/3423
  
thanks for the update @shaoxuan-wang. The PR looks good to merge. 
I do some final tests and run another build.

Thanks, Fabian


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-4565) Support for SQL IN operator

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892619#comment-15892619
 ] 

ASF GitHub Bot commented on FLINK-4565:
---

Github user DmytroShkvyra commented on a diff in the pull request:

https://github.com/apache/flink/pull/2870#discussion_r103976547
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/table.scala
 ---
@@ -150,7 +150,35 @@ class Table(
 * }}}
 */
   def filter(predicate: Expression): Table = {
-new Table(tableEnv, Filter(predicate, logicalPlan).validate(tableEnv))
+
+predicate match {
--- End diff --

Did you mean something like that:
`SELECT b.IN[1,2,3] as a FROM T as b` ? 


> Support for SQL IN operator
> ---
>
> Key: FLINK-4565
> URL: https://issues.apache.org/jira/browse/FLINK-4565
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Dmytro Shkvyra
>
> It seems that Flink SQL supports the uncorrelated sub-query IN operator. But 
> it should also be available in the Table API and tested.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink pull request #2870: [FLINK-4565] Support for SQL IN operator

2017-03-02 Thread DmytroShkvyra

Github user DmytroShkvyra commented on a diff in the pull request:

https://github.com/apache/flink/pull/2870#discussion_r103976547
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/table.scala
 ---
@@ -150,7 +150,35 @@ class Table(
 * }}}
 */
   def filter(predicate: Expression): Table = {
-new Table(tableEnv, Filter(predicate, logicalPlan).validate(tableEnv))
+
+predicate match {
--- End diff --

Did you mean something like that:
`SELECT b.IN[1,2,3] as a FROM T as b` ? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5654) Add processing time OVER RANGE BETWEEN x PRECEDING aggregation to SQL

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892596#comment-15892596
 ] 

ASF GitHub Bot commented on FLINK-5654:
---

GitHub user huawei-flink opened a pull request:

https://github.com/apache/flink/pull/3459

[FLINK-5654] Add processing time OVER RANGE BETWEEN x PRECEDING aggregation 
to SQL

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [ X] General
  - The pull request references the related JIRA issue ("[FLINK-XXX] Jira 
title text")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [X ] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [x ] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/huawei-flink/flink FLINK-5654

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3459.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3459


commit 72ec35a7380a4d73bd092ce14962ab2248139bae
Author: Stefano Bortoli 
Date:   2017-02-01T16:15:58Z

First implementation of ProcTime()

commit e98c28616af1cf67d3ad3277d9cc2ca335604eca
Author: rtudoran 
Date:   2017-02-02T10:30:40Z

Disambiguate for the OVER BY clause, which should not be treated as a
RexOver expression in Logical Project

commit b7e6a673b88b6181c06071cd6c7bda55c25a62b4
Author: Stefano Bortoli 
Date:   2017-02-02T12:07:11Z

Added return to disambiguation method for rexover

commit cda17565d5969f29b16923b631178a2cbf64791b
Author: rtudoran 
Date:   2017-02-02T16:00:20Z

Enable the LogicalWindow operators in query translation

commit 4b3e54281018b83c818f91e09a5321c34bbf297b
Author: rtudoran 
Date:   2017-02-03T14:59:39Z

Added a DataStreamRel version that can be extended in java

commit cc960d699db369cc8dc4e155cc5c5f6c3baf74a4
Author: rtudoran 
Date:   2017-02-03T15:35:18Z

Add skeleton for the implementation of the aggregates over sliding
window with processing time and time boundaries

commit 2390a9d3dc15afba01185c47f61a9ea830ea5acc
Author: Stefano Bortoli 
Date:   2017-02-06T10:33:57Z

committing changes with stub modifications before chekout proctime
branch

commit eaf4e92784dab01b17004390968ca4b1fe7c4bea
Author: Stefano Bortoli 
Date:   2017-02-06T13:17:43Z

ignore aggregation test and implemented simple proctime test

commit 16ccd7f5bf019803ea8b53f09a126ec53a5a6d59
Author: Stefano Bortoli 
Date:   2017-02-06T14:17:03Z

Merge branch 'FLINK-5710' of https://github.com/huawei-flink/flink into 
FLINK-5653

commit 10f7bc5e2086e41ec76cbabdcd069c71a491671a
Author: Stefano Bortoli 
Date:   2017-02-07T09:42:41Z

committing first key selector and utils

commit 31060e46f78729880c03e8cab0f92ff06faec4f0
Author: Stefano Bortoli 
Date:   2017-02-07T11:16:43Z

Changed ProcTime from time to timestamp

commit 69289bad836a5fdace271b28a15ca0e309e50b17
Author: rtudoran 
Date:   2017-02-07T13:13:23Z

Merge branch 'FLINK-5710' of https://github.com/huawei-flink/flink into 
FLINK-5654

commit 3392817045ed166df5f55d22fde34cbd98c775db
Author: rtudoran 
Date:   2017-02-07T13:14:50Z

Merge branch 'FLINK-5653' of https://github.com/huawei-flink/flink into 
FLINK-5654

commit d2ea0076b5e3561585c4eaea84025e50beaacf9a
Author: Stefano Bortoli 
Date:   2017-02-07T09:42:41Z

fixing linelength and other issues

commit f29f564bb7fe7496b9f3d2f45a6b4469af559378
Author: Stefano Bortoli 
Date:   2017-02-07T13:46:30Z

Merge branch 'FLINK-5653' of https://github.com/huawei-flink/flink.git
into FLINK-5653

Conflicts:

[GitHub] flink pull request #3459: [FLINK-5654] Add processing time OVER RANGE BETWEE...

2017-03-02 Thread huawei-flink

GitHub user huawei-flink opened a pull request:

https://github.com/apache/flink/pull/3459

[FLINK-5654] Add processing time OVER RANGE BETWEEN x PRECEDING aggregation 
to SQL

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [ X] General
  - The pull request references the related JIRA issue ("[FLINK-XXX] Jira 
title text")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [X ] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [x ] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/huawei-flink/flink FLINK-5654

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3459.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3459


commit 72ec35a7380a4d73bd092ce14962ab2248139bae
Author: Stefano Bortoli 
Date:   2017-02-01T16:15:58Z

First implementation of ProcTime()

commit e98c28616af1cf67d3ad3277d9cc2ca335604eca
Author: rtudoran 
Date:   2017-02-02T10:30:40Z

Disambiguate for the OVER BY clause, which should not be treated as a
RexOver expression in Logical Project

commit b7e6a673b88b6181c06071cd6c7bda55c25a62b4
Author: Stefano Bortoli 
Date:   2017-02-02T12:07:11Z

Added return to disambiguation method for rexover

commit cda17565d5969f29b16923b631178a2cbf64791b
Author: rtudoran 
Date:   2017-02-02T16:00:20Z

Enable the LogicalWindow operators in query translation

commit 4b3e54281018b83c818f91e09a5321c34bbf297b
Author: rtudoran 
Date:   2017-02-03T14:59:39Z

Added a DataStreamRel version that can be extended in java

commit cc960d699db369cc8dc4e155cc5c5f6c3baf74a4
Author: rtudoran 
Date:   2017-02-03T15:35:18Z

Add skeleton for the implementation of the aggregates over sliding
window with processing time and time boundaries

commit 2390a9d3dc15afba01185c47f61a9ea830ea5acc
Author: Stefano Bortoli 
Date:   2017-02-06T10:33:57Z

committing changes with stub modifications before chekout proctime
branch

commit eaf4e92784dab01b17004390968ca4b1fe7c4bea
Author: Stefano Bortoli 
Date:   2017-02-06T13:17:43Z

ignore aggregation test and implemented simple proctime test

commit 16ccd7f5bf019803ea8b53f09a126ec53a5a6d59
Author: Stefano Bortoli 
Date:   2017-02-06T14:17:03Z

Merge branch 'FLINK-5710' of https://github.com/huawei-flink/flink into 
FLINK-5653

commit 10f7bc5e2086e41ec76cbabdcd069c71a491671a
Author: Stefano Bortoli 
Date:   2017-02-07T09:42:41Z

committing first key selector and utils

commit 31060e46f78729880c03e8cab0f92ff06faec4f0
Author: Stefano Bortoli 
Date:   2017-02-07T11:16:43Z

Changed ProcTime from time to timestamp

commit 69289bad836a5fdace271b28a15ca0e309e50b17
Author: rtudoran 
Date:   2017-02-07T13:13:23Z

Merge branch 'FLINK-5710' of https://github.com/huawei-flink/flink into 
FLINK-5654

commit 3392817045ed166df5f55d22fde34cbd98c775db
Author: rtudoran 
Date:   2017-02-07T13:14:50Z

Merge branch 'FLINK-5653' of https://github.com/huawei-flink/flink into 
FLINK-5654

commit d2ea0076b5e3561585c4eaea84025e50beaacf9a
Author: Stefano Bortoli 
Date:   2017-02-07T09:42:41Z

fixing linelength and other issues

commit f29f564bb7fe7496b9f3d2f45a6b4469af559378
Author: Stefano Bortoli 
Date:   2017-02-07T13:46:30Z

Merge branch 'FLINK-5653' of https://github.com/huawei-flink/flink.git
into FLINK-5653

Conflicts:

flink-libraries/flink-table/src/main/java/org/apache/flink/table/plan/logical/rel/DataStreamWindowRowAggregate.java

flink-libraries/flink-table/src/main/java/org/apache/flink/table/plan/logical/rel/util/StreamGroupKeySelector.java

commit ea145ecefc2be1bea71e995dbf39585e7fa44012
Author: rtudoran

[GitHub] flink issue #3107: [FLINK-5441] [table] Directly allow SQL queries on a Tabl...

2017-03-02 Thread twalthr

Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/3107
  
I think the point is that we don't know if the table will ever be used 
again. If we unregister them after optimization, we can not have multiple 
`toDataStream()` calls for one table.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5441) Directly allow SQL queries on a Table

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892592#comment-15892592
 ] 

ASF GitHub Bot commented on FLINK-5441:
---

Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/3107
  
I think the point is that we don't know if the table will ever be used 
again. If we unregister them after optimization, we can not have multiple 
`toDataStream()` calls for one table.


> Directly allow SQL queries on a Table
> -
>
> Key: FLINK-5441
> URL: https://issues.apache.org/jira/browse/FLINK-5441
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Jark Wu
>
> Right now a user has to register a table before it can be used in SQL 
> queries. In order to allow more fluent programming we propose calling SQL 
> directly on a table. An underscore can be used to reference the current table:
> {code}
> myTable.sql("SELECT a, b, c FROM _ WHERE d = 12")
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink pull request #2870: [FLINK-4565] Support for SQL IN operator

2017-03-02 Thread twalthr

Github user twalthr commented on a diff in the pull request:

https://github.com/apache/flink/pull/2870#discussion_r103963098
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/codegen/calls/ScalarOperators.scala
 ---
@@ -79,6 +79,72 @@ object ScalarOperators {
 }
   }
 
+  def generateIn(
+  nullCheck: Boolean,
+  left: GeneratedExpression,
+  right: scala.collection.mutable.Buffer[GeneratedExpression],
+  addReusableCodeCallback: (String, String) => Any)
+: GeneratedExpression = {
+val resultTerm = newName("result")
+val isNull = newName("isNull")
+
+val topNumericalType: Option[TypeInformation[_]] = {
--- End diff --

IMHO, calculating the top numerical type seems not too specific to me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5441) Directly allow SQL queries on a Table

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892517#comment-15892517
 ] 

ASF GitHub Bot commented on FLINK-5441:
---

Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/3107
  
But what's the point of keeping tables that will never be used again?


> Directly allow SQL queries on a Table
> -
>
> Key: FLINK-5441
> URL: https://issues.apache.org/jira/browse/FLINK-5441
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Jark Wu
>
> Right now a user has to register a table before it can be used in SQL 
> queries. In order to allow more fluent programming we propose calling SQL 
> directly on a table. An underscore can be used to reference the current table:
> {code}
> myTable.sql("SELECT a, b, c FROM _ WHERE d = 12")
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-4565) Support for SQL IN operator

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892519#comment-15892519
 ] 

ASF GitHub Bot commented on FLINK-4565:
---

Github user twalthr commented on a diff in the pull request:

https://github.com/apache/flink/pull/2870#discussion_r103963098
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/codegen/calls/ScalarOperators.scala
 ---
@@ -79,6 +79,72 @@ object ScalarOperators {
 }
   }
 
+  def generateIn(
+  nullCheck: Boolean,
+  left: GeneratedExpression,
+  right: scala.collection.mutable.Buffer[GeneratedExpression],
+  addReusableCodeCallback: (String, String) => Any)
+: GeneratedExpression = {
+val resultTerm = newName("result")
+val isNull = newName("isNull")
+
+val topNumericalType: Option[TypeInformation[_]] = {
--- End diff --

IMHO, calculating the top numerical type seems not too specific to me.


> Support for SQL IN operator
> ---
>
> Key: FLINK-4565
> URL: https://issues.apache.org/jira/browse/FLINK-4565
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Dmytro Shkvyra
>
> It seems that Flink SQL supports the uncorrelated sub-query IN operator. But 
> it should also be available in the Table API and tested.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-4565) Support for SQL IN operator

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892534#comment-15892534
 ] 

ASF GitHub Bot commented on FLINK-4565:
---

Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/2870
  
Thanks for taking care of this PR @DmytroShkvyra. It would be great if we 
could use `SubQueryRemoveRule` rules together with the RexSubQuery.in() RexNode 
in order to reduce code complexity.


> Support for SQL IN operator
> ---
>
> Key: FLINK-4565
> URL: https://issues.apache.org/jira/browse/FLINK-4565
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Dmytro Shkvyra
>
> It seems that Flink SQL supports the uncorrelated sub-query IN operator. But 
> it should also be available in the Table API and tested.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink issue #2870: [FLINK-4565] Support for SQL IN operator

2017-03-02 Thread twalthr

Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/2870
  
Thanks for taking care of this PR @DmytroShkvyra. It would be great if we 
could use `SubQueryRemoveRule` rules together with the RexSubQuery.in() RexNode 
in order to reduce code complexity.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-4565) Support for SQL IN operator

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892526#comment-15892526
 ] 

ASF GitHub Bot commented on FLINK-4565:
---

Github user twalthr commented on a diff in the pull request:

https://github.com/apache/flink/pull/2870#discussion_r103963525
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/codegen/calls/ScalarOperators.scala
 ---
@@ -1002,12 +1068,17 @@ object ScalarOperators {
 val resultTypeTerm = primitiveTypeTermForTypeInfo(resultType)
 // no casting necessary
 if (operandType == resultType) {
-  (operandTerm) => s"$operandTerm"
+  if (isDecimal(operandType)) {
+(operandTerm) => s"$operandTerm.stripTrailingZeros()"
--- End diff --

Ok, I'm fine with this change.


> Support for SQL IN operator
> ---
>
> Key: FLINK-4565
> URL: https://issues.apache.org/jira/browse/FLINK-4565
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Dmytro Shkvyra
>
> It seems that Flink SQL supports the uncorrelated sub-query IN operator. But 
> it should also be available in the Table API and tested.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink pull request #2870: [FLINK-4565] Support for SQL IN operator

2017-03-02 Thread twalthr

Github user twalthr commented on a diff in the pull request:

https://github.com/apache/flink/pull/2870#discussion_r103963525
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/codegen/calls/ScalarOperators.scala
 ---
@@ -1002,12 +1068,17 @@ object ScalarOperators {
 val resultTypeTerm = primitiveTypeTermForTypeInfo(resultType)
 // no casting necessary
 if (operandType == resultType) {
-  (operandTerm) => s"$operandTerm"
+  if (isDecimal(operandType)) {
+(operandTerm) => s"$operandTerm.stripTrailingZeros()"
--- End diff --

Ok, I'm fine with this change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #3107: [FLINK-5441] [table] Directly allow SQL queries on a Tabl...

2017-03-02 Thread fhueske

Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/3107
  
But what's the point of keeping tables that will never be used again?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-4565) Support for SQL IN operator

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892485#comment-15892485
 ] 

ASF GitHub Bot commented on FLINK-4565:
---

Github user twalthr commented on a diff in the pull request:

https://github.com/apache/flink/pull/2870#discussion_r103962015
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/table.scala
 ---
@@ -150,7 +150,35 @@ class Table(
 * }}}
 */
   def filter(predicate: Expression): Table = {
-new Table(tableEnv, Filter(predicate, logicalPlan).validate(tableEnv))
+
+predicate match {
--- End diff --

The IN operator can also be used in a `select()` statement but if we do the 
translation here, only `filter()` is supported.


> Support for SQL IN operator
> ---
>
> Key: FLINK-4565
> URL: https://issues.apache.org/jira/browse/FLINK-4565
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Timo Walther
>Assignee: Dmytro Shkvyra
>
> It seems that Flink SQL supports the uncorrelated sub-query IN operator. But 
> it should also be available in the Table API and tested.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink pull request #2870: [FLINK-4565] Support for SQL IN operator

2017-03-02 Thread twalthr

Github user twalthr commented on a diff in the pull request:

https://github.com/apache/flink/pull/2870#discussion_r103962015
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/table.scala
 ---
@@ -150,7 +150,35 @@ class Table(
 * }}}
 */
   def filter(predicate: Expression): Table = {
-new Table(tableEnv, Filter(predicate, logicalPlan).validate(tableEnv))
+
+predicate match {
--- End diff --

The IN operator can also be used in a `select()` statement but if we do the 
translation here, only `filter()` is supported.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (FLINK-5698) Add NestedFieldsProjectableTableSource interface

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892476#comment-15892476
 ] 

ASF GitHub Bot commented on FLINK-5698:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3269#discussion_r103912590
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/sources/NestedFieldsProjectableTableSource.scala
 ---
@@ -0,0 +1,55 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.sources
+
+/**
+  * Adds support for projection push-down to a [[TableSource]] with nested 
fields.
+  * A [[TableSource]] extending this interface is able
+  * to project the nested fields of the return table.
+  *
+  * @tparam T The return type of the 
[[NestedFieldsProjectableTableSource]].
+  */
+trait NestedFieldsProjectableTableSource[T] extends 
ProjectableTableSource[T] {
--- End diff --

I think it's not necessary to extend from `ProjectableTableSource`.


> Add NestedFieldsProjectableTableSource interface
> 
>
> Key: FLINK-5698
> URL: https://issues.apache.org/jira/browse/FLINK-5698
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Anton Solovev
>Assignee: Anton Solovev
>
> Add a NestedFieldsProjectableTableSource interface for some TableSource 
> implementation that support nesting projection push-down.
> The interface could look as follows
> {code}
> def trait NestedFieldsProjectableTableSource {
>   def projectNestedFields(fields: Array[String]): 
> NestedFieldsProjectableTableSource[T]
> }
> {code}
> This interface works together with ProjectableTableSource 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5698) Add NestedFieldsProjectableTableSource interface

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892474#comment-15892474
 ] 

ASF GitHub Bot commented on FLINK-5698:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3269#discussion_r103960833
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/plan/rules/util/RexProgramProjectExtractorTest.scala
 ---
@@ -55,6 +56,34 @@ class RexProgramProjectExtractorTest {
   }
 
   @Test
+  def testExtractRefNestedInputFields(): Unit = {
+val rexProgram = buildRexProgramWithNesting()
+val usedFields = extractRefInputFields(rexProgram)
+val usedNestedFields = extractRefNestedInputFields(rexProgram, 
usedFields)
+val expected = Array[Array[String]](Array("amount"), Array("*"))
+assertThat(usedNestedFields, is(expected))
+  }
+
+  @Test
+  def testExtractRefNestedInputFieldsWithNoNesting(): Unit = {
+val rexProgram = buildRexProgram()
+val usedFields = extractRefInputFields(rexProgram)
+val usedNestedFields = extractRefNestedInputFields(rexProgram, 
usedFields)
+val expected = Array[Array[String]](Array("*"), Array("*"), Array("*"))
+assertThat(usedNestedFields, is(expected))
+  }
+
+  @Test
+  def testExtractDeepRefNestedInputFields(): Unit = {
+val rexProgram = buildRexProgramWithDeepNesting()
+val usedFields = extractRefInputFields(rexProgram)
+val usedNestedFields = extractRefNestedInputFields(rexProgram, 
usedFields)
+val expected = Array[Array[String]](Array("amount"), 
Array("passport.status"))
--- End diff --

Another test would be to reference the nested attribute in a call, for 
example something like `payments.amount * 10`.


> Add NestedFieldsProjectableTableSource interface
> 
>
> Key: FLINK-5698
> URL: https://issues.apache.org/jira/browse/FLINK-5698
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Anton Solovev
>Assignee: Anton Solovev
>
> Add a NestedFieldsProjectableTableSource interface for some TableSource 
> implementation that support nesting projection push-down.
> The interface could look as follows
> {code}
> def trait NestedFieldsProjectableTableSource {
>   def projectNestedFields(fields: Array[String]): 
> NestedFieldsProjectableTableSource[T]
> }
> {code}
> This interface works together with ProjectableTableSource 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5698) Add NestedFieldsProjectableTableSource interface

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892480#comment-15892480
 ] 

ASF GitHub Bot commented on FLINK-5698:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3269#discussion_r103959655
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/plan/rules/util/RexProgramProjectExtractorTest.scala
 ---
@@ -55,6 +56,34 @@ class RexProgramProjectExtractorTest {
   }
 
   @Test
+  def testExtractRefNestedInputFields(): Unit = {
+val rexProgram = buildRexProgramWithNesting()
+val usedFields = extractRefInputFields(rexProgram)
+val usedNestedFields = extractRefNestedInputFields(rexProgram, 
usedFields)
+val expected = Array[Array[String]](Array("amount"), Array("*"))
+assertThat(usedNestedFields, is(expected))
+  }
+
+  @Test
+  def testExtractRefNestedInputFieldsWithNoNesting(): Unit = {
+val rexProgram = buildRexProgram()
+val usedFields = extractRefInputFields(rexProgram)
+val usedNestedFields = extractRefNestedInputFields(rexProgram, 
usedFields)
+val expected = Array[Array[String]](Array("*"), Array("*"), Array("*"))
+assertThat(usedNestedFields, is(expected))
+  }
+
+  @Test
+  def testExtractDeepRefNestedInputFields(): Unit = {
+val rexProgram = buildRexProgramWithDeepNesting()
+val usedFields = extractRefInputFields(rexProgram)
+val usedNestedFields = extractRefNestedInputFields(rexProgram, 
usedFields)
+val expected = Array[Array[String]](Array("amount"), 
Array("passport.status"))
--- End diff --

It would be good to have a test where the Array with the nested fields 
contains more than one entry.


> Add NestedFieldsProjectableTableSource interface
> 
>
> Key: FLINK-5698
> URL: https://issues.apache.org/jira/browse/FLINK-5698
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Anton Solovev
>Assignee: Anton Solovev
>
> Add a NestedFieldsProjectableTableSource interface for some TableSource 
> implementation that support nesting projection push-down.
> The interface could look as follows
> {code}
> def trait NestedFieldsProjectableTableSource {
>   def projectNestedFields(fields: Array[String]): 
> NestedFieldsProjectableTableSource[T]
> }
> {code}
> This interface works together with ProjectableTableSource 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5698) Add NestedFieldsProjectableTableSource interface

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892478#comment-15892478
 ] 

ASF GitHub Bot commented on FLINK-5698:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3269#discussion_r103913118
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/sources/NestedFieldsProjectableTableSource.scala
 ---
@@ -0,0 +1,55 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.sources
+
+/**
+  * Adds support for projection push-down to a [[TableSource]] with nested 
fields.
+  * A [[TableSource]] extending this interface is able
+  * to project the nested fields of the return table.
--- End diff --

return -> returned


> Add NestedFieldsProjectableTableSource interface
> 
>
> Key: FLINK-5698
> URL: https://issues.apache.org/jira/browse/FLINK-5698
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Anton Solovev
>Assignee: Anton Solovev
>
> Add a NestedFieldsProjectableTableSource interface for some TableSource 
> implementation that support nesting projection push-down.
> The interface could look as follows
> {code}
> def trait NestedFieldsProjectableTableSource {
>   def projectNestedFields(fields: Array[String]): 
> NestedFieldsProjectableTableSource[T]
> }
> {code}
> This interface works together with ProjectableTableSource 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5698) Add NestedFieldsProjectableTableSource interface

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892479#comment-15892479
 ] 

ASF GitHub Bot commented on FLINK-5698:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3269#discussion_r103958345
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/sources/NestedFieldsProjectableTableSource.scala
 ---
@@ -0,0 +1,55 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.sources
+
+/**
+  * Adds support for projection push-down to a [[TableSource]] with nested 
fields.
+  * A [[TableSource]] extending this interface is able
+  * to project the nested fields of the return table.
+  *
+  * @tparam T The return type of the 
[[NestedFieldsProjectableTableSource]].
+  */
+trait NestedFieldsProjectableTableSource[T] extends 
ProjectableTableSource[T] {
+
+  /**
+* Creates a copy of the [[NestedFieldsProjectableTableSource]]
+* that projects its output on the specified nested fields.
+*
+* @param fields The indexes of the fields to return.
+* @param nestedFields hold the nested fields and has identical size 
with fields array
+*
+* e.g.
+* tableSchema = {
+*   id,
+*   student<\school<\city, tuition>, age, name>,
+*   teacher<\age, name>
+*   }
+*
+* select (id, student.school.city, student.age, teacher)
+*
+* fields = field = [0, 1, 2]
+* nestedFields  \[\[], ["school.city", "age"], ["*"\]\]
--- End diff --

That would also be OK with me, but the documentation would need to be 
adapted.


> Add NestedFieldsProjectableTableSource interface
> 
>
> Key: FLINK-5698
> URL: https://issues.apache.org/jira/browse/FLINK-5698
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Anton Solovev
>Assignee: Anton Solovev
>
> Add a NestedFieldsProjectableTableSource interface for some TableSource 
> implementation that support nesting projection push-down.
> The interface could look as follows
> {code}
> def trait NestedFieldsProjectableTableSource {
>   def projectNestedFields(fields: Array[String]): 
> NestedFieldsProjectableTableSource[T]
> }
> {code}
> This interface works together with ProjectableTableSource 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5698) Add NestedFieldsProjectableTableSource interface

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892475#comment-15892475
 ] 

ASF GitHub Bot commented on FLINK-5698:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3269#discussion_r103918266
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/rules/util/RexProgramProjectExtractor.scala
 ---
@@ -84,6 +108,49 @@ object RexProgramProjectExtractor {
 }
 
 /**
+  * A RexVisitor to extract used nested input fields
+  */
+class RefFieldAccessorVisitor(
+names: List[String],
+usedFields: Array[Int]) extends RexVisitorImpl[Unit](true) {
+
+  private val projectedFields = new util.ArrayList[Array[String]]
+
+  names.foreach { n =>
+projectedFields.add(Array.empty)
+  }
+
+  private val order: Map[Int, Int] = 
names.indices.zip(usedFields).map(_.swap).toMap
--- End diff --

`names` and `usedFields` might not have the same length.
The result of `zip` has the length of the smaller of both lists which is 
not intended here, right?


> Add NestedFieldsProjectableTableSource interface
> 
>
> Key: FLINK-5698
> URL: https://issues.apache.org/jira/browse/FLINK-5698
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Anton Solovev
>Assignee: Anton Solovev
>
> Add a NestedFieldsProjectableTableSource interface for some TableSource 
> implementation that support nesting projection push-down.
> The interface could look as follows
> {code}
> def trait NestedFieldsProjectableTableSource {
>   def projectNestedFields(fields: Array[String]): 
> NestedFieldsProjectableTableSource[T]
> }
> {code}
> This interface works together with ProjectableTableSource 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-3695) ValueArray types

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-3695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892473#comment-15892473
 ] 

ASF GitHub Bot commented on FLINK-3695:
---

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3382
  
@vasia, if you are available to review, I think the easiest way to do so is 
to first review the classes for `LongValue` then compare the diff for 
`IntValue` and `StringValue`. I think we are forced to duplicate code as the 
value types to keep the method calls monomorphic.


> ValueArray types
> 
>
> Key: FLINK-3695
> URL: https://issues.apache.org/jira/browse/FLINK-3695
> Project: Flink
>  Issue Type: New Feature
>  Components: Type Serialization System
>Affects Versions: 1.1.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>Priority: Minor
> Fix For: 1.3.0
>
>
> Flink provides mutable {{Value}} type implementations of Java primitives 
> along with efficient serializers and comparators. It would be useful to have 
> corresponding {{ValueArray}} implementations backed by primitive rather than 
> object arrays, along with an {{ArrayableValue}} interface tying a {{Value}} 
> to its {{ValueArray}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5698) Add NestedFieldsProjectableTableSource interface

2017-03-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892477#comment-15892477
 ] 

ASF GitHub Bot commented on FLINK-5698:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3269#discussion_r103958236
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/sources/NestedFieldsProjectableTableSource.scala
 ---
@@ -0,0 +1,55 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.sources
+
+/**
+  * Adds support for projection push-down to a [[TableSource]] with nested 
fields.
+  * A [[TableSource]] extending this interface is able
+  * to project the nested fields of the return table.
+  *
+  * @tparam T The return type of the 
[[NestedFieldsProjectableTableSource]].
+  */
+trait NestedFieldsProjectableTableSource[T] extends 
ProjectableTableSource[T] {
+
+  /**
+* Creates a copy of the [[NestedFieldsProjectableTableSource]]
+* that projects its output on the specified nested fields.
+*
+* @param fields The indexes of the fields to return.
+* @param nestedFields hold the nested fields and has identical size 
with fields array
+*
+* e.g.
+* tableSchema = {
+*   id,
+*   student<\school<\city, tuition>, age, name>,
+*   teacher<\age, name>
+*   }
+*
+* select (id, student.school.city, student.age, teacher)
+*
+* fields = field = [0, 1, 2]
+* nestedFields  \[\[], ["school.city", "age"], ["*"\]\]
--- End diff --

I think with the current implementation we would get `\[\["*"], 
["school.city", "age"], ["*"\]\]`


> Add NestedFieldsProjectableTableSource interface
> 
>
> Key: FLINK-5698
> URL: https://issues.apache.org/jira/browse/FLINK-5698
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API & SQL
>Reporter: Anton Solovev
>Assignee: Anton Solovev
>
> Add a NestedFieldsProjectableTableSource interface for some TableSource 
> implementation that support nesting projection push-down.
> The interface could look as follows
> {code}
> def trait NestedFieldsProjectableTableSource {
>   def projectNestedFields(fields: Array[String]): 
> NestedFieldsProjectableTableSource[T]
> }
> {code}
> This interface works together with ProjectableTableSource 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] flink issue #3382: [FLINK-3695] [gelly] ValueArray types

2017-03-02 Thread greghogan

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/3382
  
@vasia, if you are available to review, I think the easiest way to do so is 
to first review the classes for `LongValue` then compare the diff for 
`IntValue` and `StringValue`. I think we are forced to duplicate code as the 
value types to keep the method calls monomorphic.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #3269: [FLINK-5698] Add NestedFieldsProjectableTableSourc...

2017-03-02 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3269#discussion_r103913118
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/sources/NestedFieldsProjectableTableSource.scala
 ---
@@ -0,0 +1,55 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.sources
+
+/**
+  * Adds support for projection push-down to a [[TableSource]] with nested 
fields.
+  * A [[TableSource]] extending this interface is able
+  * to project the nested fields of the return table.
--- End diff --

return -> returned


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #3269: [FLINK-5698] Add NestedFieldsProjectableTableSourc...

2017-03-02 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3269#discussion_r103918266
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/rules/util/RexProgramProjectExtractor.scala
 ---
@@ -84,6 +108,49 @@ object RexProgramProjectExtractor {
 }
 
 /**
+  * A RexVisitor to extract used nested input fields
+  */
+class RefFieldAccessorVisitor(
+names: List[String],
+usedFields: Array[Int]) extends RexVisitorImpl[Unit](true) {
+
+  private val projectedFields = new util.ArrayList[Array[String]]
+
+  names.foreach { n =>
+projectedFields.add(Array.empty)
+  }
+
+  private val order: Map[Int, Int] = 
names.indices.zip(usedFields).map(_.swap).toMap
--- End diff --

`names` and `usedFields` might not have the same length.
The result of `zip` has the length of the smaller of both lists which is 
not intended here, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #3269: [FLINK-5698] Add NestedFieldsProjectableTableSourc...

2017-03-02 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3269#discussion_r103958236
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/sources/NestedFieldsProjectableTableSource.scala
 ---
@@ -0,0 +1,55 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.sources
+
+/**
+  * Adds support for projection push-down to a [[TableSource]] with nested 
fields.
+  * A [[TableSource]] extending this interface is able
+  * to project the nested fields of the return table.
+  *
+  * @tparam T The return type of the 
[[NestedFieldsProjectableTableSource]].
+  */
+trait NestedFieldsProjectableTableSource[T] extends 
ProjectableTableSource[T] {
+
+  /**
+* Creates a copy of the [[NestedFieldsProjectableTableSource]]
+* that projects its output on the specified nested fields.
+*
+* @param fields The indexes of the fields to return.
+* @param nestedFields hold the nested fields and has identical size 
with fields array
+*
+* e.g.
+* tableSchema = {
+*   id,
+*   student<\school<\city, tuition>, age, name>,
+*   teacher<\age, name>
+*   }
+*
+* select (id, student.school.city, student.age, teacher)
+*
+* fields = field = [0, 1, 2]
+* nestedFields  \[\[], ["school.city", "age"], ["*"\]\]
--- End diff --

I think with the current implementation we would get `\[\["*"], 
["school.city", "age"], ["*"\]\]`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #3269: [FLINK-5698] Add NestedFieldsProjectableTableSourc...

2017-03-02 Thread fhueske

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3269#discussion_r103960833
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/plan/rules/util/RexProgramProjectExtractorTest.scala
 ---
@@ -55,6 +56,34 @@ class RexProgramProjectExtractorTest {
   }
 
   @Test
+  def testExtractRefNestedInputFields(): Unit = {
+val rexProgram = buildRexProgramWithNesting()
+val usedFields = extractRefInputFields(rexProgram)
+val usedNestedFields = extractRefNestedInputFields(rexProgram, 
usedFields)
+val expected = Array[Array[String]](Array("amount"), Array("*"))
+assertThat(usedNestedFields, is(expected))
+  }
+
+  @Test
+  def testExtractRefNestedInputFieldsWithNoNesting(): Unit = {
+val rexProgram = buildRexProgram()
+val usedFields = extractRefInputFields(rexProgram)
+val usedNestedFields = extractRefNestedInputFields(rexProgram, 
usedFields)
+val expected = Array[Array[String]](Array("*"), Array("*"), Array("*"))
+assertThat(usedNestedFields, is(expected))
+  }
+
+  @Test
+  def testExtractDeepRefNestedInputFields(): Unit = {
+val rexProgram = buildRexProgramWithDeepNesting()
+val usedFields = extractRefInputFields(rexProgram)
+val usedNestedFields = extractRefNestedInputFields(rexProgram, 
usedFields)
+val expected = Array[Array[String]](Array("amount"), 
Array("passport.status"))
--- End diff --

Another test would be to reference the nested attribute in a call, for 
example something like `payments.amount * 10`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

1 2 3 4 >

1 - 100 of 312 matches

Mail list logo