date:20180831

[jira] [Commented] (FLINK-9884) Slot request may not be removed when it has already be assigned in slot manager

2018-08-31 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-9884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16599496#comment-16599496
 ] 

ASF GitHub Bot commented on FLINK-9884:
---

TisonKun edited a comment on issue #6360: [FLINK-9884] [runtime] fix slot 
request may not be removed when it has already be assigned in slot manager
URL: https://github.com/apache/flink/pull/6360#issuecomment-417825840
 
 
   cc @tillrohrmann @GJL 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Slot request may not be removed when it has already be assigned in slot 
> manager
> ---
>
> Key: FLINK-9884
> URL: https://issues.apache.org/jira/browse/FLINK-9884
> Project: Flink
>  Issue Type: Bug
>  Components: Cluster Management
>Reporter: shuai.xu
>Assignee: shuai.xu
>Priority: Major
>  Labels: pull-request-available
>
> When task executor report a slotA with allocationId1, it may happen that slot 
> manager record slotA is assigned to allocationId2, and the slot request with 
> allocationId1 is not assigned. Then slot manager will update itself with 
> slotA assigned to allocationId1, by it does not clear the slot request with 
> allocationId1.
> For example:
>  # There is one free slot in slot manager.
>  # Now come two slot request with allocationId1 and allocationId2.
>  # The slot is assigned to allocationId1, but the requestSlot call timeout.
>  # SlotManager assign the slot to allocationId2 and insert a slot request 
> with allocationId1.
>  # The second requestSlot call to task executor return SlotOccupiedException.
>  # SlotManager update the slot to allocationID1, but the slot request is left.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] TisonKun edited a comment on issue #6360: [FLINK-9884] [runtime] fix slot request may not be removed when it has already be assigned in slot manager

2018-08-31 Thread GitBox

TisonKun edited a comment on issue #6360: [FLINK-9884] [runtime] fix slot 
request may not be removed when it has already be assigned in slot manager
URL: https://github.com/apache/flink/pull/6360#issuecomment-417825840
 
 
   cc @tillrohrmann @GJL 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-9884) Slot request may not be removed when it has already be assigned in slot manager

2018-08-31 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-9884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16599495#comment-16599495
 ] 

ASF GitHub Bot commented on FLINK-9884:
---

TisonKun commented on issue #6360: [FLINK-9884] [runtime] fix slot request may 
not be removed when it has already be assigned in slot manager
URL: https://github.com/apache/flink/pull/6360#issuecomment-417825840
 
 
   cc @tillrohrmann 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Slot request may not be removed when it has already be assigned in slot 
> manager
> ---
>
> Key: FLINK-9884
> URL: https://issues.apache.org/jira/browse/FLINK-9884
> Project: Flink
>  Issue Type: Bug
>  Components: Cluster Management
>Reporter: shuai.xu
>Assignee: shuai.xu
>Priority: Major
>  Labels: pull-request-available
>
> When task executor report a slotA with allocationId1, it may happen that slot 
> manager record slotA is assigned to allocationId2, and the slot request with 
> allocationId1 is not assigned. Then slot manager will update itself with 
> slotA assigned to allocationId1, by it does not clear the slot request with 
> allocationId1.
> For example:
>  # There is one free slot in slot manager.
>  # Now come two slot request with allocationId1 and allocationId2.
>  # The slot is assigned to allocationId1, but the requestSlot call timeout.
>  # SlotManager assign the slot to allocationId2 and insert a slot request 
> with allocationId1.
>  # The second requestSlot call to task executor return SlotOccupiedException.
>  # SlotManager update the slot to allocationID1, but the slot request is left.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] TisonKun commented on issue #6360: [FLINK-9884] [runtime] fix slot request may not be removed when it has already be assigned in slot manager

2018-08-31 Thread GitBox

TisonKun commented on issue #6360: [FLINK-9884] [runtime] fix slot request may 
not be removed when it has already be assigned in slot manager
URL: https://github.com/apache/flink/pull/6360#issuecomment-417825840
 
 
   cc @tillrohrmann 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] TisonKun opened a new pull request #6644: [hotfix] check correct parameter

2018-08-31 Thread GitBox

TisonKun opened a new pull request #6644: [hotfix] check correct parameter
URL: https://github.com/apache/flink/pull/6644
 
 
   ## What is the purpose of the change
   
   Accidentally check error message, correct to check correct parameter
   
   ## Verifying this change
   
   trivial work
   
   cc @zentol 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-10114) Support Orc for StreamingFileSink

2018-08-31 Thread zhangminglei (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16599433#comment-16599433
 ] 

zhangminglei commented on FLINK-10114:
--

I can't continue to work on this JIRA because I've just joined Alibaba for some 
job reasons. Anyone who wants to do this can finish this pr. Thank you very 
much.

> Support Orc for StreamingFileSink
> -
>
> Key: FLINK-10114
> URL: https://issues.apache.org/jira/browse/FLINK-10114
> Project: Flink
>  Issue Type: Sub-task
>  Components: Streaming Connectors
>Reporter: zhangminglei
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (FLINK-10114) Support Orc for StreamingFileSink

2018-08-31 Thread zhangminglei (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangminglei reassigned FLINK-10114:


Assignee: (was: zhangminglei)

> Support Orc for StreamingFileSink
> -
>
> Key: FLINK-10114
> URL: https://issues.apache.org/jira/browse/FLINK-10114
> Project: Flink
>  Issue Type: Sub-task
>  Components: Streaming Connectors
>Reporter: zhangminglei
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (FLINK-10276) Job Manager and Task Manager Metrics Reporter Ports Configuration

2018-08-31 Thread Deirdre Kong (JIRA)

Deirdre Kong created FLINK-10276:


 Summary: Job Manager and Task Manager Metrics Reporter Ports 
Configuration
 Key: FLINK-10276
 URL: https://issues.apache.org/jira/browse/FLINK-10276
 Project: Flink
  Issue Type: New Feature
  Components: Core
Reporter: Deirdre Kong


*Problem Statement:*

When deploying Flink using YARN, the job manager and task manager can be on the 
same node or different nodes.  Say I specify the port range to be 9249-9250, if 
JM and TM are deployed on the same node, the port for JM will be 9249 and the 
port for TM will be 9250.  If JM and TM are deployed on different nodes, then 
the ports for JM and TM will be 9249.

I can only configure Prometheus once for the ports to scrape JM and TMs 
metrics.  In this case, I won't know whether port 9249 is for JM or TM.  If 
would be great if we can specify in flink-conf.yaml on the port we want for JM 
reporter and TMs reporter.

*Comment from Till:*

I think we could extend Vino's proposal for Yarn as well: Maybe it makes sense 
to allow to override certain configuration settings for the TaskManagers when 
deploying on Yarn. That way one could define a fixed port for the JM and a port 
range for the TMs. Having such a distinction you can configure your Prometheus 
to scrape for the single JM and the TMs individually. However, Flink does not 
yet support such a feature. You can open a JIRA issue to track the problem.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-10268) Document update deployment/aws HADOOP_CLASSPATH

2018-08-31 Thread Gary Yao (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Yao updated FLINK-10268:
-
Fix Version/s: 1.7.0

> Document update deployment/aws HADOOP_CLASSPATH
> ---
>
> Key: FLINK-10268
> URL: https://issues.apache.org/jira/browse/FLINK-10268
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.5.3, 1.6.0, 1.7.0
>Reporter: Andy M
>Priority: Minor
> Fix For: 1.7.0
>
>
> The Deployment/AWS/Custom EMR Installation documents need to be updated.  
> Currently the steps will result in a ClassNotFoundException.  A step needs to 
> be added to include setting HADOOP_CLASSPATH=`hadoop classpath`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-10268) Document update deployment/aws HADOOP_CLASSPATH

2018-08-31 Thread Gary Yao (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Yao updated FLINK-10268:
-
Priority: Major  (was: Minor)

> Document update deployment/aws HADOOP_CLASSPATH
> ---
>
> Key: FLINK-10268
> URL: https://issues.apache.org/jira/browse/FLINK-10268
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.5.3, 1.6.0, 1.7.0
>Reporter: Andy M
>Priority: Major
> Fix For: 1.7.0
>
>
> The Deployment/AWS/Custom EMR Installation documents need to be updated.  
> Currently the steps will result in a ClassNotFoundException.  A step needs to 
> be added to include setting HADOOP_CLASSPATH=`hadoop classpath`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-10268) Document update deployment/aws HADOOP_CLASSPATH

2018-08-31 Thread Gary Yao (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Yao updated FLINK-10268:
-
Affects Version/s: 1.7.0
   1.5.3
   1.6.0

> Document update deployment/aws HADOOP_CLASSPATH
> ---
>
> Key: FLINK-10268
> URL: https://issues.apache.org/jira/browse/FLINK-10268
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 1.5.3, 1.6.0, 1.7.0
>Reporter: Andy M
>Priority: Minor
> Fix For: 1.7.0
>
>
> The Deployment/AWS/Custom EMR Installation documents need to be updated.  
> Currently the steps will result in a ClassNotFoundException.  A step needs to 
> be added to include setting HADOOP_CLASSPATH=`hadoop classpath`



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-10275) StreamTask support object reuse

2018-08-31 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-10275:
---
Labels: pull-request-available  (was: )

> StreamTask support object reuse
> ---
>
> Key: FLINK-10275
> URL: https://issues.apache.org/jira/browse/FLINK-10275
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Affects Versions: 1.7.0
>Reporter: 陈梓立
>Assignee: 陈梓立
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> StreamTask support efficient object reuse. The purpose behind this is to 
> reduce pressure on the garbage collector.
> All objects are reused, without backup copies. The operators and UDFs must be 
> careful to not keep any objects as state or not to modify the objects.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10275) StreamTask support object reuse

2018-08-31 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16599068#comment-16599068
 ] 

ASF GitHub Bot commented on FLINK-10275:


TisonKun opened a new pull request #6643: [FLINK-10275] StreamTask support 
object reuse
URL: https://github.com/apache/flink/pull/6643
 
 
   ## What is the purpose of the change
   
   StreamTask support efficient object reuse. The purpose behind this is to 
reduce pressure on the garbage collector.
   
   All objects are reused, without backup copies. The operators and UDFs must 
be careful to not keep any objects as state or not to modify the objects.
   
   ## Brief change log
   
   - With `ExecutionConfig#isObjectReuseEnable` on, reuse `StreamRecord` 
associated to `StreamTask`.
   - Also clean code as glancing over. 
   
   
   ## Verifying this change
   
   Add case to unit test `OneInputStreamTaskTest.java` and 
`TwoInputStreamTaskTest.java`
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> StreamTask support object reuse
> ---
>
> Key: FLINK-10275
> URL: https://issues.apache.org/jira/browse/FLINK-10275
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Affects Versions: 1.7.0
>Reporter: 陈梓立
>Assignee: 陈梓立
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> StreamTask support efficient object reuse. The purpose behind this is to 
> reduce pressure on the garbage collector.
> All objects are reused, without backup copies. The operators and UDFs must be 
> careful to not keep any objects as state or not to modify the objects.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] TisonKun opened a new pull request #6643: [FLINK-10275] StreamTask support object reuse

2018-08-31 Thread GitBox

TisonKun opened a new pull request #6643: [FLINK-10275] StreamTask support 
object reuse
URL: https://github.com/apache/flink/pull/6643
 
 
   ## What is the purpose of the change
   
   StreamTask support efficient object reuse. The purpose behind this is to 
reduce pressure on the garbage collector.
   
   All objects are reused, without backup copies. The operators and UDFs must 
be careful to not keep any objects as state or not to modify the objects.
   
   ## Brief change log
   
   - With `ExecutionConfig#isObjectReuseEnable` on, reuse `StreamRecord` 
associated to `StreamTask`.
   - Also clean code as glancing over. 
   
   
   ## Verifying this change
   
   Add case to unit test `OneInputStreamTaskTest.java` and 
`TwoInputStreamTaskTest.java`
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Created] (FLINK-10275) StreamTask support object reuse

2018-08-31 Thread JIRA

陈梓立 created FLINK-10275:
---

 Summary: StreamTask support object reuse
 Key: FLINK-10275
 URL: https://issues.apache.org/jira/browse/FLINK-10275
 Project: Flink
  Issue Type: Improvement
  Components: Streaming
Affects Versions: 1.7.0
Reporter: 陈梓立
Assignee: 陈梓立
 Fix For: 1.7.0


StreamTask support efficient object reuse. The purpose behind this is to reduce 
pressure on the garbage collector.

All objects are reused, without backup copies. The operators and UDFs must be 
careful to not keep any objects as state or not to modify the objects.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (FLINK-10274) The stop-cluster.sh cannot stop cluster properly when there are multiple clusters running

2018-08-31 Thread YuFeng Shen (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598983#comment-16598983
 ] 

YuFeng Shen edited comment on FLINK-10274 at 8/31/18 4:45 PM:
--

After doing some checking I found Flink stop its components(JM&) by reading 
the pid file firstly then run cmd kill pid, and Till Rohrmann mentioned by 
default the pid file is written to /tmp and has the name 
flink--.pid and can control the dir by setting the 
env.pid.dir configuration in flink-conf.yaml to avoid this issue temporarily. 

So by default should the Flink use the  
flink-<*CLUSERID*>.pid instead? And also the non-public 
config item env.pid.dir  should be published in the official document.


was (Author: shenyufeng):
After doing some checking I found Flink stop its components(JM&) by reading 
the pid file firstly then run cmd kill pid, and Till Rohrmann mentioned by 
default the pid file is written to /tmp and has the name 
flink--.pid and can control the dir by setting the 
env.pid.dir configuration in flink-conf.yaml to avoid this issue temporarily. 

So by default should the Flink use the  
flink-<*CLUSERID*>.pid instead? And also the non-public 
config item env.pid.dir  should be in the official document.

> The stop-cluster.sh cannot stop cluster properly when there are multiple 
> clusters running
> -
>
> Key: FLINK-10274
> URL: https://issues.apache.org/jira/browse/FLINK-10274
> Project: Flink
>  Issue Type: Bug
>  Components: Configuration
>Affects Versions: 1.5.1, 1.5.2, 1.5.3, 1.6.0
>Reporter: YuFeng Shen
>Priority: Major
>
> **When you are prepare to do a Flink framework version upgrading by using the 
> strategy [shadow 
> copy|https://ci.apache.org/projects/flink/flink-docs-release-1.6/ops/upgrading.html#upgrading-the-flink-framework-version]
>  , then you need to run multiple clusters concurrently,  however when you 
> want to stop the old version cluster after upgrading ,you would find the 
> stop-cluster.sh wouldn't work as you expected, the following is the steps to 
> duplicate the issue:
>  # There is already a running Flink 1.5.x cluster instance
>  # Installing another Flink 1.6.x cluster instance at the same cluster 
> machines
>  # Migrating the jobs from Flink 1.5.x  to Flink 1.6.x 
>  # go to the bin dir of the Flink 1.5.x cluster instance and run 
> stop-cluster.sh 
> You expect the old Flink 1.5.x cluster instance would be stopped ,right? 
> Unfortunately the stopped cluster is the new installed Flink 1.6.x cluster 
> instance!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (FLINK-10274) The stop-cluster.sh cannot stop cluster properly when there are multiple clusters running

2018-08-31 Thread YuFeng Shen (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598983#comment-16598983
 ] 

YuFeng Shen edited comment on FLINK-10274 at 8/31/18 4:44 PM:
--

After doing some checking I found Flink stop its components(JM&) by reading 
the pid file firstly then run cmd kill pid, and Till Rohrmann mentioned by 
default the pid file is written to /tmp and has the name 
flink--.pid and can control the dir by setting the 
env.pid.dir configuration in flink-conf.yaml to avoid this issue temporarily. 

So by default should the Flink use the  
flink-<*CLUSERID*>.pid instead? And also the non-public 
config item env.pid.dir  should be in the official document.


was (Author: shenyufeng):
After doing some checking I found Flink stop its components(JM&) by reading 
the pid file firstly then run cmd kill pid, and Till Rohrmann mentioned by 
default the pid file is written to /tmp and has the name 
flink--.pid and can control the dir by setting the 
env.pid.dir configuration in flink-conf.yaml to avoid this issue temporarily. 

By default should the Flink use the  
flink-<*CLUSERID*>.pid instead? And also the non-public 
config item env.pid.dir  should be in the official document.

> The stop-cluster.sh cannot stop cluster properly when there are multiple 
> clusters running
> -
>
> Key: FLINK-10274
> URL: https://issues.apache.org/jira/browse/FLINK-10274
> Project: Flink
>  Issue Type: Bug
>  Components: Configuration
>Affects Versions: 1.5.1, 1.5.2, 1.5.3, 1.6.0
>Reporter: YuFeng Shen
>Priority: Major
>
> **When you are prepare to do a Flink framework version upgrading by using the 
> strategy [shadow 
> copy|https://ci.apache.org/projects/flink/flink-docs-release-1.6/ops/upgrading.html#upgrading-the-flink-framework-version]
>  , then you need to run multiple clusters concurrently,  however when you 
> want to stop the old version cluster after upgrading ,you would find the 
> stop-cluster.sh wouldn't work as you expected, the following is the steps to 
> duplicate the issue:
>  # There is already a running Flink 1.5.x cluster instance
>  # Installing another Flink 1.6.x cluster instance at the same cluster 
> machines
>  # Migrating the jobs from Flink 1.5.x  to Flink 1.6.x 
>  # go to the bin dir of the Flink 1.5.x cluster instance and run 
> stop-cluster.sh 
> You expect the old Flink 1.5.x cluster instance would be stopped ,right? 
> Unfortunately the stopped cluster is the new installed Flink 1.6.x cluster 
> instance!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-10274) The stop-cluster.sh cannot stop cluster properly when there are multiple clusters running

2018-08-31 Thread YuFeng Shen (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YuFeng Shen updated FLINK-10274:

Description: 
**When you are prepare to do a Flink framework version upgrading by using the 
strategy [shadow 
copy|https://ci.apache.org/projects/flink/flink-docs-release-1.6/ops/upgrading.html#upgrading-the-flink-framework-version]
 , then you need to run multiple clusters concurrently,  however when you want 
to stop the old version cluster after upgrading ,you would find the 
stop-cluster.sh wouldn't work as you expected, the following is the steps to 
duplicate the issue:
 # There is already a running Flink 1.5.x cluster instance
 # Installing another Flink 1.6.x cluster instance at the same cluster machines
 # Migrating the jobs from Flink 1.5.x  to Flink 1.6.x 
 # go to the bin dir of the Flink 1.5.x cluster instance and run 
stop-cluster.sh 

You expect the old Flink 1.5.x cluster instance would be stopped ,right? 
Unfortunately the stopped cluster is the new installed Flink 1.6.x cluster 
instance!

  was:
**When you are prepare to do a Flink framework version upgrading by using the 
strategy [shadow 
copy|https://ci.apache.org/projects/flink/flink-docs-release-1.6/ops/upgrading.html#upgrading-the-flink-framework-version]
 , then you need to run multiple clusters concurrently,  however when you want 
to stop the old version cluster after upgrading ,you would find the 
stop-cluster.sh wouldn't work as you expected, the following is the steps to 
duplicate the issue:
 # There is a running Flink 1.5.x cluster instance
 # Installing another Flink 1.6.x cluster instance at the same cluster machines
 # Migrating the jobs from Flink 1.5.x  to Flink 1.6.x 
 # go to the bin dir of the Flink 1.5.x cluster instance and run 
stop-cluster.sh 

You expect the old Flink 1.5.x cluster instance would be stopped ,right? 
Unfortunately the stopped cluster is the new installed Flink 1.6.x cluster 
instance!


> The stop-cluster.sh cannot stop cluster properly when there are multiple 
> clusters running
> -
>
> Key: FLINK-10274
> URL: https://issues.apache.org/jira/browse/FLINK-10274
> Project: Flink
>  Issue Type: Bug
>  Components: Configuration
>Affects Versions: 1.5.1, 1.5.2, 1.5.3, 1.6.0
>Reporter: YuFeng Shen
>Priority: Major
>
> **When you are prepare to do a Flink framework version upgrading by using the 
> strategy [shadow 
> copy|https://ci.apache.org/projects/flink/flink-docs-release-1.6/ops/upgrading.html#upgrading-the-flink-framework-version]
>  , then you need to run multiple clusters concurrently,  however when you 
> want to stop the old version cluster after upgrading ,you would find the 
> stop-cluster.sh wouldn't work as you expected, the following is the steps to 
> duplicate the issue:
>  # There is already a running Flink 1.5.x cluster instance
>  # Installing another Flink 1.6.x cluster instance at the same cluster 
> machines
>  # Migrating the jobs from Flink 1.5.x  to Flink 1.6.x 
>  # go to the bin dir of the Flink 1.5.x cluster instance and run 
> stop-cluster.sh 
> You expect the old Flink 1.5.x cluster instance would be stopped ,right? 
> Unfortunately the stopped cluster is the new installed Flink 1.6.x cluster 
> instance!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-10274) The stop-cluster.sh cannot stop cluster properly when there are multiple clusters running

2018-08-31 Thread YuFeng Shen (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YuFeng Shen updated FLINK-10274:

Description: 
**When you are prepare to do a Flink framework version upgrading by using the 
strategy [shadow 
copy|https://ci.apache.org/projects/flink/flink-docs-release-1.6/ops/upgrading.html#upgrading-the-flink-framework-version]
 , then you need to run multiple clusters concurrently,  however when you want 
to stop the old version cluster after upgrading ,you would find the 
stop-cluster.sh wouldn't work as you expected, the following is the steps to 
duplicate the issue:
 # There is a running Flink 1.5.x cluster instance
 # Installing another Flink 1.6.x cluster instance at the same cluster machines
 # Migrating the jobs from Flink 1.5.x  to Flink 1.6.x 
 # go to the bin dir of the Flink 1.5.x cluster instance and run 
stop-cluster.sh 

You expect the old Flink 1.5.x cluster instance would be stopped ,right? 
Unfortunately the stopped cluster is the new installed Flink 1.6.x cluster 
instance!

  was:
**When you are prepare to do a Flink framework version upgrading by using the 
strategy [shadow 
copy|https://ci.apache.org/projects/flink/flink-docs-release-1.6/ops/upgrading.html#upgrading-the-flink-framework-version]
 , then you need to run multiple clusters concurrently,  however when you want 
to stop the old version cluster after upgrading ,you would find the 
stop-cluster.sh wouldn't work as you expected: the following is the details to 
duplicate the issue:
 # There is a running Flink 1.5.x cluster instance
 # Installing another Flink 1.6.x cluster instance at the same cluster machines
 # Migrating the jobs from Flink 1.5.x  to Flink 1.6.x 
 # go to the bin dir of the Flink 1.5.x cluster instance and run 
stop-cluster.sh 

You expect the old Flink 1.5.x cluster instance would be stopped ,right? 
Unfortunately the stopped cluster is the new installed Flink 1.6.x cluster 
instance!


> The stop-cluster.sh cannot stop cluster properly when there are multiple 
> clusters running
> -
>
> Key: FLINK-10274
> URL: https://issues.apache.org/jira/browse/FLINK-10274
> Project: Flink
>  Issue Type: Bug
>  Components: Configuration
>Affects Versions: 1.5.1, 1.5.2, 1.5.3, 1.6.0
>Reporter: YuFeng Shen
>Priority: Major
>
> **When you are prepare to do a Flink framework version upgrading by using the 
> strategy [shadow 
> copy|https://ci.apache.org/projects/flink/flink-docs-release-1.6/ops/upgrading.html#upgrading-the-flink-framework-version]
>  , then you need to run multiple clusters concurrently,  however when you 
> want to stop the old version cluster after upgrading ,you would find the 
> stop-cluster.sh wouldn't work as you expected, the following is the steps to 
> duplicate the issue:
>  # There is a running Flink 1.5.x cluster instance
>  # Installing another Flink 1.6.x cluster instance at the same cluster 
> machines
>  # Migrating the jobs from Flink 1.5.x  to Flink 1.6.x 
>  # go to the bin dir of the Flink 1.5.x cluster instance and run 
> stop-cluster.sh 
> You expect the old Flink 1.5.x cluster instance would be stopped ,right? 
> Unfortunately the stopped cluster is the new installed Flink 1.6.x cluster 
> instance!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10274) The stop-cluster.sh cannot stop cluster properly when there are multiple clusters running

2018-08-31 Thread YuFeng Shen (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598983#comment-16598983
 ] 

YuFeng Shen commented on FLINK-10274:
-

After doing some checking I found Flink stop its components(JM&) by reading 
the pid file firstly then run cmd kill pid, and Till Rohrmann mentioned by 
default the pid file is written to /tmp and has the name 
flink--.pid and can control the dir by setting the 
env.pid.dir configuration in flink-conf.yaml to avoid this issue temporarily. 

By default should the Flink use the  
flink--<*CLUSERID*>-.pid instead? And also the 
non-public config item env.pid.dir  should be in the official document.

> The stop-cluster.sh cannot stop cluster properly when there are multiple 
> clusters running
> -
>
> Key: FLINK-10274
> URL: https://issues.apache.org/jira/browse/FLINK-10274
> Project: Flink
>  Issue Type: Bug
>  Components: Configuration
>Affects Versions: 1.5.1, 1.5.2, 1.5.3, 1.6.0
>Reporter: YuFeng Shen
>Priority: Major
>
> **When you are prepare to do a Flink framework version upgrading by using the 
> strategy [shadow 
> copy|https://ci.apache.org/projects/flink/flink-docs-release-1.6/ops/upgrading.html#upgrading-the-flink-framework-version]
>  , then you need to run multiple clusters concurrently,  however when you 
> want to stop the old version cluster after upgrading ,you would find the 
> stop-cluster.sh wouldn't work as you expected: the following is the details 
> to duplicate the issue:
>  # There is a running Flink 1.5.x cluster instance
>  # Installing another Flink 1.6.x cluster instance at the same cluster 
> machines
>  # Migrating the jobs from Flink 1.5.x  to Flink 1.6.x 
>  # go to the bin dir of the Flink 1.5.x cluster instance and run 
> stop-cluster.sh 
> You expect the old Flink 1.5.x cluster instance would be stopped ,right? 
> Unfortunately the stopped cluster is the new installed Flink 1.6.x cluster 
> instance!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (FLINK-10274) The stop-cluster.sh cannot stop cluster properly when there are multiple clusters running

2018-08-31 Thread YuFeng Shen (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598983#comment-16598983
 ] 

YuFeng Shen edited comment on FLINK-10274 at 8/31/18 4:38 PM:
--

After doing some checking I found Flink stop its components(JM&) by reading 
the pid file firstly then run cmd kill pid, and Till Rohrmann mentioned by 
default the pid file is written to /tmp and has the name 
flink--.pid and can control the dir by setting the 
env.pid.dir configuration in flink-conf.yaml to avoid this issue temporarily. 

By default should the Flink use the  
flink-<*CLUSERID*>.pid instead? And also the non-public 
config item env.pid.dir  should be in the official document.


was (Author: shenyufeng):
After doing some checking I found Flink stop its components(JM&) by reading 
the pid file firstly then run cmd kill pid, and Till Rohrmann mentioned by 
default the pid file is written to /tmp and has the name 
flink--.pid and can control the dir by setting the 
env.pid.dir configuration in flink-conf.yaml to avoid this issue temporarily. 

By default should the Flink use the  
flink--<*CLUSERID*>-.pid instead? And also the 
non-public config item env.pid.dir  should be in the official document.

> The stop-cluster.sh cannot stop cluster properly when there are multiple 
> clusters running
> -
>
> Key: FLINK-10274
> URL: https://issues.apache.org/jira/browse/FLINK-10274
> Project: Flink
>  Issue Type: Bug
>  Components: Configuration
>Affects Versions: 1.5.1, 1.5.2, 1.5.3, 1.6.0
>Reporter: YuFeng Shen
>Priority: Major
>
> **When you are prepare to do a Flink framework version upgrading by using the 
> strategy [shadow 
> copy|https://ci.apache.org/projects/flink/flink-docs-release-1.6/ops/upgrading.html#upgrading-the-flink-framework-version]
>  , then you need to run multiple clusters concurrently,  however when you 
> want to stop the old version cluster after upgrading ,you would find the 
> stop-cluster.sh wouldn't work as you expected: the following is the details 
> to duplicate the issue:
>  # There is a running Flink 1.5.x cluster instance
>  # Installing another Flink 1.6.x cluster instance at the same cluster 
> machines
>  # Migrating the jobs from Flink 1.5.x  to Flink 1.6.x 
>  # go to the bin dir of the Flink 1.5.x cluster instance and run 
> stop-cluster.sh 
> You expect the old Flink 1.5.x cluster instance would be stopped ,right? 
> Unfortunately the stopped cluster is the new installed Flink 1.6.x cluster 
> instance!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-08-31 Thread Andrey Zagrebin (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598966#comment-16598966
 ] 

Andrey Zagrebin commented on FLINK-9891:


[~packet], do you see something like this in the logs when run the job and 
cancel it?
{code:java}
Submitting application master {code}
from AbstractYarnClusterDescriptor.startAppMaster.

This should the Yarn application id.

> Flink cluster is not shutdown in YARN mode when Flink client is stopped
> ---
>
> Key: FLINK-9891
> URL: https://issues.apache.org/jira/browse/FLINK-9891
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.5.0, 1.5.1
>Reporter: Sergey Krasovskiy
>Assignee: Shuyi Chen
>Priority: Major
>  Labels: pull-request-available
>
> We are not using session mode and detached mode. The command to run Flink job 
> on YARN is:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
> {code}
> Flink CLI logs:
> {code:java}
> Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 2018-07-18 12:47:03,747 INFO 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
> address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,248 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
> HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
> YARN Client needs one of these to be set to properly load the Hadoop 
> configuration for accessing YARN.
> 2018-07-18 12:47:04,409 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
> ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=2048, 
> numberTaskManagers=1, slotsPerTaskManager=1}
> 2018-07-18 12:47:04,783 WARN 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> 2018-07-18 12:47:04,788 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration 
> directory 
> ('/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/conf')
>  contains both LOG4J and Logback configuration files. Please delete or rename 
> one of them.
> 2018-07-18 12:47:07,846 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application 
> master application_1531474158783_10814
> 2018-07-18 12:47:08,073 INFO 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
> application_1531474158783_10814
> 2018-07-18 12:47:08,074 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster 
> to be allocated
> 2018-07-18 12:47:08,076 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, 
> current state ACCEPTED
> 2018-07-18 12:47:12,864 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has 
> been deployed successfully.
> {code}
> Job Manager logs:
> {code:java}
> 2018-07-18 12:47:09,913 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
> 
> 2018-07-18 12:47:09,915 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting 
> YarnSessionClusterEntrypoint (Version: 1.5.1, Rev:3488f8b, Date:10.07.2018 @ 
> 11:51:27 GMT)
> ...
> {code}
> Issues:
>  # Flink job is running as a Flink session
>  # Ctrl+C or 'stop' doesn't stop a job and YARN cluster
>  # Cancel job via Job Maanager web ui doesn't stop Flink cluster. To kill the 
> cluster we need to run: yarn application -kill 
> We also tried to run a flink job with 'mode: legacy' and we have the same 
> issues:
>  # Add property 'mode: legacy' to

[jira] [Created] (FLINK-10274) The stop-cluster.sh cannot stop cluster properly when there are multiple clusters running

2018-08-31 Thread YuFeng Shen (JIRA)

YuFeng Shen created FLINK-10274:
---

 Summary: The stop-cluster.sh cannot stop cluster properly when 
there are multiple clusters running
 Key: FLINK-10274
 URL: https://issues.apache.org/jira/browse/FLINK-10274
 Project: Flink
  Issue Type: Bug
  Components: Configuration
Affects Versions: 1.6.0, 1.5.3, 1.5.2, 1.5.1
Reporter: YuFeng Shen


**When you are prepare to do a Flink framework version upgrading by using the 
strategy [shadow 
copy|https://ci.apache.org/projects/flink/flink-docs-release-1.6/ops/upgrading.html#upgrading-the-flink-framework-version]
 , then you need to run multiple clusters concurrently,  however when you want 
to stop the old version cluster after upgrading ,you would find the 
stop-cluster.sh wouldn't work as you expected: the following is the details to 
duplicate the issue:
 # There is a running Flink 1.5.x cluster instance
 # Installing another Flink 1.6.x cluster instance at the same cluster machines
 # Migrating the jobs from Flink 1.5.x  to Flink 1.6.x 
 # go to the bin dir of the Flink 1.5.x cluster instance and run 
stop-cluster.sh 

You expect the old Flink 1.5.x cluster instance would be stopped ,right? 
Unfortunately the stopped cluster is the new installed Flink 1.6.x cluster 
instance!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (FLINK-10273) Access composite type fields after a function

2018-08-31 Thread Rong Rong (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598959#comment-16598959
 ] 

Rong Rong edited comment on FLINK-10273 at 8/31/18 4:13 PM:


sounds good. FLINK-10019 is pretty specific. but the underlying CALCITE-2468 
might be more generic. I tried to resolve the issue with a quick fix but it 
touches some of the fundamentals on CALCITE. Let me know if anything I can help.


was (Author: walterddr):
sounds good. FLINK-10019 is pretty specific. but the underlying CALCITE-2468 
might be more generic. I tried to resolve the issue with a quick fix but it 
touches some of the fundamentals. Let me know if anything I can help.

> Access composite type fields after a function
> -
>
> Key: FLINK-10273
> URL: https://issues.apache.org/jira/browse/FLINK-10273
> Project: Flink
>  Issue Type: Bug
>  Components: Table API  SQL
>Affects Versions: 1.7.0
>Reporter: Timo Walther
>Priority: Major
>
> If a function returns a composite type, for example, {{Row(lon: Float, lat: 
> Float)}}. There is currently no way of accessing fields.
> Both queries fail with exceptions:
> {code}
> select t.c.lat, t.c.lon FROM (select toCoords(12) as c) AS t
> {code}
> {code}
> select toCoords(12).lat
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10273) Access composite type fields after a function

2018-08-31 Thread Rong Rong (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598959#comment-16598959
 ] 

Rong Rong commented on FLINK-10273:
---

sounds good. FLINK-10019 is pretty specific. but the underlying CALCITE-2468 
might be more generic. I tried to resolve the issue with a quick fix but it 
touches some of the fundamentals. Let me know if anything I can help.

> Access composite type fields after a function
> -
>
> Key: FLINK-10273
> URL: https://issues.apache.org/jira/browse/FLINK-10273
> Project: Flink
>  Issue Type: Bug
>  Components: Table API  SQL
>Affects Versions: 1.7.0
>Reporter: Timo Walther
>Priority: Major
>
> If a function returns a composite type, for example, {{Row(lon: Float, lat: 
> Float)}}. There is currently no way of accessing fields.
> Both queries fail with exceptions:
> {code}
> select t.c.lat, t.c.lon FROM (select toCoords(12) as c) AS t
> {code}
> {code}
> select toCoords(12).lat
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10074) Allowable number of checkpoint failures

2018-08-31 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598929#comment-16598929
 ] 

ASF GitHub Bot commented on FLINK-10074:


azagrebin commented on a change in pull request #6567: [FLINK-10074] Allowable 
number of checkpoint failures
URL: https://github.com/apache/flink/pull/6567#discussion_r214384046
 
 

 ##
 File path: 
flink-streaming-java/src/test/java/org/apache/flink/streaming/runtime/tasks/CheckpointExceptionHandlerConfigurationTest.java
 ##
 @@ -75,6 +79,19 @@ public CheckpointExceptionHandler 
createCheckpointExceptionHandler(
}
};
 
+   if (failOnException) {
+   CheckpointExceptionHandler exceptionHandler =
+   
inspectingFactory.createCheckpointExceptionHandler(failOnException, 
environment);
+   Assert.assertTrue(
+   exceptionHandler instanceof 
CheckpointExceptionHandlerFactory.FailingCheckpointExceptionHandler
+   );
+
+   
CheckpointExceptionHandlerFactory.FailingCheckpointExceptionHandler 
actuallyHandler =
+   
(CheckpointExceptionHandlerFactory.FailingCheckpointExceptionHandler) 
exceptionHandler;
+
+   Assert.assertEquals(3, actuallyHandler.tolerableNumber);
 
 Review comment:
   I would rather make fields of `FailingCheckpointExceptionHandler` private 
and check `tolerableNumber` the same way as `failTaskOnCheckpointException` in 
previous declaration of `CheckpointExceptionHandlerFactory`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allowable number of checkpoint failures 
> 
>
> Key: FLINK-10074
> URL: https://issues.apache.org/jira/browse/FLINK-10074
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Reporter: Thomas Weise
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
>
> For intermittent checkpoint failures it is desirable to have a mechanism to 
> avoid restarts. If, for example, a transient S3 error prevents checkpoint 
> completion, the next checkpoint may very well succeed. The user may wish to 
> not incur the expense of restart under such scenario and this could be 
> expressed with a failure threshold (number of subsequent checkpoint 
> failures), possibly combined with a list of exceptions to tolerate.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10074) Allowable number of checkpoint failures

2018-08-31 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598930#comment-16598930
 ] 

ASF GitHub Bot commented on FLINK-10074:


azagrebin commented on a change in pull request #6567: [FLINK-10074] Allowable 
number of checkpoint failures
URL: https://github.com/apache/flink/pull/6567#discussion_r214384222
 
 

 ##
 File path: 
flink-streaming-java/src/test/java/org/apache/flink/streaming/runtime/tasks/CheckpointExceptionHandlerConfigurationTest.java
 ##
 @@ -63,6 +63,10 @@ private void testConfigForwarding(boolean failOnException) 
throws Exception {
environment.setTaskStateManager(new TestTaskStateManager());

environment.getExecutionConfig().setFailTaskOnCheckpointError(expectedHandlerFlag);
 
+   if (failOnException) {
+   
environment.getExecutionConfig().setTaskTolerableCheckpointFailuresNumber(3);
 
 Review comment:
   this can be always set, as it should be just ignored if `failOnException` is 
`false`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allowable number of checkpoint failures 
> 
>
> Key: FLINK-10074
> URL: https://issues.apache.org/jira/browse/FLINK-10074
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Reporter: Thomas Weise
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
>
> For intermittent checkpoint failures it is desirable to have a mechanism to 
> avoid restarts. If, for example, a transient S3 error prevents checkpoint 
> completion, the next checkpoint may very well succeed. The user may wish to 
> not incur the expense of restart under such scenario and this could be 
> expressed with a failure threshold (number of subsequent checkpoint 
> failures), possibly combined with a list of exceptions to tolerate.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10074) Allowable number of checkpoint failures

2018-08-31 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598928#comment-16598928
 ] 

ASF GitHub Bot commented on FLINK-10074:


azagrebin commented on a change in pull request #6567: [FLINK-10074] Allowable 
number of checkpoint failures
URL: https://github.com/apache/flink/pull/6567#discussion_r214379864
 
 

 ##
 File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/CheckpointExceptionHandlerFactory.java
 ##
 @@ -37,7 +37,7 @@ public CheckpointExceptionHandler 
createCheckpointExceptionHandler(
Environment environment) {
 
if (failTaskOnCheckpointException) {
-   return new FailingCheckpointExceptionHandler();
+   return new 
FailingCheckpointExceptionHandler(environment);
 
 Review comment:
   I think we do not really need `environment` in the constructor of 
`FailingCheckpointExceptionHandler`, but only `tolerableNumber` which should be 
passed in `createCheckpointExceptionHandler`, the same way as 
`failTaskOnCheckpointException`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allowable number of checkpoint failures 
> 
>
> Key: FLINK-10074
> URL: https://issues.apache.org/jira/browse/FLINK-10074
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Reporter: Thomas Weise
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
>
> For intermittent checkpoint failures it is desirable to have a mechanism to 
> avoid restarts. If, for example, a transient S3 error prevents checkpoint 
> completion, the next checkpoint may very well succeed. The user may wish to 
> not incur the expense of restart under such scenario and this could be 
> expressed with a failure threshold (number of subsequent checkpoint 
> failures), possibly combined with a list of exceptions to tolerate.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10074) Allowable number of checkpoint failures

2018-08-31 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598931#comment-16598931
 ] 

ASF GitHub Bot commented on FLINK-10074:


azagrebin commented on a change in pull request #6567: [FLINK-10074] Allowable 
number of checkpoint failures
URL: https://github.com/apache/flink/pull/6567#discussion_r214398085
 
 

 ##
 File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/CheckpointExceptionHandlerFactory.java
 ##
 @@ -48,12 +48,41 @@ public CheckpointExceptionHandler 
createCheckpointExceptionHandler(
 */
static final class FailingCheckpointExceptionHandler implements 
CheckpointExceptionHandler {
 
+   final Environment environment;
+   final int tolerableNumber;
+   long latestFailedCheckpointID;
+   int cpFailureCounter;
+
+   FailingCheckpointExceptionHandler(Environment environment) {
+   this.environment = environment;
+   this.cpFailureCounter = 0;
+   this.tolerableNumber = 
environment.getExecutionConfig().getTaskTolerableCheckpointFailuresNumber();
+   }
+
@Override
public void tryHandleCheckpointException(
CheckpointMetaData checkpointMetaData,
Exception exception) throws Exception {
 
-   throw exception;
+   if (needThrowCheckpointException(checkpointMetaData)) {
+   throw exception;
+   }
+   }
+
+   private boolean needThrowCheckpointException(CheckpointMetaData 
checkpointMetaData) {
+   if (tolerableNumber == 0) {
+   return true;
+   }
+
+   if (checkpointMetaData.getCheckpointId() - 
latestFailedCheckpointID == 1) {
 
 Review comment:
   I think rather than relying on sequential numbering of checkpoints, 
   it is better we add one more signal: 
`CheckpointExceptionHandler.checkpointSucceeded()` where the counter is reset. 
   
   This method can be called in `AsyncCheckpointRunnable.run()`, e.g. after 
`reportCompletedSnapshotStates` is done:
   ```
   owner.asynchronousCheckpointExceptionHandler.checkpointSucceeded(); // 
forward it to synchronousCheckpointExceptionHandler inside
   ```
   
   The checkpoints finish concurrently, so I think we have to use an 
`AtomicInteger` for the `cpFailureCounter` and 
`cpFailureCounter.incrementAndGet()`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allowable number of checkpoint failures 
> 
>
> Key: FLINK-10074
> URL: https://issues.apache.org/jira/browse/FLINK-10074
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Reporter: Thomas Weise
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
>
> For intermittent checkpoint failures it is desirable to have a mechanism to 
> avoid restarts. If, for example, a transient S3 error prevents checkpoint 
> completion, the next checkpoint may very well succeed. The user may wish to 
> not incur the expense of restart under such scenario and this could be 
> expressed with a failure threshold (number of subsequent checkpoint 
> failures), possibly combined with a list of exceptions to tolerate.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] azagrebin commented on a change in pull request #6567: [FLINK-10074] Allowable number of checkpoint failures

2018-08-31 Thread GitBox

azagrebin commented on a change in pull request #6567: [FLINK-10074] Allowable 
number of checkpoint failures
URL: https://github.com/apache/flink/pull/6567#discussion_r214379864
 
 

 ##
 File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/CheckpointExceptionHandlerFactory.java
 ##
 @@ -37,7 +37,7 @@ public CheckpointExceptionHandler 
createCheckpointExceptionHandler(
Environment environment) {
 
if (failTaskOnCheckpointException) {
-   return new FailingCheckpointExceptionHandler();
+   return new 
FailingCheckpointExceptionHandler(environment);
 
 Review comment:
   I think we do not really need `environment` in the constructor of 
`FailingCheckpointExceptionHandler`, but only `tolerableNumber` which should be 
passed in `createCheckpointExceptionHandler`, the same way as 
`failTaskOnCheckpointException`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] azagrebin commented on a change in pull request #6567: [FLINK-10074] Allowable number of checkpoint failures

2018-08-31 Thread GitBox

azagrebin commented on a change in pull request #6567: [FLINK-10074] Allowable 
number of checkpoint failures
URL: https://github.com/apache/flink/pull/6567#discussion_r214384222
 
 

 ##
 File path: 
flink-streaming-java/src/test/java/org/apache/flink/streaming/runtime/tasks/CheckpointExceptionHandlerConfigurationTest.java
 ##
 @@ -63,6 +63,10 @@ private void testConfigForwarding(boolean failOnException) 
throws Exception {
environment.setTaskStateManager(new TestTaskStateManager());

environment.getExecutionConfig().setFailTaskOnCheckpointError(expectedHandlerFlag);
 
+   if (failOnException) {
+   
environment.getExecutionConfig().setTaskTolerableCheckpointFailuresNumber(3);
 
 Review comment:
   this can be always set, as it should be just ignored if `failOnException` is 
`false`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] azagrebin commented on a change in pull request #6567: [FLINK-10074] Allowable number of checkpoint failures

2018-08-31 Thread GitBox

azagrebin commented on a change in pull request #6567: [FLINK-10074] Allowable 
number of checkpoint failures
URL: https://github.com/apache/flink/pull/6567#discussion_r214384046
 
 

 ##
 File path: 
flink-streaming-java/src/test/java/org/apache/flink/streaming/runtime/tasks/CheckpointExceptionHandlerConfigurationTest.java
 ##
 @@ -75,6 +79,19 @@ public CheckpointExceptionHandler 
createCheckpointExceptionHandler(
}
};
 
+   if (failOnException) {
+   CheckpointExceptionHandler exceptionHandler =
+   
inspectingFactory.createCheckpointExceptionHandler(failOnException, 
environment);
+   Assert.assertTrue(
+   exceptionHandler instanceof 
CheckpointExceptionHandlerFactory.FailingCheckpointExceptionHandler
+   );
+
+   
CheckpointExceptionHandlerFactory.FailingCheckpointExceptionHandler 
actuallyHandler =
+   
(CheckpointExceptionHandlerFactory.FailingCheckpointExceptionHandler) 
exceptionHandler;
+
+   Assert.assertEquals(3, actuallyHandler.tolerableNumber);
 
 Review comment:
   I would rather make fields of `FailingCheckpointExceptionHandler` private 
and check `tolerableNumber` the same way as `failTaskOnCheckpointException` in 
previous declaration of `CheckpointExceptionHandlerFactory`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] azagrebin commented on a change in pull request #6567: [FLINK-10074] Allowable number of checkpoint failures

2018-08-31 Thread GitBox

azagrebin commented on a change in pull request #6567: [FLINK-10074] Allowable 
number of checkpoint failures
URL: https://github.com/apache/flink/pull/6567#discussion_r214398085
 
 

 ##
 File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/CheckpointExceptionHandlerFactory.java
 ##
 @@ -48,12 +48,41 @@ public CheckpointExceptionHandler 
createCheckpointExceptionHandler(
 */
static final class FailingCheckpointExceptionHandler implements 
CheckpointExceptionHandler {
 
+   final Environment environment;
+   final int tolerableNumber;
+   long latestFailedCheckpointID;
+   int cpFailureCounter;
+
+   FailingCheckpointExceptionHandler(Environment environment) {
+   this.environment = environment;
+   this.cpFailureCounter = 0;
+   this.tolerableNumber = 
environment.getExecutionConfig().getTaskTolerableCheckpointFailuresNumber();
+   }
+
@Override
public void tryHandleCheckpointException(
CheckpointMetaData checkpointMetaData,
Exception exception) throws Exception {
 
-   throw exception;
+   if (needThrowCheckpointException(checkpointMetaData)) {
+   throw exception;
+   }
+   }
+
+   private boolean needThrowCheckpointException(CheckpointMetaData 
checkpointMetaData) {
+   if (tolerableNumber == 0) {
+   return true;
+   }
+
+   if (checkpointMetaData.getCheckpointId() - 
latestFailedCheckpointID == 1) {
 
 Review comment:
   I think rather than relying on sequential numbering of checkpoints, 
   it is better we add one more signal: 
`CheckpointExceptionHandler.checkpointSucceeded()` where the counter is reset. 
   
   This method can be called in `AsyncCheckpointRunnable.run()`, e.g. after 
`reportCompletedSnapshotStates` is done:
   ```
   owner.asynchronousCheckpointExceptionHandler.checkpointSucceeded(); // 
forward it to synchronousCheckpointExceptionHandler inside
   ```
   
   The checkpoints finish concurrently, so I think we have to use an 
`AtomicInteger` for the `cpFailureCounter` and 
`cpFailureCounter.incrementAndGet()`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-10273) Access composite type fields after a function

2018-08-31 Thread Timo Walther (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598919#comment-16598919
 ] 

Timo Walther commented on FLINK-10273:
--

The function returns a proper row type. Registering a view and access the 
fields on the view works without problems. I assume that this is a pure Calcite 
issue. Or we are not using Calcite correctly. Feel free to dig deeper into 
this, if you find time. I will take a deeper look into FLINK-10019 soon, maybe 
this issue is just a duplicate of FLINK-10019.

> Access composite type fields after a function
> -
>
> Key: FLINK-10273
> URL: https://issues.apache.org/jira/browse/FLINK-10273
> Project: Flink
>  Issue Type: Bug
>  Components: Table API  SQL
>Affects Versions: 1.7.0
>Reporter: Timo Walther
>Priority: Major
>
> If a function returns a composite type, for example, {{Row(lon: Float, lat: 
> Float)}}. There is currently no way of accessing fields.
> Both queries fail with exceptions:
> {code}
> select t.c.lat, t.c.lon FROM (select toCoords(12) as c) AS t
> {code}
> {code}
> select toCoords(12).lat
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-9501) Allow Object or Wildcard type in user-define functions as parameter types but not result types

2018-08-31 Thread Rong Rong (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-9501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-9501:
-
Description: 
Idea here is to treat every Java parameter objects  type as SQL ANY type. While 
disallowing SQL ANY type in result object.
This ticket is specifically to deal with composite types (with nested schema or 
sub schema) such as generic erasure types 

{code:java}
public String eval(Map mapArg) { /* ...  */ }
public String eval(Map mapArg) { /* ...  */ }
public String eval(Row rowArg) { /* ...  */ }
{code}

Update 08/2018
With FLINK-9294 covering some of the generic type erasure. The additional 
changes needed
1. Modify FunctionCatalog lookup to use SQL ANY type when a higher level type 
matching is not viable.
2. Introduce additional FunctionCatalog lookup checks to ensure that additional 
informations provided by type inference is used for validation purpose.


  was:
Idea here is to treat every Java parameter objects  type as SQL ANY type. While 
disallowing SQL ANY type in result object.
This ticket is specifically to deal with composite generic erasure types such 
as  

{code:java}
public String eval(Map mapArg) { /* ...  */ }
public String eval(Map mapArg) { /* ...  */ }
{code}

The changes needed here I can think of for now are:
1. Ensure SQL ANY type is used for component/field types for composite 
TypeInformation with GenericTypeInfo nested fields
2. Modify FunctionCatalog lookup to use SQL ANY type if generic erasure types 
happens.



> Allow Object or Wildcard type in user-define functions as parameter types but 
> not result types
> --
>
> Key: FLINK-9501
> URL: https://issues.apache.org/jira/browse/FLINK-9501
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API  SQL
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>
> Idea here is to treat every Java parameter objects  type as SQL ANY type. 
> While disallowing SQL ANY type in result object.
> This ticket is specifically to deal with composite types (with nested schema 
> or sub schema) such as generic erasure types 
> {code:java}
> public String eval(Map mapArg) { /* ...  */ }
> public String eval(Map mapArg) { /* ...  */ }
> public String eval(Row rowArg) { /* ...  */ }
> {code}
> Update 08/2018
> With FLINK-9294 covering some of the generic type erasure. The additional 
> changes needed
> 1. Modify FunctionCatalog lookup to use SQL ANY type when a higher level type 
> matching is not viable.
> 2. Introduce additional FunctionCatalog lookup checks to ensure that 
> additional informations provided by type inference is used for validation 
> purpose.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-9501) Allow Object or Wildcard type in user-define functions as parameter types but not result types

2018-08-31 Thread Rong Rong (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-9501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598895#comment-16598895
 ] 

Rong Rong commented on FLINK-9501:
--

https://github.com/apache/flink/pull/6472 introduced some of the generic type 
inference functionalities by relaxing the search in FunctionCatalog lookup. 
Thus the purpose of this JIRA has changed. it should enforce additional 
validations instead.

> Allow Object or Wildcard type in user-define functions as parameter types but 
> not result types
> --
>
> Key: FLINK-9501
> URL: https://issues.apache.org/jira/browse/FLINK-9501
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API  SQL
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>
> Idea here is to treat every Java parameter objects  type as SQL ANY type. 
> While disallowing SQL ANY type in result object.
> This ticket is specifically to deal with composite types (with nested schema 
> or sub schema) such as generic erasure types 
> {code:java}
> public String eval(Map mapArg) { /* ...  */ }
> public String eval(Map mapArg) { /* ...  */ }
> public String eval(Row rowArg) { /* ...  */ }
> {code}
> Update 08/2018
> With FLINK-9294 covering some of the generic type erasure. The additional 
> changes needed are:
> 1. Modify FunctionCatalog lookup to use SQL ANY type when a higher level type 
> matching is not viable.
> 2. Introduce additional FunctionCatalog lookup checks to ensure that 
> additional informations provided by type inference is used for validation 
> purpose.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-9501) Allow Object or Wildcard type in user-define functions as parameter types but not result types

2018-08-31 Thread Rong Rong (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-9501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rong Rong updated FLINK-9501:
-
Description: 
Idea here is to treat every Java parameter objects  type as SQL ANY type. While 
disallowing SQL ANY type in result object.
This ticket is specifically to deal with composite types (with nested schema or 
sub schema) such as generic erasure types 

{code:java}
public String eval(Map mapArg) { /* ...  */ }
public String eval(Map mapArg) { /* ...  */ }
public String eval(Row rowArg) { /* ...  */ }
{code}

Update 08/2018
With FLINK-9294 covering some of the generic type erasure. The additional 
changes needed are:
1. Modify FunctionCatalog lookup to use SQL ANY type when a higher level type 
matching is not viable.
2. Introduce additional FunctionCatalog lookup checks to ensure that additional 
informations provided by type inference is used for validation purpose.


  was:
Idea here is to treat every Java parameter objects  type as SQL ANY type. While 
disallowing SQL ANY type in result object.
This ticket is specifically to deal with composite types (with nested schema or 
sub schema) such as generic erasure types 

{code:java}
public String eval(Map mapArg) { /* ...  */ }
public String eval(Map mapArg) { /* ...  */ }
public String eval(Row rowArg) { /* ...  */ }
{code}

Update 08/2018
With FLINK-9294 covering some of the generic type erasure. The additional 
changes needed
1. Modify FunctionCatalog lookup to use SQL ANY type when a higher level type 
matching is not viable.
2. Introduce additional FunctionCatalog lookup checks to ensure that additional 
informations provided by type inference is used for validation purpose.



> Allow Object or Wildcard type in user-define functions as parameter types but 
> not result types
> --
>
> Key: FLINK-9501
> URL: https://issues.apache.org/jira/browse/FLINK-9501
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API  SQL
>Reporter: Rong Rong
>Assignee: Rong Rong
>Priority: Major
>
> Idea here is to treat every Java parameter objects  type as SQL ANY type. 
> While disallowing SQL ANY type in result object.
> This ticket is specifically to deal with composite types (with nested schema 
> or sub schema) such as generic erasure types 
> {code:java}
> public String eval(Map mapArg) { /* ...  */ }
> public String eval(Map mapArg) { /* ...  */ }
> public String eval(Row rowArg) { /* ...  */ }
> {code}
> Update 08/2018
> With FLINK-9294 covering some of the generic type erasure. The additional 
> changes needed are:
> 1. Modify FunctionCatalog lookup to use SQL ANY type when a higher level type 
> matching is not viable.
> 2. Introduce additional FunctionCatalog lookup checks to ensure that 
> additional informations provided by type inference is used for validation 
> purpose.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Closed] (FLINK-10270) Delete LegacyRestHandlerAdapter

2018-08-31 Thread Chesnay Schepler (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-10270.

   Resolution: Fixed
Fix Version/s: 1.7.0

master: 0eb9a29adc85286c083d1839eae5f78a17ab76d3

> Delete LegacyRestHandlerAdapter
> ---
>
> Key: FLINK-10270
> URL: https://issues.apache.org/jira/browse/FLINK-10270
> Project: Flink
>  Issue Type: Improvement
>  Components: REST
>Affects Versions: 1.7.0
>Reporter: Gary Yao
>Assignee: Gary Yao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> Delete {{org.apache.flink.runtime.rest.handler.LegacyRestHandlerAdapter}} and 
> {{org.apache.flink.runtime.rest.handler.LegacyRestHandler}} because they were 
> never used for the purposed described in FLINK-7534.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] walterddr commented on a change in pull request #6521: [FLINK-5315][table] Adding support for distinct operation for table API on DataStream

2018-08-31 Thread GitBox

walterddr commented on a change in pull request #6521: [FLINK-5315][table] 
Adding support for distinct operation for table API on DataStream
URL: https://github.com/apache/flink/pull/6521#discussion_r214387582
 
 

 ##
 File path: docs/dev/table/tableApi.md
 ##
 @@ -381,6 +381,36 @@ Table result = orders
 {% highlight java %}
 Table orders = tableEnv.scan("Orders");
 Table result = orders.distinct();
+{% endhighlight %}
+Note: For streaming queries the required state to compute 
the query result might grow infinitely depending on the number of distinct 
fields. Please provide a query configuration with valid retention interval to 
prevent excessive state size. See Streaming 
Concepts for details.
+  
+
+
+  
+Distinct Aggregation
+Streaming
 
 Review comment:
   I agree, I added the labels. Regarding adding the sections towards each 
individual `Aggregation` I wasn't able to find a clean construct since some of 
the discussions (UDAGG, built-in) are general and it's pretty messy to 
replicate those 3 different ways. 
   
   I regenerated the page and it looks pretty obvious since it is within the 
`aggregation` tab and all necessary information (such as Over aggregate only 
applies to stream) is pretty much in the same place.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-5315) Support distinct aggregations in table api

2018-08-31 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-5315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598833#comment-16598833
 ] 

ASF GitHub Bot commented on FLINK-5315:
---

walterddr commented on a change in pull request #6521: [FLINK-5315][table] 
Adding support for distinct operation for table API on DataStream
URL: https://github.com/apache/flink/pull/6521#discussion_r214387582
 
 

 ##
 File path: docs/dev/table/tableApi.md
 ##
 @@ -381,6 +381,36 @@ Table result = orders
 {% highlight java %}
 Table orders = tableEnv.scan("Orders");
 Table result = orders.distinct();
+{% endhighlight %}
+Note: For streaming queries the required state to compute 
the query result might grow infinitely depending on the number of distinct 
fields. Please provide a query configuration with valid retention interval to 
prevent excessive state size. See Streaming 
Concepts for details.
+  
+
+
+  
+Distinct Aggregation
+Streaming
 
 Review comment:
   I agree, I added the labels. Regarding adding the sections towards each 
individual `Aggregation` I wasn't able to find a clean construct since some of 
the discussions (UDAGG, built-in) are general and it's pretty messy to 
replicate those 3 different ways. 
   
   I regenerated the page and it looks pretty obvious since it is within the 
`aggregation` tab and all necessary information (such as Over aggregate only 
applies to stream) is pretty much in the same place.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support distinct aggregations in table api
> --
>
> Key: FLINK-5315
> URL: https://issues.apache.org/jira/browse/FLINK-5315
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API  SQL
>Reporter: Kurt Young
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available
>
> Support distinct aggregations in Table API in the following format:
> For Expressions:
> {code:scala}
> 'a.count.distinct // Expressions distinct modifier
> {code}
> For User-defined Function:
> {code:scala}
> singleArgUdaggFunc.distinct('a) // FunctionCall distinct modifier
> multiArgUdaggFunc.distinct('a, 'b) // FunctionCall distinct modifier
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-5315) Support distinct aggregations in table api

2018-08-31 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-5315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598827#comment-16598827
 ] 

ASF GitHub Bot commented on FLINK-5315:
---

walterddr commented on a change in pull request #6521: [FLINK-5315][table] 
Adding support for distinct operation for table API on DataStream
URL: https://github.com/apache/flink/pull/6521#discussion_r214386870
 
 

 ##
 File path: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/stream/table/AggregateITCase.scala
 ##
 @@ -40,6 +40,84 @@ class AggregateITCase extends StreamingWithStateTestBase {
   private val queryConfig = new StreamQueryConfig()
   queryConfig.withIdleStateRetentionTime(Time.hours(1), Time.hours(2))
 
+  @Test
+  def testDistinctUDAGG(): Unit = {
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+env.setStateBackend(getStateBackend)
+val tEnv = TableEnvironment.getTableEnvironment(env)
+StreamITCase.clear
+
+val testAgg = new DataViewTestAgg
+val t = StreamTestData.get5TupleDataStream(env).toTable(tEnv, 'a, 'b, 'c, 
'd, 'e)
+  .groupBy('e)
+  .select('e, testAgg.distinct('d, 'e))
+
+val results = t.toRetractStream[Row](queryConfig)
+results.addSink(new StreamITCase.RetractingSink).setParallelism(1)
+env.execute()
+
+val expected = mutable.MutableList("1,10", "2,21", "3,12")
+assertEquals(expected.sorted, StreamITCase.retractedResults.sorted)
+  }
+
+  @Test
+  def testDistinctUDAGGMixedWithNonDistinctUsage(): Unit = {
 
 Review comment:
   I original discover the distinct modifier bug using this test. Can we still 
keep this. I found some time mixed test cases can expose potentially hard to 
find bugs. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support distinct aggregations in table api
> --
>
> Key: FLINK-5315
> URL: https://issues.apache.org/jira/browse/FLINK-5315
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API  SQL
>Reporter: Kurt Young
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available
>
> Support distinct aggregations in Table API in the following format:
> For Expressions:
> {code:scala}
> 'a.count.distinct // Expressions distinct modifier
> {code}
> For User-defined Function:
> {code:scala}
> singleArgUdaggFunc.distinct('a) // FunctionCall distinct modifier
> multiArgUdaggFunc.distinct('a, 'b) // FunctionCall distinct modifier
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10115) Content-length limit is also applied to FileUploads

2018-08-31 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598828#comment-16598828
 ] 

ASF GitHub Bot commented on FLINK-10115:


GJL commented on a change in pull request #6595: [FLINK-10115][rest] Ignore 
content-length limit for file uploads 
URL: https://github.com/apache/flink/pull/6595#discussion_r214386889
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/rest/FileUploadHandler.java
 ##
 @@ -134,12 +140,18 @@ protected void channelRead0(final ChannelHandlerContext 
ctx, final HttpObject ms
}
 
if (httpContent instanceof LastHttpContent) {
+   LOG.trace("Finalizing multipart file 
upload.");

ctx.channel().attr(UPLOADED_FILES).set(new FileUploads(currentUploadDir));
-   ctx.fireChannelRead(currentHttpRequest);
if (currentJsonPayload != null) {
+   
currentHttpRequest.headers().set(HttpHeaders.Names.CONTENT_LENGTH, 
currentJsonPayload.length);
 
 Review comment:
   Ok, good point


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Content-length limit is also applied to FileUploads
> ---
>
> Key: FLINK-10115
> URL: https://issues.apache.org/jira/browse/FLINK-10115
> Project: Flink
>  Issue Type: Bug
>  Components: REST, Webfrontend
>Affects Versions: 1.6.0
>Reporter: Yazdan Shirvany
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.6.1
>
>
> Uploading jar files via WEB UI not working. After {{initializing upload...}} 
> it only shows {{saving...}} and file never shows up on UI to be able to 
> submit it



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] GJL commented on a change in pull request #6595: [FLINK-10115][rest] Ignore content-length limit for file uploads

2018-08-31 Thread GitBox

GJL commented on a change in pull request #6595: [FLINK-10115][rest] Ignore 
content-length limit for file uploads 
URL: https://github.com/apache/flink/pull/6595#discussion_r214386889
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/rest/FileUploadHandler.java
 ##
 @@ -134,12 +140,18 @@ protected void channelRead0(final ChannelHandlerContext 
ctx, final HttpObject ms
}
 
if (httpContent instanceof LastHttpContent) {
+   LOG.trace("Finalizing multipart file 
upload.");

ctx.channel().attr(UPLOADED_FILES).set(new FileUploads(currentUploadDir));
-   ctx.fireChannelRead(currentHttpRequest);
if (currentJsonPayload != null) {
+   
currentHttpRequest.headers().set(HttpHeaders.Names.CONTENT_LENGTH, 
currentJsonPayload.length);
 
 Review comment:
   Ok, good point


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] walterddr commented on a change in pull request #6521: [FLINK-5315][table] Adding support for distinct operation for table API on DataStream

2018-08-31 Thread GitBox

walterddr commented on a change in pull request #6521: [FLINK-5315][table] 
Adding support for distinct operation for table API on DataStream
URL: https://github.com/apache/flink/pull/6521#discussion_r214386870
 
 

 ##
 File path: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/stream/table/AggregateITCase.scala
 ##
 @@ -40,6 +40,84 @@ class AggregateITCase extends StreamingWithStateTestBase {
   private val queryConfig = new StreamQueryConfig()
   queryConfig.withIdleStateRetentionTime(Time.hours(1), Time.hours(2))
 
+  @Test
+  def testDistinctUDAGG(): Unit = {
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+env.setStateBackend(getStateBackend)
+val tEnv = TableEnvironment.getTableEnvironment(env)
+StreamITCase.clear
+
+val testAgg = new DataViewTestAgg
+val t = StreamTestData.get5TupleDataStream(env).toTable(tEnv, 'a, 'b, 'c, 
'd, 'e)
+  .groupBy('e)
+  .select('e, testAgg.distinct('d, 'e))
+
+val results = t.toRetractStream[Row](queryConfig)
+results.addSink(new StreamITCase.RetractingSink).setParallelism(1)
+env.execute()
+
+val expected = mutable.MutableList("1,10", "2,21", "3,12")
+assertEquals(expected.sorted, StreamITCase.retractedResults.sorted)
+  }
+
+  @Test
+  def testDistinctUDAGGMixedWithNonDistinctUsage(): Unit = {
 
 Review comment:
   I original discover the distinct modifier bug using this test. Can we still 
keep this. I found some time mixed test cases can expose potentially hard to 
find bugs. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-5315) Support distinct aggregations in table api

2018-08-31 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-5315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598825#comment-16598825
 ] 

ASF GitHub Bot commented on FLINK-5315:
---

walterddr commented on a change in pull request #6521: [FLINK-5315][table] 
Adding support for distinct operation for table API on DataStream
URL: https://github.com/apache/flink/pull/6521#discussion_r214386474
 
 

 ##
 File path: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/api/scala/expressionDsl.scala
 ##
 @@ -214,10 +214,15 @@ trait ImplicitExpressionOperations {
   def varSamp = VarSamp(expr)
 
   /**
-*  Returns multiset aggregate of a given expression.
+* Returns multiset aggregate of a given expression.
 */
   def collect = Collect(expr)
 
+  /**
+* Returns a distinct field reference to a given expression
+*/
+  def distinct = DistinctAgg(expr)
 
 Review comment:
   I was not sure if this is what you had in mind. please take another look. thx


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support distinct aggregations in table api
> --
>
> Key: FLINK-5315
> URL: https://issues.apache.org/jira/browse/FLINK-5315
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API  SQL
>Reporter: Kurt Young
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available
>
> Support distinct aggregations in Table API in the following format:
> For Expressions:
> {code:scala}
> 'a.count.distinct // Expressions distinct modifier
> {code}
> For User-defined Function:
> {code:scala}
> singleArgUdaggFunc.distinct('a) // FunctionCall distinct modifier
> multiArgUdaggFunc.distinct('a, 'b) // FunctionCall distinct modifier
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] walterddr commented on a change in pull request #6521: [FLINK-5315][table] Adding support for distinct operation for table API on DataStream

2018-08-31 Thread GitBox

walterddr commented on a change in pull request #6521: [FLINK-5315][table] 
Adding support for distinct operation for table API on DataStream
URL: https://github.com/apache/flink/pull/6521#discussion_r214386474
 
 

 ##
 File path: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/api/scala/expressionDsl.scala
 ##
 @@ -214,10 +214,15 @@ trait ImplicitExpressionOperations {
   def varSamp = VarSamp(expr)
 
   /**
-*  Returns multiset aggregate of a given expression.
+* Returns multiset aggregate of a given expression.
 */
   def collect = Collect(expr)
 
+  /**
+* Returns a distinct field reference to a given expression
+*/
+  def distinct = DistinctAgg(expr)
 
 Review comment:
   I was not sure if this is what you had in mind. please take another look. thx


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] walterddr commented on a change in pull request #6521: [FLINK-5315][table] Adding support for distinct operation for table API on DataStream

2018-08-31 Thread GitBox

walterddr commented on a change in pull request #6521: [FLINK-5315][table] 
Adding support for distinct operation for table API on DataStream
URL: https://github.com/apache/flink/pull/6521#discussion_r214386228
 
 

 ##
 File path: docs/dev/table/udfs.md
 ##
 @@ -650,6 +650,36 @@ tEnv.sqlQuery("SELECT user, wAvg(points, level) AS 
avgPoints FROM userScores GRO
 
 
 
+User-defined aggregation function can be used with `distinct` modifiers. To 
calculate the aggregate results only for distinct values, simply add the 
distinct modifier towards the aggregation function.
 
 Review comment:
   Done. yes I agree. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-5315) Support distinct aggregations in table api

2018-08-31 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-5315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598824#comment-16598824
 ] 

ASF GitHub Bot commented on FLINK-5315:
---

walterddr commented on a change in pull request #6521: [FLINK-5315][table] 
Adding support for distinct operation for table API on DataStream
URL: https://github.com/apache/flink/pull/6521#discussion_r214386228
 
 

 ##
 File path: docs/dev/table/udfs.md
 ##
 @@ -650,6 +650,36 @@ tEnv.sqlQuery("SELECT user, wAvg(points, level) AS 
avgPoints FROM userScores GRO
 
 
 
+User-defined aggregation function can be used with `distinct` modifiers. To 
calculate the aggregate results only for distinct values, simply add the 
distinct modifier towards the aggregation function.
 
 Review comment:
   Done. yes I agree. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support distinct aggregations in table api
> --
>
> Key: FLINK-5315
> URL: https://issues.apache.org/jira/browse/FLINK-5315
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API  SQL
>Reporter: Kurt Young
>Assignee: Rong Rong
>Priority: Major
>  Labels: pull-request-available
>
> Support distinct aggregations in Table API in the following format:
> For Expressions:
> {code:scala}
> 'a.count.distinct // Expressions distinct modifier
> {code}
> For User-defined Function:
> {code:scala}
> singleArgUdaggFunc.distinct('a) // FunctionCall distinct modifier
> multiArgUdaggFunc.distinct('a, 'b) // FunctionCall distinct modifier
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10115) Content-length limit is also applied to FileUploads

2018-08-31 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598823#comment-16598823
 ] 

ASF GitHub Bot commented on FLINK-10115:


zentol commented on a change in pull request #6595: [FLINK-10115][rest] Ignore 
content-length limit for file uploads 
URL: https://github.com/apache/flink/pull/6595#discussion_r214386020
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/rest/FileUploadHandler.java
 ##
 @@ -134,12 +140,18 @@ protected void channelRead0(final ChannelHandlerContext 
ctx, final HttpObject ms
}
 
if (httpContent instanceof LastHttpContent) {
+   LOG.trace("Finalizing multipart file 
upload.");

ctx.channel().attr(UPLOADED_FILES).set(new FileUploads(currentUploadDir));
-   ctx.fireChannelRead(currentHttpRequest);
if (currentJsonPayload != null) {
+   
currentHttpRequest.headers().set(HttpHeaders.Names.CONTENT_LENGTH, 
currentJsonPayload.length);
 
 Review comment:
   to keep the code identical between 1.5 and 1.6.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Content-length limit is also applied to FileUploads
> ---
>
> Key: FLINK-10115
> URL: https://issues.apache.org/jira/browse/FLINK-10115
> Project: Flink
>  Issue Type: Bug
>  Components: REST, Webfrontend
>Affects Versions: 1.6.0
>Reporter: Yazdan Shirvany
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.6.1
>
>
> Uploading jar files via WEB UI not working. After {{initializing upload...}} 
> it only shows {{saving...}} and file never shows up on UI to be able to 
> submit it



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] zentol commented on a change in pull request #6595: [FLINK-10115][rest] Ignore content-length limit for file uploads

2018-08-31 Thread GitBox

zentol commented on a change in pull request #6595: [FLINK-10115][rest] Ignore 
content-length limit for file uploads 
URL: https://github.com/apache/flink/pull/6595#discussion_r214386020
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/rest/FileUploadHandler.java
 ##
 @@ -134,12 +140,18 @@ protected void channelRead0(final ChannelHandlerContext 
ctx, final HttpObject ms
}
 
if (httpContent instanceof LastHttpContent) {
+   LOG.trace("Finalizing multipart file 
upload.");

ctx.channel().attr(UPLOADED_FILES).set(new FileUploads(currentUploadDir));
-   ctx.fireChannelRead(currentHttpRequest);
if (currentJsonPayload != null) {
+   
currentHttpRequest.headers().set(HttpHeaders.Names.CONTENT_LENGTH, 
currentJsonPayload.length);
 
 Review comment:
   to keep the code identical between 1.5 and 1.6.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-10273) Access composite type fields after a function

2018-08-31 Thread Rong Rong (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598786#comment-16598786
 ] 

Rong Rong commented on FLINK-10273:
---

[~twalthr] Yes I remember there are cases where resulting Composite type cannot 
be further concat with composite type operations. I think one of the reason is 
that the resulting type is a {{GenericType}} instead of a specific 
{{RowTypeInfo}} which I had some hard time dealing with in FLINK-9294.
Another think that might have been related is: FLINK-10019 where there are some 
issues with Calcite when trying to type inference a {{Struct Type}}.

I can dig deeper into this :-)

> Access composite type fields after a function
> -
>
> Key: FLINK-10273
> URL: https://issues.apache.org/jira/browse/FLINK-10273
> Project: Flink
>  Issue Type: Bug
>  Components: Table API  SQL
>Affects Versions: 1.7.0
>Reporter: Timo Walther
>Priority: Major
>
> If a function returns a composite type, for example, {{Row(lon: Float, lat: 
> Float)}}. There is currently no way of accessing fields.
> Both queries fail with exceptions:
> {code}
> select t.c.lat, t.c.lon FROM (select toCoords(12) as c) AS t
> {code}
> {code}
> select toCoords(12).lat
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (FLINK-10261) INSERT INTO does not work with ORDER BY clause

2018-08-31 Thread xueyu (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xueyu reassigned FLINK-10261:
-

Assignee: xueyu

> INSERT INTO does not work with ORDER BY clause
> --
>
> Key: FLINK-10261
> URL: https://issues.apache.org/jira/browse/FLINK-10261
> Project: Flink
>  Issue Type: Bug
>  Components: Table API  SQL
>Reporter: Timo Walther
>Assignee: xueyu
>Priority: Major
>
> It seems that INSERT INTO and ORDER BY do not work well together.
> An AssertionError is thrown and the ORDER BY clause is duplicated. I guess 
> this is a Calcite issue.
> Example:
> {code}
> @Test
>   def testInsertIntoMemoryTable(): Unit = {
> val env = StreamExecutionEnvironment.getExecutionEnvironment
> env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
> val tEnv = TableEnvironment.getTableEnvironment(env)
> MemoryTableSourceSinkUtil.clear()
> val t = StreamTestData.getSmall3TupleDataStream(env)
> .assignAscendingTimestamps(x => x._2)
>   .toTable(tEnv, 'a, 'b, 'c, 'rowtime.rowtime)
> tEnv.registerTable("sourceTable", t)
> val fieldNames = Array("d", "e", "f", "t")
> val fieldTypes = Array(Types.INT, Types.LONG, Types.STRING, 
> Types.SQL_TIMESTAMP)
>   .asInstanceOf[Array[TypeInformation[_]]]
> val sink = new MemoryTableSourceSinkUtil.UnsafeMemoryAppendTableSink
> tEnv.registerTableSink("targetTable", fieldNames, fieldTypes, sink)
> val sql = "INSERT INTO targetTable SELECT a, b, c, rowtime FROM 
> sourceTable ORDER BY a"
> tEnv.sqlUpdate(sql)
> env.execute()
> {code}
> Error:
> {code}
> java.lang.AssertionError: not a query: SELECT `sourceTable`.`a`, 
> `sourceTable`.`b`, `sourceTable`.`c`, `sourceTable`.`rowtime`
> FROM `sourceTable` AS `sourceTable`
> ORDER BY `a`
> ORDER BY `a`
>   at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:3069)
>   at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:557)
>   at 
> org.apache.flink.table.calcite.FlinkPlannerImpl.rel(FlinkPlannerImpl.scala:104)
>   at 
> org.apache.flink.table.api.TableEnvironment.sqlUpdate(TableEnvironment.scala:717)
>   at 
> org.apache.flink.table.api.TableEnvironment.sqlUpdate(TableEnvironment.scala:683)
>   at 
> org.apache.flink.table.runtime.stream.sql.SqlITCase.testInsertIntoMemoryTable(SqlITCase.scala:735)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10206) Add hbase sink connector

2018-08-31 Thread Shimin Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598754#comment-16598754
 ] 

Shimin Yang commented on FLINK-10206:
-

Hi [~hequn8128] , I am planning to implement the table sink for append and 
retract sink, but I think I should finished the datastream sink first since the 
table sink relied on the datastream sink. BTW, should I just add a link to 
desin document in the PR or propose a FLIP？

> Add hbase sink connector
> 
>
> Key: FLINK-10206
> URL: https://issues.apache.org/jira/browse/FLINK-10206
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming Connectors
>Affects Versions: 1.6.0
>Reporter: Igloo
>Assignee: Shimin Yang
>Priority: Major
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Now, there is a hbase source connector for batch operation. 
>  
> In some cases, we need to save Streaming/Batch results into hbase.  Just like 
> cassandra streaming/Batch sink implementations. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10206) Add hbase sink connector

2018-08-31 Thread Shimin Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598750#comment-16598750
 ] 

Shimin Yang commented on FLINK-10206:
-

Thank you for your advices. I will briefly talk about the consistency and 
performance problem here and work on the design document asap. There's a option 
in the Table Bulilder named enable buffer, this will buffer the operations and 
flush them into hbase if the buffer is full. During the snapshot, the hbase 
sink will flush all the buffer operations in case of failure. In general, it 
can provide at least once guarantee.

> Add hbase sink connector
> 
>
> Key: FLINK-10206
> URL: https://issues.apache.org/jira/browse/FLINK-10206
> Project: Flink
>  Issue Type: New Feature
>  Components: Streaming Connectors
>Affects Versions: 1.6.0
>Reporter: Igloo
>Assignee: Shimin Yang
>Priority: Major
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Now, there is a hbase source connector for batch operation. 
>  
> In some cases, we need to save Streaming/Batch results into hbase.  Just like 
> cassandra streaming/Batch sink implementations. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] Clarkkkkk commented on issue #6639: [hotfix][doc][sql-client] Modify typo in sql client doc

2018-08-31 Thread GitBox

Clark commented on issue #6639: [hotfix][doc][sql-client] Modify typo in 
sql client doc
URL: https://github.com/apache/flink/pull/6639#issuecomment-417667923
 
 
   @fhueske @twalthr Yep, I'm using kafka 0.10.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-10208) Bump mockito to 2.0+

2018-08-31 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598730#comment-16598730
 ] 

ASF GitHub Bot commented on FLINK-10208:


TisonKun commented on a change in pull request #6634: [FLINK-10208][build] Bump 
mockito to 2.21.0 / powermock to 2.0.0-beta.5
URL: https://github.com/apache/flink/pull/6634#discussion_r214349501
 
 

 ##
 File path: 
flink-clients/src/test/java/org/apache/flink/client/cli/CliFrontendSavepointTest.java
 ##
 @@ -315,7 +315,7 @@ private static void restoreStdOutAndStdErr() {
private static ClusterClient createClusterClient(String 
expectedResponse) throws Exception {
final ClusterClient clusterClient = 
mock(ClusterClient.class);
 
-   when(clusterClient.triggerSavepoint(any(JobID.class), 
anyString()))
+   when(clusterClient.triggerSavepoint(any(JobID.class), 
nullable(String.class)))
 
 Review comment:
   @zentol What is the preference between `nullable()` and `any()`? 
Inconsistent on this, take a look at `HadoopOutputFormatTest.java` where change 
`any(FileSystem.class)` to `nullable(FileSystem.class)`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Bump mockito to 2.0+
> 
>
> Key: FLINK-10208
> URL: https://issues.apache.org/jira/browse/FLINK-10208
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System, Tests
>Affects Versions: 1.7.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> Mockito only properly supports java 9 with version 2. We have to bump the 
> dependency and fix various API incompatibilities.
> Additionally we could investigate whether we still need powermock after 
> bumping the dependency (which we'd also have to bump otherwise).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] TisonKun commented on a change in pull request #6634: [FLINK-10208][build] Bump mockito to 2.21.0 / powermock to 2.0.0-beta.5

2018-08-31 Thread GitBox

TisonKun commented on a change in pull request #6634: [FLINK-10208][build] Bump 
mockito to 2.21.0 / powermock to 2.0.0-beta.5
URL: https://github.com/apache/flink/pull/6634#discussion_r214349501
 
 

 ##
 File path: 
flink-clients/src/test/java/org/apache/flink/client/cli/CliFrontendSavepointTest.java
 ##
 @@ -315,7 +315,7 @@ private static void restoreStdOutAndStdErr() {
private static ClusterClient createClusterClient(String 
expectedResponse) throws Exception {
final ClusterClient clusterClient = 
mock(ClusterClient.class);
 
-   when(clusterClient.triggerSavepoint(any(JobID.class), 
anyString()))
+   when(clusterClient.triggerSavepoint(any(JobID.class), 
nullable(String.class)))
 
 Review comment:
   @zentol What is the preference between `nullable()` and `any()`? 
Inconsistent on this, take a look at `HadoopOutputFormatTest.java` where change 
`any(FileSystem.class)` to `nullable(FileSystem.class)`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-10115) Content-length limit is also applied to FileUploads

2018-08-31 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598721#comment-16598721
 ] 

ASF GitHub Bot commented on FLINK-10115:


GJL commented on a change in pull request #6595: [FLINK-10115][rest] Ignore 
content-length limit for file uploads 
URL: https://github.com/apache/flink/pull/6595#discussion_r214305773
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/rest/FileUploadHandler.java
 ##
 @@ -134,12 +140,18 @@ protected void channelRead0(final ChannelHandlerContext 
ctx, final HttpObject ms
}
 
if (httpContent instanceof LastHttpContent) {
+   LOG.trace("Finalizing multipart file 
upload.");

ctx.channel().attr(UPLOADED_FILES).set(new FileUploads(currentUploadDir));
-   ctx.fireChannelRead(currentHttpRequest);
if (currentJsonPayload != null) {
+   
currentHttpRequest.headers().set(HttpHeaders.Names.CONTENT_LENGTH, 
currentJsonPayload.length);
 
 Review comment:
   Why don't we use `HttpHeaderNames.CONTENT_LENGTH` here. `HttpHeaders.Names` 
is deprecated.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Content-length limit is also applied to FileUploads
> ---
>
> Key: FLINK-10115
> URL: https://issues.apache.org/jira/browse/FLINK-10115
> Project: Flink
>  Issue Type: Bug
>  Components: REST, Webfrontend
>Affects Versions: 1.6.0
>Reporter: Yazdan Shirvany
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.6.1
>
>
> Uploading jar files via WEB UI not working. After {{initializing upload...}} 
> it only shows {{saving...}} and file never shows up on UI to be able to 
> submit it



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] GJL commented on a change in pull request #6595: [FLINK-10115][rest] Ignore content-length limit for file uploads

2018-08-31 Thread GitBox

GJL commented on a change in pull request #6595: [FLINK-10115][rest] Ignore 
content-length limit for file uploads 
URL: https://github.com/apache/flink/pull/6595#discussion_r214305773
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/rest/FileUploadHandler.java
 ##
 @@ -134,12 +140,18 @@ protected void channelRead0(final ChannelHandlerContext 
ctx, final HttpObject ms
}
 
if (httpContent instanceof LastHttpContent) {
+   LOG.trace("Finalizing multipart file 
upload.");

ctx.channel().attr(UPLOADED_FILES).set(new FileUploads(currentUploadDir));
-   ctx.fireChannelRead(currentHttpRequest);
if (currentJsonPayload != null) {
+   
currentHttpRequest.headers().set(HttpHeaders.Names.CONTENT_LENGTH, 
currentJsonPayload.length);
 
 Review comment:
   Why don't we use `HttpHeaderNames.CONTENT_LENGTH` here. `HttpHeaders.Names` 
is deprecated.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-10208) Bump mockito to 2.0+

2018-08-31 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598719#comment-16598719
 ] 

ASF GitHub Bot commented on FLINK-10208:


TisonKun commented on a change in pull request #6634: [FLINK-10208][build] Bump 
mockito to 2.21.0 / powermock to 2.0.0-beta.5
URL: https://github.com/apache/flink/pull/6634#discussion_r214349501
 
 

 ##
 File path: 
flink-clients/src/test/java/org/apache/flink/client/cli/CliFrontendSavepointTest.java
 ##
 @@ -315,7 +315,7 @@ private static void restoreStdOutAndStdErr() {
private static ClusterClient createClusterClient(String 
expectedResponse) throws Exception {
final ClusterClient clusterClient = 
mock(ClusterClient.class);
 
-   when(clusterClient.triggerSavepoint(any(JobID.class), 
anyString()))
+   when(clusterClient.triggerSavepoint(any(JobID.class), 
nullable(String.class)))
 
 Review comment:
   @zentol What is the preference between `nullable()` and `any()`? 
Inconsistent on this, take a look at `HadoopOutputFormatTest.java` where change 
`any(FileSystem.class)` to `nullable(FileSystem.class)`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Bump mockito to 2.0+
> 
>
> Key: FLINK-10208
> URL: https://issues.apache.org/jira/browse/FLINK-10208
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System, Tests
>Affects Versions: 1.7.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> Mockito only properly supports java 9 with version 2. We have to bump the 
> dependency and fix various API incompatibilities.
> Additionally we could investigate whether we still need powermock after 
> bumping the dependency (which we'd also have to bump otherwise).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] TisonKun commented on a change in pull request #6634: [FLINK-10208][build] Bump mockito to 2.21.0 / powermock to 2.0.0-beta.5

2018-08-31 Thread GitBox

TisonKun commented on a change in pull request #6634: [FLINK-10208][build] Bump 
mockito to 2.21.0 / powermock to 2.0.0-beta.5
URL: https://github.com/apache/flink/pull/6634#discussion_r214349501
 
 

 ##
 File path: 
flink-clients/src/test/java/org/apache/flink/client/cli/CliFrontendSavepointTest.java
 ##
 @@ -315,7 +315,7 @@ private static void restoreStdOutAndStdErr() {
private static ClusterClient createClusterClient(String 
expectedResponse) throws Exception {
final ClusterClient clusterClient = 
mock(ClusterClient.class);
 
-   when(clusterClient.triggerSavepoint(any(JobID.class), 
anyString()))
+   when(clusterClient.triggerSavepoint(any(JobID.class), 
nullable(String.class)))
 
 Review comment:
   @zentol What is the preference between `nullable()` and `any()`? 
Inconsistent on this, take a look at `HadoopOutputFormatTest.java` where change 
`any(FileSystem.class)` to `nullable(FileSystem.class)`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-10267) [State] Fix arbitrary iterator access on RocksDBMapIterator

2018-08-31 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598705#comment-16598705
 ] 

ASF GitHub Bot commented on FLINK-10267:


azagrebin commented on a change in pull request #6638: [FLINK-10267][State] Fix 
arbitrary iterator access on RocksDBMapIterator
URL: https://github.com/apache/flink/pull/6638#discussion_r214345264
 
 

 ##
 File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/state/StateBackendTestBase.java
 ##
 @@ -2913,7 +2913,21 @@ public void testMapState() throws Exception {
assertEquals(new HashMap() {{ put(103, "103"); 
put(1031, "1031"); put(1032, "1032"); }},
getSerializedMap(restoredKvState2, "3", 
keySerializer, VoidNamespace.INSTANCE, namespaceSerializer, userKeySerializer, 
userValueSerializer));
 
-   backend.dispose();
+   // [FLINK-10267] validate arbitrary iterator access not 
throwing IllegalStateException
 
 Review comment:
   The test cases are already quite huge, can we create a separate one for this 
behaviour?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [State] Fix arbitrary iterator access on RocksDBMapIterator
> ---
>
> Key: FLINK-10267
> URL: https://issues.apache.org/jira/browse/FLINK-10267
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Affects Versions: 1.5.3, 1.6.0
>Reporter: Yun Tang
>Assignee: Yun Tang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.6.1, 1.5.4
>
>
> Currently, RocksDBMapIterator would load 128 entries into local cacheEntries 
> every time if needed. Both RocksDBMapIterator#next() and 
> RocksDBMapIterator#hasNext() action might trigger to load RocksDBEntry into 
> cacheEntries.
> However, if the iterator's size larger than 128 and we continue to access the 
> iterator with following order: hasNext() -> next() -> hasNext() -> remove(), 
> we would meet weird exception when we try to remove the 128th element:
> {code:java}
> java.lang.IllegalStateException: The remove operation must be called after a 
> valid next operation.
> {code}
> Since we could not control user's access on iterator, we should fix this bug 
> to avoid unexpected exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10267) [State] Fix arbitrary iterator access on RocksDBMapIterator

2018-08-31 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598706#comment-16598706
 ] 

ASF GitHub Bot commented on FLINK-10267:


azagrebin commented on a change in pull request #6638: [FLINK-10267][State] Fix 
arbitrary iterator access on RocksDBMapIterator
URL: https://github.com/apache/flink/pull/6638#discussion_r214344987
 
 

 ##
 File path: 
flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBMapState.java
 ##
 @@ -595,6 +595,8 @@ private void loadCache() {
 */
if (lastEntry != null && !lastEntry.deleted) {
iterator.next();
+   cacheEntries.add(lastEntry);
+   cacheIndex = 1;
 
 Review comment:
   Could be also just:
   ```
   if (lastEntry != null && !lastEntry.deleted) {
   cacheIndex = 1;
   }
   ```
   
   This should work in general. I wonder if we could make it cleaner.
   Although, the remove operation is not supposed to be called twice for the 
same next,
   if it happens and `lastEntry.deleted` is true, the problem will stay.
   What if we cache always the previously returned value in `nextEntry()` as a 
class field:
   ```
   previousEntry = cacheEntries.get(cacheIndex);
   ```
   then remove could be:
   ```
   @Override
public void remove() {
if (previousEntry == null) {
throw new IllegalStateException("The remove 
operation must be called after a valid next operation.");
}
   
if (!previousEntry.deleted) {
previousEntry.remove();
   }
}
   ```
   
   What do you think?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> [State] Fix arbitrary iterator access on RocksDBMapIterator
> ---
>
> Key: FLINK-10267
> URL: https://issues.apache.org/jira/browse/FLINK-10267
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Affects Versions: 1.5.3, 1.6.0
>Reporter: Yun Tang
>Assignee: Yun Tang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.6.1, 1.5.4
>
>
> Currently, RocksDBMapIterator would load 128 entries into local cacheEntries 
> every time if needed. Both RocksDBMapIterator#next() and 
> RocksDBMapIterator#hasNext() action might trigger to load RocksDBEntry into 
> cacheEntries.
> However, if the iterator's size larger than 128 and we continue to access the 
> iterator with following order: hasNext() -> next() -> hasNext() -> remove(), 
> we would meet weird exception when we try to remove the 128th element:
> {code:java}
> java.lang.IllegalStateException: The remove operation must be called after a 
> valid next operation.
> {code}
> Since we could not control user's access on iterator, we should fix this bug 
> to avoid unexpected exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] azagrebin commented on a change in pull request #6638: [FLINK-10267][State] Fix arbitrary iterator access on RocksDBMapIterator

2018-08-31 Thread GitBox

azagrebin commented on a change in pull request #6638: [FLINK-10267][State] Fix 
arbitrary iterator access on RocksDBMapIterator
URL: https://github.com/apache/flink/pull/6638#discussion_r214345264
 
 

 ##
 File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/state/StateBackendTestBase.java
 ##
 @@ -2913,7 +2913,21 @@ public void testMapState() throws Exception {
assertEquals(new HashMap() {{ put(103, "103"); 
put(1031, "1031"); put(1032, "1032"); }},
getSerializedMap(restoredKvState2, "3", 
keySerializer, VoidNamespace.INSTANCE, namespaceSerializer, userKeySerializer, 
userValueSerializer));
 
-   backend.dispose();
+   // [FLINK-10267] validate arbitrary iterator access not 
throwing IllegalStateException
 
 Review comment:
   The test cases are already quite huge, can we create a separate one for this 
behaviour?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] azagrebin commented on a change in pull request #6638: [FLINK-10267][State] Fix arbitrary iterator access on RocksDBMapIterator

2018-08-31 Thread GitBox

azagrebin commented on a change in pull request #6638: [FLINK-10267][State] Fix 
arbitrary iterator access on RocksDBMapIterator
URL: https://github.com/apache/flink/pull/6638#discussion_r214344987
 
 

 ##
 File path: 
flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBMapState.java
 ##
 @@ -595,6 +595,8 @@ private void loadCache() {
 */
if (lastEntry != null && !lastEntry.deleted) {
iterator.next();
+   cacheEntries.add(lastEntry);
+   cacheIndex = 1;
 
 Review comment:
   Could be also just:
   ```
   if (lastEntry != null && !lastEntry.deleted) {
   cacheIndex = 1;
   }
   ```
   
   This should work in general. I wonder if we could make it cleaner.
   Although, the remove operation is not supposed to be called twice for the 
same next,
   if it happens and `lastEntry.deleted` is true, the problem will stay.
   What if we cache always the previously returned value in `nextEntry()` as a 
class field:
   ```
   previousEntry = cacheEntries.get(cacheIndex);
   ```
   then remove could be:
   ```
   @Override
public void remove() {
if (previousEntry == null) {
throw new IllegalStateException("The remove 
operation must be called after a valid next operation.");
}
   
if (!previousEntry.deleted) {
previousEntry.remove();
   }
}
   ```
   
   What do you think?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] fhueske commented on issue #6639: [hotfix][doc][sql-client] Modify typo in sql client doc

2018-08-31 Thread GitBox

fhueske commented on issue #6639: [hotfix][doc][sql-client] Modify typo in sql 
client doc
URL: https://github.com/apache/flink/pull/6639#issuecomment-417653372
 
 
   If it is a tailing `0` issue, we should add the quotes, IMO.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] twalthr commented on issue #6639: [hotfix][doc][sql-client] Modify typo in sql client doc

2018-08-31 Thread GitBox

twalthr commented on issue #6639: [hotfix][doc][sql-client] Modify typo in sql 
client doc
URL: https://github.com/apache/flink/pull/6639#issuecomment-417652040
 
 
   No this change is not necessary. `0.11` is correctly converted into a string 
"0.11". Only trailing zeros cause issues such as `0.10` which is translated 
into "0.1". @Clark can you validate this change again?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] fhueske commented on issue #6639: [hotfix][doc][sql-client] Modify typo in sql client doc

2018-08-31 Thread GitBox

fhueske commented on issue #6639: [hotfix][doc][sql-client] Modify typo in sql 
client doc
URL: https://github.com/apache/flink/pull/6639#issuecomment-417650852
 
 
   That's weird. I have a working configuration without quotes.
   
   @twalthr, can you have a look at this fix?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Updated] (FLINK-8819) Rework travis script to use build stages

2018-08-31 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-8819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-8819:
--
Labels: pull-request-available  (was: )

> Rework travis script to use build stages
> 
>
> Key: FLINK-8819
> URL: https://issues.apache.org/jira/browse/FLINK-8819
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System, Travis
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Trivial
>  Labels: pull-request-available
>
> This issue is for tracking efforts to rework our Travis scripts to use 
> [stages|https://docs.travis-ci.com/user/build-stages/].
> This feature allows us to define a sequence of jobs that are run one after 
> another. This implies that we can define dependencies between jobs, in 
> contrast to our existing jobs that have to be self-contained.
> As an example, we could have a compile stage, and a test stage with multiple 
> jobs.
> The main benefit here is that we no longer have to compile modules multiple 
> times, which would reduce our build times.
> The major issue here however is that there is no _proper_ support for passing 
> build-artifacts from one stage to the next. According to this 
> [issue|https://github.com/travis-ci/beta-features/issues/28] it is on their 
> to-do-list however.
> In the mean-time we could manually transfer the artifacts between stages by 
> either using the Travis cache or some other external storage. The cache 
> solution would work by setting up a cached directory (just like the mvn 
> cache) and creating build-scope directories within containing the artifacts 
> (I have a prototype that works like this).
> The major concern here is that of cleaning up the cache/storage.
>  We can clean things up if
>  * our script fails
>  * the last stage succeeds.
> We can *not* clean things up if
>  * the build is canceled
>  * travis fails the build due to a timeout or similar
> as apparently there is [no way to run a script at the end of a 
> build|https://github.com/travis-ci/travis-ci/issues/4221].
> Thus we would either have to periodically clear the cache, or encode more 
> information into the cached files that would allow _other_ builds to clean up 
> stale date. (For example the build number or date).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-8819) Rework travis script to use build stages

2018-08-31 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-8819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598680#comment-16598680
 ] 

ASF GitHub Bot commented on FLINK-8819:
---

zentol opened a new pull request #6642: [FLINK-8819][travis] Rework travis 
script to use stages
URL: https://github.com/apache/flink/pull/6642
 
 
   ## What is the purpose of the change
   
   This PR reworks the travis scripts to use stages. Stages allow jobs to be 
organized in sequential steps, in contrast to the current approach of all jobs 
running in parallel. This allows jobs to depend on each other, with the obvious 
use-case of separating code compilation and test execution.
   A subsequent stage is only executed if the previous stage has completed 
successfully, in that all builds in the stage have completed successfully. In 
other words, if checkstyle fails, no tests are executed, so be mindful of that.
   
   The benefit here really is that we no longer compile (parts of) Flink in 
each profile, and move part of the compilation overhead into a separate 
profile. We don't decrease the total runtime due to added overhead 
(upload/download of cache), but the individual builds are faster, and more 
manageable in the long-term.
   
   An example build can be seen here: 
https://travis-ci.org/zentol/flink/builds/422925766
   
   ## High-level overview
   
   The new scripts define 3 stages: Compile, Test and Cleanup.
   
   In the compile stage we compile Flink and run QA checks like checkstyle. The 
compiled Flink project is placed into the travis cache to make it accessible to 
subsequent builds.
   
   The test stage consists of 5 jobs based on our existing test splitting 
(core, libs, connectors, tests, misc). These builds retrieve the compiled Flink 
version from the cache, install it into the local repository and subsequently 
run the tests.
   
   The cleanup jobs deletes the compiled Flink artifact from the cache. This 
step isn't exactly necessary, but still nice to have.
   
   Some additional small refactorings have been made to separate 
`travis_mvn_watchdog.sh` into individual parts, which we can build on in the 
future.
   
   ## Low-level details
   
   ### Caching
   
   The downside of stages is there is no easy-to-use way to pass on build 
artifacts. The caching approach _works_ but has the caveat that builds have to 
share the same cache. The travis cache is only shared between builds if the 
build configurations are identical; most notably they can't call different 
scripts nor have different environment variables.
   
   As a workaround we map the `TRAVIS_JOB_NUMBER` to a specific stage. (If you 
look at the build linked in the PR, `4583.1` would be the value I'm talking 
about). The order of jobs is deterministic, so for example we always know that 
`1-2` belong to the compile stage, with `2` always being configured for the 
legacy codebase.
   
   ### travis_controller
   All stage-related logic is handled by the `travis_controller` script.
   In short:
   * it determines where we are in the build process based on 
`TRAVIS_JOB_NUMBER`
   * if in compile step
 * remove existing cached flink versions (fail-safe cleanup to prevent 
cache from growing larger over time)
 * compile Flink and do QA checks (shading, dependency convergence, 
checkstyle etc.)
 * copy flink to cache location
 * drop unnecessary files (like original jars) from compiled version
   * if in test step
 * fetch flink from cache
 * update all timestamps to prevent compiler plugins from recompiling 
classes
 * execute `travis_mvn_watchdog.sh`
   * if in cleanup step
 * well, cleanup stuff
   
   ### travis_mvn_watchdog
   
   Despite the above changes `travis_mvn_watchdog.sh` works pretty much like it 
did before. It first `install`s Flink (except now without `clean` as this would 
remove already compiled classes) and then runs `mvn verify`.
   This has the downside that we still package jars twice, which actually takes 
a while. We could skip this in theory by directly invoking the `surefire` 
plugin, but various issue in our build/tests prevent this from working at the 
moment. And I don't want to delay this change further.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Rework travis script to use build stages
> 
>
> Key: FLINK-8819
> URL: https://issues.apache.org/jira/browse/FLINK-8819
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System, Travis
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority:

[GitHub] zentol opened a new pull request #6642: [FLINK-8819][travis] Rework travis script to use stages

2018-08-31 Thread GitBox

zentol opened a new pull request #6642: [FLINK-8819][travis] Rework travis 
script to use stages
URL: https://github.com/apache/flink/pull/6642
 
 
   ## What is the purpose of the change
   
   This PR reworks the travis scripts to use stages. Stages allow jobs to be 
organized in sequential steps, in contrast to the current approach of all jobs 
running in parallel. This allows jobs to depend on each other, with the obvious 
use-case of separating code compilation and test execution.
   A subsequent stage is only executed if the previous stage has completed 
successfully, in that all builds in the stage have completed successfully. In 
other words, if checkstyle fails, no tests are executed, so be mindful of that.
   
   The benefit here really is that we no longer compile (parts of) Flink in 
each profile, and move part of the compilation overhead into a separate 
profile. We don't decrease the total runtime due to added overhead 
(upload/download of cache), but the individual builds are faster, and more 
manageable in the long-term.
   
   An example build can be seen here: 
https://travis-ci.org/zentol/flink/builds/422925766
   
   ## High-level overview
   
   The new scripts define 3 stages: Compile, Test and Cleanup.
   
   In the compile stage we compile Flink and run QA checks like checkstyle. The 
compiled Flink project is placed into the travis cache to make it accessible to 
subsequent builds.
   
   The test stage consists of 5 jobs based on our existing test splitting 
(core, libs, connectors, tests, misc). These builds retrieve the compiled Flink 
version from the cache, install it into the local repository and subsequently 
run the tests.
   
   The cleanup jobs deletes the compiled Flink artifact from the cache. This 
step isn't exactly necessary, but still nice to have.
   
   Some additional small refactorings have been made to separate 
`travis_mvn_watchdog.sh` into individual parts, which we can build on in the 
future.
   
   ## Low-level details
   
   ### Caching
   
   The downside of stages is there is no easy-to-use way to pass on build 
artifacts. The caching approach _works_ but has the caveat that builds have to 
share the same cache. The travis cache is only shared between builds if the 
build configurations are identical; most notably they can't call different 
scripts nor have different environment variables.
   
   As a workaround we map the `TRAVIS_JOB_NUMBER` to a specific stage. (If you 
look at the build linked in the PR, `4583.1` would be the value I'm talking 
about). The order of jobs is deterministic, so for example we always know that 
`1-2` belong to the compile stage, with `2` always being configured for the 
legacy codebase.
   
   ### travis_controller
   All stage-related logic is handled by the `travis_controller` script.
   In short:
   * it determines where we are in the build process based on 
`TRAVIS_JOB_NUMBER`
   * if in compile step
 * remove existing cached flink versions (fail-safe cleanup to prevent 
cache from growing larger over time)
 * compile Flink and do QA checks (shading, dependency convergence, 
checkstyle etc.)
 * copy flink to cache location
 * drop unnecessary files (like original jars) from compiled version
   * if in test step
 * fetch flink from cache
 * update all timestamps to prevent compiler plugins from recompiling 
classes
 * execute `travis_mvn_watchdog.sh`
   * if in cleanup step
 * well, cleanup stuff
   
   ### travis_mvn_watchdog
   
   Despite the above changes `travis_mvn_watchdog.sh` works pretty much like it 
did before. It first `install`s Flink (except now without `clean` as this would 
remove already compiled classes) and then runs `mvn verify`.
   This has the downside that we still package jars twice, which actually takes 
a while. We could skip this in theory by directly invoking the `surefire` 
plugin, but various issue in our build/tests prevent this from working at the 
moment. And I don't want to delay this change further.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-10273) Access composite type fields after a function

2018-08-31 Thread Timo Walther (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598676#comment-16598676
 ] 

Timo Walther commented on FLINK-10273:
--

[~walterddr] [~suez1224] Have you also experienced these issues?

> Access composite type fields after a function
> -
>
> Key: FLINK-10273
> URL: https://issues.apache.org/jira/browse/FLINK-10273
> Project: Flink
>  Issue Type: Bug
>  Components: Table API  SQL
>Affects Versions: 1.7.0
>Reporter: Timo Walther
>Priority: Major
>
> If a function returns a composite type, for example, {{Row(lon: Float, lat: 
> Float)}}. There is currently no way of accessing fields.
> Both queries fail with exceptions:
> {code}
> select t.c.lat, t.c.lon FROM (select toCoords(12) as c) AS t
> {code}
> {code}
> select toCoords(12).lat
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (FLINK-10273) Access composite type fields after a function

2018-08-31 Thread Timo Walther (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-10273:
-
Affects Version/s: 1.7.0

> Access composite type fields after a function
> -
>
> Key: FLINK-10273
> URL: https://issues.apache.org/jira/browse/FLINK-10273
> Project: Flink
>  Issue Type: Bug
>  Components: Table API  SQL
>Affects Versions: 1.7.0
>Reporter: Timo Walther
>Priority: Major
>
> If a function returns a composite type, for example, {{Row(lon: Float, lat: 
> Float)}}. There is currently no way of accessing fields.
> Both queries fail with exceptions:
> {code}
> select t.c.lat, t.c.lon FROM (select toCoords(12) as c) AS t
> {code}
> {code}
> select toCoords(12).lat
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (FLINK-10273) Access composite type fields after a function

2018-08-31 Thread Timo Walther (JIRA)

Timo Walther created FLINK-10273:


 Summary: Access composite type fields after a function
 Key: FLINK-10273
 URL: https://issues.apache.org/jira/browse/FLINK-10273
 Project: Flink
  Issue Type: Bug
  Components: Table API  SQL
Reporter: Timo Walther


If a function returns a composite type, for example, {{Row(lon: Float, lat: 
Float)}}. There is currently no way of accessing fields.

Both queries fail with exceptions:
{code}
select t.c.lat, t.c.lon FROM (select toCoords(12) as c) AS t
{code}

{code}
select toCoords(12).lat
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] yanghua commented on issue #6450: [FLINK-9991] [table] Add regexp_replace supported in TableAPI and SQL

2018-08-31 Thread GitBox

yanghua commented on issue #6450: [FLINK-9991] [table] Add regexp_replace 
supported in TableAPI and SQL
URL: https://github.com/apache/flink/pull/6450#issuecomment-417647424
 
 
   @xccui Can you review this PR? thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-9991) Add regexp_replace supported in TableAPI and SQL

2018-08-31 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-9991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598671#comment-16598671
 ] 

ASF GitHub Bot commented on FLINK-9991:
---

yanghua commented on issue #6450: [FLINK-9991] [table] Add regexp_replace 
supported in TableAPI and SQL
URL: https://github.com/apache/flink/pull/6450#issuecomment-417647424
 
 
   @xccui Can you review this PR? thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add regexp_replace supported in TableAPI and SQL
> 
>
> Key: FLINK-9991
> URL: https://issues.apache.org/jira/browse/FLINK-9991
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API  SQL
>Reporter: vinoyang
>Assignee: vinoyang
>Priority: Minor
>  Labels: pull-request-available
>
> regexp_replace is a very userful function to process String. 
>  For example :
> {code:java}
> regexp_replace("foobar", "oo|ar", "") //returns 'fb.'
> {code}
> It is supported as a UDF in Hive, more details please see[1].
> [1]: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-7964) Add Apache Kafka 1.0/1.1 connectors

2018-08-31 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-7964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598606#comment-16598606
 ] 

ASF GitHub Bot commented on FLINK-7964:
---

yanghua commented on issue #6577: [FLINK-7964] Add Apache Kafka 1.0/1.1 
connectors
URL: https://github.com/apache/flink/pull/6577#issuecomment-417636389
 
 
   Oh, @eliaslevy  then I understand what you mean, it sounds like a good idea. 
However, I don't know if Kafka client 0.8 and Kafka client 1.0/2.0 can provide 
matching or close performance if Kafka Server is version 0.8. 
   
   In addition, I refer to the [source code of 
Spark](https://github.com/apache/spark/tree/master/external) and find multiple 
connectors and they are also split by version. I don't mean the choice of Spark 
is the most suitable, but I think we should consider it carefully before we do 
this, or we can start a discussion in dev mailing list . 
   
   What do you think? @pnowojski and @aljoscha 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Apache Kafka 1.0/1.1 connectors
> ---
>
> Key: FLINK-7964
> URL: https://issues.apache.org/jira/browse/FLINK-7964
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Affects Versions: 1.4.0
>Reporter: Hai Zhou
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> Kafka 1.0.0 is no mere bump of the version number. The Apache Kafka Project 
> Management Committee has packed a number of valuable enhancements into the 
> release. Here is a summary of a few of them:
> * Since its introduction in version 0.10, the Streams API has become hugely 
> popular among Kafka users, including the likes of Pinterest, Rabobank, 
> Zalando, and The New York Times. In 1.0, the the API continues to evolve at a 
> healthy pace. To begin with, the builder API has been improved (KIP-120). A 
> new API has been added to expose the state of active tasks at runtime 
> (KIP-130). The new cogroup API makes it much easier to deal with partitioned 
> aggregates with fewer StateStores and fewer moving parts in your code 
> (KIP-150). Debuggability gets easier with enhancements to the print() and 
> writeAsText() methods (KIP-160). And if that’s not enough, check out KIP-138 
> and KIP-161 too. For more on streams, check out the Apache Kafka Streams 
> documentation, including some helpful new tutorial videos.
> * Operating Kafka at scale requires that the system remain observable, and to 
> make that easier, we’ve made a number of improvements to metrics. These are 
> too many to summarize without becoming tedious, but Connect metrics have been 
> significantly improved (KIP-196), a litany of new health check metrics are 
> now exposed (KIP-188), and we now have a global topic and partition count 
> (KIP-168). Check out KIP-164 and KIP-187 for even more.
> * We now support Java 9, leading, among other things, to significantly faster 
> TLS and CRC32C implementations. Over-the-wire encryption will be faster now, 
> which will keep Kafka fast and compute costs low when encryption is enabled.
> * In keeping with the security theme, KIP-152 cleans up the error handling on 
> Simple Authentication Security Layer (SASL) authentication attempts. 
> Previously, some authentication error conditions were indistinguishable from 
> broker failures and were not logged in a clear way. This is cleaner now.
> * Kafka can now tolerate disk failures better. Historically, JBOD storage 
> configurations have not been recommended, but the architecture has 
> nevertheless been tempting: after all, why not rely on Kafka’s own 
> replication mechanism to protect against storage failure rather than using 
> RAID? With KIP-112, Kafka now handles disk failure more gracefully. A single 
> disk failure in a JBOD broker will not bring the entire broker down; rather, 
> the broker will continue serving any log files that remain on functioning 
> disks.
> * Since release 0.11.0, the idempotent producer (which is the producer used 
> in the presence of a transaction, which of course is the producer we use for 
> exactly-once processing) required max.in.flight.requests.per.connection to be 
> equal to one. As anyone who has written or tested a wire protocol can attest, 
> this put an upper bound on throughput. Thanks to KAFKA-5949, this can now be 
> as large as five, relaxing the throughput constraint quite a bit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] yanghua commented on issue #6577: [FLINK-7964] Add Apache Kafka 1.0/1.1 connectors

2018-08-31 Thread GitBox

yanghua commented on issue #6577: [FLINK-7964] Add Apache Kafka 1.0/1.1 
connectors
URL: https://github.com/apache/flink/pull/6577#issuecomment-417636389
 
 
   Oh, @eliaslevy  then I understand what you mean, it sounds like a good idea. 
However, I don't know if Kafka client 0.8 and Kafka client 1.0/2.0 can provide 
matching or close performance if Kafka Server is version 0.8. 
   
   In addition, I refer to the [source code of 
Spark](https://github.com/apache/spark/tree/master/external) and find multiple 
connectors and they are also split by version. I don't mean the choice of Spark 
is the most suitable, but I think we should consider it carefully before we do 
this, or we can start a discussion in dev mailing list . 
   
   What do you think? @pnowojski and @aljoscha 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Created] (FLINK-10272) using Avro's DateConversion causes ClassNotFoundException on Hadoop bundle

2018-08-31 Thread Roel Van der Paal (JIRA)

Roel Van der Paal created FLINK-10272:
-

 Summary: using Avro's DateConversion causes ClassNotFoundException 
on Hadoop bundle
 Key: FLINK-10272
 URL: https://issues.apache.org/jira/browse/FLINK-10272
 Project: Flink
  Issue Type: Improvement
Affects Versions: 1.6.0, 1.5.3
Reporter: Roel Van der Paal


When using org.apache.avro.data.TimeConversions.DateConversion()
in a job on a Hadoop bundled Flink cluster, it throws a ClassNotFoundException 
on org.joda.time.ReadablePartial

* it only occurs on the Hadoop bundled Flink cluster, not on the one without 
Hadoop
* it occurs for both version 1.5.3 and 1.6.0 (I did not check the other 
versions)
* this is probably because org.apache.avro:avro is included in the 
flink-shaded-hadoop2-uber-x.x.x.jar, but joda-time not (joda-time is an 
optional dependency from org.apache.avro:avro)
* adding joda-time to the flink lib folder fixes the problem

Proposed solution is to add joda-time to the 
flink-shaded-hadoop2-uber-x.x.x.jar or remove org.apache.avro:avro from the 
flink-shaded-hadoop2-uber-x.x.x.jar.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-9913) Improve output serialization only once in RecordWriter

2018-08-31 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-9913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598589#comment-16598589
 ] 

ASF GitHub Bot commented on FLINK-9913:
---

zhijiangW commented on issue #6417: [FLINK-9913][runtime] Improve output 
serialization only once in RecordWriter
URL: https://github.com/apache/flink/pull/6417#issuecomment-417629346
 
 
   @NicoK @pnowojski FYI, I have updated the codes covering the above comments.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve output serialization only once in RecordWriter
> --
>
> Key: FLINK-9913
> URL: https://issues.apache.org/jira/browse/FLINK-9913
> Project: Flink
>  Issue Type: Improvement
>  Components: Network
>Affects Versions: 1.6.0
>Reporter: zhijiang
>Assignee: zhijiang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> Currently the {{RecordWriter}} emits output into multi channels via 
> {{ChannelSelector}}  or broadcasts output to all channels directly. Each 
> channel has a separate {{RecordSerializer}} for serializing outputs, that 
> means the output will be serialized as many times as the number of selected 
> channels.
> As we know, data serialization is a high cost operation, so we can get good 
> benefits by improving the serialization only once.
> I would suggest the following changes for realizing it.
>  # Only one {{RecordSerializer}} is created in {{RecordWriter}} for all the 
> channels.
>  # The output is serialized into the intermediate data buffer only once for 
> different channels.
>  # The intermediate serialization results are copied into different 
> {{BufferBuilder}}s for different channels.
> An additional benefit by using a single serializer for all channels is that 
> we get a potentially significant reduction on heap space overhead from fewer 
> intermediate serialization buffers (only once we got over 5MiB, these buffers 
> were pruned back to 128B!).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] zhijiangW commented on issue #6417: [FLINK-9913][runtime] Improve output serialization only once in RecordWriter

2018-08-31 Thread GitBox

zhijiangW commented on issue #6417: [FLINK-9913][runtime] Improve output 
serialization only once in RecordWriter
URL: https://github.com/apache/flink/pull/6417#issuecomment-417629346
 
 
   @NicoK @pnowojski FYI, I have updated the codes covering the above comments.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-10170) Support string representation for map and array types in descriptor-based Table API

2018-08-31 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598579#comment-16598579
 ] 

ASF GitHub Bot commented on FLINK-10170:


tragicjun commented on issue #6578: [FLINK-10170] [table] Support string 
representation for map and array types in descriptor-based Table API
URL: https://github.com/apache/flink/pull/6578#issuecomment-417627819
 
 
   Ready to be merged, could you take a look @fhueske @twalthr ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support string representation for map and array types in descriptor-based 
> Table API
> ---
>
> Key: FLINK-10170
> URL: https://issues.apache.org/jira/browse/FLINK-10170
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API  SQL
>Affects Versions: 1.6.1, 1.7.0
>Reporter: Jun Zhang
>Assignee: Jun Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.6.0
>
> Attachments: flink-10170
>
>
> Since 1.6 the recommended way of creating source/sink table is using 
> connector/format/schema/ descriptors. However, when declaring map types in 
> the schema descriptor, the following exception would be thrown:
> {quote}org.apache.flink.table.api.TableException: A string representation for 
> map types is not supported yet.{quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] tragicjun commented on issue #6578: [FLINK-10170] [table] Support string representation for map and array types in descriptor-based Table API

2018-08-31 Thread GitBox

tragicjun commented on issue #6578: [FLINK-10170] [table] Support string 
representation for map and array types in descriptor-based Table API
URL: https://github.com/apache/flink/pull/6578#issuecomment-417627819
 
 
   Ready to be merged, could you take a look @fhueske @twalthr ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-10271) flink-connector-elasticsearch6_2.11 have exception

2018-08-31 Thread Timo Walther (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598567#comment-16598567
 ] 

Timo Walther commented on FLINK-10271:
--

[~TommyYang] please post the entire stack trace and some more information to 
reproduce this exception.

> flink-connector-elasticsearch6_2.11 have exception
> --
>
> Key: FLINK-10271
> URL: https://issues.apache.org/jira/browse/FLINK-10271
> Project: Flink
>  Issue Type: Bug
>  Components: ElasticSearch Connector
>Affects Versions: 1.6.0
> Environment: LocalStreamEnvironment
>Reporter: ting
>Priority: Major
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>
> org.apache.flink.runtime.client.JobExecutionException: 
> TimerException\{java.lang.NoSuchMethodError: 
> org.elasticsearch.action.bulk.BulkProcessor.add(Lorg/elasticsearch/action/ActionRequest;)Lorg/elasticsearch/action/bulk/BulkProcessor;}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10271) flink-connector-elasticsearch6_2.11 have exception

2018-08-31 Thread Timo Walther (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598565#comment-16598565
 ] 

Timo Walther commented on FLINK-10271:
--

No it doesn't look like a duplicate. It is a different exception. But maybe 
also related to our current packaging.

> flink-connector-elasticsearch6_2.11 have exception
> --
>
> Key: FLINK-10271
> URL: https://issues.apache.org/jira/browse/FLINK-10271
> Project: Flink
>  Issue Type: Bug
>  Components: ElasticSearch Connector
>Affects Versions: 1.6.0
> Environment: LocalStreamEnvironment
>Reporter: ting
>Priority: Major
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>
> org.apache.flink.runtime.client.JobExecutionException: 
> TimerException\{java.lang.NoSuchMethodError: 
> org.elasticsearch.action.bulk.BulkProcessor.add(Lorg/elasticsearch/action/ActionRequest;)Lorg/elasticsearch/action/bulk/BulkProcessor;}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Closed] (FLINK-9825) Upgrade checkstyle version to 8.6

2018-08-31 Thread Chesnay Schepler (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-9825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-9825.
---
Resolution: Duplicate

> Upgrade checkstyle version to 8.6
> -
>
> Key: FLINK-9825
> URL: https://issues.apache.org/jira/browse/FLINK-9825
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Ted Yu
>Assignee: dalongliu
>Priority: Minor
>
> We should upgrade checkstyle version to 8.6+ so that we can use the "match 
> violation message to this regex" feature for suppression. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10208) Bump mockito to 2.0+

2018-08-31 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598547#comment-16598547
 ] 

ASF GitHub Bot commented on FLINK-10208:


zentol commented on a change in pull request #6634: [FLINK-10208][build] Bump 
mockito to 2.21.0 / powermock to 2.0.0-beta.5
URL: https://github.com/apache/flink/pull/6634#discussion_r214309889
 
 

 ##
 File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/taskexecutor/slot/TimerServiceTest.java
 ##
 @@ -22,7 +22,7 @@
 import org.apache.flink.util.TestLogger;
 
 import org.junit.Test;
-import org.mockito.internal.util.reflection.Whitebox;
+import org.powermock.reflect.Whitebox;
 
 Review comment:
   yes, but it would be out-of-scope of this PR to change other usages. For 
this test in particular I'll change it to the flink whitebox copy though.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Bump mockito to 2.0+
> 
>
> Key: FLINK-10208
> URL: https://issues.apache.org/jira/browse/FLINK-10208
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System, Tests
>Affects Versions: 1.7.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> Mockito only properly supports java 9 with version 2. We have to bump the 
> dependency and fix various API incompatibilities.
> Additionally we could investigate whether we still need powermock after 
> bumping the dependency (which we'd also have to bump otherwise).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] zentol commented on a change in pull request #6634: [FLINK-10208][build] Bump mockito to 2.21.0 / powermock to 2.0.0-beta.5

2018-08-31 Thread GitBox

zentol commented on a change in pull request #6634: [FLINK-10208][build] Bump 
mockito to 2.21.0 / powermock to 2.0.0-beta.5
URL: https://github.com/apache/flink/pull/6634#discussion_r214309889
 
 

 ##
 File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/taskexecutor/slot/TimerServiceTest.java
 ##
 @@ -22,7 +22,7 @@
 import org.apache.flink.util.TestLogger;
 
 import org.junit.Test;
-import org.mockito.internal.util.reflection.Whitebox;
+import org.powermock.reflect.Whitebox;
 
 Review comment:
   yes, but it would be out-of-scope of this PR to change other usages. For 
this test in particular I'll change it to the flink whitebox copy though.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Commented] (FLINK-7607) Web Frontend Hangs with Large Numbers of Tasks

2018-08-31 Thread Mike Pedersen (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-7607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598545#comment-16598545
 ] 

Mike Pedersen commented on FLINK-7607:
--

Yeah, I'm having the same problem. We are building a machine learning pipeline 
with a series of transforms for each of many features. This results in a graph 
of approx. 1300 tasks, which Flink itself handles just fine, but totally hangs 
the UI.

> Web Frontend Hangs with Large Numbers of Tasks
> --
>
> Key: FLINK-7607
> URL: https://issues.apache.org/jira/browse/FLINK-7607
> Project: Flink
>  Issue Type: Bug
>  Components: Webfrontend
>Affects Versions: 1.3.2
> Environment: Attempted to load the web frontend on a MacBook Pro 15" 
> (late 2016) with 16 GB of memory using both Chrome 60.0 and Safari 10.1.2.
>Reporter: Joshua Griffith
>Assignee: Steven Langbroek
>Priority: Major
>  Labels: performance
>
> Viewing a job with a high number of tasks in the web front-end causes the 
> page to hang, consuming 100% CPU on a core. At 200 tasks the page slows 
> noticeably and scrolling results in long, non-responsive pauses. At 400 tasks 
> the page only updates once per minute and is almost entirely non-responsive.
> Initially, I thought this was caused by rendering a complex job graph but 
> opening the inspector and deleting the canvas did not improve page 
> performance. Further inspection indicated that the page was redrawing every 
> DOM element in the task list on every update.
> A possible solution is to use an approach similar to 
> [react-list|https://github.com/orgsync/react-list] and only request 
> data/render list items that are in view and only update DOM nodes that have 
> changed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Issue Comment Deleted] (FLINK-10269) Elasticsearch 6 UpdateRequest fail because of binary incompatibility

2018-08-31 Thread ting (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ting updated FLINK-10269:
-
Comment: was deleted

(was: I don't see 
org.apache.flink.streaming.connectors.elasticsearch.BulkProcessorIndexer  in 
the flink project.)

> Elasticsearch 6 UpdateRequest fail because of binary incompatibility
> 
>
> Key: FLINK-10269
> URL: https://issues.apache.org/jira/browse/FLINK-10269
> Project: Flink
>  Issue Type: Bug
>  Components: ElasticSearch Connector
>Affects Versions: 1.6.0
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Blocker
> Fix For: 1.6.1
>
>
> When trying to send UpdateRequest(s) to ElasticSearch6, and one gets the 
> following
> error:
> {code}
> Caused by: java.lang.NoSuchMethodError:
> org.elasticsearch.action.bulk.BulkProcessor.add(Lorg/elasticsearch/action/ActionRequest;)Lorg/elasticsearch/action/bulk/BulkProcessor;
>   at
> org.apache.flink.streaming.connectors.elasticsearch.BulkProcessorIndexer.add(BulkProcessorIndexer.java:76)
> {code}
> ElasticsearchSinkFunction:
> {code}
>   import org.elasticsearch.action.update.UpdateRequest
>   def upsertRequest(element: T): UpdateRequest = {
>   new UpdateRequest(
>   "myIndex",
>   "record",
>   s"${element.id}")
>   .doc(element.toMap())
>   }
>   override def process(element: T, runtimeContext: RuntimeContext,
> requestIndexer: RequestIndexer): Unit = {
>   requestIndexer.add(upsertRequest(element))
>   }
> {code}
> This is due to a binary compatibility issue between the base module (which is 
> compiled against a very old ES version and the current Elasticsearch version).
> As a work around you can simply copy 
> org.apache.flink.streaming.connectors.elasticsearch.BulkProcessorIndexer to 
> your project. This should ensure that the class is compiled correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Issue Comment Deleted] (FLINK-10271) flink-connector-elasticsearch6_2.11 have exception

2018-08-31 Thread ting (JIRA)



 [ 
https://issues.apache.org/jira/browse/FLINK-10271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ting updated FLINK-10271:
-
Comment: was deleted

(was: The problem is different is with FLINK-10269， I don't see 
org.apache.flink.streaming.connectors.elasticsearch.BulkProcessorIndexer  in 
the flink project.)

> flink-connector-elasticsearch6_2.11 have exception
> --
>
> Key: FLINK-10271
> URL: https://issues.apache.org/jira/browse/FLINK-10271
> Project: Flink
>  Issue Type: Bug
>  Components: ElasticSearch Connector
>Affects Versions: 1.6.0
> Environment: LocalStreamEnvironment
>Reporter: ting
>Priority: Major
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>
> org.apache.flink.runtime.client.JobExecutionException: 
> TimerException\{java.lang.NoSuchMethodError: 
> org.elasticsearch.action.bulk.BulkProcessor.add(Lorg/elasticsearch/action/ActionRequest;)Lorg/elasticsearch/action/bulk/BulkProcessor;}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10271) flink-connector-elasticsearch6_2.11 have exception

2018-08-31 Thread ting (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598539#comment-16598539
 ] 

ting commented on FLINK-10271:
--

The problem is different is with FLINK-10269， I don't see 
org.apache.flink.streaming.connectors.elasticsearch.BulkProcessorIndexer  in 
the flink project.

> flink-connector-elasticsearch6_2.11 have exception
> --
>
> Key: FLINK-10271
> URL: https://issues.apache.org/jira/browse/FLINK-10271
> Project: Flink
>  Issue Type: Bug
>  Components: ElasticSearch Connector
>Affects Versions: 1.6.0
> Environment: LocalStreamEnvironment
>Reporter: ting
>Priority: Major
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>
> org.apache.flink.runtime.client.JobExecutionException: 
> TimerException\{java.lang.NoSuchMethodError: 
> org.elasticsearch.action.bulk.BulkProcessor.add(Lorg/elasticsearch/action/ActionRequest;)Lorg/elasticsearch/action/bulk/BulkProcessor;}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10269) Elasticsearch 6 UpdateRequest fail because of binary incompatibility

2018-08-31 Thread ting (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598532#comment-16598532
 ] 

ting commented on FLINK-10269:
--

I don't see 
org.apache.flink.streaming.connectors.elasticsearch.BulkProcessorIndexer  in 
the flink project.

> Elasticsearch 6 UpdateRequest fail because of binary incompatibility
> 
>
> Key: FLINK-10269
> URL: https://issues.apache.org/jira/browse/FLINK-10269
> Project: Flink
>  Issue Type: Bug
>  Components: ElasticSearch Connector
>Affects Versions: 1.6.0
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Blocker
> Fix For: 1.6.1
>
>
> When trying to send UpdateRequest(s) to ElasticSearch6, and one gets the 
> following
> error:
> {code}
> Caused by: java.lang.NoSuchMethodError:
> org.elasticsearch.action.bulk.BulkProcessor.add(Lorg/elasticsearch/action/ActionRequest;)Lorg/elasticsearch/action/bulk/BulkProcessor;
>   at
> org.apache.flink.streaming.connectors.elasticsearch.BulkProcessorIndexer.add(BulkProcessorIndexer.java:76)
> {code}
> ElasticsearchSinkFunction:
> {code}
>   import org.elasticsearch.action.update.UpdateRequest
>   def upsertRequest(element: T): UpdateRequest = {
>   new UpdateRequest(
>   "myIndex",
>   "record",
>   s"${element.id}")
>   .doc(element.toMap())
>   }
>   override def process(element: T, runtimeContext: RuntimeContext,
> requestIndexer: RequestIndexer): Unit = {
>   requestIndexer.add(upsertRequest(element))
>   }
> {code}
> This is due to a binary compatibility issue between the base module (which is 
> compiled against a very old ES version and the current Elasticsearch version).
> As a work around you can simply copy 
> org.apache.flink.streaming.connectors.elasticsearch.BulkProcessorIndexer to 
> your project. This should ensure that the class is compiled correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10271) flink-connector-elasticsearch6_2.11 have exception

2018-08-31 Thread Dawid Wysakowicz (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598510#comment-16598510
 ] 

Dawid Wysakowicz commented on FLINK-10271:
--

This is probably duplicate of [FLINK-10269].

> flink-connector-elasticsearch6_2.11 have exception
> --
>
> Key: FLINK-10271
> URL: https://issues.apache.org/jira/browse/FLINK-10271
> Project: Flink
>  Issue Type: Bug
>  Components: ElasticSearch Connector
>Affects Versions: 1.6.0
> Environment: LocalStreamEnvironment
>Reporter: ting
>Priority: Major
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>
> org.apache.flink.runtime.client.JobExecutionException: 
> TimerException\{java.lang.NoSuchMethodError: 
> org.elasticsearch.action.bulk.BulkProcessor.add(Lorg/elasticsearch/action/ActionRequest;)Lorg/elasticsearch/action/bulk/BulkProcessor;}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (FLINK-6105) Properly handle InterruptedException in HadoopInputFormatBase

2018-08-31 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-6105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16307281#comment-16307281
 ] 

Ted Yu edited comment on FLINK-6105 at 8/31/18 9:45 AM:


In 
flink-connectors/flink-connector-filesystem/src/main/java/org/apache/flink/streaming/connectors/fs/RollingSink.java
 :
{code}
  try {
Thread.sleep(500);
  } catch (InterruptedException e1) {
// ignore it
  }
{code}
Interrupt status should be restored, or throw InterruptedIOException .


was (Author: yuzhih...@gmail.com):
In 
flink-connectors/flink-connector-filesystem/src/main/java/org/apache/flink/streaming/connectors/fs/RollingSink.java
 :

{code}
  try {
Thread.sleep(500);
  } catch (InterruptedException e1) {
// ignore it
  }
{code}
Interrupt status should be restored, or throw InterruptedIOException .

> Properly handle InterruptedException in HadoopInputFormatBase
> -
>
> Key: FLINK-6105
> URL: https://issues.apache.org/jira/browse/FLINK-6105
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API
>Reporter: Ted Yu
>Assignee: zhangminglei
>Priority: Major
>
> When catching InterruptedException, we should throw InterruptedIOException 
> instead of IOException.
> The following example is from HadoopInputFormatBase :
> {code}
> try {
>   splits = this.mapreduceInputFormat.getSplits(jobContext);
> } catch (InterruptedException e) {
>   throw new IOException("Could not get Splits.", e);
> }
> {code}
> There may be other places where IOE is thrown.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (FLINK-9340) ScheduleOrUpdateConsumersTest may fail with Address already in use

2018-08-31 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-9340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16507191#comment-16507191
 ] 

Ted Yu edited comment on FLINK-9340 at 8/31/18 9:45 AM:


I wonder if it is easier to reproduce the error when running 
LegacyScheduleOrUpdateConsumersTest concurrently with this test.


was (Author: yuzhih...@gmail.com):
I wonder if it is easier to reproduce the error when running 
LegacyScheduleOrUpdateConsumersTest concurrently with this test .

> ScheduleOrUpdateConsumersTest may fail with Address already in use
> --
>
> Key: FLINK-9340
> URL: https://issues.apache.org/jira/browse/FLINK-9340
> Project: Flink
>  Issue Type: Test
>Reporter: Ted Yu
>Priority: Minor
>  Labels: runtime
>
> When ScheduleOrUpdateConsumersTest is run in the test suite, I saw:
> {code}
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 8.034 sec <<< 
> FAILURE! - in 
> org.apache.flink.runtime.jobmanager.scheduler.ScheduleOrUpdateConsumersTest
> org.apache.flink.runtime.jobmanager.scheduler.ScheduleOrUpdateConsumersTest  
> Time elapsed: 8.034 sec  <<< ERROR!
> java.net.BindException: Address already in use
>   at sun.nio.ch.Net.bind0(Native Method)
>   at sun.nio.ch.Net.bind(Net.java:433)
>   at sun.nio.ch.Net.bind(Net.java:425)
>   at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
>   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>   at 
> org.apache.flink.shaded.netty4.io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125)
>   at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:485)
>   at 
> org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1081)
>   at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:502)
>   at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:487)
>   at 
> org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:904)
>   at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannel.bind(AbstractChannel.java:198)
>   at 
> org.apache.flink.shaded.netty4.io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:348)
>   at 
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
>   at 
> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
>   at 
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>   at 
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
> {code}
> Seems there was address / port conflict.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (FLINK-9825) Upgrade checkstyle version to 8.6

2018-08-31 Thread Ted Yu (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-9825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16575017#comment-16575017
 ] 

Ted Yu edited comment on FLINK-9825 at 8/31/18 9:44 AM:


Thanks, Dalong.


was (Author: yuzhih...@gmail.com):
Thanks, Dalong .

> Upgrade checkstyle version to 8.6
> -
>
> Key: FLINK-9825
> URL: https://issues.apache.org/jira/browse/FLINK-9825
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Ted Yu
>Assignee: dalongliu
>Priority: Minor
>
> We should upgrade checkstyle version to 8.6+ so that we can use the "match 
> violation message to this regex" feature for suppression. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (FLINK-10271) flink-connector-elasticsearch6_2.11 have exception

2018-08-31 Thread ting (JIRA)

ting created FLINK-10271:


 Summary: flink-connector-elasticsearch6_2.11 have exception
 Key: FLINK-10271
 URL: https://issues.apache.org/jira/browse/FLINK-10271
 Project: Flink
  Issue Type: Bug
  Components: ElasticSearch Connector
Affects Versions: 1.6.0
 Environment: LocalStreamEnvironment
Reporter: ting


org.apache.flink.runtime.client.JobExecutionException: 
TimerException\{java.lang.NoSuchMethodError: 
org.elasticsearch.action.bulk.BulkProcessor.add(Lorg/elasticsearch/action/ActionRequest;)Lorg/elasticsearch/action/bulk/BulkProcessor;}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10208) Bump mockito to 2.0+

2018-08-31 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/FLINK-10208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598484#comment-16598484
 ] 

ASF GitHub Bot commented on FLINK-10208:


azagrebin commented on a change in pull request #6634: [FLINK-10208][build] 
Bump mockito to 2.21.0 / powermock to 2.0.0-beta.5
URL: https://github.com/apache/flink/pull/6634#discussion_r214297157
 
 

 ##
 File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/taskexecutor/slot/TimerServiceTest.java
 ##
 @@ -22,7 +22,7 @@
 import org.apache.flink.util.TestLogger;
 
 import org.junit.Test;
-import org.mockito.internal.util.reflection.Whitebox;
+import org.powermock.reflect.Whitebox;
 
 Review comment:
   ok, does it make sense then to use everywhere one thing? e.g. then 
`flink.Whitebox`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Bump mockito to 2.0+
> 
>
> Key: FLINK-10208
> URL: https://issues.apache.org/jira/browse/FLINK-10208
> Project: Flink
>  Issue Type: Sub-task
>  Components: Build System, Tests
>Affects Versions: 1.7.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> Mockito only properly supports java 9 with version 2. We have to bump the 
> dependency and fix various API incompatibilities.
> Additionally we could investigate whether we still need powermock after 
> bumping the dependency (which we'd also have to bump otherwise).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] azagrebin commented on a change in pull request #6634: [FLINK-10208][build] Bump mockito to 2.21.0 / powermock to 2.0.0-beta.5

2018-08-31 Thread GitBox

azagrebin commented on a change in pull request #6634: [FLINK-10208][build] 
Bump mockito to 2.21.0 / powermock to 2.0.0-beta.5
URL: https://github.com/apache/flink/pull/6634#discussion_r214297157
 
 

 ##
 File path: 
flink-runtime/src/test/java/org/apache/flink/runtime/taskexecutor/slot/TimerServiceTest.java
 ##
 @@ -22,7 +22,7 @@
 import org.apache.flink.util.TestLogger;
 
 import org.junit.Test;
-import org.mockito.internal.util.reflection.Whitebox;
+import org.powermock.reflect.Whitebox;
 
 Review comment:
   ok, does it make sense then to use everywhere one thing? e.g. then 
`flink.Whitebox`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

1 2 >

1 - 100 of 136 matches

Mail list logo