[jira] [Assigned] (FLINK-7804) YarnResourceManager does not execute AMRMClientAsync callbacks in main thread

2018-03-05 Thread Gary Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Yao reassigned FLINK-7804:
---

Assignee: Gary Yao

> YarnResourceManager does not execute AMRMClientAsync callbacks in main thread
> -
>
> Key: FLINK-7804
> URL: https://issues.apache.org/jira/browse/FLINK-7804
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination, YARN
>Affects Versions: 1.4.0, 1.5.0
>Reporter: Till Rohrmann
>Assignee: Gary Yao
>Priority: Major
>  Labels: flip-6
>
> The {{YarnResourceManager}} registers callbacks at a {{AMRMClientAsync}} 
> which it uses to react to Yarn container allocations. These callbacks (e.g. 
> {{onContainersAllocated}} modify the internal state of the 
> {{YarnResourceManager}}. This can lead to race conditions with the 
> {{requestYarnContainer}} method.
> In order to solve this problem we have to execute the state changing 
> operations in the main thread of the {{YarnResourceManager}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8837) Move DataStreamUtils to package 'experimental'.

2018-03-05 Thread Bowen Li (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385772#comment-16385772
 ] 

Bowen Li commented on FLINK-8837:
-

+1 on adding a new {{experimental}} annotation.

I can take this ticket if everyone agrees on the approach, since I moved 
DataStreamUtils from flink-contrib to flink-streaming-java

> Move DataStreamUtils to package 'experimental'.
> ---
>
> Key: FLINK-8837
> URL: https://issues.apache.org/jira/browse/FLINK-8837
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Reporter: Stephan Ewen
>Priority: Blocker
> Fix For: 1.5.0
>
>
> The class {{DataStreamUtils}} came from 'flink-contrib' and now accidentally 
> moved to the fully supported API packages. It should be in package 
> 'experimental' to properly communicate that it is not guaranteed to be API 
> stable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-8857) HBase connector read example throws exception at the end.

2018-03-05 Thread Xu Zhang (JIRA)
Xu Zhang created FLINK-8857:
---

 Summary: HBase connector read example throws exception at the end.
 Key: FLINK-8857
 URL: https://issues.apache.org/jira/browse/FLINK-8857
 Project: Flink
  Issue Type: Bug
  Components: Batch Connectors and Input/Output Formats
Affects Versions: 1.2.0
Reporter: Xu Zhang


Running test case example of
{code:java}
flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseReadExample.java{code}
Although the result has been printed out successfully, but at the end, driver 
will throw the following exception.
{code:java}

The program finished with the following exception:

org.apache.flink.client.program.ProgramInvocationException: The main method 
caused an error.
at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:545)
at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:419)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:381)
at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:838)
at org.apache.flink.client.CliFrontend.run(CliFrontend.java:259)
at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1086)
at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1133)
at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1130)
at 
org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
at 
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1130)
Caused by: java.lang.RuntimeException: No new data sinks have been defined 
since the last execution. The last execution refers to the latest call to 
'execute()', 'count()', 'collect()', or 'print()'.
at 
org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:1050)
at 
org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:1032)
at 
org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:59)
at 
org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:926)
at com.hulu.ap.flink.scoring.pipeline.ScoringJob.main(ScoringJob.java:82)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528)
... 13 more
{code}
 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-8845) Introduce `parallel recovery` mode for full checkpoint (savepoint)

2018-03-05 Thread Sihua Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sihua Zhou updated FLINK-8845:
--
Summary:  Introduce `parallel recovery` mode for full checkpoint 
(savepoint)  (was:  Introduce `parallel recovery` mode for fully checkpoint 
(savepoint))

>  Introduce `parallel recovery` mode for full checkpoint (savepoint)
> ---
>
> Key: FLINK-8845
> URL: https://issues.apache.org/jira/browse/FLINK-8845
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.5.0
>Reporter: Sihua Zhou
>Assignee: Sihua Zhou
>Priority: Major
> Fix For: 1.6.0
>
>
> Base on {{ingestExternalFile()}} and {{SstFileWriter}} provided by RocksDB, 
> we can restore from fully checkpoint (savepoint) in parallel. This can also 
> be extended to incremental checkpoint easily, but for the sake of simple, we 
> do this in two separate tasks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-8839) Table source factory discovery is broken in SQL Client

2018-03-05 Thread Timo Walther (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-8839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-8839:

Priority: Blocker  (was: Major)

> Table source factory discovery is broken in SQL Client
> --
>
> Key: FLINK-8839
> URL: https://issues.apache.org/jira/browse/FLINK-8839
> Project: Flink
>  Issue Type: Bug
>  Components: Table API  SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Blocker
>
> Table source factories cannot not be discovered if they were added using a 
> jar file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-7804) YarnResourceManager does not execute AMRMClientAsync callbacks in main thread

2018-03-05 Thread Gary Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Yao updated FLINK-7804:

Priority: Blocker  (was: Major)

> YarnResourceManager does not execute AMRMClientAsync callbacks in main thread
> -
>
> Key: FLINK-7804
> URL: https://issues.apache.org/jira/browse/FLINK-7804
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination, YARN
>Affects Versions: 1.4.0, 1.5.0
>Reporter: Till Rohrmann
>Assignee: Gary Yao
>Priority: Blocker
>  Labels: flip-6
>
> The {{YarnResourceManager}} registers callbacks at a {{AMRMClientAsync}} 
> which it uses to react to Yarn container allocations. These callbacks (e.g. 
> {{onContainersAllocated}} modify the internal state of the 
> {{YarnResourceManager}}. This can lead to race conditions with the 
> {{requestYarnContainer}} method.
> In order to solve this problem we have to execute the state changing 
> operations in the main thread of the {{YarnResourceManager}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (FLINK-8835) Fix TaskManager config keys

2018-03-05 Thread mingleizhang (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385857#comment-16385857
 ] 

mingleizhang edited comment on FLINK-8835 at 3/5/18 9:40 AM:
-

Hi, [~StephanEwen] I would like to confirm one stuff with you. I think we just 
rename the key name is enough, but I found, Is it necessary to refactor the 
corresponding method and variable name ? I do not think so. Because only the 
keys name exposes to *users* while method and variable name only exposes to 
*developer*.

like the following, I just want to rename the key to 
{{taskmanager.registration.initial-backoff}}, instead rename the 
{{INITIAL_REGISTRATION_PAUSE}}.What do you think of that ? Thanks.
{code:java}
public static final ConfigOption INITIAL_REGISTRATION_PAUSE =
key("taskmanager.registration.initial-backoff")
.defaultValue("500 ms")
{code}


was (Author: mingleizhang):
Hi, [~StephanEwen] I would like to confirm one stuff with you. I think we just 
rename the key name is enough, but I found, Is it necessary to refactor the 
corresponding method and variable name ? I do not think so. Because only the 
keys name exposes to users while method and variable name only exposes to 
developer. 

like the following, I just want to rename the key to 
{{taskmanager.registration.initial-backoff}}, instead rename the 
{{INITIAL_REGISTRATION_PAUSE}}.What do you think of that ? Thanks.

{code:java}

public static final ConfigOption INITIAL_REGISTRATION_PAUSE =
key("taskmanager.registration.initial-backoff")
.defaultValue("500 ms")
{code}


> Fix TaskManager config keys
> ---
>
> Key: FLINK-8835
> URL: https://issues.apache.org/jira/browse/FLINK-8835
> Project: Flink
>  Issue Type: Bug
>  Components: TaskManager
>Reporter: Stephan Ewen
>Assignee: mingleizhang
>Priority: Blocker
>  Labels: easy-fix
> Fix For: 1.5.0
>
>
> Many new config keys in the TaskManager don't follow the proper naming 
> scheme. We need to clear those up before the release. I would also suggest to 
> keep the key names short, because that makes it easier for users.
> When doing this cleanup pass over the config keys, I would suggest to also 
> make some of the existing keys more hierarchical harmonize them with the 
> common scheme in Flink.
> h1. New Keys
> * {{taskmanager.network.credit-based-flow-control.enabled}} to 
> {{taskmanager.network.credit-model}}.
> * {{taskmanager.exactly-once.blocking.data.enabled}} to 
> {{task.checkpoint.alignment.blocking}} (we already have 
> {{task.checkpoint.alignment.max-size}})
> h1. Existing Keys
> * {{taskmanager.debug.memory.startLogThread}} => 
> {{taskmanager.debug.memory.log}}
> * {{taskmanager.debug.memory.logIntervalMs}} => 
> {{taskmanager.debug.memory.log-interval}}
> * {{taskmanager.initial-registration-pause}} => 
> {{taskmanager.registration.initial-backoff}}
> * {{taskmanager.max-registration-pause}} => 
> {{taskmanager.registration.max-backoff}}
> * {{taskmanager.refused-registration-pause}} 
> {{taskmanager.registration.refused-backoff}}
> * {{taskmanager.maxRegistrationDuration}} ==> * 
> {{taskmanager.registration.timeout}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink pull request #5634: [FLINK-5479] [kafka] Idleness detection for period...

2018-03-05 Thread tzulitai
GitHub user tzulitai opened a pull request:

https://github.com/apache/flink/pull/5634

[FLINK-5479] [kafka] Idleness detection for periodic per-partition 
watermarks in FlinkKafkaConsumer

## What is the purpose of the change

This commit adds the capability to detect idle partitions in the 
`FlinkKafkaConsumer` when using periodic per-partition watermark generation. 
Users set the partition idle timeout using `setPartitionIdleTimeout(long)`. The 
value of the timeout determines how long a an idle partition may block 
watermark advancement downstream.


## Brief change log

- Adds a `setPartitionIdleTimeout(long)` configuration method
- Modifies `KafkaTopicPartitionStateWithPeriodicWatermarks` to keep track 
of necessary information to determine partition idleness.
- Adds idleness detection logic to 
`AsbtractFetcher.PeriodicWatermarkEmitter`.

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (yes / **no**)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
  - The serializers: (yes / **no** / don't know)
  - The runtime per-record code paths (performance sensitive): (yes / 
**no** / don't know)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
  - The S3 file system connector: (yes / **no** / don't know)

## Documentation

  - Does this pull request introduce a new feature? (**yes** / no)
  - If yes, how is the feature documented? (not applicable / docs / 
**JavaDocs** / not documented)


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tzulitai/flink FLINK-5479

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5634.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5634


commit d7b95ca1c8bb85f77dd6b83becf8fc1d5cccb810
Author: Tzu-Li (Gordon) Tai 
Date:   2018-03-05T09:35:02Z

[FLINK-5479] [kafka] Idleness detection for periodic per-partition 
watermarks

This commit adds the capability to detect idle partitions in the
FlinkKafkaConsumer when using periodic per-partition watermark
generation. Users set the partition idle timeout using
`setPartitionIdleTimeout(long)`. The value of the timeout determines how
long a an idle partition may block watermark advancement downstream.




---


[jira] [Commented] (FLINK-8835) Fix TaskManager config keys

2018-03-05 Thread mingleizhang (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385857#comment-16385857
 ] 

mingleizhang commented on FLINK-8835:
-

Hi, [~StephanEwen] I would like to confirm one stuff with you. I think we just 
rename the key name is enough, but I found, Is it necessary to refactor the 
corresponding method and variable name ? I do not think so. Because only the 
keys name exposes to users while method and variable name only exposes to 
developer. 

like the following, I just want to rename the key to 
{{taskmanager.registration.initial-backoff}}, instead rename the 
{{INITIAL_REGISTRATION_PAUSE}}.What do you think of that ? Thanks.

{code:java}

public static final ConfigOption INITIAL_REGISTRATION_PAUSE =
key("taskmanager.registration.initial-backoff")
.defaultValue("500 ms")
{code}


> Fix TaskManager config keys
> ---
>
> Key: FLINK-8835
> URL: https://issues.apache.org/jira/browse/FLINK-8835
> Project: Flink
>  Issue Type: Bug
>  Components: TaskManager
>Reporter: Stephan Ewen
>Assignee: mingleizhang
>Priority: Blocker
>  Labels: easy-fix
> Fix For: 1.5.0
>
>
> Many new config keys in the TaskManager don't follow the proper naming 
> scheme. We need to clear those up before the release. I would also suggest to 
> keep the key names short, because that makes it easier for users.
> When doing this cleanup pass over the config keys, I would suggest to also 
> make some of the existing keys more hierarchical harmonize them with the 
> common scheme in Flink.
> h1. New Keys
> * {{taskmanager.network.credit-based-flow-control.enabled}} to 
> {{taskmanager.network.credit-model}}.
> * {{taskmanager.exactly-once.blocking.data.enabled}} to 
> {{task.checkpoint.alignment.blocking}} (we already have 
> {{task.checkpoint.alignment.max-size}})
> h1. Existing Keys
> * {{taskmanager.debug.memory.startLogThread}} => 
> {{taskmanager.debug.memory.log}}
> * {{taskmanager.debug.memory.logIntervalMs}} => 
> {{taskmanager.debug.memory.log-interval}}
> * {{taskmanager.initial-registration-pause}} => 
> {{taskmanager.registration.initial-backoff}}
> * {{taskmanager.max-registration-pause}} => 
> {{taskmanager.registration.max-backoff}}
> * {{taskmanager.refused-registration-pause}} 
> {{taskmanager.registration.refused-backoff}}
> * {{taskmanager.maxRegistrationDuration}} ==> * 
> {{taskmanager.registration.timeout}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8859) RocksDB backend should pass WriteOption to Rocks.put() when restoring

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386012#comment-16386012
 ] 

ASF GitHub Bot commented on FLINK-8859:
---

GitHub user sihuazhou opened a pull request:

https://github.com/apache/flink/pull/5635

[FLINK-8859][state backend] RocksDB backend should pass WriteOption to 
Rocks.put() when restoring

## What is the purpose of the change

This PR fixes 
[FLINK-8859](https://issues.apache.org/jira/browse/FLINK-8859). We should pass 
`WriteOption` to Rocks.put() when restoring from handle (Both in full & 
incremental checkpoint). Because of `WriteOption.setDisableWAL(true)`, the 
performance can be increased by about 2 times.

## Brief change log

  - pass WriteOption to Rocks.put() when restoring

## Verifying this change
The changes can be verified by existing tests.

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (no)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)

## Documentation

  - Does this pull request introduce a new feature? (no)


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sihuazhou/flink pass_writeoption_2_put

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5635.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5635


commit 08109cb20d26bd1c8a04455b2fa914807a730a28
Author: sihuazhou 
Date:   2018-03-05T12:21:27Z

Pass the WriteOption to RocksDB.put() when restoring from handle.




> RocksDB backend should pass WriteOption to Rocks.put() when restoring
> -
>
> Key: FLINK-8859
> URL: https://issues.apache.org/jira/browse/FLINK-8859
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Affects Versions: 1.5.0
>Reporter: Sihua Zhou
>Assignee: Sihua Zhou
>Priority: Major
> Fix For: 1.5.0
>
>
> We should pass `WriteOption` to Rocks.put() when restoring from handle (Both 
> in full & incremental checkpoint). Because of 
> `WriteOption.setDisableWAL(true)`, the performance can be increased by about 
> 2 times.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8560) add KeyedProcessFunction to expose the key in onTimer() and other methods

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385880#comment-16385880
 ] 

ASF GitHub Bot commented on FLINK-8560:
---

Github user kl0u commented on a diff in the pull request:

https://github.com/apache/flink/pull/5481#discussion_r172136142
  
--- Diff: docs/dev/stream/operators/process_function.md ---
@@ -242,4 +242,17 @@ class CountWithTimeoutFunction extends 
ProcessFunction[(String, String), (String
 the current processing time as event-time timestamp. This behavior is very 
subtle and might not be noticed by users. Well, it's
 harmful because processing-time timestamps are indeterministic and not 
aligned with watermarks. Besides, user-implemented logic
 depends on this wrong timestamp highly likely is unintendedly faulty. So 
we've decided to fix it. Upon upgrading to 1.4.0, Flink jobs
-that are using this incorrect event-time timestamp will fail, and users 
should adapt their jobs to the correct logic.
\ No newline at end of file
+that are using this incorrect event-time timestamp will fail, and users 
should adapt their jobs to the correct logic.
+
+## The KeyedProcessFunction
+
+`KeyedProcessFunction`, as an extension of `ProcessFunction`, gives access 
to the key of timers in its `onTimer(...)`
+method.
+
+{% highlight java %}
--- End diff --

@bowenli86 As soon as the scala example is added, I can take care of the 
other two comments and merge! Let me know when you update the PR, and thanks 
for the work!


> add KeyedProcessFunction to expose the key in onTimer() and other methods
> -
>
> Key: FLINK-8560
> URL: https://issues.apache.org/jira/browse/FLINK-8560
> Project: Flink
>  Issue Type: Improvement
>  Components: DataStream API
>Affects Versions: 1.4.0
>Reporter: Jürgen Thomann
>Assignee: Bowen Li
>Priority: Major
> Fix For: 1.5.0
>
>
> Currently it is required to store the key of a keyBy() in the processElement 
> method to have access to it in the OnTimerContext.
> This is not so good as you have to check in the processElement method for 
> every element if the key is already stored and set it if it's not already set.
> A possible solution would adding OnTimerContext#getCurrentKey() or a similar 
> method. Maybe having it in the open() method could maybe work as well.
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Getting-Key-from-keyBy-in-ProcessFunction-tt18126.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8834) Job fails to restart due to some tasks stuck in cancelling state

2018-03-05 Thread Daniel Harper (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385879#comment-16385879
 ] 

Daniel Harper commented on FLINK-8834:
--

Interesting insight, thanks [~StephanEwen]!

Will follow FLINK-8856 with interest

> Job fails to restart due to some tasks stuck in cancelling state
> 
>
> Key: FLINK-8834
> URL: https://issues.apache.org/jira/browse/FLINK-8834
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.4.0
> Environment: AWS EMR 5.12
> Flink 1.4.0
> Beam 2.3.0
>Reporter: Daniel Harper
>Priority: Major
> Fix For: 1.5.0
>
>
> Our job threw an exception overnight, causing the job to commence attempting 
> a restart.
> However it never managed to restart because 2 tasks on one of the Task 
> Managers are stuck in "Cancelling" state, with the following exception
> {code:java}
> 2018-03-02 02:29:31,604 WARN  org.apache.flink.runtime.taskmanager.Task   
>   - Task 'PTransformTranslation.UnknownRawPTransform -> 
> ParDoTranslation.RawParDo -> ParDoTranslation.RawParDo -> 
> uk.co.bbc.sawmill.streaming.pipeline.output.io.file.WriteWindowToFile-RDotRecord/TextIO.Write/WriteFiles/GatherTempFileResults/Reshuffle/Window.Into()/Window.Assign.out
>  -> ParDoTranslation.RawParDo -> ToKeyedWorkItem (24/32)' did not react to 
> cancelling signal, but is stuck in method:
>  java.lang.Thread.blockedOn(Thread.java:239)
> java.lang.System$2.blockedOn(System.java:1252)
> java.nio.channels.spi.AbstractInterruptibleChannel.blockedOn(AbstractInterruptibleChannel.java:211)
> java.nio.channels.spi.AbstractInterruptibleChannel.begin(AbstractInterruptibleChannel.java:170)
> java.nio.channels.Channels$WritableByteChannelImpl.write(Channels.java:457)
> java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
> java.nio.channels.Channels.writeFully(Channels.java:101)
> java.nio.channels.Channels.access$000(Channels.java:61)
> java.nio.channels.Channels$1.write(Channels.java:174)
> java.util.zip.DeflaterOutputStream.deflate(DeflaterOutputStream.java:253)
> java.util.zip.DeflaterOutputStream.write(DeflaterOutputStream.java:211)
> java.util.zip.GZIPOutputStream.write(GZIPOutputStream.java:145)
> java.nio.channels.Channels$WritableByteChannelImpl.write(Channels.java:458)
> java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
> java.nio.channels.Channels.writeFully(Channels.java:101)
> java.nio.channels.Channels.access$000(Channels.java:61)
> java.nio.channels.Channels$1.write(Channels.java:174)
> sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
> sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:282)
> sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
> sun.nio.cs.StreamEncoder.write(StreamEncoder.java:135)
> java.io.OutputStreamWriter.write(OutputStreamWriter.java:220)
> java.io.Writer.write(Writer.java:157)
> org.apache.beam.sdk.io.TextSink$TextWriter.writeLine(TextSink.java:102)
> org.apache.beam.sdk.io.TextSink$TextWriter.write(TextSink.java:118)
> org.apache.beam.sdk.io.TextSink$TextWriter.write(TextSink.java:76)
> org.apache.beam.sdk.io.WriteFiles.writeOrClose(WriteFiles.java:550)
> org.apache.beam.sdk.io.WriteFiles.access$1000(WriteFiles.java:112)
> org.apache.beam.sdk.io.WriteFiles$WriteShardsIntoTempFilesFn.processElement(WriteFiles.java:718)
> org.apache.beam.sdk.io.WriteFiles$WriteShardsIntoTempFilesFn$DoFnInvoker.invokeProcessElement(Unknown
>  Source)
> org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:177)
> org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:138)
> org.apache.beam.runners.flink.metrics.DoFnRunnerWithMetricsUpdate.processElement(DoFnRunnerWithMetricsUpdate.java:65)
> org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.processElement(DoFnOperator.java:425)
> org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:549)
> org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:524)
> org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:504)
> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:831)
> org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:809)
> org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator$BufferedOutputManager.emit(DoFnOperator.java:888)
> org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator$BufferedOutputManager.output(DoFnOperator.java:865)
> 

[GitHub] flink pull request #5481: [FLINK-8560] Access to the current key in ProcessF...

2018-03-05 Thread kl0u
Github user kl0u commented on a diff in the pull request:

https://github.com/apache/flink/pull/5481#discussion_r172136142
  
--- Diff: docs/dev/stream/operators/process_function.md ---
@@ -242,4 +242,17 @@ class CountWithTimeoutFunction extends 
ProcessFunction[(String, String), (String
 the current processing time as event-time timestamp. This behavior is very 
subtle and might not be noticed by users. Well, it's
 harmful because processing-time timestamps are indeterministic and not 
aligned with watermarks. Besides, user-implemented logic
 depends on this wrong timestamp highly likely is unintendedly faulty. So 
we've decided to fix it. Upon upgrading to 1.4.0, Flink jobs
-that are using this incorrect event-time timestamp will fail, and users 
should adapt their jobs to the correct logic.
\ No newline at end of file
+that are using this incorrect event-time timestamp will fail, and users 
should adapt their jobs to the correct logic.
+
+## The KeyedProcessFunction
+
+`KeyedProcessFunction`, as an extension of `ProcessFunction`, gives access 
to the key of timers in its `onTimer(...)`
+method.
+
+{% highlight java %}
--- End diff --

@bowenli86 As soon as the scala example is added, I can take care of the 
other two comments and merge! Let me know when you update the PR, and thanks 
for the work!


---


[GitHub] flink issue #5601: [FLINK-8818][yarn/s3][tests] harden YarnFileStageTest upl...

2018-03-05 Thread StefanRRichter
Github user StefanRRichter commented on the issue:

https://github.com/apache/flink/pull/5601
  
LGTM 👍 


---


[jira] [Commented] (FLINK-8769) Quickstart job execution in IDE logs contain several exceptions

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385968#comment-16385968
 ] 

ASF GitHub Bot commented on FLINK-8769:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/5611


> Quickstart job execution in IDE logs contain several exceptions
> ---
>
> Key: FLINK-8769
> URL: https://issues.apache.org/jira/browse/FLINK-8769
> Project: Flink
>  Issue Type: Bug
>  Components: Local Runtime
>Affects Versions: 1.5.0
>Reporter: Chesnay Schepler
>Assignee: Nico Kruber
>Priority: Blocker
> Fix For: 1.5.0
>
>
> While checking out [the PR for 
> FLINK-8761|https://github.com/apache/flink/pull/5569] and running a job in 
> the IDE several exceptions are being logged. The job still runs properly 
> though.
> {code:java}
> ...
> 14:19:52,546 INFO  org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint 
>- Failed to load web based job submission extension.
> org.apache.flink.util.FlinkException: Could not load web submission extension.
>   at 
> org.apache.flink.runtime.webmonitor.WebMonitorUtils.loadWebSubmissionExtension(WebMonitorUtils.java:252)
>   at 
> org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint.initializeHandlers(DispatcherRestEndpoint.java:111)
>   at 
> org.apache.flink.runtime.rest.RestServerEndpoint.start(RestServerEndpoint.java:124)
>   at 
> org.apache.flink.runtime.minicluster.MiniCluster.start(MiniCluster.java:320)
>   at 
> org.apache.flink.client.LocalExecutor.createJobExecutorService(LocalExecutor.java:144)
>   at org.apache.flink.client.LocalExecutor.start(LocalExecutor.java:118)
>   at 
> org.apache.flink.client.LocalExecutor.executePlan(LocalExecutor.java:212)
>   at 
> org.apache.flink.api.java.LocalEnvironment.execute(LocalEnvironment.java:91)
>   at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:815)
>   at org.apache.flink.api.java.DataSet.collect(DataSet.java:413)
>   at org.apache.flink.api.java.DataSet.print(DataSet.java:1652)
>   at iqst.BatchJob.main(BatchJob.java:39)
> Caused by: java.lang.reflect.InvocationTargetException
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.apache.flink.runtime.webmonitor.WebMonitorUtils.loadWebSubmissionExtension(WebMonitorUtils.java:243)
>   ... 11 more
> Caused by: org.apache.flink.util.ConfigurationException: Config parameter 
> 'Key: 'jobmanager.rpc.address' , default: null (deprecated keys: [])' is 
> missing (hostname/address of JobManager to connect to).
>   at 
> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.getJobManagerAddress(HighAvailabilityServicesUtils.java:137)
>   at 
> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:79)
>   at 
> org.apache.flink.client.program.ClusterClient.(ClusterClient.java:148)
>   at 
> org.apache.flink.client.program.rest.RestClusterClient.(RestClusterClient.java:144)
>   at 
> org.apache.flink.client.program.rest.RestClusterClient.(RestClusterClient.java:135)
>   at 
> org.apache.flink.runtime.webmonitor.WebSubmissionExtension.(WebSubmissionExtension.java:61)
>   ... 16 more
> 14:19:53,140 INFO  org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint 
>- Rest endpoint listening at 127.0.0.1:64908
> ...
> 14:19:56,546 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor 
>- Close ResourceManager connection b8a2cff59ba07813067a64ebaf7d7889.
> org.apache.flink.util.FlinkException: New ResourceManager leader found under: 
> null(null)
>   at 
> org.apache.flink.runtime.taskexecutor.TaskExecutor.notifyOfNewResourceManagerLeader(TaskExecutor.java:853)
>   at 
> org.apache.flink.runtime.taskexecutor.TaskExecutor.access$900(TaskExecutor.java:127)
>   at 
> org.apache.flink.runtime.taskexecutor.TaskExecutor$ResourceManagerLeaderListener.lambda$notifyLeaderAddress$0(TaskExecutor.java:1359)
>   at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:292)
>   at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:147)
>   at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$onReceive$0(AkkaRpcActor.java:129)
>   at 
> akka.actor.ActorCell$$anonfun$become$1.applyOrElse(ActorCell.scala:544)
>   

[GitHub] flink pull request #5635: [FLINK-8859][state backend] RocksDB backend should...

2018-03-05 Thread sihuazhou
GitHub user sihuazhou opened a pull request:

https://github.com/apache/flink/pull/5635

[FLINK-8859][state backend] RocksDB backend should pass WriteOption to 
Rocks.put() when restoring

## What is the purpose of the change

This PR fixes 
[FLINK-8859](https://issues.apache.org/jira/browse/FLINK-8859). We should pass 
`WriteOption` to Rocks.put() when restoring from handle (Both in full & 
incremental checkpoint). Because of `WriteOption.setDisableWAL(true)`, the 
performance can be increased by about 2 times.

## Brief change log

  - pass WriteOption to Rocks.put() when restoring

## Verifying this change
The changes can be verified by existing tests.

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (no)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)

## Documentation

  - Does this pull request introduce a new feature? (no)


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sihuazhou/flink pass_writeoption_2_put

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5635.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5635


commit 08109cb20d26bd1c8a04455b2fa914807a730a28
Author: sihuazhou 
Date:   2018-03-05T12:21:27Z

Pass the WriteOption to RocksDB.put() when restoring from handle.




---


[jira] [Commented] (FLINK-8849) Wrong link from concepts/runtime to doc on chaining

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385942#comment-16385942
 ] 

ASF GitHub Bot commented on FLINK-8849:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/5630
  
We don't _require_ JIRAs for small doc fixes, whether you open one is up to 
you.


> Wrong link from concepts/runtime to doc on chaining
> ---
>
> Key: FLINK-8849
> URL: https://issues.apache.org/jira/browse/FLINK-8849
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Ken Krugler
>Priority: Minor
>
> On 
> https://ci.apache.org/projects/flink/flink-docs-master/concepts/runtime.html 
> there's a link to "chaining docs" that currently points at:
> https://ci.apache.org/projects/flink/flink-docs-master/dev/datastream_api.html#task-chaining-and-resource-groups
> but it should link to:
> https://ci.apache.org/projects/flink/flink-docs-master/dev/stream/operators/#task-chaining-and-resource-groups



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8818) Harden YarnFileStageTest upload test for eventual consistent read-after-write

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385944#comment-16385944
 ] 

ASF GitHub Bot commented on FLINK-8818:
---

Github user StefanRRichter commented on a diff in the pull request:

https://github.com/apache/flink/pull/5601#discussion_r172152995
  
--- Diff: 
flink-yarn/src/test/java/org/apache/flink/yarn/YarnFileStageTest.java ---
@@ -200,13 +201,23 @@ static void testCopyFromLocalRecursive(
while (targetFilesIterator.hasNext()) {
LocatedFileStatus targetFile = 
targetFilesIterator.next();
 
-   try (FSDataInputStream in = 
targetFileSystem.open(targetFile.getPath())) {
-   String absolutePathString = 
targetFile.getPath().toString();
-   String relativePath = 
absolutePathString.substring(workDirPrefixLength);
-   targetFiles.put(relativePath, 
in.readUTF());
-
-   assertEquals("extraneous data in file " 
+ relativePath, -1, in.read());
-   }
+   int retries = 5;
+   do {
+   try (FSDataInputStream in = 
targetFileSystem.open(targetFile.getPath())) {
+   String absolutePathString = 
targetFile.getPath().toString();
+   String relativePath = 
absolutePathString.substring(workDirPrefixLength);
+   targetFiles.put(relativePath, 
in.readUTF());
+
+   assertEquals("extraneous data 
in file " + relativePath, -1, in.read());
+   break;
+   } catch (FileNotFoundException e) {
+   // For S3, read-after-write may 
be eventually consistent, i.e. when trying
+   // to access the object before 
writing it; see
+   // 
https://docs.aws.amazon.com/AmazonS3/latest/dev/Introduction.html#ConsistencyModel
+   // -> try again a bit later
+   Thread.sleep(50);
--- End diff --

Same here.


> Harden YarnFileStageTest upload test for eventual consistent read-after-write
> -
>
> Key: FLINK-8818
> URL: https://issues.apache.org/jira/browse/FLINK-8818
> Project: Flink
>  Issue Type: Sub-task
>  Components: FileSystem, Tests, YARN
>Affects Versions: 1.5.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Blocker
> Fix For: 1.5.0, 1.4.3
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8818) Harden YarnFileStageTest upload test for eventual consistent read-after-write

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385943#comment-16385943
 ] 

ASF GitHub Bot commented on FLINK-8818:
---

Github user StefanRRichter commented on a diff in the pull request:

https://github.com/apache/flink/pull/5601#discussion_r172152959
  
--- Diff: 
flink-yarn/src/test/java/org/apache/flink/yarn/YarnFileStageTest.java ---
@@ -200,13 +201,23 @@ static void testCopyFromLocalRecursive(
while (targetFilesIterator.hasNext()) {
LocatedFileStatus targetFile = 
targetFilesIterator.next();
 
-   try (FSDataInputStream in = 
targetFileSystem.open(targetFile.getPath())) {
-   String absolutePathString = 
targetFile.getPath().toString();
-   String relativePath = 
absolutePathString.substring(workDirPrefixLength);
-   targetFiles.put(relativePath, 
in.readUTF());
-
-   assertEquals("extraneous data in file " 
+ relativePath, -1, in.read());
-   }
+   int retries = 5;
--- End diff --

I wonder if this should be a magic number or better something that can be 
configured?


> Harden YarnFileStageTest upload test for eventual consistent read-after-write
> -
>
> Key: FLINK-8818
> URL: https://issues.apache.org/jira/browse/FLINK-8818
> Project: Flink
>  Issue Type: Sub-task
>  Components: FileSystem, Tests, YARN
>Affects Versions: 1.5.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Blocker
> Fix For: 1.5.0, 1.4.3
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink pull request #5601: [FLINK-8818][yarn/s3][tests] harden YarnFileStageT...

2018-03-05 Thread StefanRRichter
Github user StefanRRichter commented on a diff in the pull request:

https://github.com/apache/flink/pull/5601#discussion_r172152959
  
--- Diff: 
flink-yarn/src/test/java/org/apache/flink/yarn/YarnFileStageTest.java ---
@@ -200,13 +201,23 @@ static void testCopyFromLocalRecursive(
while (targetFilesIterator.hasNext()) {
LocatedFileStatus targetFile = 
targetFilesIterator.next();
 
-   try (FSDataInputStream in = 
targetFileSystem.open(targetFile.getPath())) {
-   String absolutePathString = 
targetFile.getPath().toString();
-   String relativePath = 
absolutePathString.substring(workDirPrefixLength);
-   targetFiles.put(relativePath, 
in.readUTF());
-
-   assertEquals("extraneous data in file " 
+ relativePath, -1, in.read());
-   }
+   int retries = 5;
--- End diff --

I wonder if this should be a magic number or better something that can be 
configured?


---


[GitHub] flink pull request #5601: [FLINK-8818][yarn/s3][tests] harden YarnFileStageT...

2018-03-05 Thread StefanRRichter
Github user StefanRRichter commented on a diff in the pull request:

https://github.com/apache/flink/pull/5601#discussion_r172152995
  
--- Diff: 
flink-yarn/src/test/java/org/apache/flink/yarn/YarnFileStageTest.java ---
@@ -200,13 +201,23 @@ static void testCopyFromLocalRecursive(
while (targetFilesIterator.hasNext()) {
LocatedFileStatus targetFile = 
targetFilesIterator.next();
 
-   try (FSDataInputStream in = 
targetFileSystem.open(targetFile.getPath())) {
-   String absolutePathString = 
targetFile.getPath().toString();
-   String relativePath = 
absolutePathString.substring(workDirPrefixLength);
-   targetFiles.put(relativePath, 
in.readUTF());
-
-   assertEquals("extraneous data in file " 
+ relativePath, -1, in.read());
-   }
+   int retries = 5;
+   do {
+   try (FSDataInputStream in = 
targetFileSystem.open(targetFile.getPath())) {
+   String absolutePathString = 
targetFile.getPath().toString();
+   String relativePath = 
absolutePathString.substring(workDirPrefixLength);
+   targetFiles.put(relativePath, 
in.readUTF());
+
+   assertEquals("extraneous data 
in file " + relativePath, -1, in.read());
+   break;
+   } catch (FileNotFoundException e) {
+   // For S3, read-after-write may 
be eventually consistent, i.e. when trying
+   // to access the object before 
writing it; see
+   // 
https://docs.aws.amazon.com/AmazonS3/latest/dev/Introduction.html#ConsistencyModel
+   // -> try again a bit later
+   Thread.sleep(50);
--- End diff --

Same here.


---


[jira] [Created] (FLINK-8859) RocksDB backend should pass WriteOption to Rocks.put() when restoring

2018-03-05 Thread Sihua Zhou (JIRA)
Sihua Zhou created FLINK-8859:
-

 Summary: RocksDB backend should pass WriteOption to Rocks.put() 
when restoring
 Key: FLINK-8859
 URL: https://issues.apache.org/jira/browse/FLINK-8859
 Project: Flink
  Issue Type: Bug
  Components: State Backends, Checkpointing
Affects Versions: 1.5.0
Reporter: Sihua Zhou
Assignee: Sihua Zhou
 Fix For: 1.5.0


We should pass `WriteOption` to Rocks.put() when restoring from handle (Both in 
full & incremental checkpoint). Because of `WriteOption.setDisableWAL(true)`, 
the performance can be increased by about 2 times.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink pull request #5633: [FLINK-8857] [Hbase] Avoid HBase connector read ex...

2018-03-05 Thread neoremind
GitHub user neoremind opened a pull request:

https://github.com/apache/flink/pull/5633

[FLINK-8857] [Hbase] Avoid HBase connector read example throwing exception 
at the end

## What is the purpose of the change

*This pull request fixes problem of HBase read example throwing exception 
at the end of the program execution.*


## Brief change log

  - *Update example 
`flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseReadExample.java`
 by removing the part causing the problem.*


## Verifying this change

This change is a trivial rework / code cleanup without any test coverage.

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (no)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
  - The S3 file system connector: (no)

## Documentation

  - Does this pull request introduce a new feature? (no)
  - If yes, how is the feature documented? (not applicable)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/neoremind/flink FLINK-8857

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5633.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5633


commit 97c009a4d2308ad1da5629f9653bed9af352a8f7
Author: neoremind 
Date:   2018-03-05T08:50:37Z

Avoid hbase connector read example throwing exception at the end.




---


[jira] [Commented] (FLINK-8857) HBase connector read example throws exception at the end.

2018-03-05 Thread Chesnay Schepler (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385825#comment-16385825
 ] 

Chesnay Schepler commented on FLINK-8857:
-

This happens because the program calls `ExecutionEnvironment#execute()` after a 
call to `DataSet#print()`. Calling `print()` eagerly kicks off the job 
execution, so the call to `execute()` just isn't necessary.

> HBase connector read example throws exception at the end.
> -
>
> Key: FLINK-8857
> URL: https://issues.apache.org/jira/browse/FLINK-8857
> Project: Flink
>  Issue Type: Bug
>  Components: Batch Connectors and Input/Output Formats
>Affects Versions: 1.2.0, 1.3.2, 1.5.0, 1.4.1
>Reporter: Xu Zhang
>Priority: Trivial
>  Labels: easy-fix, starter
>
> Running test case example of
> {code:java}
> flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseReadExample.java{code}
> Although the result has been printed out successfully, but at the end, driver 
> will throw the following exception.
> {code:java}
> 
> The program finished with the following exception:
> org.apache.flink.client.program.ProgramInvocationException: The main method 
> caused an error.
> at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:545)
> at 
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:419)
> at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:381)
> at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:838)
> at org.apache.flink.client.CliFrontend.run(CliFrontend.java:259)
> at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1086)
> at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1133)
> at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1130)
> at 
> org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
> at 
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
> at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1130)
> Caused by: java.lang.RuntimeException: No new data sinks have been defined 
> since the last execution. The last execution refers to the latest call to 
> 'execute()', 'count()', 'collect()', or 'print()'.
> at 
> org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:1050)
> at 
> org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:1032)
> at 
> org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:59)
> at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:926)
> at com.hulu.ap.flink.scoring.pipeline.ScoringJob.main(ScoringJob.java:82)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528)
> ... 13 more
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-8715) RocksDB does not propagate reconfiguration of serializer to the states

2018-03-05 Thread Aljoscha Krettek (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-8715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek updated FLINK-8715:

Fix Version/s: 1.5.0

> RocksDB does not propagate reconfiguration of serializer to the states
> --
>
> Key: FLINK-8715
> URL: https://issues.apache.org/jira/browse/FLINK-8715
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Affects Versions: 1.3.2
>Reporter: Arvid Heise
>Priority: Blocker
> Fix For: 1.5.0
>
>
> Any changes to the serializer done in #ensureCompability are lost during the 
> state creation.
> In particular, 
> [https://github.com/apache/flink/blob/master/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBValueState.java#L68]
>  always uses a fresh copy of the StateDescriptor.
> An easy fix is to pass the reconfigured serializer as an additional parameter 
> in 
> [https://github.com/apache/flink/blob/master/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBKeyedStateBackend.java#L1681]
>  , which can be retrieved through the side-output of getColumnFamily
> {code:java}
> kvStateInformation.get(stateDesc.getName()).f1.getStateSerializer()
> {code}
> I encountered it in 1.3.2 but the code in the master seems unchanged (hence 
> the pointer into master). I encountered it in ValueState, but I suspect the 
> same issue can be observed for all kinds of RocksDB states.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-8269) Set netRuntime in JobExecutionResult

2018-03-05 Thread Aljoscha Krettek (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-8269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek updated FLINK-8269:

Fix Version/s: 1.5.0

> Set netRuntime in JobExecutionResult
> 
>
> Key: FLINK-8269
> URL: https://issues.apache.org/jira/browse/FLINK-8269
> Project: Flink
>  Issue Type: Bug
>  Components: Job-Submission
>Affects Versions: 1.5.0
> Environment: 917fbcbee4599c1d198a4c63942fe1d2762aa64a
>Reporter: Gary Yao
>Assignee: Gary Yao
>Priority: Blocker
>  Labels: flip-6
> Fix For: 1.5.0
>
>
> In FLIP-6 mode, the {{JobMaster}} does not correctly set the field 
> {{netRuntime}} on the {{JobExecutionResult}} when the job status transitions 
> to {{_FINISHED_}}.
> Find the code in question below:
> {code}
> case FINISHED:
>   try {
>   // TODO get correct job duration
>   // job done, let's get the accumulators
>   Map accumulatorResults = 
> executionGraph.getAccumulators();
>   JobExecutionResult result = new JobExecutionResult(jobID, 0L, 
> accumulatorResults);
>   
>   executor.execute(() -> 
> jobCompletionActions.jobFinished(result));
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-8857) HBase connector read example throws exception at the end.

2018-03-05 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-8857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-8857:

Labels: easy-fix starter  (was: )

> HBase connector read example throws exception at the end.
> -
>
> Key: FLINK-8857
> URL: https://issues.apache.org/jira/browse/FLINK-8857
> Project: Flink
>  Issue Type: Bug
>  Components: Batch Connectors and Input/Output Formats
>Affects Versions: 1.2.0, 1.3.2, 1.5.0, 1.4.1
>Reporter: Xu Zhang
>Priority: Trivial
>  Labels: easy-fix, starter
>
> Running test case example of
> {code:java}
> flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseReadExample.java{code}
> Although the result has been printed out successfully, but at the end, driver 
> will throw the following exception.
> {code:java}
> 
> The program finished with the following exception:
> org.apache.flink.client.program.ProgramInvocationException: The main method 
> caused an error.
> at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:545)
> at 
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:419)
> at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:381)
> at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:838)
> at org.apache.flink.client.CliFrontend.run(CliFrontend.java:259)
> at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1086)
> at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1133)
> at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1130)
> at 
> org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
> at 
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
> at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1130)
> Caused by: java.lang.RuntimeException: No new data sinks have been defined 
> since the last execution. The last execution refers to the latest call to 
> 'execute()', 'count()', 'collect()', or 'print()'.
> at 
> org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:1050)
> at 
> org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:1032)
> at 
> org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:59)
> at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:926)
> at com.hulu.ap.flink.scoring.pipeline.ScoringJob.main(ScoringJob.java:82)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528)
> ... 13 more
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-8858) SQL Client to submit long running query in file

2018-03-05 Thread Renjie Liu (JIRA)
Renjie Liu created FLINK-8858:
-

 Summary: SQL Client to submit long running query in file
 Key: FLINK-8858
 URL: https://issues.apache.org/jira/browse/FLINK-8858
 Project: Flink
  Issue Type: New Feature
  Components: Table API  SQL
Affects Versions: 1.6.0
Reporter: Renjie Liu
Assignee: Renjie Liu


The current design of SQL Client embedded mode doesn't support long running 
queries. It would be useful for simple jobs that can be expressed in a single 
sql statement if we can submit sql statements stored in files as long running 
queries. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink issue #5630: [FLINK-8849][Documentation] Fix link to chaining docs

2018-03-05 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/5630
  
merging.


---


[jira] [Commented] (FLINK-8849) Wrong link from concepts/runtime to doc on chaining

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385956#comment-16385956
 ] 

ASF GitHub Bot commented on FLINK-8849:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/5630
  
merging.


> Wrong link from concepts/runtime to doc on chaining
> ---
>
> Key: FLINK-8849
> URL: https://issues.apache.org/jira/browse/FLINK-8849
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Ken Krugler
>Priority: Minor
>
> On 
> https://ci.apache.org/projects/flink/flink-docs-master/concepts/runtime.html 
> there's a link to "chaining docs" that currently points at:
> https://ci.apache.org/projects/flink/flink-docs-master/dev/datastream_api.html#task-chaining-and-resource-groups
> but it should link to:
> https://ci.apache.org/projects/flink/flink-docs-master/dev/stream/operators/#task-chaining-and-resource-groups



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8458) Add the switch for keeping both the old mode and the new credit-based mode

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385973#comment-16385973
 ] 

ASF GitHub Bot commented on FLINK-8458:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/5317


> Add the switch for keeping both the old mode and the new credit-based mode
> --
>
> Key: FLINK-8458
> URL: https://issues.apache.org/jira/browse/FLINK-8458
> Project: Flink
>  Issue Type: Sub-task
>  Components: Network
>Reporter: zhijiang
>Assignee: zhijiang
>Priority: Major
> Fix For: 1.5.0
>
>
> After the whole feature of credit-based flow control is done, we should add a 
> config parameter to switch on/off the new credit-based mode. To do so, we can 
> roll back to the old network mode for any expected risks.
> The parameter is defined as 
> {{taskmanager.network.credit-based-flow-control.enabled}} and the default 
> value is true. This switch may be removed after next release.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8857) HBase connector read example throws exception at the end.

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385837#comment-16385837
 ] 

ASF GitHub Bot commented on FLINK-8857:
---

Github user neoremind commented on a diff in the pull request:

https://github.com/apache/flink/pull/5633#discussion_r172125701
  
--- Diff: 
flink-connectors/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseReadExample.java
 ---
@@ -86,8 +86,8 @@ public boolean filter(Tuple2 t) throws 
Exception {
 
hbaseDs.print();
 
-   // kick off execution.
-   env.execute();
+   // kick off execution is not needed.
+   // env.execute();
--- End diff --

Done.


> HBase connector read example throws exception at the end.
> -
>
> Key: FLINK-8857
> URL: https://issues.apache.org/jira/browse/FLINK-8857
> Project: Flink
>  Issue Type: Bug
>  Components: Batch Connectors and Input/Output Formats
>Affects Versions: 1.2.0, 1.3.2, 1.5.0, 1.4.1
>Reporter: Xu Zhang
>Assignee: Xu Zhang
>Priority: Trivial
>  Labels: easy-fix, starter
>
> Running test case example of
> {code:java}
> flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseReadExample.java{code}
> Although the result has been printed out successfully, but at the end, driver 
> will throw the following exception.
> {code:java}
> 
> The program finished with the following exception:
> org.apache.flink.client.program.ProgramInvocationException: The main method 
> caused an error.
> at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:545)
> at 
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:419)
> at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:381)
> at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:838)
> at org.apache.flink.client.CliFrontend.run(CliFrontend.java:259)
> at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1086)
> at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1133)
> at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1130)
> at 
> org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
> at 
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
> at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1130)
> Caused by: java.lang.RuntimeException: No new data sinks have been defined 
> since the last execution. The last execution refers to the latest call to 
> 'execute()', 'count()', 'collect()', or 'print()'.
> at 
> org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:1050)
> at 
> org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:1032)
> at 
> org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:59)
> at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:926)
> at com.hulu.ap.flink.scoring.pipeline.ScoringJob.main(ScoringJob.java:82)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528)
> ... 13 more
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink pull request #5481: [FLINK-8560] Access to the current key in ProcessF...

2018-03-05 Thread aljoscha
Github user aljoscha commented on a diff in the pull request:

https://github.com/apache/flink/pull/5481#discussion_r172133573
  
--- Diff: docs/dev/stream/operators/process_function.md ---
@@ -242,4 +242,17 @@ class CountWithTimeoutFunction extends 
ProcessFunction[(String, String), (String
 the current processing time as event-time timestamp. This behavior is very 
subtle and might not be noticed by users. Well, it's
 harmful because processing-time timestamps are indeterministic and not 
aligned with watermarks. Besides, user-implemented logic
 depends on this wrong timestamp highly likely is unintendedly faulty. So 
we've decided to fix it. Upon upgrading to 1.4.0, Flink jobs
-that are using this incorrect event-time timestamp will fail, and users 
should adapt their jobs to the correct logic.
\ No newline at end of file
+that are using this incorrect event-time timestamp will fail, and users 
should adapt their jobs to the correct logic.
+
+## The KeyedProcessFunction
+
+`KeyedProcessFunction`, as an extension of `ProcessFunction`, gives access 
to the key of timers in its `onTimer(...)`
+method.
+
+{% highlight java %}
+@Override
+public void onTimer(long timestamp, OnTimerContext ctx, Collector 
out) throws Exception {
--- End diff --

I believe this is now `public void onTimer(long timestamp, 
OnTimerContext ctx, Collector out)`, right? @kl0u you could fix this 
while merging.


---


[jira] [Commented] (FLINK-8818) Harden YarnFileStageTest upload test for eventual consistent read-after-write

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385964#comment-16385964
 ] 

ASF GitHub Bot commented on FLINK-8818:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/5601


> Harden YarnFileStageTest upload test for eventual consistent read-after-write
> -
>
> Key: FLINK-8818
> URL: https://issues.apache.org/jira/browse/FLINK-8818
> Project: Flink
>  Issue Type: Sub-task
>  Components: FileSystem, Tests, YARN
>Affects Versions: 1.5.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Blocker
> Fix For: 1.5.0, 1.4.3
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink pull request #5601: [FLINK-8818][yarn/s3][tests] harden YarnFileStageT...

2018-03-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/5601


---


[jira] [Commented] (FLINK-8859) RocksDB backend should pass WriteOption to Rocks.put() when restoring

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386014#comment-16386014
 ] 

ASF GitHub Bot commented on FLINK-8859:
---

Github user sihuazhou commented on the issue:

https://github.com/apache/flink/pull/5635
  
@StefanRRichter Could you please have a look at this? This is a trivial 
work, but can improve performance a lot (It looks like we forgot it when 
restoring).


> RocksDB backend should pass WriteOption to Rocks.put() when restoring
> -
>
> Key: FLINK-8859
> URL: https://issues.apache.org/jira/browse/FLINK-8859
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Affects Versions: 1.5.0
>Reporter: Sihua Zhou
>Assignee: Sihua Zhou
>Priority: Major
> Fix For: 1.5.0
>
>
> We should pass `WriteOption` to Rocks.put() when restoring from handle (Both 
> in full & incremental checkpoint). Because of 
> `WriteOption.setDisableWAL(true)`, the performance can be increased by about 
> 2 times.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-8839) Table source factory discovery is broken in SQL Client

2018-03-05 Thread Aljoscha Krettek (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-8839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek updated FLINK-8839:

Fix Version/s: 1.5.0

> Table source factory discovery is broken in SQL Client
> --
>
> Key: FLINK-8839
> URL: https://issues.apache.org/jira/browse/FLINK-8839
> Project: Flink
>  Issue Type: Bug
>  Components: Table API  SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Blocker
> Fix For: 1.5.0
>
>
> Table source factories cannot not be discovered if they were added using a 
> jar file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-8857) HBase connector read example throws exception at the end.

2018-03-05 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-8857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-8857:

Affects Version/s: 1.5.0
   1.3.2
   1.4.1

> HBase connector read example throws exception at the end.
> -
>
> Key: FLINK-8857
> URL: https://issues.apache.org/jira/browse/FLINK-8857
> Project: Flink
>  Issue Type: Bug
>  Components: Batch Connectors and Input/Output Formats
>Affects Versions: 1.2.0, 1.3.2, 1.5.0, 1.4.1
>Reporter: Xu Zhang
>Priority: Trivial
>
> Running test case example of
> {code:java}
> flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseReadExample.java{code}
> Although the result has been printed out successfully, but at the end, driver 
> will throw the following exception.
> {code:java}
> 
> The program finished with the following exception:
> org.apache.flink.client.program.ProgramInvocationException: The main method 
> caused an error.
> at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:545)
> at 
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:419)
> at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:381)
> at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:838)
> at org.apache.flink.client.CliFrontend.run(CliFrontend.java:259)
> at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1086)
> at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1133)
> at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1130)
> at 
> org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
> at 
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
> at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1130)
> Caused by: java.lang.RuntimeException: No new data sinks have been defined 
> since the last execution. The last execution refers to the latest call to 
> 'execute()', 'count()', 'collect()', or 'print()'.
> at 
> org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:1050)
> at 
> org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:1032)
> at 
> org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:59)
> at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:926)
> at com.hulu.ap.flink.scoring.pipeline.ScoringJob.main(ScoringJob.java:82)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528)
> ... 13 more
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink pull request #5633: [FLINK-8857] [Hbase] Avoid HBase connector read ex...

2018-03-05 Thread zentol
Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/5633#discussion_r172122162
  
--- Diff: 
flink-connectors/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseReadExample.java
 ---
@@ -86,8 +86,8 @@ public boolean filter(Tuple2 t) throws 
Exception {
 
hbaseDs.print();
 
-   // kick off execution.
-   env.execute();
+   // kick off execution is not needed.
+   // env.execute();
--- End diff --

I'd just remove it.


---


[jira] [Commented] (FLINK-8857) HBase connector read example throws exception at the end.

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385835#comment-16385835
 ] 

ASF GitHub Bot commented on FLINK-8857:
---

Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/5633#discussion_r172124498
  
--- Diff: 
flink-connectors/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseReadExample.java
 ---
@@ -86,8 +86,8 @@ public boolean filter(Tuple2 t) throws 
Exception {
 
hbaseDs.print();
 
-   // kick off execution.
-   env.execute();
+   // kick off execution is not needed.
+   // env.execute();
--- End diff --

you can add another commit that removes the code


> HBase connector read example throws exception at the end.
> -
>
> Key: FLINK-8857
> URL: https://issues.apache.org/jira/browse/FLINK-8857
> Project: Flink
>  Issue Type: Bug
>  Components: Batch Connectors and Input/Output Formats
>Affects Versions: 1.2.0, 1.3.2, 1.5.0, 1.4.1
>Reporter: Xu Zhang
>Assignee: Xu Zhang
>Priority: Trivial
>  Labels: easy-fix, starter
>
> Running test case example of
> {code:java}
> flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseReadExample.java{code}
> Although the result has been printed out successfully, but at the end, driver 
> will throw the following exception.
> {code:java}
> 
> The program finished with the following exception:
> org.apache.flink.client.program.ProgramInvocationException: The main method 
> caused an error.
> at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:545)
> at 
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:419)
> at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:381)
> at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:838)
> at org.apache.flink.client.CliFrontend.run(CliFrontend.java:259)
> at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1086)
> at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1133)
> at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1130)
> at 
> org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
> at 
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
> at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1130)
> Caused by: java.lang.RuntimeException: No new data sinks have been defined 
> since the last execution. The last execution refers to the latest call to 
> 'execute()', 'count()', 'collect()', or 'print()'.
> at 
> org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:1050)
> at 
> org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:1032)
> at 
> org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:59)
> at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:926)
> at com.hulu.ap.flink.scoring.pipeline.ScoringJob.main(ScoringJob.java:82)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528)
> ... 13 more
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-7756) RocksDB state backend Checkpointing (Async and Incremental) is not working with CEP.

2018-03-05 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385836#comment-16385836
 ] 

Aljoscha Krettek commented on FLINK-7756:
-

[~trazdan] & [~shashank734] You think we can also close the remaining two 
issues of FLINK-7830? It seems that the problem was not RocksDB but some 
shading problems/library conflicts.

> RocksDB state backend Checkpointing (Async and Incremental)  is not working 
> with CEP.
> -
>
> Key: FLINK-7756
> URL: https://issues.apache.org/jira/browse/FLINK-7756
> Project: Flink
>  Issue Type: Sub-task
>  Components: CEP, State Backends, Checkpointing, Streaming
>Affects Versions: 1.4.0, 1.3.2
> Environment: Flink 1.3.2, Yarn, HDFS, RocksDB backend
>Reporter: Shashank Agarwal
>Assignee: Aljoscha Krettek
>Priority: Blocker
> Fix For: 1.5.0, 1.4.2
>
> Attachments: jobmanager.log, jobmanager_without_cassandra.log, 
> taskmanager.log, taskmanager_without_cassandra.log
>
>
> When i try to use RocksDBStateBackend on my staging cluster (which is using 
> HDFS as file system) it crashes. But When i use FsStateBackend on staging 
> (which is using HDFS as file system) it is working fine.
> On local with local file system it's working fine in both cases.
> Please check attached logs. I have around 20-25 tasks in my app.
> {code:java}
> 2017-09-29 14:21:31,639 INFO  
> org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink  - No state 
> to restore for the BucketingSink (taskIdx=0).
> 2017-09-29 14:21:31,640 INFO  
> org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend  - 
> Initializing RocksDB keyed state backend from snapshot.
> 2017-09-29 14:21:32,020 INFO  
> org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink  - No state 
> to restore for the BucketingSink (taskIdx=1).
> 2017-09-29 14:21:32,022 INFO  
> org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend  - 
> Initializing RocksDB keyed state backend from snapshot.
> 2017-09-29 14:21:32,078 INFO  com.datastax.driver.core.NettyUtil  
>   - Found Netty's native epoll transport in the classpath, using 
> it
> 2017-09-29 14:21:34,177 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Attempting to fail task externally Co-Flat Map (1/2) 
> (b879f192c4e8aae6671cdafb3a24c00a).
> 2017-09-29 14:21:34,177 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Attempting to fail task externally Map (2/2) 
> (1ea5aef6ccc7031edc6b37da2912d90b).
> 2017-09-29 14:21:34,178 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Attempting to fail task externally Co-Flat Map (2/2) 
> (4bac8e764c67520d418a4c755be23d4d).
> 2017-09-29 14:21:34,178 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Co-Flat Map (1/2) (b879f192c4e8aae6671cdafb3a24c00a) switched 
> from RUNNING to FAILED.
> AsynchronousException{java.lang.Exception: Could not materialize checkpoint 2 
> for operator Co-Flat Map (1/2).}
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:970)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.Exception: Could not materialize checkpoint 2 for 
> operator Co-Flat Map (1/2).
>   ... 6 more
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.IllegalStateException
>   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>   at 
> org.apache.flink.util.FutureUtil.runIfNotDoneAndGet(FutureUtil.java:43)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:897)
>   ... 5 more
>   Suppressed: java.lang.Exception: Could not properly cancel managed 
> keyed state future.
>   at 
> org.apache.flink.streaming.api.operators.OperatorSnapshotResult.cancel(OperatorSnapshotResult.java:90)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.cleanup(StreamTask.java:1023)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:961)
>   ... 5 more
>   Caused by: java.util.concurrent.ExecutionException: 
> java.lang.IllegalStateException
>   at 

[jira] [Commented] (FLINK-7756) RocksDB state backend Checkpointing (Async and Incremental) is not working with CEP.

2018-03-05 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385834#comment-16385834
 ] 

Aljoscha Krettek commented on FLINK-7756:
-

Great that it works now!

Regarding 1) This is probably because you are using a Maven version that is 
newer than 3.3. We have verification in our POMs that no newer version is used 
when creating the release but your build invocation did not build with the 
release profile. There is this section in the doc about it: 
https://ci.apache.org/projects/flink/flink-docs-master/start/building.html#build-flink

Regarding 2) That is a bit surprising but I can imagine that there are some 
more problems with incompatible libraries that are fixed by you including all 
of them. Flink 1.4.2 will have some small changes that should help mitigate 
some of that.

Regarding 3) The difference is that the "without hadoop" version does not 
include Hadoop classes in the lib/ folder. In most cases I would recommend to 
use this version because then you're independent of your Hadoop distribution, 
i.e. it will just pick up the Hadoop dependencies from the classpath when 
running on YARN without risk of clashes with the bundled Hadoop classes.

> RocksDB state backend Checkpointing (Async and Incremental)  is not working 
> with CEP.
> -
>
> Key: FLINK-7756
> URL: https://issues.apache.org/jira/browse/FLINK-7756
> Project: Flink
>  Issue Type: Sub-task
>  Components: CEP, State Backends, Checkpointing, Streaming
>Affects Versions: 1.4.0, 1.3.2
> Environment: Flink 1.3.2, Yarn, HDFS, RocksDB backend
>Reporter: Shashank Agarwal
>Assignee: Aljoscha Krettek
>Priority: Blocker
> Fix For: 1.5.0, 1.4.2
>
> Attachments: jobmanager.log, jobmanager_without_cassandra.log, 
> taskmanager.log, taskmanager_without_cassandra.log
>
>
> When i try to use RocksDBStateBackend on my staging cluster (which is using 
> HDFS as file system) it crashes. But When i use FsStateBackend on staging 
> (which is using HDFS as file system) it is working fine.
> On local with local file system it's working fine in both cases.
> Please check attached logs. I have around 20-25 tasks in my app.
> {code:java}
> 2017-09-29 14:21:31,639 INFO  
> org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink  - No state 
> to restore for the BucketingSink (taskIdx=0).
> 2017-09-29 14:21:31,640 INFO  
> org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend  - 
> Initializing RocksDB keyed state backend from snapshot.
> 2017-09-29 14:21:32,020 INFO  
> org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink  - No state 
> to restore for the BucketingSink (taskIdx=1).
> 2017-09-29 14:21:32,022 INFO  
> org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend  - 
> Initializing RocksDB keyed state backend from snapshot.
> 2017-09-29 14:21:32,078 INFO  com.datastax.driver.core.NettyUtil  
>   - Found Netty's native epoll transport in the classpath, using 
> it
> 2017-09-29 14:21:34,177 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Attempting to fail task externally Co-Flat Map (1/2) 
> (b879f192c4e8aae6671cdafb3a24c00a).
> 2017-09-29 14:21:34,177 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Attempting to fail task externally Map (2/2) 
> (1ea5aef6ccc7031edc6b37da2912d90b).
> 2017-09-29 14:21:34,178 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Attempting to fail task externally Co-Flat Map (2/2) 
> (4bac8e764c67520d418a4c755be23d4d).
> 2017-09-29 14:21:34,178 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Co-Flat Map (1/2) (b879f192c4e8aae6671cdafb3a24c00a) switched 
> from RUNNING to FAILED.
> AsynchronousException{java.lang.Exception: Could not materialize checkpoint 2 
> for operator Co-Flat Map (1/2).}
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:970)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.Exception: Could not materialize checkpoint 2 for 
> operator Co-Flat Map (1/2).
>   ... 6 more
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.IllegalStateException
>   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>   at 
> 

[GitHub] flink pull request #5633: [FLINK-8857] [Hbase] Avoid HBase connector read ex...

2018-03-05 Thread zentol
Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/5633#discussion_r172124498
  
--- Diff: 
flink-connectors/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseReadExample.java
 ---
@@ -86,8 +86,8 @@ public boolean filter(Tuple2 t) throws 
Exception {
 
hbaseDs.print();
 
-   // kick off execution.
-   env.execute();
+   // kick off execution is not needed.
+   // env.execute();
--- End diff --

you can add another commit that removes the code


---


[jira] [Commented] (FLINK-8818) Harden YarnFileStageTest upload test for eventual consistent read-after-write

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385947#comment-16385947
 ] 

ASF GitHub Bot commented on FLINK-8818:
---

Github user StefanRRichter commented on the issue:

https://github.com/apache/flink/pull/5601
  
LGTM  


> Harden YarnFileStageTest upload test for eventual consistent read-after-write
> -
>
> Key: FLINK-8818
> URL: https://issues.apache.org/jira/browse/FLINK-8818
> Project: Flink
>  Issue Type: Sub-task
>  Components: FileSystem, Tests, YARN
>Affects Versions: 1.5.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Blocker
> Fix For: 1.5.0, 1.4.3
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink issue #5635: [FLINK-8859][state backend] RocksDB backend should pass W...

2018-03-05 Thread sihuazhou
Github user sihuazhou commented on the issue:

https://github.com/apache/flink/pull/5635
  
@StefanRRichter Could you please have a look at this? This is a trivial 
work, but can improve performance a lot (It looks like we forgot it when 
restoring).


---


[jira] [Assigned] (FLINK-8857) HBase connector read example throws exception at the end.

2018-03-05 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-8857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler reassigned FLINK-8857:
---

Assignee: Xu Zhang

> HBase connector read example throws exception at the end.
> -
>
> Key: FLINK-8857
> URL: https://issues.apache.org/jira/browse/FLINK-8857
> Project: Flink
>  Issue Type: Bug
>  Components: Batch Connectors and Input/Output Formats
>Affects Versions: 1.2.0, 1.3.2, 1.5.0, 1.4.1
>Reporter: Xu Zhang
>Assignee: Xu Zhang
>Priority: Trivial
>  Labels: easy-fix, starter
>
> Running test case example of
> {code:java}
> flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseReadExample.java{code}
> Although the result has been printed out successfully, but at the end, driver 
> will throw the following exception.
> {code:java}
> 
> The program finished with the following exception:
> org.apache.flink.client.program.ProgramInvocationException: The main method 
> caused an error.
> at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:545)
> at 
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:419)
> at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:381)
> at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:838)
> at org.apache.flink.client.CliFrontend.run(CliFrontend.java:259)
> at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1086)
> at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1133)
> at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1130)
> at 
> org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
> at 
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
> at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1130)
> Caused by: java.lang.RuntimeException: No new data sinks have been defined 
> since the last execution. The last execution refers to the latest call to 
> 'execute()', 'count()', 'collect()', or 'print()'.
> at 
> org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:1050)
> at 
> org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:1032)
> at 
> org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:59)
> at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:926)
> at com.hulu.ap.flink.scoring.pipeline.ScoringJob.main(ScoringJob.java:82)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528)
> ... 13 more
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-7804) YarnResourceManager does not execute AMRMClientAsync callbacks in main thread

2018-03-05 Thread Aljoscha Krettek (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-7804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek updated FLINK-7804:

Fix Version/s: 1.5.0

> YarnResourceManager does not execute AMRMClientAsync callbacks in main thread
> -
>
> Key: FLINK-7804
> URL: https://issues.apache.org/jira/browse/FLINK-7804
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination, YARN
>Affects Versions: 1.4.0, 1.5.0
>Reporter: Till Rohrmann
>Assignee: Gary Yao
>Priority: Blocker
>  Labels: flip-6
> Fix For: 1.5.0
>
>
> The {{YarnResourceManager}} registers callbacks at a {{AMRMClientAsync}} 
> which it uses to react to Yarn container allocations. These callbacks (e.g. 
> {{onContainersAllocated}} modify the internal state of the 
> {{YarnResourceManager}}. This can lead to race conditions with the 
> {{requestYarnContainer}} method.
> In order to solve this problem we have to execute the state changing 
> operations in the main thread of the {{YarnResourceManager}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8857) HBase connector read example throws exception at the end.

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385826#comment-16385826
 ] 

ASF GitHub Bot commented on FLINK-8857:
---

Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/5633#discussion_r172122162
  
--- Diff: 
flink-connectors/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseReadExample.java
 ---
@@ -86,8 +86,8 @@ public boolean filter(Tuple2 t) throws 
Exception {
 
hbaseDs.print();
 
-   // kick off execution.
-   env.execute();
+   // kick off execution is not needed.
+   // env.execute();
--- End diff --

I'd just remove it.


> HBase connector read example throws exception at the end.
> -
>
> Key: FLINK-8857
> URL: https://issues.apache.org/jira/browse/FLINK-8857
> Project: Flink
>  Issue Type: Bug
>  Components: Batch Connectors and Input/Output Formats
>Affects Versions: 1.2.0, 1.3.2, 1.5.0, 1.4.1
>Reporter: Xu Zhang
>Priority: Trivial
>  Labels: easy-fix, starter
>
> Running test case example of
> {code:java}
> flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseReadExample.java{code}
> Although the result has been printed out successfully, but at the end, driver 
> will throw the following exception.
> {code:java}
> 
> The program finished with the following exception:
> org.apache.flink.client.program.ProgramInvocationException: The main method 
> caused an error.
> at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:545)
> at 
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:419)
> at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:381)
> at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:838)
> at org.apache.flink.client.CliFrontend.run(CliFrontend.java:259)
> at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1086)
> at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1133)
> at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1130)
> at 
> org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
> at 
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
> at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1130)
> Caused by: java.lang.RuntimeException: No new data sinks have been defined 
> since the last execution. The last execution refers to the latest call to 
> 'execute()', 'count()', 'collect()', or 'print()'.
> at 
> org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:1050)
> at 
> org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:1032)
> at 
> org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:59)
> at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:926)
> at com.hulu.ap.flink.scoring.pipeline.ScoringJob.main(ScoringJob.java:82)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528)
> ... 13 more
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8857) HBase connector read example throws exception at the end.

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385823#comment-16385823
 ] 

ASF GitHub Bot commented on FLINK-8857:
---

GitHub user neoremind opened a pull request:

https://github.com/apache/flink/pull/5633

[FLINK-8857] [Hbase] Avoid HBase connector read example throwing exception 
at the end

## What is the purpose of the change

*This pull request fixes problem of HBase read example throwing exception 
at the end of the program execution.*


## Brief change log

  - *Update example 
`flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseReadExample.java`
 by removing the part causing the problem.*


## Verifying this change

This change is a trivial rework / code cleanup without any test coverage.

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (no)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
  - The serializers: (no)
  - The runtime per-record code paths (performance sensitive): (no)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
  - The S3 file system connector: (no)

## Documentation

  - Does this pull request introduce a new feature? (no)
  - If yes, how is the feature documented? (not applicable)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/neoremind/flink FLINK-8857

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5633.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5633


commit 97c009a4d2308ad1da5629f9653bed9af352a8f7
Author: neoremind 
Date:   2018-03-05T08:50:37Z

Avoid hbase connector read example throwing exception at the end.




> HBase connector read example throws exception at the end.
> -
>
> Key: FLINK-8857
> URL: https://issues.apache.org/jira/browse/FLINK-8857
> Project: Flink
>  Issue Type: Bug
>  Components: Batch Connectors and Input/Output Formats
>Affects Versions: 1.2.0, 1.3.2, 1.5.0, 1.4.1
>Reporter: Xu Zhang
>Priority: Trivial
>  Labels: easy-fix, starter
>
> Running test case example of
> {code:java}
> flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseReadExample.java{code}
> Although the result has been printed out successfully, but at the end, driver 
> will throw the following exception.
> {code:java}
> 
> The program finished with the following exception:
> org.apache.flink.client.program.ProgramInvocationException: The main method 
> caused an error.
> at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:545)
> at 
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:419)
> at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:381)
> at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:838)
> at org.apache.flink.client.CliFrontend.run(CliFrontend.java:259)
> at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1086)
> at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1133)
> at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1130)
> at 
> org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
> at 
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
> at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1130)
> Caused by: java.lang.RuntimeException: No new data sinks have been defined 
> since the last execution. The last execution refers to the latest call to 
> 'execute()', 'count()', 'collect()', or 'print()'.
> at 
> org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:1050)
> at 
> org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:1032)
> at 
> org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:59)
> at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:926)
> at 

[GitHub] flink pull request #5633: [FLINK-8857] [Hbase] Avoid HBase connector read ex...

2018-03-05 Thread neoremind
Github user neoremind commented on a diff in the pull request:

https://github.com/apache/flink/pull/5633#discussion_r172122738
  
--- Diff: 
flink-connectors/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseReadExample.java
 ---
@@ -86,8 +86,8 @@ public boolean filter(Tuple2 t) throws 
Exception {
 
hbaseDs.print();
 
-   // kick off execution.
-   env.execute();
+   // kick off execution is not needed.
+   // env.execute();
--- End diff --

Cool! So should I close this PR? 


---


[jira] [Commented] (FLINK-8857) HBase connector read example throws exception at the end.

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385827#comment-16385827
 ] 

ASF GitHub Bot commented on FLINK-8857:
---

Github user neoremind commented on a diff in the pull request:

https://github.com/apache/flink/pull/5633#discussion_r172122738
  
--- Diff: 
flink-connectors/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseReadExample.java
 ---
@@ -86,8 +86,8 @@ public boolean filter(Tuple2 t) throws 
Exception {
 
hbaseDs.print();
 
-   // kick off execution.
-   env.execute();
+   // kick off execution is not needed.
+   // env.execute();
--- End diff --

Cool! So should I close this PR? 


> HBase connector read example throws exception at the end.
> -
>
> Key: FLINK-8857
> URL: https://issues.apache.org/jira/browse/FLINK-8857
> Project: Flink
>  Issue Type: Bug
>  Components: Batch Connectors and Input/Output Formats
>Affects Versions: 1.2.0, 1.3.2, 1.5.0, 1.4.1
>Reporter: Xu Zhang
>Assignee: Xu Zhang
>Priority: Trivial
>  Labels: easy-fix, starter
>
> Running test case example of
> {code:java}
> flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseReadExample.java{code}
> Although the result has been printed out successfully, but at the end, driver 
> will throw the following exception.
> {code:java}
> 
> The program finished with the following exception:
> org.apache.flink.client.program.ProgramInvocationException: The main method 
> caused an error.
> at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:545)
> at 
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:419)
> at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:381)
> at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:838)
> at org.apache.flink.client.CliFrontend.run(CliFrontend.java:259)
> at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1086)
> at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1133)
> at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1130)
> at 
> org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
> at 
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
> at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1130)
> Caused by: java.lang.RuntimeException: No new data sinks have been defined 
> since the last execution. The last execution refers to the latest call to 
> 'execute()', 'count()', 'collect()', or 'print()'.
> at 
> org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:1050)
> at 
> org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:1032)
> at 
> org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:59)
> at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:926)
> at com.hulu.ap.flink.scoring.pipeline.ScoringJob.main(ScoringJob.java:82)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528)
> ... 13 more
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8829) Flink in EMR(YARN) is down due to Akka communication issue

2018-03-05 Thread Aleksandr Filichkin (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385831#comment-16385831
 ] 

Aleksandr Filichkin commented on FLINK-8829:


[~phoenixjiangnan] Have you faced it before? What akka configs should we 
change? Do you mean "akka.client.timeout"?

> Flink in EMR(YARN) is down due to Akka communication issue
> --
>
> Key: FLINK-8829
> URL: https://issues.apache.org/jira/browse/FLINK-8829
> Project: Flink
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 1.3.2
>Reporter: Aleksandr Filichkin
>Priority: Major
>
> Hi,
> We have running Flink 1.3.2 app in Amazon EMR with YARN. Every week our Flink 
> job is down due to:
> _2018-02-16 19:00:04,595 WARN akka.remote.ReliableDeliverySupervisor - 
> Association with remote system 
> [akka.tcp://[fl...@ip-10-97-34-209.tr-fr-nonprod.aws-int.thomsonreuters.com:42177]|mailto:fl...@ip-10-97-34-209.tr-fr-nonprod.aws-int.thomsonreuters.com:42177]]
>  has failed, address is now gated for [5000] ms. Reason: [Association failed 
> with 
> [akka.tcp://[fl...@ip-10-97-34-209.tr-fr-nonprod.aws-int.thomsonreuters.com:42177]]|mailto:fl...@ip-10-97-34-209.tr-fr-nonprod.aws-int.thomsonreuters.com:42177]]]
>  Caused by: [Connection refused: 
> ip-10-97-34-209.tr-fr-nonprod.aws-int.thomsonreuters.com/10.97.34.209:42177] 
> 2018-02-16 19:00:05,593 WARN akka.remote.RemoteWatcher - Detected 
> unreachable: 
> [akka.tcp://[fl...@ip-10-97-34-209.tr-fr-nonprod.aws-int.thomsonreuters.com:42177]|mailto:fl...@ip-10-97-34-209.tr-fr-nonprod.aws-int.thomsonreuters.com:42177]]
>  2018-02-16 19:00:05,596 INFO 
> org.apache.flink.runtime.client.JobSubmissionClientActor - Lost connection to 
> JobManager 
> akka.tcp://[fl...@ip-10-97-34-209.tr-fr-nonprod.aws-int.thomsonreuters.com:42177/user/jobmanager|mailto:fl...@ip-10-97-34-209.tr-fr-nonprod.aws-int.thomsonreuters.com:42177/user/jobmanager].
>  Triggering connection timeout._
> Do you have any ideas how to troubleshoot it?
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink pull request #5633: [FLINK-8857] [Hbase] Avoid HBase connector read ex...

2018-03-05 Thread neoremind
Github user neoremind commented on a diff in the pull request:

https://github.com/apache/flink/pull/5633#discussion_r172125701
  
--- Diff: 
flink-connectors/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseReadExample.java
 ---
@@ -86,8 +86,8 @@ public boolean filter(Tuple2 t) throws 
Exception {
 
hbaseDs.print();
 
-   // kick off execution.
-   env.execute();
+   // kick off execution is not needed.
+   // env.execute();
--- End diff --

Done.


---


[jira] [Commented] (FLINK-5479) Per-partition watermarks in FlinkKafkaConsumer should consider idle partitions

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385863#comment-16385863
 ] 

ASF GitHub Bot commented on FLINK-5479:
---

GitHub user tzulitai opened a pull request:

https://github.com/apache/flink/pull/5634

[FLINK-5479] [kafka] Idleness detection for periodic per-partition 
watermarks in FlinkKafkaConsumer

## What is the purpose of the change

This commit adds the capability to detect idle partitions in the 
`FlinkKafkaConsumer` when using periodic per-partition watermark generation. 
Users set the partition idle timeout using `setPartitionIdleTimeout(long)`. The 
value of the timeout determines how long a an idle partition may block 
watermark advancement downstream.


## Brief change log

- Adds a `setPartitionIdleTimeout(long)` configuration method
- Modifies `KafkaTopicPartitionStateWithPeriodicWatermarks` to keep track 
of necessary information to determine partition idleness.
- Adds idleness detection logic to 
`AsbtractFetcher.PeriodicWatermarkEmitter`.

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (yes / **no**)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
  - The serializers: (yes / **no** / don't know)
  - The runtime per-record code paths (performance sensitive): (yes / 
**no** / don't know)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
  - The S3 file system connector: (yes / **no** / don't know)

## Documentation

  - Does this pull request introduce a new feature? (**yes** / no)
  - If yes, how is the feature documented? (not applicable / docs / 
**JavaDocs** / not documented)


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tzulitai/flink FLINK-5479

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5634.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5634


commit d7b95ca1c8bb85f77dd6b83becf8fc1d5cccb810
Author: Tzu-Li (Gordon) Tai 
Date:   2018-03-05T09:35:02Z

[FLINK-5479] [kafka] Idleness detection for periodic per-partition 
watermarks

This commit adds the capability to detect idle partitions in the
FlinkKafkaConsumer when using periodic per-partition watermark
generation. Users set the partition idle timeout using
`setPartitionIdleTimeout(long)`. The value of the timeout determines how
long a an idle partition may block watermark advancement downstream.




> Per-partition watermarks in FlinkKafkaConsumer should consider idle partitions
> --
>
> Key: FLINK-5479
> URL: https://issues.apache.org/jira/browse/FLINK-5479
> Project: Flink
>  Issue Type: Improvement
>  Components: Kafka Connector
>Reporter: Tzu-Li (Gordon) Tai
>Priority: Blocker
> Fix For: 1.5.0
>
>
> Reported in ML: 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Kafka-topic-partition-skewness-causes-watermark-not-being-emitted-td11008.html
> Similar to what's happening to idle sources blocking watermark progression in 
> downstream operators (see FLINK-5017), the per-partition watermark mechanism 
> in {{FlinkKafkaConsumer}} is also being blocked of progressing watermarks 
> when a partition is idle. The watermark of idle partitions is always 
> {{Long.MIN_VALUE}}, therefore the overall min watermark across all partitions 
> of a consumer subtask will never proceed.
> It's normally not a common case to have Kafka partitions not producing any 
> data, but it'll probably be good to handle this as well. I think we should 
> have a localized solution similar to FLINK-5017 for the per-partition 
> watermarks in {{AbstractFetcher}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8857) HBase connector read example throws exception at the end.

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385953#comment-16385953
 ] 

ASF GitHub Bot commented on FLINK-8857:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/5633
  
merging.


> HBase connector read example throws exception at the end.
> -
>
> Key: FLINK-8857
> URL: https://issues.apache.org/jira/browse/FLINK-8857
> Project: Flink
>  Issue Type: Bug
>  Components: Batch Connectors and Input/Output Formats
>Affects Versions: 1.2.0, 1.3.2, 1.5.0, 1.4.1
>Reporter: Xu Zhang
>Assignee: Xu Zhang
>Priority: Trivial
>  Labels: easy-fix, starter
>
> Running test case example of
> {code:java}
> flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseReadExample.java{code}
> Although the result has been printed out successfully, but at the end, driver 
> will throw the following exception.
> {code:java}
> 
> The program finished with the following exception:
> org.apache.flink.client.program.ProgramInvocationException: The main method 
> caused an error.
> at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:545)
> at 
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:419)
> at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:381)
> at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:838)
> at org.apache.flink.client.CliFrontend.run(CliFrontend.java:259)
> at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1086)
> at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1133)
> at org.apache.flink.client.CliFrontend$2.call(CliFrontend.java:1130)
> at 
> org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
> at 
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
> at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1130)
> Caused by: java.lang.RuntimeException: No new data sinks have been defined 
> since the last execution. The last execution refers to the latest call to 
> 'execute()', 'count()', 'collect()', or 'print()'.
> at 
> org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:1050)
> at 
> org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:1032)
> at 
> org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:59)
> at 
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:926)
> at com.hulu.ap.flink.scoring.pipeline.ScoringJob.main(ScoringJob.java:82)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528)
> ... 13 more
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink pull request #5481: [FLINK-8560] Access to the current key in ProcessF...

2018-03-05 Thread aljoscha
Github user aljoscha commented on a diff in the pull request:

https://github.com/apache/flink/pull/5481#discussion_r172135390
  
--- Diff: docs/dev/stream/operators/process_function.md ---
@@ -242,4 +242,17 @@ class CountWithTimeoutFunction extends 
ProcessFunction[(String, String), (String
 the current processing time as event-time timestamp. This behavior is very 
subtle and might not be noticed by users. Well, it's
 harmful because processing-time timestamps are indeterministic and not 
aligned with watermarks. Besides, user-implemented logic
 depends on this wrong timestamp highly likely is unintendedly faulty. So 
we've decided to fix it. Upon upgrading to 1.4.0, Flink jobs
-that are using this incorrect event-time timestamp will fail, and users 
should adapt their jobs to the correct logic.
\ No newline at end of file
+that are using this incorrect event-time timestamp will fail, and users 
should adapt their jobs to the correct logic.
+
+## The KeyedProcessFunction
+
+`KeyedProcessFunction`, as an extension of `ProcessFunction`, gives access 
to the key of timers in its `onTimer(...)`
+method.
+
+{% highlight java %}
--- End diff --

Maybe also add Scala example code.


---


[GitHub] flink pull request #5481: [FLINK-8560] Access to the current key in ProcessF...

2018-03-05 Thread aljoscha
Github user aljoscha commented on a diff in the pull request:

https://github.com/apache/flink/pull/5481#discussion_r172135104
  
--- Diff: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/KeyedProcessOperator.java
 ---
@@ -70,21 +69,15 @@ public void open() throws Exception {
@Override
public void onEventTime(InternalTimer timer) throws 
Exception {
collector.setAbsoluteTimestamp(timer.getTimestamp());
-   onTimerContext.timeDomain = TimeDomain.EVENT_TIME;
-   onTimerContext.timer = timer;
-   userFunction.onTimer(timer.getTimestamp(), onTimerContext, 
collector);
-   onTimerContext.timeDomain = null;
-   onTimerContext.timer = null;
+   reinitialize(userFunction, TimeDomain.EVENT_TIME, timer);
--- End diff --

Hate to be picky, but I think the name is a bit misleading and we could 
probably put all of this in a method `invokeUserTime()` that does what 
`reinitialise()` and `reset()` do.

@kl0u I think you can quickly fix that when merging.


---


[jira] [Commented] (FLINK-8560) add KeyedProcessFunction to expose the key in onTimer() and other methods

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385877#comment-16385877
 ] 

ASF GitHub Bot commented on FLINK-8560:
---

Github user aljoscha commented on a diff in the pull request:

https://github.com/apache/flink/pull/5481#discussion_r172133573
  
--- Diff: docs/dev/stream/operators/process_function.md ---
@@ -242,4 +242,17 @@ class CountWithTimeoutFunction extends 
ProcessFunction[(String, String), (String
 the current processing time as event-time timestamp. This behavior is very 
subtle and might not be noticed by users. Well, it's
 harmful because processing-time timestamps are indeterministic and not 
aligned with watermarks. Besides, user-implemented logic
 depends on this wrong timestamp highly likely is unintendedly faulty. So 
we've decided to fix it. Upon upgrading to 1.4.0, Flink jobs
-that are using this incorrect event-time timestamp will fail, and users 
should adapt their jobs to the correct logic.
\ No newline at end of file
+that are using this incorrect event-time timestamp will fail, and users 
should adapt their jobs to the correct logic.
+
+## The KeyedProcessFunction
+
+`KeyedProcessFunction`, as an extension of `ProcessFunction`, gives access 
to the key of timers in its `onTimer(...)`
+method.
+
+{% highlight java %}
+@Override
+public void onTimer(long timestamp, OnTimerContext ctx, Collector 
out) throws Exception {
--- End diff --

I believe this is now `public void onTimer(long timestamp, 
OnTimerContext ctx, Collector out)`, right? @kl0u you could fix this 
while merging.


> add KeyedProcessFunction to expose the key in onTimer() and other methods
> -
>
> Key: FLINK-8560
> URL: https://issues.apache.org/jira/browse/FLINK-8560
> Project: Flink
>  Issue Type: Improvement
>  Components: DataStream API
>Affects Versions: 1.4.0
>Reporter: Jürgen Thomann
>Assignee: Bowen Li
>Priority: Major
> Fix For: 1.5.0
>
>
> Currently it is required to store the key of a keyBy() in the processElement 
> method to have access to it in the OnTimerContext.
> This is not so good as you have to check in the processElement method for 
> every element if the key is already stored and set it if it's not already set.
> A possible solution would adding OnTimerContext#getCurrentKey() or a similar 
> method. Maybe having it in the open() method could maybe work as well.
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Getting-Key-from-keyBy-in-ProcessFunction-tt18126.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8560) add KeyedProcessFunction to expose the key in onTimer() and other methods

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385875#comment-16385875
 ] 

ASF GitHub Bot commented on FLINK-8560:
---

Github user aljoscha commented on a diff in the pull request:

https://github.com/apache/flink/pull/5481#discussion_r172135104
  
--- Diff: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/KeyedProcessOperator.java
 ---
@@ -70,21 +69,15 @@ public void open() throws Exception {
@Override
public void onEventTime(InternalTimer timer) throws 
Exception {
collector.setAbsoluteTimestamp(timer.getTimestamp());
-   onTimerContext.timeDomain = TimeDomain.EVENT_TIME;
-   onTimerContext.timer = timer;
-   userFunction.onTimer(timer.getTimestamp(), onTimerContext, 
collector);
-   onTimerContext.timeDomain = null;
-   onTimerContext.timer = null;
+   reinitialize(userFunction, TimeDomain.EVENT_TIME, timer);
--- End diff --

Hate to be picky, but I think the name is a bit misleading and we could 
probably put all of this in a method `invokeUserTime()` that does what 
`reinitialise()` and `reset()` do.

@kl0u I think you can quickly fix that when merging.


> add KeyedProcessFunction to expose the key in onTimer() and other methods
> -
>
> Key: FLINK-8560
> URL: https://issues.apache.org/jira/browse/FLINK-8560
> Project: Flink
>  Issue Type: Improvement
>  Components: DataStream API
>Affects Versions: 1.4.0
>Reporter: Jürgen Thomann
>Assignee: Bowen Li
>Priority: Major
> Fix For: 1.5.0
>
>
> Currently it is required to store the key of a keyBy() in the processElement 
> method to have access to it in the OnTimerContext.
> This is not so good as you have to check in the processElement method for 
> every element if the key is already stored and set it if it's not already set.
> A possible solution would adding OnTimerContext#getCurrentKey() or a similar 
> method. Maybe having it in the open() method could maybe work as well.
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Getting-Key-from-keyBy-in-ProcessFunction-tt18126.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8560) add KeyedProcessFunction to expose the key in onTimer() and other methods

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385876#comment-16385876
 ] 

ASF GitHub Bot commented on FLINK-8560:
---

Github user aljoscha commented on a diff in the pull request:

https://github.com/apache/flink/pull/5481#discussion_r172135390
  
--- Diff: docs/dev/stream/operators/process_function.md ---
@@ -242,4 +242,17 @@ class CountWithTimeoutFunction extends 
ProcessFunction[(String, String), (String
 the current processing time as event-time timestamp. This behavior is very 
subtle and might not be noticed by users. Well, it's
 harmful because processing-time timestamps are indeterministic and not 
aligned with watermarks. Besides, user-implemented logic
 depends on this wrong timestamp highly likely is unintendedly faulty. So 
we've decided to fix it. Upon upgrading to 1.4.0, Flink jobs
-that are using this incorrect event-time timestamp will fail, and users 
should adapt their jobs to the correct logic.
\ No newline at end of file
+that are using this incorrect event-time timestamp will fail, and users 
should adapt their jobs to the correct logic.
+
+## The KeyedProcessFunction
+
+`KeyedProcessFunction`, as an extension of `ProcessFunction`, gives access 
to the key of timers in its `onTimer(...)`
+method.
+
+{% highlight java %}
--- End diff --

Maybe also add Scala example code.


> add KeyedProcessFunction to expose the key in onTimer() and other methods
> -
>
> Key: FLINK-8560
> URL: https://issues.apache.org/jira/browse/FLINK-8560
> Project: Flink
>  Issue Type: Improvement
>  Components: DataStream API
>Affects Versions: 1.4.0
>Reporter: Jürgen Thomann
>Assignee: Bowen Li
>Priority: Major
> Fix For: 1.5.0
>
>
> Currently it is required to store the key of a keyBy() in the processElement 
> method to have access to it in the OnTimerContext.
> This is not so good as you have to check in the processElement method for 
> every element if the key is already stored and set it if it's not already set.
> A possible solution would adding OnTimerContext#getCurrentKey() or a similar 
> method. Maybe having it in the open() method could maybe work as well.
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Getting-Key-from-keyBy-in-ProcessFunction-tt18126.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink issue #5630: [FLINK-8849][Documentation] Fix link to chaining docs

2018-03-05 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/5630
  
We don't _require_ JIRAs for small doc fixes, whether you open one is up to 
you.


---


[GitHub] flink issue #5627: [doc] Remove missed CheckpointedRestoring

2018-03-05 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/5627
  
merging.


---


[GitHub] flink issue #5633: [FLINK-8857] [Hbase] Avoid HBase connector read example t...

2018-03-05 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/5633
  
merging.


---


[GitHub] flink pull request #5611: [FLINK-8769][flip6] do not print error causing exc...

2018-03-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/5611


---


[GitHub] flink pull request #5317: [FLINK-8458] Add the switch for keeping both the o...

2018-03-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/5317


---


[GitHub] flink pull request #5636: [FLINK-8703][tests] Port CancelingTestBase to Mini...

2018-03-05 Thread zentol
GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/5636

 [FLINK-8703][tests] Port CancelingTestBase to MiniClusterResource 

## What is the purpose of the change

The `CancelingTestBase` now uses the `MiniClusterResource` and can be run 
against both legacy and flip6 clusters.


## Brief change log

* Do not use singleActorSystem in LocalFlinkMiniCluster as this rendered 
the returned client to be unusable
* port `CancelingTestBase`
* properly disable JoinCancelingITCase


## Verifying this change

Run `MapCancelingITCase` with flip6 profile on/off.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 8703_canceling

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5636.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5636


commit 24d95a00164f4b93ff30a237680cf4772855d7fc
Author: zentol 
Date:   2018-03-05T12:45:33Z

[hotfix][tests] Do not use singleActorSystem in LocalFlinkMiniCluster

Using a singleActorSystem rendered the returned client unusable.

commit 3fad83426a6356dee7966cf9e55d0de40b3bf6da
Author: zentol 
Date:   2018-02-26T14:36:37Z

[FLINK-8703][tests] Port CancelingTestBase to MiniClusterResource

commit 07c77df577e62bcbfc4aeac3e5220151768319dd
Author: zentol 
Date:   2018-02-28T12:43:42Z

[hotfix][tests] Properly disable JoinCancelingITCase




---


[jira] [Closed] (FLINK-8859) RocksDB backend should pass WriteOption to Rocks.put() when restoring

2018-03-05 Thread Stefan Richter (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Richter closed FLINK-8859.
-
  Resolution: Fixed
Release Note: Merged in 131daa28bf.

> RocksDB backend should pass WriteOption to Rocks.put() when restoring
> -
>
> Key: FLINK-8859
> URL: https://issues.apache.org/jira/browse/FLINK-8859
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Affects Versions: 1.5.0
>Reporter: Sihua Zhou
>Assignee: Sihua Zhou
>Priority: Major
> Fix For: 1.5.0
>
>
> We should pass `WriteOption` to Rocks.put() when restoring from handle (Both 
> in full & incremental checkpoint). Because of 
> `WriteOption.setDisableWAL(true)`, the performance can be increased by about 
> 2 times.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8703) Migrate tests from LocalFlinkMiniCluster to MiniClusterResource

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386031#comment-16386031
 ] 

ASF GitHub Bot commented on FLINK-8703:
---

GitHub user zentol opened a pull request:

https://github.com/apache/flink/pull/5636

 [FLINK-8703][tests] Port CancelingTestBase to MiniClusterResource 

## What is the purpose of the change

The `CancelingTestBase` now uses the `MiniClusterResource` and can be run 
against both legacy and flip6 clusters.


## Brief change log

* Do not use singleActorSystem in LocalFlinkMiniCluster as this rendered 
the returned client to be unusable
* port `CancelingTestBase`
* properly disable JoinCancelingITCase


## Verifying this change

Run `MapCancelingITCase` with flip6 profile on/off.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zentol/flink 8703_canceling

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5636.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5636


commit 24d95a00164f4b93ff30a237680cf4772855d7fc
Author: zentol 
Date:   2018-03-05T12:45:33Z

[hotfix][tests] Do not use singleActorSystem in LocalFlinkMiniCluster

Using a singleActorSystem rendered the returned client unusable.

commit 3fad83426a6356dee7966cf9e55d0de40b3bf6da
Author: zentol 
Date:   2018-02-26T14:36:37Z

[FLINK-8703][tests] Port CancelingTestBase to MiniClusterResource

commit 07c77df577e62bcbfc4aeac3e5220151768319dd
Author: zentol 
Date:   2018-02-28T12:43:42Z

[hotfix][tests] Properly disable JoinCancelingITCase




> Migrate tests from LocalFlinkMiniCluster to MiniClusterResource
> ---
>
> Key: FLINK-8703
> URL: https://issues.apache.org/jira/browse/FLINK-8703
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Aljoscha Krettek
>Assignee: Chesnay Schepler
>Priority: Blocker
> Fix For: 1.5.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-8862) Support HBase snapshot read

2018-03-05 Thread Xu Zhang (JIRA)
Xu Zhang created FLINK-8862:
---

 Summary: Support HBase snapshot read
 Key: FLINK-8862
 URL: https://issues.apache.org/jira/browse/FLINK-8862
 Project: Flink
  Issue Type: Improvement
  Components: Batch Connectors and Input/Output Formats
Affects Versions: 1.2.0
Reporter: Xu Zhang


Flink-hbase connector only supports reading/scanning HBase over region server 
scanner, there is also snapshot scanning solution, just like Hadoop provides 2 
ways to scan HBase, one is TableInputFormat, the other is 
TableSnapshotInputFormat, so it would be great if flink supports both solutions 
to ensure more wider usage scope and provide alternatives for users.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-8863) Add user-defined function support in SQL Client

2018-03-05 Thread Timo Walther (JIRA)
Timo Walther created FLINK-8863:
---

 Summary: Add user-defined function support in SQL Client
 Key: FLINK-8863
 URL: https://issues.apache.org/jira/browse/FLINK-8863
 Project: Flink
  Issue Type: Sub-task
  Components: Table API  SQL
Reporter: Timo Walther


This issue is a subtask of part two "Full Embedded SQL Client" of the 
implementation plan mentioned in 
[FLIP-24|https://cwiki.apache.org/confluence/display/FLINK/FLIP-24+-+SQL+Client].

It should be possible to declare user-defined functions in the SQL client. For 
now, we limit the registration to classes that implement {{ScalarFunction}}, 
{{TableFunction}}, {{AggregateFunction}}. Functions that are implemented in SQL 
are not part of this issue.

I would suggest to introduce a {{functions}} top-level property. The 
declaration could look similar to:

{code}
functions:
  - name: testFunction
from: class   <-- optional, default: class
class: org.my.MyScalarFunction
constructor:  <-- optional, needed for 
certain types of functions
  - 42.0
  - class: org.my.Class  <-- possibility to create objects 
via properties
constructor: 
  - 1
  - true
  - false
  - "whatever"
  - type: INT
value: 1
{code}





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-7756) RocksDB state backend Checkpointing (Async and Incremental) is not working with CEP.

2018-03-05 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386138#comment-16386138
 ] 

Aljoscha Krettek commented on FLINK-7756:
-

[~shashank734] Yes, because that should also fail with a better error message.

> RocksDB state backend Checkpointing (Async and Incremental)  is not working 
> with CEP.
> -
>
> Key: FLINK-7756
> URL: https://issues.apache.org/jira/browse/FLINK-7756
> Project: Flink
>  Issue Type: Sub-task
>  Components: CEP, State Backends, Checkpointing, Streaming
>Affects Versions: 1.4.0, 1.3.2
> Environment: Flink 1.3.2, Yarn, HDFS, RocksDB backend
>Reporter: Shashank Agarwal
>Assignee: Aljoscha Krettek
>Priority: Blocker
> Fix For: 1.5.0, 1.4.1, 1.4.2
>
> Attachments: jobmanager.log, jobmanager_without_cassandra.log, 
> taskmanager.log, taskmanager_without_cassandra.log
>
>
> When i try to use RocksDBStateBackend on my staging cluster (which is using 
> HDFS as file system) it crashes. But When i use FsStateBackend on staging 
> (which is using HDFS as file system) it is working fine.
> On local with local file system it's working fine in both cases.
> Please check attached logs. I have around 20-25 tasks in my app.
> {code:java}
> 2017-09-29 14:21:31,639 INFO  
> org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink  - No state 
> to restore for the BucketingSink (taskIdx=0).
> 2017-09-29 14:21:31,640 INFO  
> org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend  - 
> Initializing RocksDB keyed state backend from snapshot.
> 2017-09-29 14:21:32,020 INFO  
> org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink  - No state 
> to restore for the BucketingSink (taskIdx=1).
> 2017-09-29 14:21:32,022 INFO  
> org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend  - 
> Initializing RocksDB keyed state backend from snapshot.
> 2017-09-29 14:21:32,078 INFO  com.datastax.driver.core.NettyUtil  
>   - Found Netty's native epoll transport in the classpath, using 
> it
> 2017-09-29 14:21:34,177 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Attempting to fail task externally Co-Flat Map (1/2) 
> (b879f192c4e8aae6671cdafb3a24c00a).
> 2017-09-29 14:21:34,177 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Attempting to fail task externally Map (2/2) 
> (1ea5aef6ccc7031edc6b37da2912d90b).
> 2017-09-29 14:21:34,178 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Attempting to fail task externally Co-Flat Map (2/2) 
> (4bac8e764c67520d418a4c755be23d4d).
> 2017-09-29 14:21:34,178 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Co-Flat Map (1/2) (b879f192c4e8aae6671cdafb3a24c00a) switched 
> from RUNNING to FAILED.
> AsynchronousException{java.lang.Exception: Could not materialize checkpoint 2 
> for operator Co-Flat Map (1/2).}
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:970)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.Exception: Could not materialize checkpoint 2 for 
> operator Co-Flat Map (1/2).
>   ... 6 more
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.IllegalStateException
>   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>   at 
> org.apache.flink.util.FutureUtil.runIfNotDoneAndGet(FutureUtil.java:43)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:897)
>   ... 5 more
>   Suppressed: java.lang.Exception: Could not properly cancel managed 
> keyed state future.
>   at 
> org.apache.flink.streaming.api.operators.OperatorSnapshotResult.cancel(OperatorSnapshotResult.java:90)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.cleanup(StreamTask.java:1023)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:961)
>   ... 5 more
>   Caused by: java.util.concurrent.ExecutionException: 
> java.lang.IllegalStateException
>   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>   at 

[jira] [Updated] (FLINK-8858) Add support for INSERT INTO in SQL Client

2018-03-05 Thread Timo Walther (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-8858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-8858:

Issue Type: Sub-task  (was: New Feature)
Parent: FLINK-7594

> Add support for INSERT INTO in SQL Client
> -
>
> Key: FLINK-8858
> URL: https://issues.apache.org/jira/browse/FLINK-8858
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API  SQL
>Affects Versions: 1.6.0
>Reporter: Renjie Liu
>Assignee: Renjie Liu
>Priority: Major
>
> The current design of SQL Client embedded mode doesn't support long running 
> queries. It would be useful for simple jobs that can be expressed in a single 
> sql statement if we can submit sql statements stored in files as long running 
> queries. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-8858) Add support for INSERT INTO in SQL Client

2018-03-05 Thread Timo Walther (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-8858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-8858:

Summary: Add support for INSERT INTO in SQL Client  (was: SQL Client to 
submit long running query in file)

> Add support for INSERT INTO in SQL Client
> -
>
> Key: FLINK-8858
> URL: https://issues.apache.org/jira/browse/FLINK-8858
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API  SQL
>Affects Versions: 1.6.0
>Reporter: Renjie Liu
>Assignee: Renjie Liu
>Priority: Major
>
> The current design of SQL Client embedded mode doesn't support long running 
> queries. It would be useful for simple jobs that can be expressed in a single 
> sql statement if we can submit sql statements stored in files as long running 
> queries. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-8860) SlotManager spamming log files

2018-03-05 Thread Aljoscha Krettek (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-8860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek updated FLINK-8860:

Priority: Blocker  (was: Critical)

> SlotManager spamming log files
> --
>
> Key: FLINK-8860
> URL: https://issues.apache.org/jira/browse/FLINK-8860
> Project: Flink
>  Issue Type: Bug
>  Components: JobManager, ResourceManager
>Affects Versions: 1.5.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Blocker
>  Labels: flip-6
> Fix For: 1.5.0
>
>
> {{SlotManager}} is spamming the log files a lot with
> {code}
> 2018-03-05 10:45:12,393 INFO  
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Received 
> slot report from instance b16c4e516995d1e672c0933bb380770c.
> 2018-03-05 10:45:12,393 INFO  
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Received 
> slot report from instance de58fbf1c069620a4275c8b529deb20b.
> 2018-03-05 10:45:12,393 INFO  
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Received 
> slot report from instance 86ab5a7e1d57bb2883fc0d1f2aebb304.
> 2018-03-05 10:45:12,393 INFO  
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Received 
> slot report from instance 90f9ab2bd433db41b3ab567fd246fb3c.
> 2018-03-05 10:45:12,393 INFO  
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Received 
> slot report from instance ec99fcc5a801272402af9afe08a1001d.
> 2018-03-05 10:45:12,394 INFO  
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Received 
> slot report from instance 4c1c4b5ce52195dc90196c10c26d9ef8.
> 2018-03-05 10:45:12,394 INFO  
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Received 
> slot report from instance 2541d0f1398fc307aaf86bf7750535f1.
> 2018-03-05 10:45:12,394 INFO  
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Received 
> slot report from instance 91d94a9fdfce3cbcef32ac1c6e7b3fbf.
> 2018-03-05 10:45:22,392 INFO  
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Received 
> slot report from instance 91d94a9fdfce3cbcef32ac1c6e7b3fbf.
> 2018-03-05 10:45:22,394 INFO  
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Received 
> slot report from instance 90f9ab2bd433db41b3ab567fd246fb3c.
> {code}
> This message is printed once per {{TaskManager}} heartbeat.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink pull request #5640: [FLINK-8839] [sql-client] Fix table source factory...

2018-03-05 Thread twalthr
GitHub user twalthr opened a pull request:

https://github.com/apache/flink/pull/5640

[FLINK-8839] [sql-client] Fix table source factory discovery

## What is the purpose of the change

This PR fixes the table source factory discovery by adding dependencies to 
the classloader. It also implements an `ExecutionContext` that can be reused 
during the same session.


## Brief change log

- New `ExecutionContext` abstraction
- Possibility to pass a classloader to the Java service provider


## Verifying this change

- See `DependencyTest`

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): no
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
  - The serializers: no
  - The runtime per-record code paths (performance sensitive): no
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
  - The S3 file system connector: no

## Documentation

  - Does this pull request introduce a new feature? no
  - If yes, how is the feature documented? JavaDocs


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/twalthr/flink FLINK-8839

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5640.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5640


commit 8c7b1427f94082dc023073125b32eceda556d8cd
Author: Timo Walther 
Date:   2018-03-05T12:46:41Z

[FLINK-8839] [sql-client] Fix table source factory discovery




---


[jira] [Commented] (FLINK-8839) Table source factory discovery is broken in SQL Client

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386152#comment-16386152
 ] 

ASF GitHub Bot commented on FLINK-8839:
---

GitHub user twalthr opened a pull request:

https://github.com/apache/flink/pull/5640

[FLINK-8839] [sql-client] Fix table source factory discovery

## What is the purpose of the change

This PR fixes the table source factory discovery by adding dependencies to 
the classloader. It also implements an `ExecutionContext` that can be reused 
during the same session.


## Brief change log

- New `ExecutionContext` abstraction
- Possibility to pass a classloader to the Java service provider


## Verifying this change

- See `DependencyTest`

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): no
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
  - The serializers: no
  - The runtime per-record code paths (performance sensitive): no
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
  - The S3 file system connector: no

## Documentation

  - Does this pull request introduce a new feature? no
  - If yes, how is the feature documented? JavaDocs


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/twalthr/flink FLINK-8839

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5640.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5640


commit 8c7b1427f94082dc023073125b32eceda556d8cd
Author: Timo Walther 
Date:   2018-03-05T12:46:41Z

[FLINK-8839] [sql-client] Fix table source factory discovery




> Table source factory discovery is broken in SQL Client
> --
>
> Key: FLINK-8839
> URL: https://issues.apache.org/jira/browse/FLINK-8839
> Project: Flink
>  Issue Type: Bug
>  Components: Table API  SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Blocker
> Fix For: 1.5.0
>
>
> Table source factories cannot not be discovered if they were added using a 
> jar file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink issue #5635: [FLINK-8859][state backend] RocksDB backend should pass W...

2018-03-05 Thread StefanRRichter
Github user StefanRRichter commented on the issue:

https://github.com/apache/flink/pull/5635
  
LGTM 👍 


---


[jira] [Commented] (FLINK-8859) RocksDB backend should pass WriteOption to Rocks.put() when restoring

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386022#comment-16386022
 ] 

ASF GitHub Bot commented on FLINK-8859:
---

Github user StefanRRichter commented on the issue:

https://github.com/apache/flink/pull/5635
  
LGTM  


> RocksDB backend should pass WriteOption to Rocks.put() when restoring
> -
>
> Key: FLINK-8859
> URL: https://issues.apache.org/jira/browse/FLINK-8859
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Affects Versions: 1.5.0
>Reporter: Sihua Zhou
>Assignee: Sihua Zhou
>Priority: Major
> Fix For: 1.5.0
>
>
> We should pass `WriteOption` to Rocks.put() when restoring from handle (Both 
> in full & incremental checkpoint). Because of 
> `WriteOption.setDisableWAL(true)`, the performance can be increased by about 
> 2 times.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-7756) RocksDB state backend Checkpointing (Async and Incremental) is not working with CEP.

2018-03-05 Thread Shashank Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-7756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386040#comment-16386040
 ] 

Shashank Agarwal commented on FLINK-7756:
-

[~aljoscha] We found one thing, In our past setup, we had an entry in our 
Flink_conf file for default schema.

{code}
fs.default-scheme: hdfs://mydomain.com:8020/flink
{code}

When we have removed that now it's working fine with previous flink build also 
which we have built from source using HDP version. So this problem solved, But 
this is an issue should I report a new bug for that? So closing the issues. 
Thanks for your great support.

> RocksDB state backend Checkpointing (Async and Incremental)  is not working 
> with CEP.
> -
>
> Key: FLINK-7756
> URL: https://issues.apache.org/jira/browse/FLINK-7756
> Project: Flink
>  Issue Type: Sub-task
>  Components: CEP, State Backends, Checkpointing, Streaming
>Affects Versions: 1.4.0, 1.3.2
> Environment: Flink 1.3.2, Yarn, HDFS, RocksDB backend
>Reporter: Shashank Agarwal
>Assignee: Aljoscha Krettek
>Priority: Blocker
> Fix For: 1.5.0, 1.4.2
>
> Attachments: jobmanager.log, jobmanager_without_cassandra.log, 
> taskmanager.log, taskmanager_without_cassandra.log
>
>
> When i try to use RocksDBStateBackend on my staging cluster (which is using 
> HDFS as file system) it crashes. But When i use FsStateBackend on staging 
> (which is using HDFS as file system) it is working fine.
> On local with local file system it's working fine in both cases.
> Please check attached logs. I have around 20-25 tasks in my app.
> {code:java}
> 2017-09-29 14:21:31,639 INFO  
> org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink  - No state 
> to restore for the BucketingSink (taskIdx=0).
> 2017-09-29 14:21:31,640 INFO  
> org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend  - 
> Initializing RocksDB keyed state backend from snapshot.
> 2017-09-29 14:21:32,020 INFO  
> org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink  - No state 
> to restore for the BucketingSink (taskIdx=1).
> 2017-09-29 14:21:32,022 INFO  
> org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend  - 
> Initializing RocksDB keyed state backend from snapshot.
> 2017-09-29 14:21:32,078 INFO  com.datastax.driver.core.NettyUtil  
>   - Found Netty's native epoll transport in the classpath, using 
> it
> 2017-09-29 14:21:34,177 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Attempting to fail task externally Co-Flat Map (1/2) 
> (b879f192c4e8aae6671cdafb3a24c00a).
> 2017-09-29 14:21:34,177 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Attempting to fail task externally Map (2/2) 
> (1ea5aef6ccc7031edc6b37da2912d90b).
> 2017-09-29 14:21:34,178 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Attempting to fail task externally Co-Flat Map (2/2) 
> (4bac8e764c67520d418a4c755be23d4d).
> 2017-09-29 14:21:34,178 INFO  org.apache.flink.runtime.taskmanager.Task   
>   - Co-Flat Map (1/2) (b879f192c4e8aae6671cdafb3a24c00a) switched 
> from RUNNING to FAILED.
> AsynchronousException{java.lang.Exception: Could not materialize checkpoint 2 
> for operator Co-Flat Map (1/2).}
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:970)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.Exception: Could not materialize checkpoint 2 for 
> operator Co-Flat Map (1/2).
>   ... 6 more
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.IllegalStateException
>   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>   at 
> org.apache.flink.util.FutureUtil.runIfNotDoneAndGet(FutureUtil.java:43)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:897)
>   ... 5 more
>   Suppressed: java.lang.Exception: Could not properly cancel managed 
> keyed state future.
>   at 
> org.apache.flink.streaming.api.operators.OperatorSnapshotResult.cancel(OperatorSnapshotResult.java:90)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.cleanup(StreamTask.java:1023)
>   at 
> 

[jira] [Commented] (FLINK-8274) Fix Java 64K method compiling limitation for CommonCalc

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386109#comment-16386109
 ] 

ASF GitHub Bot commented on FLINK-8274:
---

Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/5613
  
@Xpray and @hequn8128 are you fine with merging this PR for now and then 
open follow-up issues for more splitting (unboxing, expression, class)?


> Fix Java 64K method compiling limitation for CommonCalc
> ---
>
> Key: FLINK-8274
> URL: https://issues.apache.org/jira/browse/FLINK-8274
> Project: Flink
>  Issue Type: Bug
>  Components: Table API  SQL
>Affects Versions: 1.5.0
>Reporter: Ruidong Li
>Assignee: Ruidong Li
>Priority: Critical
>
> For complex SQL Queries, the generated code for {code}DataStreamCalc{code}, 
> {code}DataSetCalc{code} may exceed Java's method length limitation 64kb.
>  
> This issue will split long method to several sub method calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink pull request #5635: [FLINK-8859][state backend] RocksDB backend should...

2018-03-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/5635


---


[jira] [Commented] (FLINK-8859) RocksDB backend should pass WriteOption to Rocks.put() when restoring

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386029#comment-16386029
 ] 

ASF GitHub Bot commented on FLINK-8859:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/5635


> RocksDB backend should pass WriteOption to Rocks.put() when restoring
> -
>
> Key: FLINK-8859
> URL: https://issues.apache.org/jira/browse/FLINK-8859
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Affects Versions: 1.5.0
>Reporter: Sihua Zhou
>Assignee: Sihua Zhou
>Priority: Major
> Fix For: 1.5.0
>
>
> We should pass `WriteOption` to Rocks.put() when restoring from handle (Both 
> in full & incremental checkpoint). Because of 
> `WriteOption.setDisableWAL(true)`, the performance can be increased by about 
> 2 times.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-4387) Instability in KvStateClientTest.testClientServerIntegration()

2018-03-05 Thread Nico Kruber (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nico Kruber updated FLINK-4387:
---
Priority: Major  (was: Blocker)

> Instability in KvStateClientTest.testClientServerIntegration()
> --
>
> Key: FLINK-4387
> URL: https://issues.apache.org/jira/browse/FLINK-4387
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.1.0, 1.5.0, 1.6.0
>Reporter: Robert Metzger
>Assignee: Nico Kruber
>Priority: Major
>  Labels: test-stability
> Fix For: 1.2.0, 1.6.0
>
>
> According to this log: 
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/151491745/log.txt
> the {{KvStateClientTest}} didn't complete.
> {code}
> "main" #1 prio=5 os_prio=0 tid=0x7fb2b400a000 nid=0x29dc in Object.wait() 
> [0x7fb2bcb3b000]
>java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   - waiting on <0xf7c049a0> (a 
> io.netty.util.concurrent.DefaultPromise)
>   at java.lang.Object.wait(Object.java:502)
>   at 
> io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:254)
>   - locked <0xf7c049a0> (a 
> io.netty.util.concurrent.DefaultPromise)
>   at io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:32)
>   at 
> org.apache.flink.runtime.query.netty.KvStateServer.shutDown(KvStateServer.java:185)
>   at 
> org.apache.flink.runtime.query.netty.KvStateClientTest.testClientServerIntegration(KvStateClientTest.java:680)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> {code}
> and
> {code}
> Exception in thread "globalEventExecutor-1-3" java.lang.AssertionError
>   at 
> io.netty.util.concurrent.AbstractScheduledEventExecutor.pollScheduledTask(AbstractScheduledEventExecutor.java:83)
>   at 
> io.netty.util.concurrent.GlobalEventExecutor.fetchFromScheduledTaskQueue(GlobalEventExecutor.java:110)
>   at 
> io.netty.util.concurrent.GlobalEventExecutor.takeTask(GlobalEventExecutor.java:95)
>   at 
> io.netty.util.concurrent.GlobalEventExecutor$TaskRunner.run(GlobalEventExecutor.java:226)
>   at 
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
>   at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-4387) Instability in KvStateClientTest.testClientServerIntegration()

2018-03-05 Thread Nico Kruber (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386061#comment-16386061
 ] 

Nico Kruber commented on FLINK-4387:


Upgrading Netty should solve this bug as well.

> Instability in KvStateClientTest.testClientServerIntegration()
> --
>
> Key: FLINK-4387
> URL: https://issues.apache.org/jira/browse/FLINK-4387
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.1.0, 1.5.0, 1.6.0
>Reporter: Robert Metzger
>Assignee: Nico Kruber
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.2.0, 1.6.0
>
>
> According to this log: 
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/151491745/log.txt
> the {{KvStateClientTest}} didn't complete.
> {code}
> "main" #1 prio=5 os_prio=0 tid=0x7fb2b400a000 nid=0x29dc in Object.wait() 
> [0x7fb2bcb3b000]
>java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   - waiting on <0xf7c049a0> (a 
> io.netty.util.concurrent.DefaultPromise)
>   at java.lang.Object.wait(Object.java:502)
>   at 
> io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:254)
>   - locked <0xf7c049a0> (a 
> io.netty.util.concurrent.DefaultPromise)
>   at io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:32)
>   at 
> org.apache.flink.runtime.query.netty.KvStateServer.shutDown(KvStateServer.java:185)
>   at 
> org.apache.flink.runtime.query.netty.KvStateClientTest.testClientServerIntegration(KvStateClientTest.java:680)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> {code}
> and
> {code}
> Exception in thread "globalEventExecutor-1-3" java.lang.AssertionError
>   at 
> io.netty.util.concurrent.AbstractScheduledEventExecutor.pollScheduledTask(AbstractScheduledEventExecutor.java:83)
>   at 
> io.netty.util.concurrent.GlobalEventExecutor.fetchFromScheduledTaskQueue(GlobalEventExecutor.java:110)
>   at 
> io.netty.util.concurrent.GlobalEventExecutor.takeTask(GlobalEventExecutor.java:95)
>   at 
> io.netty.util.concurrent.GlobalEventExecutor$TaskRunner.run(GlobalEventExecutor.java:226)
>   at 
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
>   at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink pull request #5621: [FLINK-8517] fix missing synchronization in TaskEv...

2018-03-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/5621


---


[jira] [Updated] (FLINK-8862) Support HBase snapshot read

2018-03-05 Thread Xu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-8862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Zhang updated FLINK-8862:

Attachment: FLINK-8862-DesignDoc.pdf

> Support HBase snapshot read
> ---
>
> Key: FLINK-8862
> URL: https://issues.apache.org/jira/browse/FLINK-8862
> Project: Flink
>  Issue Type: Improvement
>  Components: Batch Connectors and Input/Output Formats
>Affects Versions: 1.2.0
>Reporter: Xu Zhang
>Priority: Major
> Attachments: FLINK-8862-DesignDoc.pdf
>
>
> Flink-hbase connector only supports reading/scanning HBase over region server 
> scanner, there is also snapshot scanning solution, just like Hadoop provides 
> 2 ways to scan HBase, one is TableInputFormat, the other is 
> TableSnapshotInputFormat, so it would be great if flink supports both 
> solutions to ensure more wider usage scope and provide alternatives for users.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-8864) Add CLI query history

2018-03-05 Thread Timo Walther (JIRA)
Timo Walther created FLINK-8864:
---

 Summary: Add CLI query history
 Key: FLINK-8864
 URL: https://issues.apache.org/jira/browse/FLINK-8864
 Project: Flink
  Issue Type: Sub-task
  Components: Table API  SQL
Reporter: Timo Walther


This issue is a subtask of part two "Full Embedded SQL Client" of the 
implementation plan mentioned in 
[FLIP-24|https://cwiki.apache.org/confluence/display/FLINK/FLIP-24+-+SQL+Client].

It would be great to have the possibility of persisting the CLI's query 
history. Such that queries can be reused when the CLI Client is started again. 
Also a search feature as it is offered by terminals would be good.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink pull request #5639: [FLINK-8862] [HBase] Support HBase snapshot read

2018-03-05 Thread neoremind
GitHub user neoremind opened a pull request:

https://github.com/apache/flink/pull/5639

[FLINK-8862] [HBase] Support HBase snapshot read

## What is the purpose of the change

*Flink-hbase connector only supports reading/scanning HBase over region 
server scanner, there is also 
[snapshot](http://hbase.apache.org/book.html#ops.snapshots) scanning solution, 
just like Hadoop provides 2 ways to scan HBase, one is 
[TableInputFormat](https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableInputFormat.html),
 the other is 
[TableSnapshotInputFormat](https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableSnapshotInputFormat.html),
 so it would be great if flink supports both solutions to ensure more wider 
usage scope and provide alternatives for users.*


## Brief change log

  - *Create `TableInputSplitStrategy` interface and its implementations as 
abstraction logic for `AbstractTableInputFormat`*
  - *Update `HBaseRowInputFormat` and `TableInputFormat`*
  - *Add `HBaseSnapshotRowInputFormat` and `TableSnapshotInputFormat`*
  - *Extract 2 interfaces including `HBaseTableScannerAware` and 
`ResultToTupleMapper`*
  - *Add `HBaseSnapshotReadExample`*


## Verifying this change

This change is already covered by existing tests as follows, and new test 
cases has been added as well.

`org.apache.flink.addons.hbase.HBaseConnectorITCase`

This change added tests and can be verified as follows:

  - *Manually create one snapshot for a specific HBase table, and use 
TableSnapshotInputFormat to do full scan.*
  - *Running existing HBaseReadExample to do full scan.*

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (yes / **no**)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
  - The serializers: (yes / **no** / don't know)
  - The runtime per-record code paths (performance sensitive): (yes / 
**no** / don't know)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
  - The S3 file system connector: (yes / **no** / don't know)

## Documentation

  - Does this pull request introduce a new feature? (**yes** / no)
  - If yes, how is the feature documented? (not applicable / **docs** / 
**JavaDocs** / not documented)
  - For document, please visit [JIRA 
ticket](https://issues.apache.org/jira/projects/FLINK/issues/FLINK-8862?filter=allopenissues),
 a detailed design doc and class diagram have been attached.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/neoremind/flink snapshot

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5639.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5639


commit 0b36b434f987a971b6463ce3441c483380cfa9dd
Author: neoremind 
Date:   2018-03-05T14:14:09Z

Support HBase snapshot read




---


[jira] [Created] (FLINK-8867) Rocksdb checkpointing failing with fs.default-scheme: hdfs:// config

2018-03-05 Thread Shashank Agarwal (JIRA)
Shashank Agarwal created FLINK-8867:
---

 Summary: Rocksdb checkpointing failing with fs.default-scheme: 
hdfs:// config
 Key: FLINK-8867
 URL: https://issues.apache.org/jira/browse/FLINK-8867
 Project: Flink
  Issue Type: Bug
  Components: Configuration, State Backends, Checkpointing, YARN
Affects Versions: 1.4.1, 1.4.2
Reporter: Shashank Agarwal
 Fix For: 1.5.0, 1.4.3


In our setup, when we put an entry in our Flink_conf file for default schema.

{code}
fs.default-scheme: hdfs://mydomain.com:8020/flink
{code}

Than application with rocksdb state backend fails with the following exception. 
When we remove this config it works fine. It's working fine with other state 
backends.

{code}
AsynchronousException{java.lang.Exception: Could not materialize checkpoint 1 
for operator order ip stream (1/1).}
at 
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:948)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.Exception: Could not materialize checkpoint 1 for operator 
order ip stream (1/1).
... 6 more
Caused by: java.util.concurrent.ExecutionException: 
java.lang.IllegalStateException
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at 
org.apache.flink.util.FutureUtil.runIfNotDoneAndGet(FutureUtil.java:43)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:894)
... 5 more
Suppressed: java.lang.Exception: Could not properly cancel managed 
keyed state future.
at 
org.apache.flink.streaming.api.operators.OperatorSnapshotResult.cancel(OperatorSnapshotResult.java:91)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.cleanup(StreamTask.java:976)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:939)
... 5 more
Caused by: java.util.concurrent.ExecutionException: 
java.lang.IllegalStateException
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at 
org.apache.flink.util.FutureUtil.runIfNotDoneAndGet(FutureUtil.java:43)
at 
org.apache.flink.runtime.state.StateUtil.discardStateFuture(StateUtil.java:66)
at 
org.apache.flink.streaming.api.operators.OperatorSnapshotResult.cancel(OperatorSnapshotResult.java:89)
... 7 more
Caused by: java.lang.IllegalStateException
at 
org.apache.flink.util.Preconditions.checkState(Preconditions.java:179)
at 
org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend$RocksDBIncrementalSnapshotOperation.materializeSnapshot(RocksDBKeyedStateBackend.java:926)
at 
org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend$1.call(RocksDBKeyedStateBackend.java:389)
at 
org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend$1.call(RocksDBKeyedStateBackend.java:386)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
org.apache.flink.util.FutureUtil.runIfNotDoneAndGet(FutureUtil.java:40)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:894)
... 5 more
[CIRCULAR REFERENCE:java.lang.IllegalStateException]
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-8860) SlotManager spamming log files

2018-03-05 Thread Nico Kruber (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-8860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nico Kruber updated FLINK-8860:
---
Labels: flip-6  (was: flip-6 flip6)

> SlotManager spamming log files
> --
>
> Key: FLINK-8860
> URL: https://issues.apache.org/jira/browse/FLINK-8860
> Project: Flink
>  Issue Type: Bug
>  Components: JobManager, ResourceManager
>Affects Versions: 1.5.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Critical
>  Labels: flip-6
> Fix For: 1.5.0
>
>
> {{SlotManager}} is spamming the log files a lot with
> {code}
> 2018-03-05 10:45:12,393 INFO  
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Received 
> slot report from instance b16c4e516995d1e672c0933bb380770c.
> 2018-03-05 10:45:12,393 INFO  
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Received 
> slot report from instance de58fbf1c069620a4275c8b529deb20b.
> 2018-03-05 10:45:12,393 INFO  
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Received 
> slot report from instance 86ab5a7e1d57bb2883fc0d1f2aebb304.
> 2018-03-05 10:45:12,393 INFO  
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Received 
> slot report from instance 90f9ab2bd433db41b3ab567fd246fb3c.
> 2018-03-05 10:45:12,393 INFO  
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Received 
> slot report from instance ec99fcc5a801272402af9afe08a1001d.
> 2018-03-05 10:45:12,394 INFO  
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Received 
> slot report from instance 4c1c4b5ce52195dc90196c10c26d9ef8.
> 2018-03-05 10:45:12,394 INFO  
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Received 
> slot report from instance 2541d0f1398fc307aaf86bf7750535f1.
> 2018-03-05 10:45:12,394 INFO  
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Received 
> slot report from instance 91d94a9fdfce3cbcef32ac1c6e7b3fbf.
> 2018-03-05 10:45:22,392 INFO  
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Received 
> slot report from instance 91d94a9fdfce3cbcef32ac1c6e7b3fbf.
> 2018-03-05 10:45:22,394 INFO  
> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Received 
> slot report from instance 90f9ab2bd433db41b3ab567fd246fb3c.
> {code}
> This message is printed once per {{TaskManager}} heartbeat.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (FLINK-8858) SQL Client to submit long running query in file

2018-03-05 Thread Timo Walther (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386047#comment-16386047
 ] 

Timo Walther edited comment on FLINK-8858 at 3/5/18 1:08 PM:
-

Thanks for opening this issue [~liurenjie1024]. What you describe in your issue 
is part of "2. Full Embedded SQL Client" of 
[FLIP-24|https://cwiki.apache.org/confluence/display/FLINK/FLIP-24+-+SQL+Client].
 Such a feature needs the {{INSERT INTO}} statement as well as a unified table 
sink interface similar to FLINK-8240. I will open subtaks such that 
contributors can assign them.


was (Author: twalthr):
Thanks for opening this issue [~liurenjie1024]. What you describe in your issue 
is part of "2. Full Embedded SQL Client" of 
[FLIP-24|[https://cwiki.apache.org/confluence/display/FLINK/FLIP-24+-+SQL+Client].]
 Such a feature needs the \{{INSERT INTO}} statement as well as a unified table 
sink interface similar to FLINK-8240. I will open subtaks such that 
contributors can assign them.

> SQL Client to submit long running query in file
> ---
>
> Key: FLINK-8858
> URL: https://issues.apache.org/jira/browse/FLINK-8858
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API  SQL
>Affects Versions: 1.6.0
>Reporter: Renjie Liu
>Assignee: Renjie Liu
>Priority: Major
>
> The current design of SQL Client embedded mode doesn't support long running 
> queries. It would be useful for simple jobs that can be expressed in a single 
> sql statement if we can submit sql statements stored in files as long running 
> queries. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (FLINK-6321) RocksDB state backend Checkpointing is not working with KeyedCEP.

2018-03-05 Thread Shashank Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashank Agarwal closed FLINK-6321.
---
   Resolution: Fixed
Fix Version/s: 1.4.1

same as : https://issues.apache.org/jira/browse/FLINK-7756

> RocksDB state backend Checkpointing is not working with KeyedCEP.
> -
>
> Key: FLINK-6321
> URL: https://issues.apache.org/jira/browse/FLINK-6321
> Project: Flink
>  Issue Type: Sub-task
>  Components: CEP
>Affects Versions: 1.2.0
> Environment: yarn-cluster, RocksDB State backend, Checkpointing every 
> 1000 ms
>Reporter: Shashank Agarwal
>Assignee: Kostas Kloudas
>Priority: Blocker
> Fix For: 1.5.0, 1.4.2, 1.4.1
>
> Attachments: jobmanager.log, taskmanager.log
>
>
> Checkpointing is not working with RocksDBStateBackend when using CEP. It's 
> working fine with FsStateBackend and MemoryStateBackend. Application failing 
> every-time.
> {code}
> 04/18/2017 21:53:20   Job execution switched to status FAILING.
> AsynchronousException{java.lang.Exception: Could not materialize checkpoint 
> 46 for operator KeyedCEPPatternOperator -> Map (1/4).}
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:980)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.Exception: Could not materialize checkpoint 46 for 
> operator KeyedCEPPatternOperator -> Map (1/4).
>   ... 6 more
> Caused by: java.util.concurrent.CancellationException
>   at java.util.concurrent.FutureTask.report(FutureTask.java:121)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>   at 
> org.apache.flink.util.FutureUtil.runIfNotDoneAndGet(FutureUtil.java:40)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:915)
>   ... 5 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-8517) StaticlyNestedIterationsITCase.testJobWithoutObjectReuse unstable on Travis

2018-03-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-8517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386129#comment-16386129
 ] 

ASF GitHub Bot commented on FLINK-8517:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/5621


> StaticlyNestedIterationsITCase.testJobWithoutObjectReuse unstable on Travis
> ---
>
> Key: FLINK-8517
> URL: https://issues.apache.org/jira/browse/FLINK-8517
> Project: Flink
>  Issue Type: Bug
>  Components: DataSet API, TaskManager, Tests
>Affects Versions: 1.5.0
>Reporter: Till Rohrmann
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.5.0
>
>
> The {{StaticlyNestedIterationsITCase.testJobWithoutObjectReuse}} test case 
> fails on Travis. This exception might be relevant:
> {code:java}
> org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$6.apply$mcV$sp(JobManager.scala:891)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$6.apply(JobManager.scala:834)
>   at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$6.apply(JobManager.scala:834)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>   at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>   at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39)
>   at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:415)
>   at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>   at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>   at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: java.lang.IllegalStateException: Partition 
> 557b069f2b89f8ba599e6ab0956a3f5a@58f1a6b7d8ae10b9141f17c08d06cecb not 
> registered at task event dispatcher.
>   at 
> org.apache.flink.runtime.io.network.TaskEventDispatcher.subscribeToEvent(TaskEventDispatcher.java:107)
>   at 
> org.apache.flink.runtime.iterative.task.IterationHeadTask.initSuperstepBarrier(IterationHeadTask.java:242)
>   at 
> org.apache.flink.runtime.iterative.task.IterationHeadTask.run(IterationHeadTask.java:266)
>   at 
> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:703)
>   at java.lang.Thread.run(Thread.java:748){code}
>  
> https://api.travis-ci.org/v3/job/60156/log.txt



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-8862) Support HBase snapshot read

2018-03-05 Thread Xu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-8862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Zhang updated FLINK-8862:

Attachment: FLINK-8862-Design-Class-Diagram.png

> Support HBase snapshot read
> ---
>
> Key: FLINK-8862
> URL: https://issues.apache.org/jira/browse/FLINK-8862
> Project: Flink
>  Issue Type: Improvement
>  Components: Batch Connectors and Input/Output Formats
>Affects Versions: 1.2.0
>Reporter: Xu Zhang
>Priority: Major
> Attachments: FLINK-8862-Design-Class-Diagram.png, 
> FLINK-8862-DesignDoc.pdf
>
>
> Flink-hbase connector only supports reading/scanning HBase over region server 
> scanner, there is also snapshot scanning solution, just like Hadoop provides 
> 2 ways to scan HBase, one is TableInputFormat, the other is 
> TableSnapshotInputFormat, so it would be great if flink supports both 
> solutions to ensure more wider usage scope and provide alternatives for users.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-8865) Add CLI query code completion in SQL Client

2018-03-05 Thread Timo Walther (JIRA)
Timo Walther created FLINK-8865:
---

 Summary: Add CLI query code completion in SQL Client
 Key: FLINK-8865
 URL: https://issues.apache.org/jira/browse/FLINK-8865
 Project: Flink
  Issue Type: Sub-task
  Components: Table API  SQL
Reporter: Timo Walther


This issue is a subtask of part two "Full Embedded SQL Client" of the 
implementation plan mentioned in 
[FLIP-24|https://cwiki.apache.org/confluence/display/FLINK/FLIP-24+-+SQL+Client].

Calcite already offers a code completion functionality. It would be great if we 
could expose this feature also through the SQL CLI Client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-8812) Possible resource leak in Flip6

2018-03-05 Thread Nico Kruber (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-8812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nico Kruber updated FLINK-8812:
---
Labels: flip-6  (was: flip6)

> Possible resource leak in Flip6
> ---
>
> Key: FLINK-8812
> URL: https://issues.apache.org/jira/browse/FLINK-8812
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Reporter: Chesnay Schepler
>Priority: Blocker
>  Labels: flip-6
> Fix For: 1.5.0
>
>
> In this build (https://travis-ci.org/zentol/flink/builds/347373839) I set the 
> codebase to flip6 for half the profiles to find failing tests.
> The "libraries" job (https://travis-ci.org/zentol/flink/jobs/347373851) 
> failed with an OutOfMemoryError.
> This could mean that there is a memory-leak somewhere.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-8860) SlotManager spamming log files

2018-03-05 Thread Nico Kruber (JIRA)
Nico Kruber created FLINK-8860:
--

 Summary: SlotManager spamming log files
 Key: FLINK-8860
 URL: https://issues.apache.org/jira/browse/FLINK-8860
 Project: Flink
  Issue Type: Bug
  Components: JobManager, ResourceManager
Affects Versions: 1.5.0
Reporter: Nico Kruber
Assignee: Nico Kruber
 Fix For: 1.5.0


{{SlotManager}} is spamming the log files a lot with
{code}
2018-03-05 10:45:12,393 INFO  
org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Received 
slot report from instance b16c4e516995d1e672c0933bb380770c.
2018-03-05 10:45:12,393 INFO  
org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Received 
slot report from instance de58fbf1c069620a4275c8b529deb20b.
2018-03-05 10:45:12,393 INFO  
org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Received 
slot report from instance 86ab5a7e1d57bb2883fc0d1f2aebb304.
2018-03-05 10:45:12,393 INFO  
org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Received 
slot report from instance 90f9ab2bd433db41b3ab567fd246fb3c.
2018-03-05 10:45:12,393 INFO  
org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Received 
slot report from instance ec99fcc5a801272402af9afe08a1001d.
2018-03-05 10:45:12,394 INFO  
org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Received 
slot report from instance 4c1c4b5ce52195dc90196c10c26d9ef8.
2018-03-05 10:45:12,394 INFO  
org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Received 
slot report from instance 2541d0f1398fc307aaf86bf7750535f1.
2018-03-05 10:45:12,394 INFO  
org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Received 
slot report from instance 91d94a9fdfce3cbcef32ac1c6e7b3fbf.
2018-03-05 10:45:22,392 INFO  
org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Received 
slot report from instance 91d94a9fdfce3cbcef32ac1c6e7b3fbf.
2018-03-05 10:45:22,394 INFO  
org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager  - Received 
slot report from instance 90f9ab2bd433db41b3ab567fd246fb3c.
{code}
This message is printed once per {{TaskManager}} heartbeat.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-8686) Improve basic embedded SQL client

2018-03-05 Thread Timo Walther (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-8686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-8686:

Issue Type: Sub-task  (was: Improvement)
Parent: FLINK-7594

> Improve basic embedded SQL client 
> --
>
> Key: FLINK-8686
> URL: https://issues.apache.org/jira/browse/FLINK-8686
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API  SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Blocker
> Fix For: 1.5.0
>
>
> This issue describes follow-up issues that should be fixes in order to make 
> the SQL client more stable:
>  - Add more tests for executor
>  - Configure JVM heap size
>  - Limit changelog and table buffers
>  - "The input is invalid please check it again." => add allowed range
>  - Load dependencies recursively
>  - Cache table & environments in executor
>  - Clean up results in result store
>  - Improve error message for unsupported batch queries
>  - Add more logging instead swallowing exceptions
>  - List properties in error message about missing TS factory sorted by name
>  - Add command to show loaded TS factories and their required propeties
>  - Add command to reload configuration from files (no need to restart client)
>  - Improve error message in case of invalid json-schema (right now: 
> {{java.lang.IllegalArgumentException: No type could be found in node:}}
>  - Add switch to show full stacktraces of exceptions
>  - Give error message when setting unknown parameters 
> {{result-mode=changelog}} does not give an error but should be 
> {{execution.result-mode=changelog}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-8853) SQL Client cannot emit query results that contain a rowtime attribute

2018-03-05 Thread Timo Walther (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-8853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-8853:

Issue Type: Sub-task  (was: Bug)
Parent: FLINK-7594

> SQL Client cannot emit query results that contain a rowtime attribute
> -
>
> Key: FLINK-8853
> URL: https://issues.apache.org/jira/browse/FLINK-8853
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API  SQL
>Affects Versions: 1.5.0
>Reporter: Fabian Hueske
>Assignee: Timo Walther
>Priority: Blocker
> Fix For: 1.5.0
>
>
> Emitting a query result that contains a rowtime attribute fails with the 
> following exception:
> {code:java}
> Caused by: java.lang.ClassCastException: java.sql.Timestamp cannot be cast to 
> java.lang.Long
>     at 
> org.apache.flink.api.common.typeutils.base.LongSerializer.serialize(LongSerializer.java:27)
>     at 
> org.apache.flink.api.java.typeutils.runtime.RowSerializer.serialize(RowSerializer.java:160)
>     at 
> org.apache.flink.api.java.typeutils.runtime.RowSerializer.serialize(RowSerializer.java:46)
>     at 
> org.apache.flink.api.java.typeutils.runtime.TupleSerializer.serialize(TupleSerializer.java:125)
>     at 
> org.apache.flink.api.java.typeutils.runtime.TupleSerializer.serialize(TupleSerializer.java:30)
>     at 
> org.apache.flink.streaming.experimental.CollectSink.invoke(CollectSink.java:66)
>     ... 44 more{code}
> The problem is cause by the {{ResultStore}} which configures the 
> {{CollectionSink}} with the field types obtained from the {{TableSchema}}. 
> The type of the rowtime field is a {{TimeIndicatorType}} which is serialized 
> as Long. However, in the query result it is represented as Timestamp. Hence, 
> the type must be replaced by a {{SqlTimeTypeInfo}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink pull request #5637: [FLINK-8860][flip6] stop SlotManager spamming logs...

2018-03-05 Thread NicoK
GitHub user NicoK opened a pull request:

https://github.com/apache/flink/pull/5637

[FLINK-8860][flip6] stop SlotManager spamming logs for every TM heartbeat 
at log level 'info'

## What is the purpose of the change

For every `TaskManager` heartbeat message, `SlotManager` was writing the 
message `Received slot report from instance...` into the logs at info level 
although this is clearly a debug information that is even already printed with 
a different of detail by the `ResourceManager`.

## Brief change log

- change log level of a slot report to `debug`

## Verifying this change

This change is a trivial rework / code cleanup without any test coverage.

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): **no**
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: **no**
  - The serializers: **no**
  - The runtime per-record code paths (performance sensitive): **no**
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: **yes**
  - The S3 file system connector: **no**

## Documentation

  - Does this pull request introduce a new feature? **no**
  - If yes, how is the feature documented? **not applicable**


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/NicoK/flink flink-8860

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/5637.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5637


commit aea1dc317d8f5280faadace3f872655a51f32b75
Author: Nico Kruber 
Date:   2018-03-05T12:55:04Z

[FLINK-8860][flip6] stop SlotManager spamming logs for every TM heartbeat 
at log level 'info'




---


  1   2   >