[jira] [Commented] (FLINK-4253) Rename "recovery.mode" config key to "high-availability"

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15414673#comment-15414673
 ] 

ASF GitHub Bot commented on FLINK-4253:
---

Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/2342
  
Looks like only this build failed 
`https://travis-ci.org/apache/flink/jobs/150992610` and that too due to a 
cassandra-connector test case. Should be unrelated to this PR.


> Rename "recovery.mode" config key to "high-availability"
> 
>
> Key: FLINK-4253
> URL: https://issues.apache.org/jira/browse/FLINK-4253
> Project: Flink
>  Issue Type: Improvement
>Reporter: Ufuk Celebi
>Assignee: ramkrishna.s.vasudevan
>
> Currently, HA is configured via the following configuration keys:
> {code}
> recovery.mode: STANDALONE // No high availability (HA)
> recovery.mode: ZOOKEEPER // HA
> {code}
> This could be more straight forward by simply renaming the key to 
> {{high-availability}}. Furthermore, the term {{STANDALONE}} is overloaded. We 
> already have standalone cluster mode.
> {code}
> high-availability: NONE // No HA
> high-availability: ZOOKEEPER // HA via ZooKeeper
> {code}
> The {{recovery.mode}} configuration keys would have to be deprecated before 
> completely removing them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2342: FLINK-4253 - Rename "recovery.mode" config key to "high-a...

2016-08-09 Thread ramkrish86
Github user ramkrish86 commented on the issue:

https://github.com/apache/flink/pull/2342
  
Looks like only this build failed 
`https://travis-ci.org/apache/flink/jobs/150992610` and that too due to a 
cassandra-connector test case. Should be unrelated to this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-4347) Implement SlotManager for new ResourceManager

2016-08-09 Thread Kurt Young (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kurt Young updated FLINK-4347:
--
Description: The slot manager is responsible to maintain the list of slot 
requests and slot allocations. It allows to request slots from the registered 
TaskExecutors and issues container allocation requests in case that there are 
not enough available resources.  (was: details will be added later)

> Implement SlotManager for new ResourceManager
> -
>
> Key: FLINK-4347
> URL: https://issues.apache.org/jira/browse/FLINK-4347
> Project: Flink
>  Issue Type: Sub-task
>  Components: Cluster Management
>Reporter: Kurt Young
>
> The slot manager is responsible to maintain the list of slot requests and 
> slot allocations. It allows to request slots from the registered 
> TaskExecutors and issues container allocation requests in case that there are 
> not enough available resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2345: [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode

2016-08-09 Thread wenlong88
Github user wenlong88 commented on the issue:

https://github.com/apache/flink/pull/2345
  
@aljoscha I am curious at what problem and incompatibility between 
semi-aync snapshot and key-group, can you explain some background information?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15414640#comment-15414640
 ] 

ASF GitHub Bot commented on FLINK-4340:
---

Github user wenlong88 commented on the issue:

https://github.com/apache/flink/pull/2345
  
@aljoscha I am curious at what problem and incompatibility between 
semi-aync snapshot and key-group, can you explain some background information?


> Remove RocksDB Semi-Async Checkpoint Mode
> -
>
> Key: FLINK-4340
> URL: https://issues.apache.org/jira/browse/FLINK-4340
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.1.0
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>
> This seems to be causing to many problems and is also incompatible with the 
> upcoming key-group/sharding changes that will allow rescaling of keyed state.
> Once this is done we can also close FLINK-4228.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4347) Implement SlotManager for new ResourceManager

2016-08-09 Thread Kurt Young (JIRA)
Kurt Young created FLINK-4347:
-

 Summary: Implement SlotManager for new ResourceManager
 Key: FLINK-4347
 URL: https://issues.apache.org/jira/browse/FLINK-4347
 Project: Flink
  Issue Type: Sub-task
  Components: Cluster Management
Reporter: Kurt Young


details will be added later



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4341) Checkpoint state size grows unbounded when task parallelism not uniform

2016-08-09 Thread Scott Kidder (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15414599#comment-15414599
 ] 

Scott Kidder commented on FLINK-4341:
-

Unfortunately I don't have the time to develop a test application to 
demonstrate this issue without Kinesis. It might be reproducible with Kafka 
when the number of shards (or the Kafka equivalent) is less than the 
parallelism specified on the job.

Also, I noticed that the 1.1.0 binaries available for download were created on 
August 4 and don't include the latest commits on the release-1.1 branch in Git. 
Do you know when they'll be updated? I deploy Flink as a Docker container and 
use the Flink tar-gzip binary to build the Docker image.



> Checkpoint state size grows unbounded when task parallelism not uniform
> ---
>
> Key: FLINK-4341
> URL: https://issues.apache.org/jira/browse/FLINK-4341
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.1.0
>Reporter: Scott Kidder
>Priority: Critical
>
> This issue was first encountered with Flink release 1.1.0 (commit 45f7825). I 
> was previously using a 1.1.0 snapshot (commit 18995c8) which performed as 
> expected.  This issue was introduced somewhere between those commits.
> I've got a Flink application that uses the Kinesis Stream Consumer to read 
> from a Kinesis stream with 2 shards. I've got 2 task managers with 2 slots 
> each, providing a total of 4 slots.  When running the application with a 
> parallelism of 4, the Kinesis consumer uses 2 slots (one per Kinesis shard) 
> and 4 slots for subsequent tasks that process the Kinesis stream data. I use 
> an in-memory store for checkpoint data.
> Yesterday I upgraded to Flink 1.1.0 (45f7825) and noticed that checkpoint 
> states were growing unbounded when running with a parallelism of 4, 
> checkpoint interval of 10 seconds:
> {code}
> ID  State Size
> 1   11.3 MB
> 220.9 MB
> 3   30.6 MB
> 4   41.4 MB
> 5   52.6 MB
> 6   62.5 MB
> 7   71.5 MB
> 8   83.3 MB
> 9   93.5 MB
> {code}
> The first 4 checkpoints generally succeed, but then fail with an exception 
> like the following:
> {code}
> java.lang.RuntimeException: Error triggering a checkpoint as the result of 
> receiving checkpoint barrier at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$2.onEvent(StreamTask.java:768)
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$2.onEvent(StreamTask.java:758)
>  at 
> org.apache.flink.streaming.runtime.io.BarrierBuffer.processBarrier(BarrierBuffer.java:203)
>  at 
> org.apache.flink.streaming.runtime.io.BarrierBuffer.getNextNonBlocked(BarrierBuffer.java:129)
>  at 
> org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:183)
>  at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:66)
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:266)
>  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:584) at 
> java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: Size of 
> the state is larger than the maximum permitted memory-backed state. 
> Size=12105407 , maxSize=5242880 . Consider using a different state backend, 
> like the File System State backend. at 
> org.apache.flink.runtime.state.memory.MemoryStateBackend.checkSize(MemoryStateBackend.java:146)
>  at 
> org.apache.flink.runtime.state.memory.MemoryStateBackend$MemoryCheckpointOutputStream.closeAndGetBytes(MemoryStateBackend.java:200)
>  at 
> org.apache.flink.runtime.state.memory.MemoryStateBackend$MemoryCheckpointOutputStream.closeAndGetHandle(MemoryStateBackend.java:190)
>  at 
> org.apache.flink.runtime.state.AbstractStateBackend$CheckpointStateOutputView.closeAndGetHandle(AbstractStateBackend.java:447)
>  at 
> org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.snapshotOperatorState(WindowOperator.java:879)
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:598)
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$2.onEvent(StreamTask.java:762)
>  ... 8 more
> {code}
> Or:
> {code}
> 2016-08-09 17:44:43,626 INFO  
> org.apache.flink.streaming.runtime.tasks.StreamTask   - Restoring 
> checkpointed state to task Fold: property_id, player -> 10-minute 
> Sliding-Window Percentile Aggregation -> Sink: InfluxDB (2/4)
> 2016-08-09 17:44:51,236 ERROR akka.remote.EndpointWriter- 
> Transient association error (association remains live) 
> akka.remote.OversizedPayloadException: Discarding oversized payload sent to 
> Actor[akka.tcp://flink@10.55.2.212:6123/user/jobmanager#510517238]: max 
> allowed size 10485760 bytes, actual size of encoded class 
> 

[jira] [Commented] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15414591#comment-15414591
 ] 

ASF GitHub Bot commented on FLINK-4340:
---

Github user wenlong88 commented on the issue:

https://github.com/apache/flink/pull/2345
  
@StephanEwen 
http://rocksdb.org/blog/2609/use-checkpoints-for-efficient-snapshots/ 
since sst files is immutable once created, in when doing checkpoint, 
rocksdb creates hard link for all live sst files to a given directory and copy 
manifest to the directory, it cost little IO and the checkpoint dir can be used 
as data dir to restore


> Remove RocksDB Semi-Async Checkpoint Mode
> -
>
> Key: FLINK-4340
> URL: https://issues.apache.org/jira/browse/FLINK-4340
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.1.0
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>
> This seems to be causing to many problems and is also incompatible with the 
> upcoming key-group/sharding changes that will allow rescaling of keyed state.
> Once this is done we can also close FLINK-4228.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2345: [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode

2016-08-09 Thread wenlong88
Github user wenlong88 commented on the issue:

https://github.com/apache/flink/pull/2345
  
@StephanEwen 
http://rocksdb.org/blog/2609/use-checkpoints-for-efficient-snapshots/ 
since sst files is immutable once created, in when doing checkpoint, 
rocksdb creates hard link for all live sst files to a given directory and copy 
manifest to the directory, it cost little IO and the checkpoint dir can be used 
as data dir to restore


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-4346) Implement basic RPC abstraction

2016-08-09 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-4346:
---

 Summary: Implement basic RPC abstraction
 Key: FLINK-4346
 URL: https://issues.apache.org/jira/browse/FLINK-4346
 Project: Flink
  Issue Type: New Feature
  Components: Cluster Management
Reporter: Stephan Ewen


As part of refactoring of the cluster management, we can introduce a new RPC 
abstraction on top of our Akka-based distributed coordination.

It should address the following issues:

  - Add type safety to the sender and receiver of messages. We want proper 
types methods to be called, rather than haveing generic message types and 
pattern matching everywhere. This is similar to typed actors.

  - Make the message receivers testable without involving actors, i.e. the 
methods should be callable directly. When used with other component, the 
receiver will be wrapped in an actor that calls the methods based on received 
messages.

  - We want to keep the paradigm of single-threaded execution per "actor"

There is some basic code layout in the following branch and commit:

https://github.com/apache/flink/tree/f1b45d320181284eca64126ba04f010e23757e38/flink-runtime/src/main/java/org/apache/flink/runtime/rpc



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4345) Implement new ResourceManager

2016-08-09 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-4345:
---

 Summary: Implement new ResourceManager
 Key: FLINK-4345
 URL: https://issues.apache.org/jira/browse/FLINK-4345
 Project: Flink
  Issue Type: New Feature
  Components: Cluster Management
Reporter: Stephan Ewen


This is the parent issue for the efforts to implement the {{ResourceManager}} 
changes based on FLIP-6 
(https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077)

Because of the breadth of changes, we should implement a new version of the 
{{ResourceManager}} rather than updating the current {{ResourceManager}}. That 
will allow us to keep a working master branch.

Much of the current {{ResourceManager}} can probably be reused for the new 
{{ResourceManager}}, but needs to be adjusted for the new RPC abstraction.

At the point when the new cluster management is on par with the current 
implementation, we will replace the current {{ResourceManager}} with the new 
{{ResourceManager}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4344) Implement new JobManager

2016-08-09 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-4344:
---

 Summary: Implement new JobManager
 Key: FLINK-4344
 URL: https://issues.apache.org/jira/browse/FLINK-4344
 Project: Flink
  Issue Type: New Feature
  Components: Cluster Management
Reporter: Stephan Ewen


This is the parent issue for the efforts to implement the {{JobManager}} 
changes based on FLIP-6 
(https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077)

Because of the breadth of changes, we should implement a new version of the 
{{JobManager}} (let's call it {{JobMaster}}) rather than updating the current 
{{JobManager}}. That will allow us to keep a working master branch.
At the point when the new cluster management is on par with the current 
implementation, we will drop the old {{JobManager}}and rename the {{JobMaster}} 
to {{JobManager}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4341) Checkpoint state size grows unbounded when task parallelism not uniform

2016-08-09 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15414334#comment-15414334
 ] 

Stephan Ewen commented on FLINK-4341:
-

Thanks for reporting this issue. Sounds pretty serious.

Would be important to see if this is a general checkpointing regression, or a 
change in the behavior of the Kinesis connector since the snapshot version you 
used.

Do you have a way to test this job with a testdata-generating source, i.e. 
without Kinesis?

> Checkpoint state size grows unbounded when task parallelism not uniform
> ---
>
> Key: FLINK-4341
> URL: https://issues.apache.org/jira/browse/FLINK-4341
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.1.0
>Reporter: Scott Kidder
>Priority: Critical
>
> This issue was first encountered with Flink release 1.1.0 (commit 45f7825). I 
> was previously using a 1.1.0 snapshot (commit 18995c8) which performed as 
> expected.  This issue was introduced somewhere between those commits.
> I've got a Flink application that uses the Kinesis Stream Consumer to read 
> from a Kinesis stream with 2 shards. I've got 2 task managers with 2 slots 
> each, providing a total of 4 slots.  When running the application with a 
> parallelism of 4, the Kinesis consumer uses 2 slots (one per Kinesis shard) 
> and 4 slots for subsequent tasks that process the Kinesis stream data. I use 
> an in-memory store for checkpoint data.
> Yesterday I upgraded to Flink 1.1.0 (45f7825) and noticed that checkpoint 
> states were growing unbounded when running with a parallelism of 4, 
> checkpoint interval of 10 seconds:
> {code}
> ID  State Size
> 1   11.3 MB
> 220.9 MB
> 3   30.6 MB
> 4   41.4 MB
> 5   52.6 MB
> 6   62.5 MB
> 7   71.5 MB
> 8   83.3 MB
> 9   93.5 MB
> {code}
> The first 4 checkpoints generally succeed, but then fail with an exception 
> like the following:
> {code}
> java.lang.RuntimeException: Error triggering a checkpoint as the result of 
> receiving checkpoint barrier at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$2.onEvent(StreamTask.java:768)
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$2.onEvent(StreamTask.java:758)
>  at 
> org.apache.flink.streaming.runtime.io.BarrierBuffer.processBarrier(BarrierBuffer.java:203)
>  at 
> org.apache.flink.streaming.runtime.io.BarrierBuffer.getNextNonBlocked(BarrierBuffer.java:129)
>  at 
> org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:183)
>  at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:66)
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:266)
>  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:584) at 
> java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: Size of 
> the state is larger than the maximum permitted memory-backed state. 
> Size=12105407 , maxSize=5242880 . Consider using a different state backend, 
> like the File System State backend. at 
> org.apache.flink.runtime.state.memory.MemoryStateBackend.checkSize(MemoryStateBackend.java:146)
>  at 
> org.apache.flink.runtime.state.memory.MemoryStateBackend$MemoryCheckpointOutputStream.closeAndGetBytes(MemoryStateBackend.java:200)
>  at 
> org.apache.flink.runtime.state.memory.MemoryStateBackend$MemoryCheckpointOutputStream.closeAndGetHandle(MemoryStateBackend.java:190)
>  at 
> org.apache.flink.runtime.state.AbstractStateBackend$CheckpointStateOutputView.closeAndGetHandle(AbstractStateBackend.java:447)
>  at 
> org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.snapshotOperatorState(WindowOperator.java:879)
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:598)
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$2.onEvent(StreamTask.java:762)
>  ... 8 more
> {code}
> Or:
> {code}
> 2016-08-09 17:44:43,626 INFO  
> org.apache.flink.streaming.runtime.tasks.StreamTask   - Restoring 
> checkpointed state to task Fold: property_id, player -> 10-minute 
> Sliding-Window Percentile Aggregation -> Sink: InfluxDB (2/4)
> 2016-08-09 17:44:51,236 ERROR akka.remote.EndpointWriter- 
> Transient association error (association remains live) 
> akka.remote.OversizedPayloadException: Discarding oversized payload sent to 
> Actor[akka.tcp://flink@10.55.2.212:6123/user/jobmanager#510517238]: max 
> allowed size 10485760 bytes, actual size of encoded class 
> org.apache.flink.runtime.messages.checkpoint.AcknowledgeCheckpoint was 
> 10891825 bytes.
> {code}
> This can be fixed by simply submitting the job with a parallelism of 2. I 
> suspect there was a regression introduced relating to assumptions about the 
> number of sub-tasks 

[jira] [Updated] (FLINK-4341) Checkpoint state size grows unbounded when task parallelism not uniform

2016-08-09 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen updated FLINK-4341:

Priority: Critical  (was: Major)

> Checkpoint state size grows unbounded when task parallelism not uniform
> ---
>
> Key: FLINK-4341
> URL: https://issues.apache.org/jira/browse/FLINK-4341
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.1.0
>Reporter: Scott Kidder
>Priority: Critical
>
> This issue was first encountered with Flink release 1.1.0 (commit 45f7825). I 
> was previously using a 1.1.0 snapshot (commit 18995c8) which performed as 
> expected.  This issue was introduced somewhere between those commits.
> I've got a Flink application that uses the Kinesis Stream Consumer to read 
> from a Kinesis stream with 2 shards. I've got 2 task managers with 2 slots 
> each, providing a total of 4 slots.  When running the application with a 
> parallelism of 4, the Kinesis consumer uses 2 slots (one per Kinesis shard) 
> and 4 slots for subsequent tasks that process the Kinesis stream data. I use 
> an in-memory store for checkpoint data.
> Yesterday I upgraded to Flink 1.1.0 (45f7825) and noticed that checkpoint 
> states were growing unbounded when running with a parallelism of 4, 
> checkpoint interval of 10 seconds:
> {code}
> ID  State Size
> 1   11.3 MB
> 220.9 MB
> 3   30.6 MB
> 4   41.4 MB
> 5   52.6 MB
> 6   62.5 MB
> 7   71.5 MB
> 8   83.3 MB
> 9   93.5 MB
> {code}
> The first 4 checkpoints generally succeed, but then fail with an exception 
> like the following:
> {code}
> java.lang.RuntimeException: Error triggering a checkpoint as the result of 
> receiving checkpoint barrier at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$2.onEvent(StreamTask.java:768)
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$2.onEvent(StreamTask.java:758)
>  at 
> org.apache.flink.streaming.runtime.io.BarrierBuffer.processBarrier(BarrierBuffer.java:203)
>  at 
> org.apache.flink.streaming.runtime.io.BarrierBuffer.getNextNonBlocked(BarrierBuffer.java:129)
>  at 
> org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:183)
>  at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:66)
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:266)
>  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:584) at 
> java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: Size of 
> the state is larger than the maximum permitted memory-backed state. 
> Size=12105407 , maxSize=5242880 . Consider using a different state backend, 
> like the File System State backend. at 
> org.apache.flink.runtime.state.memory.MemoryStateBackend.checkSize(MemoryStateBackend.java:146)
>  at 
> org.apache.flink.runtime.state.memory.MemoryStateBackend$MemoryCheckpointOutputStream.closeAndGetBytes(MemoryStateBackend.java:200)
>  at 
> org.apache.flink.runtime.state.memory.MemoryStateBackend$MemoryCheckpointOutputStream.closeAndGetHandle(MemoryStateBackend.java:190)
>  at 
> org.apache.flink.runtime.state.AbstractStateBackend$CheckpointStateOutputView.closeAndGetHandle(AbstractStateBackend.java:447)
>  at 
> org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.snapshotOperatorState(WindowOperator.java:879)
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:598)
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$2.onEvent(StreamTask.java:762)
>  ... 8 more
> {code}
> Or:
> {code}
> 2016-08-09 17:44:43,626 INFO  
> org.apache.flink.streaming.runtime.tasks.StreamTask   - Restoring 
> checkpointed state to task Fold: property_id, player -> 10-minute 
> Sliding-Window Percentile Aggregation -> Sink: InfluxDB (2/4)
> 2016-08-09 17:44:51,236 ERROR akka.remote.EndpointWriter- 
> Transient association error (association remains live) 
> akka.remote.OversizedPayloadException: Discarding oversized payload sent to 
> Actor[akka.tcp://flink@10.55.2.212:6123/user/jobmanager#510517238]: max 
> allowed size 10485760 bytes, actual size of encoded class 
> org.apache.flink.runtime.messages.checkpoint.AcknowledgeCheckpoint was 
> 10891825 bytes.
> {code}
> This can be fixed by simply submitting the job with a parallelism of 2. I 
> suspect there was a regression introduced relating to assumptions about the 
> number of sub-tasks associated with a job stage (e.g. assuming 4 instead of a 
> value ranging from 1-4). This is currently preventing me from using all 
> available Task Manager slots.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-4319) Rework Cluster Management (FLIP-6)

2016-08-09 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen updated FLINK-4319:

Description: This is the root issue to track progress of the rework of 
cluster management (FLIP-6) 
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077  
(was: This is the head issue to track progress of the rework of cluster 
management (FLIP-6) 
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077)

> Rework Cluster Management (FLIP-6)
> --
>
> Key: FLINK-4319
> URL: https://issues.apache.org/jira/browse/FLINK-4319
> Project: Flink
>  Issue Type: Improvement
>  Components: Cluster Management
>Affects Versions: 1.1.0
>Reporter: Stephan Ewen
>
> This is the root issue to track progress of the rework of cluster management 
> (FLIP-6) 
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4343) Implement new TaskManager

2016-08-09 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-4343:
---

 Summary: Implement new TaskManager
 Key: FLINK-4343
 URL: https://issues.apache.org/jira/browse/FLINK-4343
 Project: Flink
  Issue Type: New Feature
  Components: Cluster Management
Reporter: Stephan Ewen


This is the parent issue for the efforts to implement the {{TaskManager}} 
changes based on FLIP-6 
(https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077)

Because of the breadth of changes, we should implement a new version of the 
{{TaskManager}} (let's call it {{TaskExecutor}}) rather than updating the 
current {{TaskManager}}. That will allow us to keep a working master branch.

At the point when the new cluster management is on par with the current 
implementation, we will drop the old {{TaskManager}} and rename the 
{{TaskExecutor}} to {{TaskManager}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4341) Checkpoint state size grows unbounded when task parallelism not uniform

2016-08-09 Thread Scott Kidder (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15414301#comment-15414301
 ] 

Scott Kidder commented on FLINK-4341:
-

I also noticed that when checkpointing is enabled and I'm using a parallelism 
of 2 the processing speed is extremely slow compared to that of Flink 18995c8. 
I disabled checkpointing altogether and the speed returned to previous levels.

I'm currently building Flink from source to pull in hotfixes added to the 
release-1.1 branch since commit 45f7825. I'll update this issue with my 
findings.

> Checkpoint state size grows unbounded when task parallelism not uniform
> ---
>
> Key: FLINK-4341
> URL: https://issues.apache.org/jira/browse/FLINK-4341
> Project: Flink
>  Issue Type: Bug
>  Components: Core
>Affects Versions: 1.1.0
>Reporter: Scott Kidder
>
> This issue was first encountered with Flink release 1.1.0 (commit 45f7825). I 
> was previously using a 1.1.0 snapshot (commit 18995c8) which performed as 
> expected.  This issue was introduced somewhere between those commits.
> I've got a Flink application that uses the Kinesis Stream Consumer to read 
> from a Kinesis stream with 2 shards. I've got 2 task managers with 2 slots 
> each, providing a total of 4 slots.  When running the application with a 
> parallelism of 4, the Kinesis consumer uses 2 slots (one per Kinesis shard) 
> and 4 slots for subsequent tasks that process the Kinesis stream data. I use 
> an in-memory store for checkpoint data.
> Yesterday I upgraded to Flink 1.1.0 (45f7825) and noticed that checkpoint 
> states were growing unbounded when running with a parallelism of 4, 
> checkpoint interval of 10 seconds:
> {code}
> ID  State Size
> 1   11.3 MB
> 220.9 MB
> 3   30.6 MB
> 4   41.4 MB
> 5   52.6 MB
> 6   62.5 MB
> 7   71.5 MB
> 8   83.3 MB
> 9   93.5 MB
> {code}
> The first 4 checkpoints generally succeed, but then fail with an exception 
> like the following:
> {code}
> java.lang.RuntimeException: Error triggering a checkpoint as the result of 
> receiving checkpoint barrier at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$2.onEvent(StreamTask.java:768)
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$2.onEvent(StreamTask.java:758)
>  at 
> org.apache.flink.streaming.runtime.io.BarrierBuffer.processBarrier(BarrierBuffer.java:203)
>  at 
> org.apache.flink.streaming.runtime.io.BarrierBuffer.getNextNonBlocked(BarrierBuffer.java:129)
>  at 
> org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:183)
>  at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:66)
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:266)
>  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:584) at 
> java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: Size of 
> the state is larger than the maximum permitted memory-backed state. 
> Size=12105407 , maxSize=5242880 . Consider using a different state backend, 
> like the File System State backend. at 
> org.apache.flink.runtime.state.memory.MemoryStateBackend.checkSize(MemoryStateBackend.java:146)
>  at 
> org.apache.flink.runtime.state.memory.MemoryStateBackend$MemoryCheckpointOutputStream.closeAndGetBytes(MemoryStateBackend.java:200)
>  at 
> org.apache.flink.runtime.state.memory.MemoryStateBackend$MemoryCheckpointOutputStream.closeAndGetHandle(MemoryStateBackend.java:190)
>  at 
> org.apache.flink.runtime.state.AbstractStateBackend$CheckpointStateOutputView.closeAndGetHandle(AbstractStateBackend.java:447)
>  at 
> org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.snapshotOperatorState(WindowOperator.java:879)
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:598)
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$2.onEvent(StreamTask.java:762)
>  ... 8 more
> {code}
> Or:
> {code}
> 2016-08-09 17:44:43,626 INFO  
> org.apache.flink.streaming.runtime.tasks.StreamTask   - Restoring 
> checkpointed state to task Fold: property_id, player -> 10-minute 
> Sliding-Window Percentile Aggregation -> Sink: InfluxDB (2/4)
> 2016-08-09 17:44:51,236 ERROR akka.remote.EndpointWriter- 
> Transient association error (association remains live) 
> akka.remote.OversizedPayloadException: Discarding oversized payload sent to 
> Actor[akka.tcp://flink@10.55.2.212:6123/user/jobmanager#510517238]: max 
> allowed size 10485760 bytes, actual size of encoded class 
> org.apache.flink.runtime.messages.checkpoint.AcknowledgeCheckpoint was 
> 10891825 bytes.
> {code}
> This can be fixed by simply submitting the job with a parallelism of 2. I 
> suspect there was a regression introduced relating to 

[GitHub] flink pull request #2346: [FLINK-4342] [build] Fix dependencies of flink-con...

2016-08-09 Thread StephanEwen
GitHub user StephanEwen opened a pull request:

https://github.com/apache/flink/pull/2346

[FLINK-4342] [build] Fix dependencies of flink-connector-filesystem

This fixes dependencies of the `flink-connector-filesystem` project

  - Remove unneeded Guava dependency
  - Set hadoop-shaded-artifact dependency to 'provided' as in other 
connectors, because the dependency should not go into the user's jar.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StephanEwen/incubator-flink 
fix_filesystem_dependencies

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2346.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2346


commit 097c13833a7f643f6c665c66bae5e7db6ad57ea7
Author: Stephan Ewen 
Date:   2016-08-09T20:49:55Z

[FLINK-4342] [build] Fix dependencies of flink-connector-filesystem

  - Remove unneeded Guava dependency
  - Set hadoop-shaded-artifact dependency to 'provided'




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4342) Fix dependencies of flink-connector-filesystem

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15414219#comment-15414219
 ] 

ASF GitHub Bot commented on FLINK-4342:
---

GitHub user StephanEwen opened a pull request:

https://github.com/apache/flink/pull/2346

[FLINK-4342] [build] Fix dependencies of flink-connector-filesystem

This fixes dependencies of the `flink-connector-filesystem` project

  - Remove unneeded Guava dependency
  - Set hadoop-shaded-artifact dependency to 'provided' as in other 
connectors, because the dependency should not go into the user's jar.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StephanEwen/incubator-flink 
fix_filesystem_dependencies

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2346.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2346


commit 097c13833a7f643f6c665c66bae5e7db6ad57ea7
Author: Stephan Ewen 
Date:   2016-08-09T20:49:55Z

[FLINK-4342] [build] Fix dependencies of flink-connector-filesystem

  - Remove unneeded Guava dependency
  - Set hadoop-shaded-artifact dependency to 'provided'




> Fix dependencies of flink-connector-filesystem
> --
>
> Key: FLINK-4342
> URL: https://issues.apache.org/jira/browse/FLINK-4342
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Affects Versions: 1.1.0
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
> Fix For: 1.2.0
>
>
> The {{flink-connector-filesystem}} has inconsistent dependencies
>   - The Guava dependency is unused and can be removed
>   - The hadoop-shaded dependency is in 'compile' scope, but should be in 
> 'provided' scope, because it must not go into the user code jar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4342) Fix dependencies of flink-connector-filesystem

2016-08-09 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-4342:
---

 Summary: Fix dependencies of flink-connector-filesystem
 Key: FLINK-4342
 URL: https://issues.apache.org/jira/browse/FLINK-4342
 Project: Flink
  Issue Type: Bug
  Components: Streaming Connectors
Affects Versions: 1.1.0
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 1.2.0


The {{flink-connector-filesystem}} has inconsistent dependencies

  - The Guava dependency is unused and can be removed
  - The hadoop-shaded dependency is in 'compile' scope, but should be in 
'provided' scope, because it must not go into the user code jar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4326) Flink start-up scripts should optionally start services on the foreground

2016-08-09 Thread JIRA

[ 
https://issues.apache.org/jira/browse/FLINK-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15414182#comment-15414182
 ] 

Ismaël Mejía commented on FLINK-4326:
-

Well that's the question I was wondering before my previous PR but then I 
realized that having a centralized point for all the changes was less 
error-prone (current flink-daemon.sh), that's the reason I ended up mixing 
flink-daemon with an action like 'start-foreground', on the other hand we can 
rename flink-daemon into flink-service and it will make the same but it will 
have a less confusing naming.


> Flink start-up scripts should optionally start services on the foreground
> -
>
> Key: FLINK-4326
> URL: https://issues.apache.org/jira/browse/FLINK-4326
> Project: Flink
>  Issue Type: Improvement
>  Components: Startup Shell Scripts
>Affects Versions: 1.0.3
>Reporter: Elias Levy
>
> This has previously been mentioned in the mailing list, but has not been 
> addressed.  Flink start-up scripts start the job and task managers in the 
> background.  This makes it difficult to integrate Flink with most processes 
> supervisory tools and init systems, including Docker.  One can get around 
> this via hacking the scripts or manually starting the right classes via Java, 
> but it is a brittle solution.
> In addition to starting the daemons in the foreground, the start up scripts 
> should use exec instead of running the commends, so as to avoid forks.  Many 
> supervisory tools assume the PID of the process to be monitored is that of 
> the process it first executes, and fork chains make it difficult for the 
> supervisor to figure out what process to monitor.  Specifically, 
> jobmanager.sh and taskmanager.sh should exec flink-daemon.sh, and 
> flink-daemon.sh should exec java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4341) Checkpoint state size grows unbounded when task parallelism not uniform

2016-08-09 Thread Scott Kidder (JIRA)
Scott Kidder created FLINK-4341:
---

 Summary: Checkpoint state size grows unbounded when task 
parallelism not uniform
 Key: FLINK-4341
 URL: https://issues.apache.org/jira/browse/FLINK-4341
 Project: Flink
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.0
Reporter: Scott Kidder


This issue was first encountered with Flink release 1.1.0 (commit 45f7825). I 
was previously using a 1.1.0 snapshot (commit 18995c8) which performed as 
expected.  This issue was introduced somewhere between those commits.

I've got a Flink application that uses the Kinesis Stream Consumer to read from 
a Kinesis stream with 2 shards. I've got 2 task managers with 2 slots each, 
providing a total of 4 slots.  When running the application with a parallelism 
of 4, the Kinesis consumer uses 2 slots (one per Kinesis shard) and 4 slots for 
subsequent tasks that process the Kinesis stream data. I use an in-memory store 
for checkpoint data.

Yesterday I upgraded to Flink 1.1.0 (45f7825) and noticed that checkpoint 
states were growing unbounded when running with a parallelism of 4, checkpoint 
interval of 10 seconds:

{code}
ID  State Size
1   11.3 MB
220.9 MB
3   30.6 MB
4   41.4 MB
5   52.6 MB
6   62.5 MB
7   71.5 MB
8   83.3 MB
9   93.5 MB
{code}

The first 4 checkpoints generally succeed, but then fail with an exception like 
the following:

{code}
java.lang.RuntimeException: Error triggering a checkpoint as the result of 
receiving checkpoint barrier at 
org.apache.flink.streaming.runtime.tasks.StreamTask$2.onEvent(StreamTask.java:768)
 at 
org.apache.flink.streaming.runtime.tasks.StreamTask$2.onEvent(StreamTask.java:758)
 at 
org.apache.flink.streaming.runtime.io.BarrierBuffer.processBarrier(BarrierBuffer.java:203)
 at 
org.apache.flink.streaming.runtime.io.BarrierBuffer.getNextNonBlocked(BarrierBuffer.java:129)
 at 
org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:183)
 at 
org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:66)
 at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:266) 
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:584) at 
java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: Size of 
the state is larger than the maximum permitted memory-backed state. 
Size=12105407 , maxSize=5242880 . Consider using a different state backend, 
like the File System State backend. at 
org.apache.flink.runtime.state.memory.MemoryStateBackend.checkSize(MemoryStateBackend.java:146)
 at 
org.apache.flink.runtime.state.memory.MemoryStateBackend$MemoryCheckpointOutputStream.closeAndGetBytes(MemoryStateBackend.java:200)
 at 
org.apache.flink.runtime.state.memory.MemoryStateBackend$MemoryCheckpointOutputStream.closeAndGetHandle(MemoryStateBackend.java:190)
 at 
org.apache.flink.runtime.state.AbstractStateBackend$CheckpointStateOutputView.closeAndGetHandle(AbstractStateBackend.java:447)
 at 
org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.snapshotOperatorState(WindowOperator.java:879)
 at 
org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:598)
 at 
org.apache.flink.streaming.runtime.tasks.StreamTask$2.onEvent(StreamTask.java:762)
 ... 8 more
{code}

Or:

{code}
2016-08-09 17:44:43,626 INFO  
org.apache.flink.streaming.runtime.tasks.StreamTask   - Restoring 
checkpointed state to task Fold: property_id, player -> 10-minute 
Sliding-Window Percentile Aggregation -> Sink: InfluxDB (2/4)
2016-08-09 17:44:51,236 ERROR akka.remote.EndpointWriter- Transient 
association error (association remains live) 
akka.remote.OversizedPayloadException: Discarding oversized payload sent to 
Actor[akka.tcp://flink@10.55.2.212:6123/user/jobmanager#510517238]: max allowed 
size 10485760 bytes, actual size of encoded class 
org.apache.flink.runtime.messages.checkpoint.AcknowledgeCheckpoint was 10891825 
bytes.
{code}

This can be fixed by simply submitting the job with a parallelism of 2. I 
suspect there was a regression introduced relating to assumptions about the 
number of sub-tasks associated with a job stage (e.g. assuming 4 instead of a 
value ranging from 1-4). This is currently preventing me from using all 
available Task Manager slots.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4326) Flink start-up scripts should optionally start services on the foreground

2016-08-09 Thread Greg Hogan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15414167#comment-15414167
 ] 

Greg Hogan commented on FLINK-4326:
---

The configuration parsing in {{jobmanager.sh}}, {{taskmanager.sh}}, etc. will 
be useful for starting processes in foreground mode. Would it be better to 
leave {{flink-daemon.sh}} unchanged and create a new lightweight 
foreground-only start script?

> Flink start-up scripts should optionally start services on the foreground
> -
>
> Key: FLINK-4326
> URL: https://issues.apache.org/jira/browse/FLINK-4326
> Project: Flink
>  Issue Type: Improvement
>  Components: Startup Shell Scripts
>Affects Versions: 1.0.3
>Reporter: Elias Levy
>
> This has previously been mentioned in the mailing list, but has not been 
> addressed.  Flink start-up scripts start the job and task managers in the 
> background.  This makes it difficult to integrate Flink with most processes 
> supervisory tools and init systems, including Docker.  One can get around 
> this via hacking the scripts or manually starting the right classes via Java, 
> but it is a brittle solution.
> In addition to starting the daemons in the foreground, the start up scripts 
> should use exec instead of running the commends, so as to avoid forks.  Many 
> supervisory tools assume the PID of the process to be monitored is that of 
> the process it first executes, and fork chains make it difficult for the 
> supervisor to figure out what process to monitor.  Specifically, 
> jobmanager.sh and taskmanager.sh should exec flink-daemon.sh, and 
> flink-daemon.sh should exec java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3623) Adjust MurmurHash algorithm

2016-08-09 Thread Stefan Richter (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15414110#comment-15414110
 ] 

Stefan Richter commented on FLINK-3623:
---

In my opinion, our implementation of Murmur might be too complicated and 
computationally expensive for simply hashing an int.  Most parts of the used 
algorithm are only required for arbitrary byte[] keys. What matters for mixing 
the bits of an int is the finalizer part of Murmur3, which is:
  
  h ^= h >> 16;
  h *= 0x85ebca6b;
  h ^= h >> 13;
  h *= 0xc2b2ae35;
  h ^= h >> 16;

If I remember correctly, the purpose of the remaining code is to generate the 
intermediate hash from blocks over variable size byte[], whereas the purpose of 
the finalizer is to actually increase entropy and reduce collisions of the 
intermediate hash. Restricting the code to these lines could speed up hashing 
significantly, which might be even more relevant in case hash calculations are 
also used to determine keygroups.

> Adjust MurmurHash algorithm
> ---
>
> Key: FLINK-3623
> URL: https://issues.apache.org/jira/browse/FLINK-3623
> Project: Flink
>  Issue Type: Improvement
>  Components: Distributed Coordination
>Affects Versions: 1.1.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>Priority: Trivial
> Fix For: 1.1.0
>
>
> Flink's MurmurHash implementation differs from the published algorithm.
> From Flink's MathUtils.java:
> {code}
> code *= 0xe6546b64;
> {code}
> The Murmur3_32 algorithm as described by 
> [Wikipedia|https://en.wikipedia.org/wiki/MurmurHash]:
> {code}
> m ← 5
> n ← 0xe6546b64
> hash ← hash × m + n
> {code}
> and in Guava's Murmur3_32HashFunction.java:
> {code}
> h1 = h1 * 5 + 0xe6546b64;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15414086#comment-15414086
 ] 

ASF GitHub Bot commented on FLINK-4340:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2345
  
@wenlong88 Can you explain this a bit more? How exactly are you creating 
the local checkpoint?



> Remove RocksDB Semi-Async Checkpoint Mode
> -
>
> Key: FLINK-4340
> URL: https://issues.apache.org/jira/browse/FLINK-4340
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.1.0
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>
> This seems to be causing to many problems and is also incompatible with the 
> upcoming key-group/sharding changes that will allow rescaling of keyed state.
> Once this is done we can also close FLINK-4228.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2345: [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode

2016-08-09 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2345
  
@wenlong88 Can you explain this a bit more? How exactly are you creating 
the local checkpoint?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-4309) Potential null pointer dereference in DelegatingConfiguration#keySet()

2016-08-09 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated FLINK-4309:
--
Description: 
{code}
final int prefixLen = this.prefix == null ? 0 : this.prefix.length();

for (String key : this.backingConfig.keySet()) {
  if (key.startsWith(this.prefix)) {
{code}

If this.prefix == null, we would get NPE in startsWith():
{code}
public boolean startsWith(String prefix, int toffset) {
char ta[] = value;
int to = toffset;
char pa[] = prefix.value;
{code}

  was:
{code}
final int prefixLen = this.prefix == null ? 0 : this.prefix.length();

for (String key : this.backingConfig.keySet()) {
  if (key.startsWith(this.prefix)) {
{code}
If this.prefix == null, we would get NPE in startsWith():
{code}
public boolean startsWith(String prefix, int toffset) {
char ta[] = value;
int to = toffset;
char pa[] = prefix.value;
{code}


> Potential null pointer dereference in DelegatingConfiguration#keySet()
> --
>
> Key: FLINK-4309
> URL: https://issues.apache.org/jira/browse/FLINK-4309
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
>
> {code}
> final int prefixLen = this.prefix == null ? 0 : this.prefix.length();
> for (String key : this.backingConfig.keySet()) {
>   if (key.startsWith(this.prefix)) {
> {code}
> If this.prefix == null, we would get NPE in startsWith():
> {code}
> public boolean startsWith(String prefix, int toffset) {
> char ta[] = value;
> int to = toffset;
> char pa[] = prefix.value;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-4337) Remove unnecessary Scala suffix from Hadoop1 artifact

2016-08-09 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-4337.
---

> Remove unnecessary Scala suffix from Hadoop1 artifact
> -
>
> Key: FLINK-4337
> URL: https://issues.apache.org/jira/browse/FLINK-4337
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.1.0
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
> Fix For: 1.2.0
>
>
> The hadoop1 artifacts have a "_2.10" Scala suffix, but have no Scala 
> dependency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2343: [FLINK-4337] [build] Remove unnecessary Scala Suff...

2016-08-09 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2343


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4337) Remove unnecessary Scala suffix from Hadoop1 artifact

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413919#comment-15413919
 ] 

ASF GitHub Bot commented on FLINK-4337:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2343


> Remove unnecessary Scala suffix from Hadoop1 artifact
> -
>
> Key: FLINK-4337
> URL: https://issues.apache.org/jira/browse/FLINK-4337
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.1.0
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
> Fix For: 1.2.0
>
>
> The hadoop1 artifacts have a "_2.10" Scala suffix, but have no Scala 
> dependency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-4337) Remove unnecessary Scala suffix from Hadoop1 artifact

2016-08-09 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-4337.
-
Resolution: Fixed

Fixed via aed7a2872a475f1d69b0fa11d837d0d56c06e825

> Remove unnecessary Scala suffix from Hadoop1 artifact
> -
>
> Key: FLINK-4337
> URL: https://issues.apache.org/jira/browse/FLINK-4337
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.1.0
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
> Fix For: 1.2.0
>
>
> The hadoop1 artifacts have a "_2.10" Scala suffix, but have no Scala 
> dependency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4253) Rename "recovery.mode" config key to "high-availability"

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413886#comment-15413886
 ] 

ASF GitHub Bot commented on FLINK-4253:
---

Github user ramkrish86 commented on a diff in the pull request:

https://github.com/apache/flink/pull/2342#discussion_r74105158
  
--- Diff: 
flink-runtime/src/test/scala/org/apache/flink/runtime/testingUtils/TestingUtils.scala
 ---
@@ -502,7 +503,8 @@ object TestingUtils {
   configuration: Configuration)
   : ActorGateway = {
 
-configuration.setString(ConfigConstants.HIGH_AVAILABILITY, 
ConfigConstants.DEFAULT_HIGH_AVAILABILTY)
+configuration.setString(ConfigConstants.HIGH_AVAILABILITY,
--- End diff --

After wrapping here 
`[INFO] force-shading .. SUCCESS [ 
11.954 s]
[INFO] flink .. SUCCESS [01:01 
min]
[INFO] flink-annotations .. SUCCESS [  
7.724 s]
[INFO] flink-shaded-hadoop  SUCCESS [  
0.364 s]
[INFO] flink-shaded-hadoop2 ... SUCCESS [ 
15.982 s]
[INFO] flink-shaded-include-yarn-tests  SUCCESS [ 
16.367 s]
[INFO] flink-shaded-curator ... SUCCESS [  
0.278 s]
[INFO] flink-shaded-curator-recipes ... SUCCESS [  
1.953 s]
[INFO] flink-shaded-curator-test .. SUCCESS [  
0.738 s]
[INFO] flink-metrics .. SUCCESS [  
0.326 s]
[INFO] flink-metrics-core . SUCCESS [  
8.961 s]
[INFO] flink-test-utils-parent  SUCCESS [  
0.302 s]
[INFO] flink-test-utils-junit . SUCCESS [  
7.145 s]
[INFO] flink-core . SUCCESS [01:27 
min]
[INFO] flink-java . SUCCESS [01:03 
min]
[INFO] flink-runtime .. SUCCESS [03:31 
min]
[INFO] flink-optimizer  SUCCESS [ 
43.211 s]
[INFO] flink-clients .. SUCCESS [ 
33.062 s]
[INFO] flink-streaming-java ... SUCCESS [01:11 
min]
[INFO] flink-test-utils ... SUCCESS [ 
29.337 s]
[INFO] flink-scala  SUCCESS [01:48 
min]
[INFO] flink-runtime-web .. SUCCESS [ 
28.635 s]
[INFO] flink-examples . SUCCESS [  
3.177 s]
[INFO] flink-examples-batch ... SUCCESS [01:09 
min]
[INFO] flink-contrib .. SUCCESS [  
0.847 s]
[INFO] flink-statebackend-rocksdb . SUCCESS [ 
11.992 s]
[INFO] flink-tests  SUCCESS [03:27 
min]
[INFO] flink-streaming-scala .. SUCCESS [01:21 
min]
[INFO] flink-streaming-connectors . SUCCESS [  
0.532 s]
[INFO] flink-connector-flume .. SUCCESS [  
8.794 s]
[INFO] flink-libraries  SUCCESS [  
0.285 s]
[INFO] flink-table  SUCCESS [02:22 
min]
[INFO] flink-metrics-jmx .. SUCCESS [  
6.208 s]
[INFO] flink-connector-kafka-base . SUCCESS [ 
14.832 s]
[INFO] flink-connector-kafka-0.8 .. SUCCESS [ 
10.985 s]
[INFO] flink-connector-kafka-0.9 .. SUCCESS [  
7.464 s]
[INFO] flink-connector-elasticsearch .. SUCCESS [ 
11.186 s]
[INFO] flink-connector-elasticsearch2 . SUCCESS [ 
11.327 s]
[INFO] flink-connector-rabbitmq ... SUCCESS [  
5.484 s]
[INFO] flink-connector-twitter  SUCCESS [  
6.641 s]
[INFO] flink-connector-nifi ... SUCCESS [  
6.188 s]
[INFO] flink-connector-cassandra .. SUCCESS [ 
11.955 s]
[INFO] flink-connector-redis .. SUCCESS [  
7.246 s]
[INFO] flink-connector-filesystem . SUCCESS [  
8.218 s]
[INFO] flink-batch-connectors . SUCCESS [  
0.283 s]
[INFO] flink-avro . SUCCESS [  
9.166 s]
[INFO] flink-jdbc . SUCCESS [  
7.117 s]
[INFO] flink-hadoop-compatibility . SUCCESS [  
9.769 s]
[INFO] flink-hbase  SUCCESS [  
8.976 s]

[GitHub] flink pull request #2342: FLINK-4253 - Rename "recovery.mode" config key to ...

2016-08-09 Thread ramkrish86
Github user ramkrish86 commented on a diff in the pull request:

https://github.com/apache/flink/pull/2342#discussion_r74105158
  
--- Diff: 
flink-runtime/src/test/scala/org/apache/flink/runtime/testingUtils/TestingUtils.scala
 ---
@@ -502,7 +503,8 @@ object TestingUtils {
   configuration: Configuration)
   : ActorGateway = {
 
-configuration.setString(ConfigConstants.HIGH_AVAILABILITY, 
ConfigConstants.DEFAULT_HIGH_AVAILABILTY)
+configuration.setString(ConfigConstants.HIGH_AVAILABILITY,
--- End diff --

After wrapping here 
`[INFO] force-shading .. SUCCESS [ 
11.954 s]
[INFO] flink .. SUCCESS [01:01 
min]
[INFO] flink-annotations .. SUCCESS [  
7.724 s]
[INFO] flink-shaded-hadoop  SUCCESS [  
0.364 s]
[INFO] flink-shaded-hadoop2 ... SUCCESS [ 
15.982 s]
[INFO] flink-shaded-include-yarn-tests  SUCCESS [ 
16.367 s]
[INFO] flink-shaded-curator ... SUCCESS [  
0.278 s]
[INFO] flink-shaded-curator-recipes ... SUCCESS [  
1.953 s]
[INFO] flink-shaded-curator-test .. SUCCESS [  
0.738 s]
[INFO] flink-metrics .. SUCCESS [  
0.326 s]
[INFO] flink-metrics-core . SUCCESS [  
8.961 s]
[INFO] flink-test-utils-parent  SUCCESS [  
0.302 s]
[INFO] flink-test-utils-junit . SUCCESS [  
7.145 s]
[INFO] flink-core . SUCCESS [01:27 
min]
[INFO] flink-java . SUCCESS [01:03 
min]
[INFO] flink-runtime .. SUCCESS [03:31 
min]
[INFO] flink-optimizer  SUCCESS [ 
43.211 s]
[INFO] flink-clients .. SUCCESS [ 
33.062 s]
[INFO] flink-streaming-java ... SUCCESS [01:11 
min]
[INFO] flink-test-utils ... SUCCESS [ 
29.337 s]
[INFO] flink-scala  SUCCESS [01:48 
min]
[INFO] flink-runtime-web .. SUCCESS [ 
28.635 s]
[INFO] flink-examples . SUCCESS [  
3.177 s]
[INFO] flink-examples-batch ... SUCCESS [01:09 
min]
[INFO] flink-contrib .. SUCCESS [  
0.847 s]
[INFO] flink-statebackend-rocksdb . SUCCESS [ 
11.992 s]
[INFO] flink-tests  SUCCESS [03:27 
min]
[INFO] flink-streaming-scala .. SUCCESS [01:21 
min]
[INFO] flink-streaming-connectors . SUCCESS [  
0.532 s]
[INFO] flink-connector-flume .. SUCCESS [  
8.794 s]
[INFO] flink-libraries  SUCCESS [  
0.285 s]
[INFO] flink-table  SUCCESS [02:22 
min]
[INFO] flink-metrics-jmx .. SUCCESS [  
6.208 s]
[INFO] flink-connector-kafka-base . SUCCESS [ 
14.832 s]
[INFO] flink-connector-kafka-0.8 .. SUCCESS [ 
10.985 s]
[INFO] flink-connector-kafka-0.9 .. SUCCESS [  
7.464 s]
[INFO] flink-connector-elasticsearch .. SUCCESS [ 
11.186 s]
[INFO] flink-connector-elasticsearch2 . SUCCESS [ 
11.327 s]
[INFO] flink-connector-rabbitmq ... SUCCESS [  
5.484 s]
[INFO] flink-connector-twitter  SUCCESS [  
6.641 s]
[INFO] flink-connector-nifi ... SUCCESS [  
6.188 s]
[INFO] flink-connector-cassandra .. SUCCESS [ 
11.955 s]
[INFO] flink-connector-redis .. SUCCESS [  
7.246 s]
[INFO] flink-connector-filesystem . SUCCESS [  
8.218 s]
[INFO] flink-batch-connectors . SUCCESS [  
0.283 s]
[INFO] flink-avro . SUCCESS [  
9.166 s]
[INFO] flink-jdbc . SUCCESS [  
7.117 s]
[INFO] flink-hadoop-compatibility . SUCCESS [  
9.769 s]
[INFO] flink-hbase  SUCCESS [  
8.976 s]
[INFO] flink-hcatalog . SUCCESS [ 
15.990 s]
[INFO] flink-examples-streaming ... SUCCESS [ 
35.892 s]
[INFO] flink-gelly  SUCCESS [ 

[jira] [Updated] (FLINK-4329) Fixes Streaming File Source Timestamps/Watermarks Handling

2016-08-09 Thread Kostas Kloudas (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kostas Kloudas updated FLINK-4329:
--
Summary: Fixes Streaming File Source Timestamps/Watermarks Handling  (was: 
Streaming File Source Must Correctly Handle Timestamps/Watermarks)

> Fixes Streaming File Source Timestamps/Watermarks Handling
> --
>
> Key: FLINK-4329
> URL: https://issues.apache.org/jira/browse/FLINK-4329
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Affects Versions: 1.1.0
>Reporter: Aljoscha Krettek
>Assignee: Kostas Kloudas
> Fix For: 1.1.1
>
>
> The {{ContinuousFileReaderOperator}} does not correctly deal with watermarks, 
> i.e. they are just passed through. This means that when the 
> {{ContinuousFileMonitoringFunction}} closes and emits a {{Long.MAX_VALUE}} 
> that watermark can "overtake" the records that are to be emitted in the 
> {{ContinuousFileReaderOperator}}. Together with the new "allowed lateness" 
> setting in window operator this can lead to elements being dropped as late.
> Also, {{ContinuousFileReaderOperator}} does not correctly assign ingestion 
> timestamps since it is not technically a source but looks like one to the 
> user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-3929) Support for Kerberos Authentication with Keytab Credential

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413789#comment-15413789
 ] 

ASF GitHub Bot commented on FLINK-3929:
---

Github user vijikarthi commented on the issue:

https://github.com/apache/flink/pull/2275
  
@mxm - Could you please let me know if you are okay with the modifications 
to the integration test case scenarios that I have mentioned. I am open to keep 
just 3 classes for each scenarios (HDFS, Yarn & Kafka) as you have suggested 
but in my opinion that will defeat the idea of reusing existing test program. 
Please let me know either way and I will fix the code accordingly.


> Support for Kerberos Authentication with Keytab Credential
> --
>
> Key: FLINK-3929
> URL: https://issues.apache.org/jira/browse/FLINK-3929
> Project: Flink
>  Issue Type: New Feature
>Reporter: Eron Wright 
>Assignee: Vijay Srinivasaraghavan
>  Labels: kerberos, security
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> _This issue is part of a series of improvements detailed in the [Secure Data 
> Access|https://docs.google.com/document/d/1-GQB6uVOyoaXGwtqwqLV8BHDxWiMO2WnVzBoJ8oPaAs/edit?usp=sharing]
>  design doc._
> Add support for a keytab credential to be associated with the Flink cluster, 
> to facilitate:
> - Kerberos-authenticated data access for connectors
> - Kerberos-authenticated ZooKeeper access
> Support both the standalone and YARN deployment modes.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4326) Flink start-up scripts should optionally start services on the foreground

2016-08-09 Thread Milosz Tanski (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413790#comment-15413790
 ] 

Milosz Tanski commented on FLINK-4326:
--

[~greghogan] A lot of the traditional things UNIX services have been doing in 
the past like (daemonizing, logging and rotation) are being taken over by the 
system via init systems, supervisors, containers. 

The systemd supervisor uses similar Linux features as docker, it can create a 
new cgroup of processes for a daemon. This way it's able to monitor the whole 
group and shutdown the whole group. It can apply things like CPU slice and 
memory limits. And it takes over logging managment (provided the service spews 
to stdout/err) and from there the system admin can manage all logs in one place 
on the box or ship them to a logging aggregation service

> Flink start-up scripts should optionally start services on the foreground
> -
>
> Key: FLINK-4326
> URL: https://issues.apache.org/jira/browse/FLINK-4326
> Project: Flink
>  Issue Type: Improvement
>  Components: Startup Shell Scripts
>Affects Versions: 1.0.3
>Reporter: Elias Levy
>
> This has previously been mentioned in the mailing list, but has not been 
> addressed.  Flink start-up scripts start the job and task managers in the 
> background.  This makes it difficult to integrate Flink with most processes 
> supervisory tools and init systems, including Docker.  One can get around 
> this via hacking the scripts or manually starting the right classes via Java, 
> but it is a brittle solution.
> In addition to starting the daemons in the foreground, the start up scripts 
> should use exec instead of running the commends, so as to avoid forks.  Many 
> supervisory tools assume the PID of the process to be monitored is that of 
> the process it first executes, and fork chains make it difficult for the 
> supervisor to figure out what process to monitor.  Specifically, 
> jobmanager.sh and taskmanager.sh should exec flink-daemon.sh, and 
> flink-daemon.sh should exec java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2275: FLINK-3929 Support for Kerberos Authentication with Keyta...

2016-08-09 Thread vijikarthi
Github user vijikarthi commented on the issue:

https://github.com/apache/flink/pull/2275
  
@mxm - Could you please let me know if you are okay with the modifications 
to the integration test case scenarios that I have mentioned. I am open to keep 
just 3 classes for each scenarios (HDFS, Yarn & Kafka) as you have suggested 
but in my opinion that will defeat the idea of reusing existing test program. 
Please let me know either way and I will fix the code accordingly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2002: Support for bz2 compression in flink-core

2016-08-09 Thread mtanski
Github user mtanski commented on the issue:

https://github.com/apache/flink/pull/2002
  
@StephanEwen would you like me to use the same version as the one being 
current include (transitive) or bump it to a recent version that include Snappy?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4282) Add Offset Parameter to WindowAssigners

2016-08-09 Thread Aditi Viswanathan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413733#comment-15413733
 ] 

Aditi Viswanathan commented on FLINK-4282:
--

Yes - exactly. So it's just the window that will shift right? That is,
every element with comes in with a UTC timestamp, will be put into the
window which starts and ends at the time which corresponds to its time zone
using the offset that is provided.

How would you suggest this problem be approached?


Aditi Viswanathan | +91-9632130809
Data Engineer,
[24]7 Customer Ltd.

On Tue, Aug 9, 2016 at 8:33 PM, Aljoscha Krettek (JIRA) 



> Add Offset Parameter to WindowAssigners
> ---
>
> Key: FLINK-4282
> URL: https://issues.apache.org/jira/browse/FLINK-4282
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Reporter: Aljoscha Krettek
>
> Currently, windows are always aligned to EPOCH, which basically means days 
> are aligned with GMT. This is somewhat problematic for people living in 
> different timezones.
> And offset parameter would allow to adapt the window assigner to the timezone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2289: [FLINK-3866] StringArraySerializer type should be mutable

2016-08-09 Thread mushketyk
Github user mushketyk commented on the issue:

https://github.com/apache/flink/pull/2289
  
Hi, Could someone please review this small bug-fix?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3866) StringArraySerializer claims type is immutable; shouldn't

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413711#comment-15413711
 ] 

ASF GitHub Bot commented on FLINK-3866:
---

Github user mushketyk commented on the issue:

https://github.com/apache/flink/pull/2289
  
Hi, Could someone please review this small bug-fix?


> StringArraySerializer claims type is immutable; shouldn't
> -
>
> Key: FLINK-3866
> URL: https://issues.apache.org/jira/browse/FLINK-3866
> Project: Flink
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 1.0.3
>Reporter: Tatu Saloranta
>Assignee: Ivan Mushketyk
>Priority: Minor
>
> Looking at default `TypeSerializer` instances I noticed what looks like a 
> minor flaw, unless I am missing something.
> Whereas all other array serializers indicate that type is not immutable 
> (since in Java, arrays are not immutable), `StringArraySerializer` has:
> ```
>   @Override
>   public boolean isImmutableType() {
>   return true;
>   }
> ```
> and I think it should instead return `false`. I could create a PR, but seems 
> like a small enough thing that issue report makes more sense.
> I tried looking for deps to see if there's a test for this, but couldn't find 
> one; otherwise could submit a test fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4271) There is no way to set parallelism of operators produced by CoGroupedStreams

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413665#comment-15413665
 ] 

ASF GitHub Bot commented on FLINK-4271:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2305
  
We could "deprecate" the `apply()` method and redirect to `with(...)`. That 
would be easier to find than a comment about casting.


> There is no way to set parallelism of operators produced by CoGroupedStreams
> 
>
> Key: FLINK-4271
> URL: https://issues.apache.org/jira/browse/FLINK-4271
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API
>Reporter: Wenlong Lyu
>Assignee: Jark Wu
>
> Currently, CoGroupStreams package the map/keyBy/window operators with a human 
> friendly interface, like: 
> dataStreamA.cogroup(streamB).where(...).equalsTo().window().apply(), both the 
> intermediate operators and final window operators can not be accessed by 
> users, and we cannot set attributes of the operators, which make co-group 
> hard to use in production environment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2305: [FLINK-4271] [DataStreamAPI] Enable CoGroupedStreams and ...

2016-08-09 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2305
  
We could "deprecate" the `apply()` method and redirect to `with(...)`. That 
would be easier to find than a comment about casting.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Closed] (FLINK-3779) Add support for queryable state

2016-08-09 Thread Ufuk Celebi (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ufuk Celebi closed FLINK-3779.
--
   Resolution: Implemented
Fix Version/s: 1.2.0

Implemented in a909adb~1..490e7eb (master).

> Add support for queryable state
> ---
>
> Key: FLINK-3779
> URL: https://issues.apache.org/jira/browse/FLINK-3779
> Project: Flink
>  Issue Type: Improvement
>  Components: Distributed Coordination
>Reporter: Ufuk Celebi
>Assignee: Ufuk Celebi
> Fix For: 1.2.0
>
>
> Flink offers state abstractions for user functions in order to guarantee 
> fault-tolerant processing of streams. Users can work with both 
> non-partitioned (Checkpointed interface) and partitioned state 
> (getRuntimeContext().getState(ValueStateDescriptor) and other variants).
> The partitioned state interface provides access to different types of state 
> that are all scoped to the key of the current input element. This type of 
> state can only be used on a KeyedStream, which is created via stream.keyBy().
> Currently, all of this state is internal to Flink and used in order to 
> provide processing guarantees in failure cases (e.g. exactly-once processing).
> The goal of Queryable State is to expose this state outside of Flink by 
> supporting queries against the partitioned key value state.
> This will help to eliminate the need for distributed operations/transactions 
> with external systems such as key-value stores which are often the bottleneck 
> in practice. Exposing the local state to the outside moves a good part of the 
> database work into the stream processor, allowing both high throughput 
> queries and immediate access to the computed state.
> This is the initial design doc for the feature: 
> https://docs.google.com/document/d/1NkQuhIKYmcprIU5Vjp04db1HgmYSsZtCMxgDi_iTN-g.
>  Feel free to comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4271) There is no way to set parallelism of operators produced by CoGroupedStreams

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413649#comment-15413649
 ] 

ASF GitHub Bot commented on FLINK-4271:
---

Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/2305
  
Wouldn't this suffer from the same problem as the "casting solution"? 
People would use `apply` and then wonder why there is no `setParallelism`, not 
bothering to read the Javadoc to find out that there is also `with`.


> There is no way to set parallelism of operators produced by CoGroupedStreams
> 
>
> Key: FLINK-4271
> URL: https://issues.apache.org/jira/browse/FLINK-4271
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API
>Reporter: Wenlong Lyu
>Assignee: Jark Wu
>
> Currently, CoGroupStreams package the map/keyBy/window operators with a human 
> friendly interface, like: 
> dataStreamA.cogroup(streamB).where(...).equalsTo().window().apply(), both the 
> intermediate operators and final window operators can not be accessed by 
> users, and we cannot set attributes of the operators, which make co-group 
> hard to use in production environment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2305: [FLINK-4271] [DataStreamAPI] Enable CoGroupedStreams and ...

2016-08-09 Thread aljoscha
Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/2305
  
Wouldn't this suffer from the same problem as the "casting solution"? 
People would use `apply` and then wonder why there is no `setParallelism`, not 
bothering to read the Javadoc to find out that there is also `with`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-3779) Add support for queryable state

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413647#comment-15413647
 ] 

ASF GitHub Bot commented on FLINK-3779:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2051


> Add support for queryable state
> ---
>
> Key: FLINK-3779
> URL: https://issues.apache.org/jira/browse/FLINK-3779
> Project: Flink
>  Issue Type: Improvement
>  Components: Distributed Coordination
>Reporter: Ufuk Celebi
>Assignee: Ufuk Celebi
>
> Flink offers state abstractions for user functions in order to guarantee 
> fault-tolerant processing of streams. Users can work with both 
> non-partitioned (Checkpointed interface) and partitioned state 
> (getRuntimeContext().getState(ValueStateDescriptor) and other variants).
> The partitioned state interface provides access to different types of state 
> that are all scoped to the key of the current input element. This type of 
> state can only be used on a KeyedStream, which is created via stream.keyBy().
> Currently, all of this state is internal to Flink and used in order to 
> provide processing guarantees in failure cases (e.g. exactly-once processing).
> The goal of Queryable State is to expose this state outside of Flink by 
> supporting queries against the partitioned key value state.
> This will help to eliminate the need for distributed operations/transactions 
> with external systems such as key-value stores which are often the bottleneck 
> in practice. Exposing the local state to the outside moves a good part of the 
> database work into the stream processor, allowing both high throughput 
> queries and immediate access to the computed state.
> This is the initial design doc for the feature: 
> https://docs.google.com/document/d/1NkQuhIKYmcprIU5Vjp04db1HgmYSsZtCMxgDi_iTN-g.
>  Feel free to comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2051: [FLINK-3779] Add support for queryable state

2016-08-09 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2051


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4282) Add Offset Parameter to WindowAssigners

2016-08-09 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413645#comment-15413645
 ] 

Aljoscha Krettek commented on FLINK-4282:
-

The {{WindowAssigner}} should not try to shift the timestamp around or do 
anything fancy. Specifying a timezone should just be another way of specifying 
where hours start with respect to UTC. (I brought this up just because there 
are some "exotic" time zones that are not shifted by exact hours compared to 
UTC.) 

> Add Offset Parameter to WindowAssigners
> ---
>
> Key: FLINK-4282
> URL: https://issues.apache.org/jira/browse/FLINK-4282
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Reporter: Aljoscha Krettek
>
> Currently, windows are always aligned to EPOCH, which basically means days 
> are aligned with GMT. This is somewhat problematic for people living in 
> different timezones.
> And offset parameter would allow to adapt the window assigner to the timezone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2344: [hotfix][build] Remove Scala suffix from Hadoop1 s...

2016-08-09 Thread rmetzger
Github user rmetzger closed the pull request at:

https://github.com/apache/flink/pull/2344


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2344: [hotfix][build] Remove Scala suffix from Hadoop1 shading ...

2016-08-09 Thread rmetzger
Github user rmetzger commented on the issue:

https://github.com/apache/flink/pull/2344
  
True. Thank you, I'll close this one.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4271) There is no way to set parallelism of operators produced by CoGroupedStreams

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413641#comment-15413641
 ] 

ASF GitHub Bot commented on FLINK-4271:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2305
  
I just thought of a different trick: We could add a second variant of the 
`apply(..)` function (for example called `with(...)` as in the DataSet API) and 
have the proper return type there (calling apply() and cast).

We can then immediately deprecate the `with()` function to indicate that it 
is a temporary workaround and is to be replaced by `apply(...)` in Flink 2.0.


> There is no way to set parallelism of operators produced by CoGroupedStreams
> 
>
> Key: FLINK-4271
> URL: https://issues.apache.org/jira/browse/FLINK-4271
> Project: Flink
>  Issue Type: Bug
>  Components: DataStream API
>Reporter: Wenlong Lyu
>Assignee: Jark Wu
>
> Currently, CoGroupStreams package the map/keyBy/window operators with a human 
> friendly interface, like: 
> dataStreamA.cogroup(streamB).where(...).equalsTo().window().apply(), both the 
> intermediate operators and final window operators can not be accessed by 
> users, and we cannot set attributes of the operators, which make co-group 
> hard to use in production environment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2344: [hotfix][build] Remove Scala suffix from Hadoop1 shading ...

2016-08-09 Thread uce
Github user uce commented on the issue:

https://github.com/apache/flink/pull/2344
  
Duplicate of #2343?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2305: [FLINK-4271] [DataStreamAPI] Enable CoGroupedStreams and ...

2016-08-09 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2305
  
I just thought of a different trick: We could add a second variant of the 
`apply(..)` function (for example called `with(...)` as in the DataSet API) and 
have the proper return type there (calling apply() and cast).

We can then immediately deprecate the `with()` function to indicate that it 
is a temporary workaround and is to be replaced by `apply(...)` in Flink 2.0.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4334) Shaded Hadoop1 jar not fully excluded in Quickstart

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413637#comment-15413637
 ] 

ASF GitHub Bot commented on FLINK-4334:
---

Github user rehevkor5 commented on the issue:

https://github.com/apache/flink/pull/2341
  
You got it @StephanEwen 


> Shaded Hadoop1 jar not fully excluded in Quickstart
> ---
>
> Key: FLINK-4334
> URL: https://issues.apache.org/jira/browse/FLINK-4334
> Project: Flink
>  Issue Type: Bug
>  Components: Quickstarts
>Affects Versions: 1.1.0, 1.0.1, 1.0.2, 1.0.3
>Reporter: Shannon Carey
> Fix For: 1.1.2
>
>
> The Shaded Hadoop1 jar has artifactId flink-shaded-hadoop1_2.10 since Flink 
> 1.0.0 (see 
> https://github.com/apache/flink/commit/2c4e4d1ffaf4107fb802c90858184fc10af66837),
>  but the quickstart POMs both refer to it as flink-shaded-hadoop1.
> If using "-Pbuild-jar", the problem is not encountered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2341: [FLINK-4334] [quickstarts] Correctly exclude hadoop1 in q...

2016-08-09 Thread rehevkor5
Github user rehevkor5 commented on the issue:

https://github.com/apache/flink/pull/2341
  
You got it @StephanEwen 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4334) Shaded Hadoop1 jar not fully excluded in Quickstart

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413638#comment-15413638
 ] 

ASF GitHub Bot commented on FLINK-4334:
---

Github user rehevkor5 closed the pull request at:

https://github.com/apache/flink/pull/2341


> Shaded Hadoop1 jar not fully excluded in Quickstart
> ---
>
> Key: FLINK-4334
> URL: https://issues.apache.org/jira/browse/FLINK-4334
> Project: Flink
>  Issue Type: Bug
>  Components: Quickstarts
>Affects Versions: 1.1.0, 1.0.1, 1.0.2, 1.0.3
>Reporter: Shannon Carey
> Fix For: 1.1.2
>
>
> The Shaded Hadoop1 jar has artifactId flink-shaded-hadoop1_2.10 since Flink 
> 1.0.0 (see 
> https://github.com/apache/flink/commit/2c4e4d1ffaf4107fb802c90858184fc10af66837),
>  but the quickstart POMs both refer to it as flink-shaded-hadoop1.
> If using "-Pbuild-jar", the problem is not encountered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2341: [FLINK-4334] [quickstarts] Correctly exclude hadoo...

2016-08-09 Thread rehevkor5
Github user rehevkor5 closed the pull request at:

https://github.com/apache/flink/pull/2341


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413596#comment-15413596
 ] 

ASF GitHub Bot commented on FLINK-4340:
---

Github user wenlong88 commented on the issue:

https://github.com/apache/flink/pull/2345
  
@aljoscha we use rocksdb checkpoint machanism to do the semi-async 
checkpoint, which use hard link to make checkpoint, cost quite a little IO and 
time in synchronized phrase. This works well even when the state is large. and 
using the checkpoint dir to restore is also very fast, since no extra IO need, 
if the task manager didn't changed.


> Remove RocksDB Semi-Async Checkpoint Mode
> -
>
> Key: FLINK-4340
> URL: https://issues.apache.org/jira/browse/FLINK-4340
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.1.0
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>
> This seems to be causing to many problems and is also incompatible with the 
> upcoming key-group/sharding changes that will allow rescaling of keyed state.
> Once this is done we can also close FLINK-4228.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2345: [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode

2016-08-09 Thread wenlong88
Github user wenlong88 commented on the issue:

https://github.com/apache/flink/pull/2345
  
@aljoscha we use rocksdb checkpoint machanism to do the semi-async 
checkpoint, which use hard link to make checkpoint, cost quite a little IO and 
time in synchronized phrase. This works well even when the state is large. and 
using the checkpoint dir to restore is also very fast, since no extra IO need, 
if the task manager didn't changed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2345: [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode

2016-08-09 Thread gyfora
Github user gyfora commented on the issue:

https://github.com/apache/flink/pull/2345
  
Hi,

Isn't this way of checkpointing is much much slower then the semi async 
version?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413579#comment-15413579
 ] 

ASF GitHub Bot commented on FLINK-4340:
---

Github user gyfora commented on the issue:

https://github.com/apache/flink/pull/2345
  
But I wonder what would happen in a scenario with a lot of states:

Semi async: short local copy time at every snapshot + very fast restore
Fully async: no copy time + very slow restore (puts sort data, recreate 
index etc)


> Remove RocksDB Semi-Async Checkpoint Mode
> -
>
> Key: FLINK-4340
> URL: https://issues.apache.org/jira/browse/FLINK-4340
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.1.0
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>
> This seems to be causing to many problems and is also incompatible with the 
> upcoming key-group/sharding changes that will allow rescaling of keyed state.
> Once this is done we can also close FLINK-4228.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2345: [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode

2016-08-09 Thread gyfora
Github user gyfora commented on the issue:

https://github.com/apache/flink/pull/2345
  
But I wonder what would happen in a scenario with a lot of states:

Semi async: short local copy time at every snapshot + very fast restore
Fully async: no copy time + very slow restore (puts sort data, recreate 
index etc)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2345: [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode

2016-08-09 Thread gyfora
Github user gyfora commented on the issue:

https://github.com/apache/flink/pull/2345
  
Good thing about the way fully async checkpoints are restored though is 
that it is very trivial to insert some state adaptor code :) 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413575#comment-15413575
 ] 

ASF GitHub Bot commented on FLINK-4340:
---

Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/2345
  
Jip, that's also good.  


> Remove RocksDB Semi-Async Checkpoint Mode
> -
>
> Key: FLINK-4340
> URL: https://issues.apache.org/jira/browse/FLINK-4340
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.1.0
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>
> This seems to be causing to many problems and is also incompatible with the 
> upcoming key-group/sharding changes that will allow rescaling of keyed state.
> Once this is done we can also close FLINK-4228.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2345: [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode

2016-08-09 Thread aljoscha
Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/2345
  
Jip, that's also good. 😃 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4326) Flink start-up scripts should optionally start services on the foreground

2016-08-09 Thread Greg Hogan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413551#comment-15413551
 ] 

Greg Hogan commented on FLINK-4326:
---

Using {{exec}} solves the problem of not knowing the PID until the daemon has 
launched but doesn't allow for removing the PID after termination. What level 
of monitoring is performed by the supervisor? Is this simply a "is this process 
still alive" or more complicated like tracking cpu and memory usage?

> Flink start-up scripts should optionally start services on the foreground
> -
>
> Key: FLINK-4326
> URL: https://issues.apache.org/jira/browse/FLINK-4326
> Project: Flink
>  Issue Type: Improvement
>  Components: Startup Shell Scripts
>Affects Versions: 1.0.3
>Reporter: Elias Levy
>
> This has previously been mentioned in the mailing list, but has not been 
> addressed.  Flink start-up scripts start the job and task managers in the 
> background.  This makes it difficult to integrate Flink with most processes 
> supervisory tools and init systems, including Docker.  One can get around 
> this via hacking the scripts or manually starting the right classes via Java, 
> but it is a brittle solution.
> In addition to starting the daemons in the foreground, the start up scripts 
> should use exec instead of running the commends, so as to avoid forks.  Many 
> supervisory tools assume the PID of the process to be monitored is that of 
> the process it first executes, and fork chains make it difficult for the 
> supervisor to figure out what process to monitor.  Specifically, 
> jobmanager.sh and taskmanager.sh should exec flink-daemon.sh, and 
> flink-daemon.sh should exec java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413563#comment-15413563
 ] 

ASF GitHub Bot commented on FLINK-4340:
---

Github user gyfora commented on the issue:

https://github.com/apache/flink/pull/2345
  
Good thing about the way fully async checkpoints are restored though is 
that it is very trivial to insert some state adaptor code :) 


> Remove RocksDB Semi-Async Checkpoint Mode
> -
>
> Key: FLINK-4340
> URL: https://issues.apache.org/jira/browse/FLINK-4340
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.1.0
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>
> This seems to be causing to many problems and is also incompatible with the 
> upcoming key-group/sharding changes that will allow rescaling of keyed state.
> Once this is done we can also close FLINK-4228.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413536#comment-15413536
 ] 

ASF GitHub Bot commented on FLINK-4340:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2345
  
+1 for this

There seems to be an issue with the RocksDB backup engine, to we should 
probably discourage that mode even in current releases.

I would also remove the `HDFSCopyFromLocal` and `HDFSCopyToLocal` utils. We 
should not have dead code in the repository and we can always re-add them from 
the git history if needed.


> Remove RocksDB Semi-Async Checkpoint Mode
> -
>
> Key: FLINK-4340
> URL: https://issues.apache.org/jira/browse/FLINK-4340
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.1.0
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>
> This seems to be causing to many problems and is also incompatible with the 
> upcoming key-group/sharding changes that will allow rescaling of keyed state.
> Once this is done we can also close FLINK-4228.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2345: [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode

2016-08-09 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2345
  
+1 for this

There seems to be an issue with the RocksDB backup engine, to we should 
probably discourage that mode even in current releases.

I would also remove the `HDFSCopyFromLocal` and `HDFSCopyToLocal` utils. We 
should not have dead code in the repository and we can always re-add them from 
the git history if needed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413548#comment-15413548
 ] 

ASF GitHub Bot commented on FLINK-4340:
---

Github user gyfora commented on the issue:

https://github.com/apache/flink/pull/2345
  
But you are right it is probably more important to keep the latency down 
for the running programs, and for that the fully async seems to be strictly 
better


> Remove RocksDB Semi-Async Checkpoint Mode
> -
>
> Key: FLINK-4340
> URL: https://issues.apache.org/jira/browse/FLINK-4340
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.1.0
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>
> This seems to be causing to many problems and is also incompatible with the 
> upcoming key-group/sharding changes that will allow rescaling of keyed state.
> Once this is done we can also close FLINK-4228.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2345: [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode

2016-08-09 Thread gyfora
Github user gyfora commented on the issue:

https://github.com/apache/flink/pull/2345
  
But you are right it is probably more important to keep the latency down 
for the running programs, and for that the fully async seems to be strictly 
better


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413547#comment-15413547
 ] 

ASF GitHub Bot commented on FLINK-4340:
---

Github user gyfora commented on the issue:

https://github.com/apache/flink/pull/2345
  
Some of the benefits we lose on restore. Especially for very large states 
this can be pretty serious.

Maybe this is required for the sharding to some extent but I don't see this 
as completely straightforward, in terms of which one is better in production.


> Remove RocksDB Semi-Async Checkpoint Mode
> -
>
> Key: FLINK-4340
> URL: https://issues.apache.org/jira/browse/FLINK-4340
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.1.0
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>
> This seems to be causing to many problems and is also incompatible with the 
> upcoming key-group/sharding changes that will allow rescaling of keyed state.
> Once this is done we can also close FLINK-4228.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2345: [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode

2016-08-09 Thread gyfora
Github user gyfora commented on the issue:

https://github.com/apache/flink/pull/2345
  
Some of the benefits we lose on restore. Especially for very large states 
this can be pretty serious.

Maybe this is required for the sharding to some extent but I don't see this 
as completely straightforward, in terms of which one is better in production.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413542#comment-15413542
 ] 

ASF GitHub Bot commented on FLINK-4340:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2345
  
@gyfora You mean if the "full async" is slower than the "semi async"?


> Remove RocksDB Semi-Async Checkpoint Mode
> -
>
> Key: FLINK-4340
> URL: https://issues.apache.org/jira/browse/FLINK-4340
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.1.0
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>
> This seems to be causing to many problems and is also incompatible with the 
> upcoming key-group/sharding changes that will allow rescaling of keyed state.
> Once this is done we can also close FLINK-4228.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413537#comment-15413537
 ] 

ASF GitHub Bot commented on FLINK-4340:
---

Github user gyfora commented on the issue:

https://github.com/apache/flink/pull/2345
  
Hi,

Isn't this way of checkpointing is much much slower then the semi async 
version?


> Remove RocksDB Semi-Async Checkpoint Mode
> -
>
> Key: FLINK-4340
> URL: https://issues.apache.org/jira/browse/FLINK-4340
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.1.0
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>
> This seems to be causing to many problems and is also incompatible with the 
> upcoming key-group/sharding changes that will allow rescaling of keyed state.
> Once this is done we can also close FLINK-4228.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2345: [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode

2016-08-09 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2345
  
The "full async" takes more time but runs completely in the background, so 
performs better in most cases than "semi async".


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413544#comment-15413544
 ] 

ASF GitHub Bot commented on FLINK-4340:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2345
  
The "full async" takes more time but runs completely in the background, so 
performs better in most cases than "semi async".


> Remove RocksDB Semi-Async Checkpoint Mode
> -
>
> Key: FLINK-4340
> URL: https://issues.apache.org/jira/browse/FLINK-4340
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.1.0
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>
> This seems to be causing to many problems and is also incompatible with the 
> upcoming key-group/sharding changes that will allow rescaling of keyed state.
> Once this is done we can also close FLINK-4228.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2345: [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode

2016-08-09 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2345
  
@gyfora You mean if the "full async" is slower than the "semi async"?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2002: Support for bz2 compression in flink-core

2016-08-09 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/2002
  
I think this looks good. +1 to merge.

As a followup, we can upgrade the `commons-compression` version via 
dependency management (if it is backwards compatible, which apache commons libs 
usually are).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2090) toString of CollectionInputFormat takes long time when the collection is huge

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413521#comment-15413521
 ] 

ASF GitHub Bot commented on FLINK-2090:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2323


> toString of CollectionInputFormat takes long time when the collection is huge
> -
>
> Key: FLINK-2090
> URL: https://issues.apache.org/jira/browse/FLINK-2090
> Project: Flink
>  Issue Type: Improvement
>Reporter: Till Rohrmann
>Assignee: Ivan Mushketyk
>Priority: Minor
> Fix For: 1.2.0
>
>
> The {{toString}} method of {{CollectionInputFormat}} calls {{toString}} on 
> its underlying {{Collection}}. Thus, {{toString}} is called for each element 
> of the collection. If the {{Collection}} contains many elements or the 
> individual {{toString}} calls for each element take a long time, then the 
> string generation can take a considerable amount of time. [~mikiobraun] 
> noticed that when he inserted several jBLAS matrices into Flink.
> The {{toString}} method is mainly used for logging statements in 
> {{DataSourceNode}}'s {{computeOperatorSpecificDefaultEstimates}} method and 
> in {{JobGraphGenerator.getDescriptionForUserCode}}. I'm wondering whether it 
> is necessary to print the complete content of the underlying {{Collection}} 
> or if it's not enough to print only the first 3 elements in the {{toString}} 
> method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4282) Add Offset Parameter to WindowAssigners

2016-08-09 Thread Aditi Viswanathan (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413519#comment-15413519
 ] 

Aditi Viswanathan commented on FLINK-4282:
--

Also when we want to assign time zones, system.currentTimeMillis() will
always give the epoch time, irrespective of time zones. So we'd have to use
the epoch time, correct window size and triggering time as well as assign
the element to the specified time zone.

This is the same case as the scenario I mentioned with the 5 seconds,
because the processing time will be in UTC and won't directly fall into the
other time zone buckets. That's why I've modified the triggering so that
whatever the offset, it will still trigger correctly after the specified
window size.



Aditi Viswanathan | +91-9632130809
Data Engineer,
[24]7 Customer Ltd.

On Tue, Aug 9, 2016 at 6:35 PM, Aljoscha Krettek (JIRA) 



> Add Offset Parameter to WindowAssigners
> ---
>
> Key: FLINK-4282
> URL: https://issues.apache.org/jira/browse/FLINK-4282
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Reporter: Aljoscha Krettek
>
> Currently, windows are always aligned to EPOCH, which basically means days 
> are aligned with GMT. This is somewhat problematic for people living in 
> different timezones.
> And offset parameter would allow to adapt the window assigner to the timezone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-2090) toString of CollectionInputFormat takes long time when the collection is huge

2016-08-09 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen closed FLINK-2090.
---

> toString of CollectionInputFormat takes long time when the collection is huge
> -
>
> Key: FLINK-2090
> URL: https://issues.apache.org/jira/browse/FLINK-2090
> Project: Flink
>  Issue Type: Improvement
>Reporter: Till Rohrmann
>Assignee: Ivan Mushketyk
>Priority: Minor
> Fix For: 1.2.0
>
>
> The {{toString}} method of {{CollectionInputFormat}} calls {{toString}} on 
> its underlying {{Collection}}. Thus, {{toString}} is called for each element 
> of the collection. If the {{Collection}} contains many elements or the 
> individual {{toString}} calls for each element take a long time, then the 
> string generation can take a considerable amount of time. [~mikiobraun] 
> noticed that when he inserted several jBLAS matrices into Flink.
> The {{toString}} method is mainly used for logging statements in 
> {{DataSourceNode}}'s {{computeOperatorSpecificDefaultEstimates}} method and 
> in {{JobGraphGenerator.getDescriptionForUserCode}}. I'm wondering whether it 
> is necessary to print the complete content of the underlying {{Collection}} 
> or if it's not enough to print only the first 3 elements in the {{toString}} 
> method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4035) Bump Kafka producer in Kafka sink to Kafka 0.10.0.0

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413518#comment-15413518
 ] 

ASF GitHub Bot commented on FLINK-4035:
---

Github user rmetzger commented on the issue:

https://github.com/apache/flink/pull/2231
  
Sorry for joining this discussion late. I've been on vacation.
I also stumbled across the code duplicates. I'll check out the code from 
this pull request and see if there's a good way of re-using most of the 0.9 
connector code.


> Bump Kafka producer in Kafka sink to Kafka 0.10.0.0
> ---
>
> Key: FLINK-4035
> URL: https://issues.apache.org/jira/browse/FLINK-4035
> Project: Flink
>  Issue Type: Bug
>  Components: Kafka Connector
>Affects Versions: 1.0.3
>Reporter: Elias Levy
>Priority: Minor
>
> Kafka 0.10.0.0 introduced protocol changes related to the producer.  
> Published messages now include timestamps and compressed messages now include 
> relative offsets.  As it is now, brokers must decompress publisher compressed 
> messages, assign offset to them, and recompress them, which is wasteful and 
> makes it less likely that compression will be used at all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink issue #2231: [FLINK-4035] Bump Kafka producer in Kafka sink to Kafka 0...

2016-08-09 Thread rmetzger
Github user rmetzger commented on the issue:

https://github.com/apache/flink/pull/2231
  
Sorry for joining this discussion late. I've been on vacation.
I also stumbled across the code duplicates. I'll check out the code from 
this pull request and see if there's a good way of re-using most of the 0.9 
connector code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2323: [FLINK-2090] toString of CollectionInputFormat tak...

2016-08-09 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2323


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (FLINK-2090) toString of CollectionInputFormat takes long time when the collection is huge

2016-08-09 Thread Stephan Ewen (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephan Ewen resolved FLINK-2090.
-
   Resolution: Fixed
Fix Version/s: 1.2.0

Fixed via b5d58934d7124e0076e588e74485a60e7c1f484b

Thank you for the contribution

> toString of CollectionInputFormat takes long time when the collection is huge
> -
>
> Key: FLINK-2090
> URL: https://issues.apache.org/jira/browse/FLINK-2090
> Project: Flink
>  Issue Type: Improvement
>Reporter: Till Rohrmann
>Assignee: Ivan Mushketyk
>Priority: Minor
> Fix For: 1.2.0
>
>
> The {{toString}} method of {{CollectionInputFormat}} calls {{toString}} on 
> its underlying {{Collection}}. Thus, {{toString}} is called for each element 
> of the collection. If the {{Collection}} contains many elements or the 
> individual {{toString}} calls for each element take a long time, then the 
> string generation can take a considerable amount of time. [~mikiobraun] 
> noticed that when he inserted several jBLAS matrices into Flink.
> The {{toString}} method is mainly used for logging statements in 
> {{DataSourceNode}}'s {{computeOperatorSpecificDefaultEstimates}} method and 
> in {{JobGraphGenerator.getDescriptionForUserCode}}. I'm wondering whether it 
> is necessary to print the complete content of the underlying {{Collection}} 
> or if it's not enough to print only the first 3 elements in the {{toString}} 
> method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-4212) Lock PID file when starting daemons

2016-08-09 Thread Greg Hogan (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Hogan closed FLINK-4212.
-
Resolution: Implemented

Implemented in 46b427fac9cfceca7839fc93f06ba758101f4fee

> Lock PID file when starting daemons
> ---
>
> Key: FLINK-4212
> URL: https://issues.apache.org/jira/browse/FLINK-4212
> Project: Flink
>  Issue Type: Improvement
>  Components: Startup Shell Scripts
>Affects Versions: 1.1.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>
> As noted on the mailing list (0), when multiple TaskManagers are started in 
> parallel (using pdsh) there is a race condition on updating the pid: 1) the 
> pid file is first read to parse the process' index, 2) the process is 
> started, and 3) on success the daemon pid is appended to the pid file.
> We could use a tool such as {{flock}} to lock on the pid file while starting 
> the Flink daemon.
> 0: 
> http://mail-archives.apache.org/mod_mbox/flink-user/201607.mbox/%3CCA%2BssbKXw954Bz_sBRwP6db0FntWyGWzTyP7wJZ5nhOeQnof3kg%40mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4212) Lock PID file when starting daemons

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413516#comment-15413516
 ] 

ASF GitHub Bot commented on FLINK-4212:
---

Github user greghogan closed the pull request at:

https://github.com/apache/flink/pull/2251


> Lock PID file when starting daemons
> ---
>
> Key: FLINK-4212
> URL: https://issues.apache.org/jira/browse/FLINK-4212
> Project: Flink
>  Issue Type: Improvement
>  Components: Startup Shell Scripts
>Affects Versions: 1.1.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>
> As noted on the mailing list (0), when multiple TaskManagers are started in 
> parallel (using pdsh) there is a race condition on updating the pid: 1) the 
> pid file is first read to parse the process' index, 2) the process is 
> started, and 3) on success the daemon pid is appended to the pid file.
> We could use a tool such as {{flock}} to lock on the pid file while starting 
> the Flink daemon.
> 0: 
> http://mail-archives.apache.org/mod_mbox/flink-user/201607.mbox/%3CCA%2BssbKXw954Bz_sBRwP6db0FntWyGWzTyP7wJZ5nhOeQnof3kg%40mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2251: [FLINK-4212] [scripts] Lock PID file when starting...

2016-08-09 Thread greghogan
Github user greghogan closed the pull request at:

https://github.com/apache/flink/pull/2251


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413498#comment-15413498
 ] 

ASF GitHub Bot commented on FLINK-4340:
---

GitHub user aljoscha reopened a pull request:

https://github.com/apache/flink/pull/2345

[FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode

This was causing to many problems and is also incompatible with the
upcoming key-group/sharding changes.

R: @StephanEwen for review, should be fairly easy, though

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aljoscha/flink rocksdb/remove-semi-async

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2345.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2345


commit 521f93c01bc1b99805d41eee5ab9f5279f52c4bd
Author: Aljoscha Krettek 
Date:   2016-08-09T13:16:59Z

[FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode

This was causing to many problems and is also incompatible with the
upcoming key-group/sharding changes.




> Remove RocksDB Semi-Async Checkpoint Mode
> -
>
> Key: FLINK-4340
> URL: https://issues.apache.org/jira/browse/FLINK-4340
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.1.0
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>
> This seems to be causing to many problems and is also incompatible with the 
> upcoming key-group/sharding changes that will allow rescaling of keyed state.
> Once this is done we can also close FLINK-4228.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413497#comment-15413497
 ] 

ASF GitHub Bot commented on FLINK-4340:
---

Github user aljoscha closed the pull request at:

https://github.com/apache/flink/pull/2345


> Remove RocksDB Semi-Async Checkpoint Mode
> -
>
> Key: FLINK-4340
> URL: https://issues.apache.org/jira/browse/FLINK-4340
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.1.0
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>
> This seems to be causing to many problems and is also incompatible with the 
> upcoming key-group/sharding changes that will allow rescaling of keyed state.
> Once this is done we can also close FLINK-4228.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413496#comment-15413496
 ] 

ASF GitHub Bot commented on FLINK-4340:
---

Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/2345
  
Technically, `HDFSCopyFromLocal` and `HDFSCopyToLocal` are now unused. 
Should we remove them? They might be useful for some stuff in the future.


> Remove RocksDB Semi-Async Checkpoint Mode
> -
>
> Key: FLINK-4340
> URL: https://issues.apache.org/jira/browse/FLINK-4340
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.1.0
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>
> This seems to be causing to many problems and is also incompatible with the 
> upcoming key-group/sharding changes that will allow rescaling of keyed state.
> Once this is done we can also close FLINK-4228.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2345: [FLINK-4340] Remove RocksDB Semi-Async Checkpoint ...

2016-08-09 Thread aljoscha
Github user aljoscha closed the pull request at:

https://github.com/apache/flink/pull/2345


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2345: [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode

2016-08-09 Thread aljoscha
Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/2345
  
Technically, `HDFSCopyFromLocal` and `HDFSCopyToLocal` are now unused. 
Should we remove them? They might be useful for some stuff in the future.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #2345: [FLINK-4340] Remove RocksDB Semi-Async Checkpoint ...

2016-08-09 Thread aljoscha
GitHub user aljoscha reopened a pull request:

https://github.com/apache/flink/pull/2345

[FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode

This was causing to many problems and is also incompatible with the
upcoming key-group/sharding changes.

R: @StephanEwen for review, should be fairly easy, though

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aljoscha/flink rocksdb/remove-semi-async

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2345.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2345


commit 521f93c01bc1b99805d41eee5ab9f5279f52c4bd
Author: Aljoscha Krettek 
Date:   2016-08-09T13:16:59Z

[FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode

This was causing to many problems and is also incompatible with the
upcoming key-group/sharding changes.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413493#comment-15413493
 ] 

ASF GitHub Bot commented on FLINK-4340:
---

GitHub user aljoscha opened a pull request:

https://github.com/apache/flink/pull/2345

[FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode

This was causing to many problems and is also incompatible with the
upcoming key-group/sharding changes.

R: @StephanEwen for review, should be fairly easy, though

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aljoscha/flink rocksdb/remove-semi-async

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2345.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2345


commit 521f93c01bc1b99805d41eee5ab9f5279f52c4bd
Author: Aljoscha Krettek 
Date:   2016-08-09T13:16:59Z

[FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode

This was causing to many problems and is also incompatible with the
upcoming key-group/sharding changes.




> Remove RocksDB Semi-Async Checkpoint Mode
> -
>
> Key: FLINK-4340
> URL: https://issues.apache.org/jira/browse/FLINK-4340
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.1.0
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>
> This seems to be causing to many problems and is also incompatible with the 
> upcoming key-group/sharding changes that will allow rescaling of keyed state.
> Once this is done we can also close FLINK-4228.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request #2345: [FLINK-4340] Remove RocksDB Semi-Async Checkpoint ...

2016-08-09 Thread aljoscha
GitHub user aljoscha opened a pull request:

https://github.com/apache/flink/pull/2345

[FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode

This was causing to many problems and is also incompatible with the
upcoming key-group/sharding changes.

R: @StephanEwen for review, should be fairly easy, though

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aljoscha/flink rocksdb/remove-semi-async

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2345.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2345


commit 521f93c01bc1b99805d41eee5ab9f5279f52c4bd
Author: Aljoscha Krettek 
Date:   2016-08-09T13:16:59Z

[FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode

This was causing to many problems and is also incompatible with the
upcoming key-group/sharding changes.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink issue #2002: Support for bz2 compression in flink-core

2016-08-09 Thread mtanski
Github user mtanski commented on the issue:

https://github.com/apache/flink/pull/2002
  
Now that 1.1 one is out, is it possible to get this in?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-4212) Lock PID file when starting daemons

2016-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413473#comment-15413473
 ] 

ASF GitHub Bot commented on FLINK-4212:
---

Github user greghogan commented on the issue:

https://github.com/apache/flink/pull/2251
  
Merging this ...


> Lock PID file when starting daemons
> ---
>
> Key: FLINK-4212
> URL: https://issues.apache.org/jira/browse/FLINK-4212
> Project: Flink
>  Issue Type: Improvement
>  Components: Startup Shell Scripts
>Affects Versions: 1.1.0
>Reporter: Greg Hogan
>Assignee: Greg Hogan
>
> As noted on the mailing list (0), when multiple TaskManagers are started in 
> parallel (using pdsh) there is a race condition on updating the pid: 1) the 
> pid file is first read to parse the process' index, 2) the process is 
> started, and 3) on success the daemon pid is appended to the pid file.
> We could use a tool such as {{flock}} to lock on the pid file while starting 
> the Flink daemon.
> 0: 
> http://mail-archives.apache.org/mod_mbox/flink-user/201607.mbox/%3CCA%2BssbKXw954Bz_sBRwP6db0FntWyGWzTyP7wJZ5nhOeQnof3kg%40mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >