Re: Incremental checkpoint branch

2017-03-03 Thread SHI Xiaogang
Hi Vinshnu, We have obtained an initial design of incremental checkpointing [1] and will start working on incremental checkpointing the next week. You can watch the issue FLINK-5053 [2] to get timely notification of the updates. All suggestions are welcome. [1]

Re: Machine Learning on Flink - Next steps

2017-03-03 Thread Roberto Bentivoglio
Hi All, I'd like to start working on: - Offline learning with Streaming API - Online learning I think also that using a new organisation on github, as Theodore propsed, to keep an initial indipendency to speed up the prototyping and development phases it's really interesting. I totally agree

Re: Incremental checkpoint branch

2017-03-03 Thread Shaoxuan Wang
Vinshnu, You can find the latest design discussion for incremental checkpoint in http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-Incremental-Checkpointing-in-Flink-td15931.html @Stefan Richter

[jira] [Created] (FLINK-5963) Remove preparation mapper of DataSetAggregate

2017-03-03 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-5963: Summary: Remove preparation mapper of DataSetAggregate Key: FLINK-5963 URL: https://issues.apache.org/jira/browse/FLINK-5963 Project: Flink Issue Type:

Incremental checkpoint branch

2017-03-03 Thread Vishnu Viswanath
Hi, Can someone point me to the branch where the ongoing work for incremental checkpoint is going on, I would like to try it out even if the work is not complete. I have a use case where the state size increase about ~1gb every 5 minutes. Thanks, Vishnu

Re: Machine Learning on Flink - Next steps

2017-03-03 Thread amir bahmanyari
Great points to start:    - Online learning   - Offline learning with the streaming API Thanks + have a great weekend. From: Katherin Eri To: dev@flink.apache.org Sent: Friday, March 3, 2017 7:41 AM Subject: Re: Machine Learning on Flink - Next steps Thank

[jira] [Created] (FLINK-5962) Cancel checkpoint canceller tasks in CheckpointCoordinator

2017-03-03 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-5962: Summary: Cancel checkpoint canceller tasks in CheckpointCoordinator Key: FLINK-5962 URL: https://issues.apache.org/jira/browse/FLINK-5962 Project: Flink

[jira] [Created] (FLINK-5961) Queryable State is broken for HeapKeyedStateBackend

2017-03-03 Thread Stefan Richter (JIRA)
Stefan Richter created FLINK-5961: - Summary: Queryable State is broken for HeapKeyedStateBackend Key: FLINK-5961 URL: https://issues.apache.org/jira/browse/FLINK-5961 Project: Flink Issue

[jira] [Created] (FLINK-5960) Make CheckpointCoordinator less blocking

2017-03-03 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-5960: Summary: Make CheckpointCoordinator less blocking Key: FLINK-5960 URL: https://issues.apache.org/jira/browse/FLINK-5960 Project: Flink Issue Type:

[jira] [Created] (FLINK-5959) Verify that mesos-appmaster.sh respects env.java.opts(.jobmanager)

2017-03-03 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-5959: Summary: Verify that mesos-appmaster.sh respects env.java.opts(.jobmanager) Key: FLINK-5959 URL: https://issues.apache.org/jira/browse/FLINK-5959 Project: Flink

Machine Learning on Flink - Next steps

2017-03-03 Thread Theodore Vasiloudis
Hello all, >From our previous discussion started by Stavros, we decided to start a planning document [1] to figure out possible next steps for ML on Flink. Our concerns where mainly ensuring active development while satisfying the needs of the community. We have listed a number of proposals for

Re: [DISCUSS] Flink ML roadmap

2017-03-03 Thread Theodore Vasiloudis
It seems like a relatively new project, backed by Intel. My impression from the doc Roberto linked is that they might switch to using Beam instead of Spark (?) I'm cc'ing Soila who is developer of TAP and has worked on FlinkML in the past, perhaps she has some input on how they plan to work with

Re: [DISCUSS] Flink ML roadmap

2017-03-03 Thread Stavros Kontopoulos
Interesting thanx @Roberto. I see that only TAP Analytics Toolkit supports streaming. I am not aware of its market share, anyone? Best, Stavros On Fri, Mar 3, 2017 at 11:50 AM, Theodore Vasiloudis < theodoros.vasilou...@gmail.com> wrote: > Thank you for the links Roberto I did not know that

Re: Dataset and select/split functionality

2017-03-03 Thread CPC
Hi Fabian, Thank you for your explanation. Also can you give an example on how the optimizer behaves on the assumption that the outputs of a function are replicated? Thank you... On 3 March 2017 at 13:52, Fabian Hueske wrote: > Hi CPC, > > we had several requests in the

[jira] [Created] (FLINK-5958) Asyncronous snapshots for heap-based keyed state backends

2017-03-03 Thread Stefan Richter (JIRA)
Stefan Richter created FLINK-5958: - Summary: Asyncronous snapshots for heap-based keyed state backends Key: FLINK-5958 URL: https://issues.apache.org/jira/browse/FLINK-5958 Project: Flink

Re: Dataset and select/split functionality

2017-03-03 Thread Fabian Hueske
Hi CPC, we had several requests in the past to add this features. However, adding select/split for DataSet is much! more work than you would expect. As you pointed out, we have to go through the optimizer, which assumes that the outputs of a function are replicated. This is pretty much wired in

[jira] [Created] (FLINK-5957) Remove `getAccumulatorType` method from `AggregateFunction`

2017-03-03 Thread sunjincheng (JIRA)
sunjincheng created FLINK-5957: -- Summary: Remove `getAccumulatorType` method from `AggregateFunction` Key: FLINK-5957 URL: https://issues.apache.org/jira/browse/FLINK-5957 Project: Flink

Re: [DISCUSS] Flink ML roadmap

2017-03-03 Thread Theodore Vasiloudis
Thank you for the links Roberto I did not know that Beam was working on an ML abstraction as well. I'm sure we can learn from that. I'll start another thread today where we can discuss next steps and action points now that we have a few different paths to follow listed on the shared doc, since

[jira] [Created] (FLINK-5956) Add retract method into the aggregateFunction

2017-03-03 Thread Shaoxuan Wang (JIRA)
Shaoxuan Wang created FLINK-5956: Summary: Add retract method into the aggregateFunction Key: FLINK-5956 URL: https://issues.apache.org/jira/browse/FLINK-5956 Project: Flink Issue Type: