[jira] [Created] (FLINK-5873) KeyedStream: expose an user defined function for key group assignment
Ovidiu Marcu created FLINK-5873: --- Summary: KeyedStream: expose an user defined function for key group assignment Key: FLINK-5873 URL: https://issues.apache.org/jira/browse/FLINK-5873 Project: Flink Issue Type: New Feature Components: DataStream API Reporter: Ovidiu Marcu Priority: Minor http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/KeyGroupRangeAssignment-td16041.html -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (FLINK-4815) Automatic fallback to earlier checkpoints when checkpoint restore fails
[ https://issues.apache.org/jira/browse/FLINK-4815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15571398#comment-15571398 ] Ovidiu Marcu commented on FLINK-4815: - hi I am currently working on a very similar problem. I am interested in any documentation that should describe details of what is currently done in the framework. Any other discussion on these issues will be on the dev/user lists? I would like to join, thanks. Ovidiu > Automatic fallback to earlier checkpoints when checkpoint restore fails > --- > > Key: FLINK-4815 > URL: https://issues.apache.org/jira/browse/FLINK-4815 > Project: Flink > Issue Type: New Feature > Components: State Backends, Checkpointing >Reporter: Stephan Ewen > > Flink should keep multiple completed checkpoints. > When the restore of one completed checkpoint fails for a certain number of > times, the CheckpointCoordinator should fall back to an earlier checkpoint to > restore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1404) Add support to cache intermediate results
[ https://issues.apache.org/jira/browse/FLINK-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233507#comment-15233507 ] Ovidiu Marcu commented on FLINK-1404: - Thanks, I am trying to understand the engine including its runtime but I don't 'see' what kind of abstractions you have arrived to. Runtime intermediate results are described in so many ways in Flink, but they rely on network buffers also.. > Add support to cache intermediate results > - > > Key: FLINK-1404 > URL: https://issues.apache.org/jira/browse/FLINK-1404 > Project: Flink > Issue Type: Sub-task > Components: Distributed Runtime >Reporter: Ufuk Celebi >Assignee: Maximilian Michels > > With blocking intermediate results (FLINK-1350) and proper partition state > management (FLINK-1359) it is necessary to allow the network buffer pool to > request eviction of historic intermediate results when not enough buffers are > available. With the currently available pipelined intermediate partitions > this is not an issue, because buffer pools can be released as soon as a > partition is consumed. > We need to be able to trigger the recycling of buffers held by historic > intermediate results when not enough buffers are available for new local > pools. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1404) Add support to cache intermediate results
[ https://issues.apache.org/jira/browse/FLINK-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232639#comment-15232639 ] Ovidiu Marcu commented on FLINK-1404: - hi, is this related to the 'persistent-blocking-no back pressure intermediate result partition variant' specified in https://github.com/apache/flink/pull/254 ? > Add support to cache intermediate results > - > > Key: FLINK-1404 > URL: https://issues.apache.org/jira/browse/FLINK-1404 > Project: Flink > Issue Type: Sub-task > Components: Distributed Runtime >Reporter: Ufuk Celebi >Assignee: Maximilian Michels > > With blocking intermediate results (FLINK-1350) and proper partition state > management (FLINK-1359) it is necessary to allow the network buffer pool to > request eviction of historic intermediate results when not enough buffers are > available. With the currently available pipelined intermediate partitions > this is not an issue, because buffer pools can be released as soon as a > partition is consumed. > We need to be able to trigger the recycling of buffers held by historic > intermediate results when not enough buffers are available for new local > pools. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2099) Add a SQL API
[ https://issues.apache.org/jira/browse/FLINK-2099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15048601#comment-15048601 ] Ovidiu Marcu commented on FLINK-2099: - Thanks, I will look into FLINK-3152. Can you please make an umbrella task to link all SQL jira issues currently opened? I am also investigating ways to use the benchmark TPC-DS. This will raise other missing features probably. Regarding Table API this is all the docs I can find https://ci.apache.org/projects/flink/flink-docs-release-0.10/libs/table.html , maybe there is something else I'm missing and you can point out? Thanks. > Add a SQL API > - > > Key: FLINK-2099 > URL: https://issues.apache.org/jira/browse/FLINK-2099 > Project: Flink > Issue Type: New Feature > Components: Table API >Reporter: Timo Walther >Assignee: Timo Walther > > From the mailing list: > Fabian: Flink's Table API is pretty close to what SQL provides. IMO, the best > approach would be to leverage that and build a SQL parser (maybe together > with a logical optimizer) on top of the Table API. Parser (and optimizer) > could be built using Apache Calcite which is providing exactly this. > Since the Table API is still a fairly new component and not very feature > rich, it might make sense to extend and strengthen it before putting > something major on top. > Ted: It would also be relatively simple (I think) to retarget drill to Flink > if > Flink doesn't provide enough typing meta-data to do traditional SQL. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2099) Add a SQL API
[ https://issues.apache.org/jira/browse/FLINK-2099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15046707#comment-15046707 ] Ovidiu Marcu commented on FLINK-2099: - Hi Timo, Thank you for your detailed response. For the benchmark you want to use, I am not sure if TPC-H is the best to follow, as the new benchmark TPC-DS (next generation decision support) is developed to handle the deficiencies of TPC-H (reference here http://www.tpc.org/tpcds/presentations/The_Making_of_TPCDS.pdf) I would like to help/contribute, I will look into your branch and come back to you. I think this is one the most important features of Flink, I don't know why Flink guys don't want to increase its priority and give more attention. Best regards, Ovidiu > Add a SQL API > - > > Key: FLINK-2099 > URL: https://issues.apache.org/jira/browse/FLINK-2099 > Project: Flink > Issue Type: New Feature > Components: Table API >Reporter: Timo Walther >Assignee: Timo Walther > > From the mailing list: > Fabian: Flink's Table API is pretty close to what SQL provides. IMO, the best > approach would be to leverage that and build a SQL parser (maybe together > with a logical optimizer) on top of the Table API. Parser (and optimizer) > could be built using Apache Calcite which is providing exactly this. > Since the Table API is still a fairly new component and not very feature > rich, it might make sense to extend and strengthen it before putting > something major on top. > Ted: It would also be relatively simple (I think) to retarget drill to Flink > if > Flink doesn't provide enough typing meta-data to do traditional SQL. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2099) Add a SQL API
[ https://issues.apache.org/jira/browse/FLINK-2099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15045170#comment-15045170 ] Ovidiu Marcu commented on FLINK-2099: - Hi Timo, I would be interested in the SQL prototype. What are your objectives for this presentation? Will it be available soon? This is a very important feature, can you give more details about it? Thank you! > Add a SQL API > - > > Key: FLINK-2099 > URL: https://issues.apache.org/jira/browse/FLINK-2099 > Project: Flink > Issue Type: New Feature > Components: Table API >Reporter: Timo Walther >Assignee: Timo Walther > > From the mailing list: > Fabian: Flink's Table API is pretty close to what SQL provides. IMO, the best > approach would be to leverage that and build a SQL parser (maybe together > with a logical optimizer) on top of the Table API. Parser (and optimizer) > could be built using Apache Calcite which is providing exactly this. > Since the Table API is still a fairly new component and not very feature > rich, it might make sense to extend and strengthen it before putting > something major on top. > Ted: It would also be relatively simple (I think) to retarget drill to Flink > if > Flink doesn't provide enough typing meta-data to do traditional SQL. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1731) Add kMeans clustering algorithm to machine learning library
[ https://issues.apache.org/jira/browse/FLINK-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15044136#comment-15044136 ] Ovidiu Marcu commented on FLINK-1731: - Will you consider this issue within the next release? > Add kMeans clustering algorithm to machine learning library > --- > > Key: FLINK-1731 > URL: https://issues.apache.org/jira/browse/FLINK-1731 > Project: Flink > Issue Type: New Feature > Components: Machine Learning Library >Reporter: Till Rohrmann >Assignee: Peter Schrott > Labels: ML > > The Flink repository already contains a kMeans implementation but it is not > yet ported to the machine learning library. I assume that only the used data > types have to be adapted and then it can be more or less directly moved to > flink-ml. > The kMeans++ [1] and the kMeans|| [2] algorithm constitute a better > implementation because the improve the initial seeding phase to achieve near > optimal clustering. It might be worthwhile to implement kMeans||. > Resources: > [1] http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf > [2] http://theory.stanford.edu/~sergei/papers/vldb12-kmpar.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332)