I think it would make sense to also move "State Backends" out from
"Runtime". This is also quite complex on it's own. I would of course
volunteer for this and I think Stephan, who is the current proposal for
"Runtime" would also be good.
On Wed, 8 Jun 2016 at 19:22 Stephan Ewen
> As far as I know, the reason why the broadcast variables are implemented
that way is that the senders would have to know which sub-tasks are
deployed to which TMs.
As the broadcast variables are realized as additionally attached "broadcast
channels", I am assuming that the same behavior will
Vladislav Pernin created FLINK-4034:
---
Summary: Dependency convergence on com.101tec:zkclient and
com.esotericsoftware.kryo:kryo
Key: FLINK-4034
URL: https://issues.apache.org/jira/browse/FLINK-4034
I am adding a dedicated component for "Checkpointing". It would include the
checkpoint coordinator, barriers, threads, state handles and recovery.
I think that part is big and complex enough to warrant its own shepherd. I
would volunteer for that and be happy to also have a second shepherd.
On
Tzu-Li (Gordon) Tai created FLINK-4033:
--
Summary: Missing Scala example snippets for the Kinesis Connector
documentation
Key: FLINK-4033
URL: https://issues.apache.org/jira/browse/FLINK-4033
Hi Till,
thanks for the fast answer.
I'll think about a concrete way of implementing and open an JIRA.
Best
Andreas
Von: Till Rohrmann
Gesendet: Mittwoch, 8. Juni 2016 15:53
An: dev@flink.apache.org
Betreff: Re: Broadcast data
Chesnay Schepler created FLINK-4032:
---
Summary: Replace all usage of Guava Preconditions
Key: FLINK-4032
URL: https://issues.apache.org/jira/browse/FLINK-4032
Project: Flink
Issue Type:
Hello Julius,
I don't think there is any real roadmap for the Python API, regardless
of batch or streaming.
Of the top of my head i can think of the following issue:
The batch Python API makes heavy use of MapPartitions to transfer data
in batches,
I'm not sure how well this could be done
Hi,
I am interested in using Flink as part of a research project. We normally
use python as a programming language. The python support for the Batch API
is already quite good. But I couldn't find any information on the future
roadmap regarding python support in Flink.
Are there plans to add
Hi Andreas,
your observation is correct. The data is sent to each slot and the
receiving TM only materializes one copy of the data. The rest of the data
is discarded.
As far as I know, the reason why the broadcast variables are implemented
that way is that the senders would have to know which
Maximilian Michels created FLINK-4030:
-
Summary: ScalaShellITCase
Key: FLINK-4030
URL: https://issues.apache.org/jira/browse/FLINK-4030
Project: Flink
Issue Type: Bug
Hi,
we experience some unexpected increase of data sent over the network for
broadcasts with increasing number of slots per Taskmanager.
We provided a benchmark [1]. It not only increases the size of data sent over
the network but also hurts performance as seen in the preliminary results
Rami created FLINK-4029:
---
Summary: Multi-field "sum" function just like "keyBy"
Key: FLINK-4029
URL: https://issues.apache.org/jira/browse/FLINK-4029
Project: Flink
Issue Type: Improvement
Hi,
the directed output via the split and select methods are indeed only
available in the DataStream API. Thus, in order to achieve the same with
the DataSet API, you would have to apply multiple filters, as you've
already written.
The result of the select call will only be sent to the same task
14 matches
Mail list logo