Stephan and I discussed this for a bit and we came to the conclusion that there
are actually two different (orthogonal) settings at play here: "object reuse"
and "object forwarding behaviour". The former specifies whether we reuse
objects when deserialising records from the network, the latter
Thanks for the feedback. Will leave this open for some more days, and adopt
it as a FLIP, taking Greg's and Aljoscha's comments into account.
On Sun, Jul 2, 2017 at 10:13 PM, Ufuk Celebi wrote:
> Thanks for the write up and illustrations. :-) +1 to do this.
>
> I'm fine with
Hi Zhenzhong!
The difference is as follows:
DEFAULT means that at the beginning of a chain, an object is created per
record, and that object travels through the chain. The total number of
instantiated objects is as many as records, but only one lives at the same
time.
FULL_REUSE is only
Thank you for the reply and for the support!
@Greg, controlling object reuse on a per-operator base is definitely a good
way to follow up. My first thought would be to keep this proposal slim and
deal with the "default" logic, and have a followup effort to make this
controllable per operator.
+1 for changing the default if so many people encountered problems with
serialisation costs.
The first two modes don’t require any code changes, correct? Only the last one
would require changes to the stream input processors.
We should also keep this issue in mind:
Stephan,
Fully supporting this FLIP. We originally encountered pretty big surprises
observing the object copy behavior causing significant performance degradation
for our massively parallel use case.
In our case, (I think most appropriately SHOULD be the assumptions for all
streaming use
Hi Stephan,
Would this be an appropriate time to discuss allowing reuse to be a
per-operator configuration? Object reuse for chained operators has lead to
considerable surprise for some users of the DataSet API. This came up during
the rework of the object reuse documentation for the DataSet
Hi all!
I would like to propose the following FLIP:
FLIP-21 - Improve object Copying/Reuse Mode for Streaming Runtime:
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=71012982
The FLIP is motivated by the fact that many users run into an unnecessary
kind of performance problem