wondering whether the state inside iterations still makes
sense without these in-flight elements. But I also don't know the King
use-case, that's why I though an example could be helpful.
On Wed, Jun 10, 2015 at 12:37 PM, Gyula Fóra gyula.f...@gmail.com wrote:
I don't understand the question, I vote
-clustering/blob/master/src/main/scala/stream/clustering/StreamClustering.scala
(checkpointing is not turned on but you will get the point)
Gyula Fóra gyula.f...@gmail.com ezt írta (időpont: 2015. jún. 10., Sze,
12:47):
You are right, to have consistent results we would need to persist the
records
that correspond to this state or created this state is
desirable. Maybe an example could help understand this better?
On Wed, Jun 10, 2015 at 11:27 AM, Gyula Fóra gyula.f...@gmail.com wrote:
The other tests verify that the checkpointing algorithm runs properly.
That
also ensures that it runs
within the loop and also slightly change the current
protocol since it works only for DAGs. Before going into that direction
though I would propose we first see whether there is a nice way to make
iterations more structured.
Paris
From: Gyula Fóra
of feedback data should not
matter.
Let us not block prominent users until the next release on this.
On Wed, Jun 10, 2015 at 8:09 AM, Gyula Fóra gyula.f...@gmail.com
wrote:
As for people currently suffering from it:
An application King is developing requires iterations, and they need
...@apache.org
wrote:
We could add a method on the ExecutionConfig but mark it as deprecated
and explain that it will go away once the interplay of iterations,
state and so on is properly figured out.
On Wed, Jun 10, 2015 at 2:36 PM, Ufuk Celebi u...@apache.org wrote:
On 10 Jun 2015, at 14:29, Gyula
Done, I will merge it after travis passes.
Maximilian Michels m...@apache.org ezt írta (időpont: 2015. jún. 10., Sze,
15:25):
Let's mark the method of the environment as deprecated like Aljoscha
suggested. Then I think we could merge it.
On Wed, Jun 10, 2015 at 2:50 PM, Gyula Fóra gyula.f
vote to postpone this.
– Ufuk
On 10 Jun 2015, at 00:19, Gyula Fóra gyf...@apache.org wrote:
Hey all,
It is currently impossible to enable state checkpointing for iterative
jobs, because en exception is thrown when creating the jobgraph. This
behaviour is motivated by the lack of precise
not yet a clear understanding about how we actually want it to
behave, exactly.
What do you think?
On Wed, Jun 3, 2015 at 2:14 PM, Gyula Fóra gyula.f...@gmail.com wrote:
I am talking of course about global delta windows. On the full stream not
on a partition. Delta windows per partition
as a
coordinator
between vertices on the same level.
On Thu, Jun 4, 2015 at 2:55 PM, Gyula Fóra gyula.f...@gmail.com wrote:
Thank you!
I was aware of the iterations as a possibility, but I was wondering if
we
might have lateral communications.
Ufuk Celebi u...@apache.org ezt írta (időpont
Thank you!
I was aware of the iterations as a possibility, but I was wondering if we
might have lateral communications.
Ufuk Celebi u...@apache.org ezt írta (időpont: 2015. jún. 4., Cs, 13:29):
On 04 Jun 2015, at 12:46, Stephan Ewen se...@apache.org wrote:
There is no lateral communication
and then it doesn't make sense to explore
complicated workarounds for the current system.
On Wed, Jun 3, 2015 at 11:07 AM, Gyula Fóra gyula.f...@gmail.com wrote:
There are simple ways of implementing it in a non-distributed or
inconsistent fashion.
On Wed, Jun 3, 2015 at 8:55 AM Aljoscha
that starts the
window can be processed on a different machine than the element that
finishes the window?
On Wed, Jun 3, 2015 at 12:11 PM, Gyula Fóra gyula.f...@gmail.com wrote:
This is not connected to the current implementation. So lets not talk
about
that.
This is about theoretical limits
, Gyula Fóra gyula.f...@gmail.com wrote:
Hi Ufuk,
In the concrete use case I have in mind I only want to send events to
another subtask of the same task vertex.
Specifically: if we want to do distributed delta based windows we need to
send after every trigger the element that has triggered
Hi,
I am wondering, what is the suggested way to send some events directly to
another parallel instance in a flink job? For example from one mapper to
another mapper (of the same operator).
Do we have any internal support for this? The first thing that we thought
of is iterations but that is
at 10:56 PM, Gyula Fóra gyula.f...@gmail.com
wrote:
Hey,
It seems like both interfaces are pretty much capable of doing the same
thing but work on slightly different assumptions.
Isn't there a way that the kafka source can work with the
interruptions?
I
think the reachedEnd
Hey,
It seems like both interfaces are pretty much capable of doing the same
thing but work on slightly different assumptions.
Isn't there a way that the kafka source can work with the interruptions? I
think the reachedEnd/next interface is slightly easier to grasp than the
run() with the locks.
.
Best regards,
Gabor
2015-05-28 19:23 GMT+02:00 Gyula Fóra gyula.f...@gmail.com:
Hi,
Indeed a good catch, and a valid issue exactly because of the stateful
nature of the trigger and eviction policies.
I agree with the suggested approach that this should be configurable
Hey,
I would also strongly prefer the second option, users need to have the
option to force cancel a program in case of something unwanted behaviour.
Cheers,
Gyula
Matthias J. Sax mj...@informatik.hu-berlin.de ezt írta (időpont: 2015.
máj. 27., Sze, 1:20):
Hi,
currently, the only way to
and not break the correct/expected
behaviour.
Paris?
On 20 May 2015, at 11:02, Márton Balassi mbala...@apache.org wrote:
+1 for copying.
On May 20, 2015 10:50 AM, Gyula Fóra gyf...@apache.org wrote:
Hey,
The latest streaming operator rework removed the copying
: Gyula Fóra gyf...@apache.org
Sent: Wednesday, May 20, 2015 2:06 PM
To: dev@flink.apache.org
Subject: Re: [DISCUSS] Re-add record copy to chained operator calls
Copy before putting it into a window buffer and any other group buffer.
Exactly my point. Any stateful operator should be able
operator in a chain is repeatedly
emitting the same object, and the succeeding operator is gathering the
objects, then it is a problem
Or are there cases where the system itself repeatedly emits the same
objects?
On Wed, May 20, 2015 at 3:07 PM, Gyula Fóra gyf...@apache.org wrote:
We
emit (or mutate) the same object for example
in
Spark, you get an RDD with completely messed up contents.
On Wed, May 20, 2015 at 3:27 PM, Gyula Fóra gyf...@apache.org wrote:
If the preceding operator is emitting a mutated object, or does
something
with the output object afterwards
a well informed decision, as this behavior is as much part
of the API as the classname of the DataStream.
On Wed, May 20, 2015 at 3:41 PM, Gyula Fóra gyula.f...@gmail.com wrote:
I would go for the Failsafe option as a default behaviour with a clearly
documented lightweight (no-copy) setting
you are very keen on the failsafe variant. That is fine, I'd say
let's go ahead.
Then let us introduce a switch. The switch needs to work on copies for user
functions only. Until the window buffers are serialized, we need to keep
the copies there.
On Wed, May 20, 2015 at 3:55 PM, Gyula Fóra
I think this is that thread :)
But as I said it is just a matter of what we want to add, and we can
already do it.
On Wed, May 13, 2015 at 11:37 AM, Stephan Ewen se...@apache.org wrote:
This is a pretty central question, actually (timestamping the results of
windows). Let us kick off a
Hi,
This was the exact need that motivated me to rework the windowing and
introduce the StreamWindow abstraction which can hold any metadata that
represents the current window.
At this moment it only contains a unique id but this could be extended
easily.
When the user created a
Hi,
Checkpoint barriers are handled directly on top of the network layer and
you are right they work similarly, by blocking input channels until it gets
the barrier from all of them.
A way of implementing this on the operator level would be by adding a way
to ask the inputreader the channel
Generally I am in favor of making these name changes. My only concern is
regarding to the one-input and multiple inputs operators.
There is a general problem with the n-ary operators regarding type safety,
thats why we now have SingleInput and Co (two-input) operators. I think we
should keep
there would need to be some
functionality there at some point.
Or maybe it is not required there and can be handled in the
StreamTask. Others might know this better than I do right now.
On Tue, May 5, 2015 at 3:24 PM, Gyula Fóra gyula.f...@gmail.com
javascript:; wrote:
What would
I think this a good idea in general. I would try to minimize the methods we
include and make the ones that we keep very concrete. For instance i would
not have the receive barrier method as that is handled on a totally
different level already. And instead of punctuation I would directly add a
to the TaskContext and ExecutionConfig
and Serializers. The operator would emit values using an emit() method
or the Collector interface, not sure about that yet.
On Tue, May 5, 2015 at 3:12 PM, Gyula Fóra gyf...@apache.org
javascript:; wrote:
I think this a good idea in general. I would try
:15 PM, Gyula Fóra gyf...@apache.org wrote:
Hey,
They actually work :P Although I have to admit I need to do some
refactoring of the method names and parameters.
I made some quick refactoring and added some comments for the key
methods:
https://github.com/mbalassi/flink/blob
On 20.04.2015 22:32, Gyula Fóra wrote:
Hey all,
I think we are missing a quite useful feature that could be
implemented (with some slight modifications) on top of the current
windowing api.
We currently provide 2 ways of aggregating (or reducing) over
streams: doing a continuous
, and max where you need to pass only a value
to the next aggregation or also more complex data structures, e.g., a
synopsis of the full stream, to compute an aggregation such as an
approximate count distinct (item count)?
Cheers,
Bruno
On 21.04.2015 15:18, Gyula Fóra wrote:
You are right
))
I think that would be more consistent with the structure of the remaining
API.
Cheers, Fabian
2015-04-21 10:57 GMT+02:00 Gyula Fóra gyf...@apache.org:
Hi Bruno,
Of course you can do that as well. (That's the good part :p )
I will open a PR soon with the proposed changes (first
Hey all,
I think we are missing a quite useful feature that could be implemented
(with some slight modifications) on top of the current windowing api.
We currently provide 2 ways of aggregating (or reducing) over streams:
doing a continuous aggregation and always output the aggregated value
+1
On Mon, Apr 20, 2015 at 2:41 PM, Fabian Hueske fhue...@gmail.com wrote:
+1
2015-04-20 14:39 GMT+02:00 Maximilian Michels m...@apache.org:
+1 Let's merge it to flink-staging and get some people to use it.
On Mon, Apr 20, 2015 at 2:21 PM, Kostas Tzoumas ktzou...@apache.org
wrote:
). The timestamping for (intermediate) result
tuples, is also an important question to be answered.
-Matthias
On 04/07/2015 11:37 AM, Gyula Fóra wrote:
Hey,
I agree with Kostas, if we define the exact semantics how this works,
this
is not more ad-hoc than any other stateful
Dear All,
Today I did a major refactoring of some streaming components, with a lot of
class renamings and some package restructuring.
https://github.com/apache/flink/pull/594
1. I refactored the internal representation of the Streaming topologies
(StreamGraph) to a more straightforward and less
, Márton Balassi balassi.mar...@gmail.com
wrote:
Big +1 for the proposal for Peter and Gyula. I'm really for bringing
the
windowing and window join API in sync.
On Thu, Apr 2, 2015 at 6:32 PM, Gyula Fóra gyf...@apache.org wrote:
Hey guys,
As Aljoscha has highlighted earlier
This sounds amazing :) thanks Matthias!
Tomorrow I will spend some time to look through your work and give some
comments.
Also I would love to help with this effort so once we merge an initial
prototype let's open some Jiras and I will pick some up :)
Gyula
On Thursday, April 2, 2015, Márton
Hey guys,
As Aljoscha has highlighted earlier the current window join semantics in
the streaming api doesn't follow the changes in the windowing api. More
precisely, we currently only support joins over time windows of equal size
on both streams. The reason for this is that we now take a window
Amazing! :)
On Wed, Apr 1, 2015 at 9:41 AM, fhue...@gmail.com wrote:
My ruby skills are a bit rust(y) but I’d love to contribute.
Can you point me to a repository that I can fork?
From: Till Rohrmann
Sent: Wednesday, 1. April, 2015 09:29
To: dev@flink.apache.org
Hey Gabor,
Thank you for the proposal. It has many interesting ideas and a good
potential.
My comments:
We already have a large amount of ongoing work on the windowing
optimizations, covering your suggestions in section 1. It would be better
to drop that part from the project because thats very
Hey Aljoscha,
This is not an accident :D
The reason behind only supporting time-window joins at the moment, is
because time-window-joins can be implemented in a completely non-blocking
fashion arising from the global nature of time.
All other windows would need to be matched properly which
semantics
you would like to provide can hugely affect (probably limit) the
possibilities that you have afterwards in terms of optimizations.
Let me know if you have further questions regarding this :)
Gyula
On Tue, Mar 24, 2015 at 12:01 PM, Gyula Fóra gyf...@apache.org wrote:
Hey,
Give me an hour
Hey,
Give me an hour or so as I am in a meeting currently, but I will get back
to you afterwards.
Regards,
Gyula
On Tue, Mar 24, 2015 at 11:03 AM, Akshay Dixit akshayd...@gmail.com wrote:
Hi,
It'd really help if I got a reply soon. It'll be helpful in writing the
proposal since the deadline
Hey Matthias,
Let's see if I get these things for you :)
1) The difference between setup and open is that, setup to set things like
collectors, runtimecontext and everything that will be used by the
implemented invokable, and also by the rich functions. Open is called after
setup, to actually
is fine and should not be changed. Before opening a JIRA for
it, you two should get in sync and decide what to do.
-Matthias
On 03/24/2015 05:38 PM, Gyula Fóra wrote:
Hey Matthias,
Let's see if I get these things for you :)
1) The difference between setup and open is that, setup
+1 for lots of streaming tests
On Mon, Mar 23, 2015 at 11:23 AM, Márton Balassi balassi.mar...@gmail.com
wrote:
Thanks for looking into this, Stephan. +1 for the JIRAs.
On Mon, Mar 23, 2015 at 10:55 AM, Ufuk Celebi u...@apache.org wrote:
On 23 Mar 2015, at 10:44, Stephan Ewen
. :( I don't think that
anything is broken. Let's wait a little longer.
– Ufuk
On Wednesday, March 11, 2015, Gyula Fóra gyf...@apache.org wrote:
Hey,
I pushed some commits yesterday evening and it seems like the git repos
somehow became inconsistent, the https://github.com/apache
Hey All,
I am getting a weird error in Eclipse when trying to run local flink
programs for example any batch or streaming example program. I tried
cleaning, rebuilding reimporting etc that didnt help.
Please see the log and stack trace:
16:55:04,965 INFO
Setting the JVM heap larger solves this issue, but I still think that this
is a misleading error.
With -Xmx256m it works.
On Sun, Mar 1, 2015 at 5:03 PM, Gyula Fóra gyf...@apache.org wrote:
Hey All,
I am getting a weird error in Eclipse when trying to run local flink
programs for example
They should actually return different values in many cases.
Datastream.env.getDegreeOfParallelism returns the environment parallelism
(default)
Datastream.getparallelism() returns the parallelism of the operator. There
is a reason when one or the other is used.
Please watch out when you try to
. The other ITCases ran
properly, so I figured, the problem is with the windowing.
Can you check it out for me? (WindowedDataStream, line 348)
Peter
2015-02-27 10:06 GMT+01:00 Gyula Fóra gyf...@apache.org:
They should actually return different values in many cases
We have a join operator that is defined over windows in the data stream. So
the problem is not with the join itself, but it seems that trying to apply
this window-join with itself throws an error.
On Fri, Feb 20, 2015 at 4:50 PM, Alexander Alexandrov
alexander.s.alexand...@gmail.com wrote:
I
So the current setup is to share results between the two apis by files. So
I dont see any reason why this couldnt work with the 2 cluster setup. It
makes deployment a little trickier but still feasible.
On Tue, Feb 17, 2015 at 11:55 AM, Ufuk Celebi u...@apache.org wrote:
I think this separation
Great, thanks! It was fast :)
I will try to check it out in the afternoon!
Gyula
On Fri, Jan 23, 2015 at 11:37 AM, Ufuk Celebi u...@apache.org wrote:
On 22 Jan 2015, at 18:10, Stephan Ewen se...@apache.org wrote:
So the crux is that the JobManager has a location for the sender task
(and
Thank you! I will play around with it.
On Wed, Jan 21, 2015 at 3:50 PM, Ufuk Celebi u...@apache.org wrote:
Hey Gyula,
On 21 Jan 2015, at 15:41, Gyula Fóra gyf...@apache.org wrote:
Hey Guys,
I think it would make sense to turn lazy operator execution off for
streaming programs because
Hey Guys,
I think it would make sense to turn lazy operator execution off for
streaming programs because it would make life simpler for windowing. I
also created a JIRA issue here
https://issues.apache.org/jira/browse/FLINK-1425.
Can anyone give me some quick pointers how to do this? Its
601 - 661 of 661 matches
Mail list logo