t; method of the buffer pool...
>
> Stephan
>
>
> On Mon, Dec 7, 2015 at 9:45 AM, Gyula Fóra <gyf...@apache.org> wrote:
>
> > Hey guys,
> >
> > Is there any way to monitor the backpressure in the Flink job? I find it
> > hard to debug slow operators because of t
Yes, I am not sure if this the intentional behaviour. I think you are
supposed to be able to do the things you described.
stream.union(stream.map(..)) and things like this are fair operations. Also
maybe stream.union(stream) should just give stream instead of an error.
Could someone comment on
> "stream.union(stream.map(..))" should definitely be possible. Not sure
> why
> > this is not permitted.
> >
> > "stream.union(stream)" would contain each element twice, so should either
> > give an error or actually union (or duplicate) elements.
5. nov.
23., H, 22:03):
> Either is abstract already ;)
>
> On 23 November 2015 at 21:54, Gyula Fóra <gyula.f...@gmail.com> wrote:
>
> > I think it is not too bad to only have the Right/Left classes. You can
> then
> > write it like this:
> >
> > Eith
Hi,
Regarding my previous comment for the Kafka/Zookeeper issue, let's discuss
if this is critical enough so we want to include it in this release or the
next bugfix.
I will try to further investigate the reason the job failed in the first
place (we suspect broker failure)
Cheers,
Gyula
Hi,
I vote -1 for the RC due to the fact that the zookeeper deadlock issue was
not completely solved.
Robert could find the problem with the dependency management plugin and has
opened a PR:
[FLINK-3067] Enforce zkclient 0.7 for Kafka
https://github.com/apache/flink/pull/1399
Cheers,
Gyula
t and right values, no?
> How about renaming to getLeft() / getRight()?
>
> -V.
>
> On 23 November 2015 at 09:55, Gyula Fóra <gyula.f...@gmail.com> wrote:
>
> > Hey guys,
> >
> > I know this should have been part of the PR discussion but it kind of
> >
; Cheers,
>
> Konstantin
>
> On 23.11.2015 11:32, Gyula Fóra wrote:
> > Hi,
> >
> > Alright it seems there are multiple ways of doing this.
> >
> > I would do something like:
> >
> > ds.keyBy(key)
> > .timeWindow(w)
> > .reduce(...)
&
Hey guys,
I know this should have been part of the PR discussion but it kind of
slipped through the cracks :)
I think it might be useful to change the method name for Either.left(value)
to Either.Left(value) (or drop the method completely).
The reason is that it is slightly awkward to use it
aid in the PR, Left and Right make no sense on
> their own, they're helper classes for Either. Hence, I believe they should
> be private. Maybe we could rename the methods to createLeft() /
> createRight() ?
>
> On 23 November 2015 at 20:58, Gyula Fóra <gyula.f...@gmail.com&g
Hi all,
Wouldnt you think that it would make sense to wait a week or so to find all
the hot issues with the current release?
To me it feels a little bit like rushing this out and we will have almost
the same situation afterwards.
I might be wrong but I think people should get a chance to try
that we want fixed). We can
> always do a new bug fix release.
>
> – Ufuk
>
> On Fri, Nov 20, 2015 at 11:25 AM, Gyula Fóra <gyula.f...@gmail.com> wrote:
>
> > Hi all,
> >
> > Wouldnt you think that it would make sense to wait a week or so to find
> all
>
ue, Nov 17, 2015 at 10:43 AM, Ufuk Celebi <u...@apache.org> wrote:
> > >
> > >> https://issues.apache.org/jira/browse/KAFKA-824
> > >>
> > >> This has been fixed for Kafka’s 0.9.0 version.
> > >>
> > >> We should investiga
Should I open a JIRA for this?
Gyula Fóra <gyula.f...@gmail.com> ezt írta (időpont: 2015. nov. 17., K,
11:30):
> Thanks for the quick response and thorough explanation :)
>
> Gyula
>
> Robert Metzger <rmetz...@apache.org> ezt írta (időpont: 2015. nov. 17.,
&g
Hey guys,
I ran into some issue with the kafka consumers.
I am reading from more than 50 topics with parallelism 1, and while running
the job I got the following exception during the checkpoint notification
(offset committing):
java.lang.RuntimeException: Error while confirming checkpoint
at
thrown by the new web interface. Was it
> > running with your job?
> >
> > My second best guess is that it was thrown by another component running
> > Netty (maybe a Hadoop client?).
> >
> > – Ufuk
> >
> > PS Thanks for sharing the logs with me. :)
Hi guys,
I have a Flink Streaming job running for about a day now without any errors
and then I got this in the job manager log:
15:37:49,905 WARN io.netty.channel.DefaultChannelPipeline
- An exceptionCaught() event was fired, and it reached at
the tail of the pipeline. It usually
t;
> 2) An option to shut down with external checkpoint would also be important,
> to stop and resume from exactly there.
>
>
> Stephan
>
>
> On Wed, Nov 11, 2015 at 3:19 PM, Gyula Fóra <gyf...@apache.org> wrote:
>
> > Hey guys,
> >
> > With recen
aged memory is lazily allocated and the logged amount is an
> upper bound.
>
> Cheers, Fabian
>
> 2015-11-12 13:37 GMT+01:00 Gyula Fóra <gyf...@apache.org>:
>
> > Hey guys,
> >
> > Is it normal that when I start the cluster with
> start-cluster-streaming.sh
&
Hey,
I get a weird error when I try to execute my job on the cluster. Locally
this works fine but running it from the command line fails during
typeextraction:
input1.union(input2, input3).map(Either::
Left).returns(eventOrLongType);
This fails when trying to extract the output
Hey guys,
Is it normal that when I start the cluster with start-cluster-streaming.sh
out of the 16gb tm memory 10.6 gb becomes flink managed? (I get pretty much
the same number when I use start-cluster.sh)
I thought that Flink would only use a very small fraction in streaming mode.
Cheers,
This seems to be an issue only occuring when using Java 8 lambdas, which is
still super annoying but may not be a release blocker.
Gyula Fóra <gyula.f...@gmail.com> ezt írta (időpont: 2015. nov. 12., Cs,
15:38):
> I am not sure if this issue affects the release or maybe I am j
Hey,
Yes what you wrote should work. You can alternatively use
TypeExtractor.getForObject(modelMapInit) to extract the tye information.
I also like to implement my custom type info for Hashmaps and the other
types and use that.
Cheers,
Gyula
Martin Neumann ezt írta (időpont:
Hey All,
I am wondering what is the reason why Function input types are validated?
This might become an issue if the user wants to write his own TypeInfo for
a type that flink also handles natively.
Let's say I want to implement my own TupleTypeinfo that handles null
values, and I pass this
Hey guys,
I am trying to look at the throughput of my Flink Streaming job over time.
Is there any way to extract this information from the dashboard or is it
only possible to view the cumulative statistics at given time points.
Also I am wondering whether there is any info about the latency in
special jobs where we could sample the records
> that after two re-partitionings return to the same JVM, so we would not
> have clock misalignment. Still thinking about good ways to have a general
> purpose latency measurement mechanism.
>
> If you have any ideas there, let me know
he.org> wrote:
> > Hi Gyula,
> >
> > Trying to reproduce this error now. I'm assuming this is 0.10-SNAPSHOT?
> >
> > Cheers,
> > Max
> >
> > On Wed, Nov 4, 2015 at 1:49 PM, Gyula Fóra <gyf...@apache.org> wrote:
> >> Hey,
> >
Hey,
Running the following simple application gives me an error:
//just counting by key, the
streamOfIntegers.keyBy(x -> x).timeWindow(Time.milliseconds(3000)).fold(0, (
c, next) -> c + 1).print();
Executing this gives the following error:
"No initial value was serialized for the fold window
Hey,
I found an interesting failure in the KafkaITCase, I am not sure if this
happened before.
It received a duplicate record and failed on that (not the usual zookeeper
timeout thing)
Logs are here:
https://s3.amazonaws.com/archive.travis-ci.org/jobs/89171477/log.txt
Cheers,
Gyula
done
Till Rohrmann <trohrm...@apache.org> ezt írta (időpont: 2015. nov. 4., Sze,
11:19):
> Could you please open or update the corresponding JIRA issue if existing.
>
> On Wed, Nov 4, 2015 at 11:14 AM, Gyula Fóra <gyf...@apache.org> wrote:
>
> > Hey,
> >
Hey guys,
Have we disabled the default input copying after all? I don't remember
seeing a Jira or PR for this (maybe I just missed it).
And if not, do we want this in the 0.10 release?
Cheers,
Gyula
On Fri, Oct 2, 2015 at 7:57 PM, Till Rohrmann wrote:
> Do we know what
ate. It might also be stuck on a
> lock, in which case it would be waiting for the lock holder to terminate.
>
> Do you have the traces from other threads as well, so we could look which
> one actually is stuck while holding the lock?
>
> Greetings,
> Stephan
>
>
&
Thanks Max for the effort, this is going to be huge :)
Unfortunately I have to say -1
FLINK-2888 and FLINK-2824 are blockers from my point of view.
Cheers,
Gyula
Vasiliki Kalavri ezt írta (időpont: 2015. okt.
21., Sze, 20:07):
> Awesome! Thanks Max :))
>
> I have a
I think the nice thing about a common codestyle is that everyone can set
the template in the IDE and use the formatting commands.
Matthias's suggestion makes this practically impossible so -1 for mixed
tabs/spaces from my side.
Matthias J. Sax ezt írta (időpont: 2015. okt.
Hey All,
I think there is some serious issue with the checkpoints. Running a simple
program like this won't complete any checkpoints:
StreamExecutionEnvironment env =
StreamExecutionEnvironment.getExecutionEnvironment();
env.setParallelism(2);
env.enableCheckpointing(5000);
;
> Greetings,
> Stephan
>
>
> On Wed, Oct 21, 2015 at 1:47 PM, Gyula Fóra <gyula.f...@gmail.com> wrote:
>
> > Hey All,
> >
> > I think there is some serious issue with the checkpoints. Running a
> simple
> > program like this won't complete
+1 for both :)
Till Rohrmann ezt írta (időpont: 2015. okt. 20., K,
14:58):
> I like the idea to have a bit stricter code style which will increase code
> maintainability and makes it easier for people to go through the code.
> Furthermore, it will relieve us from code
Hey guys,
Has anyone ever got something similar working with the kafka sources?
11:52:48,838 WARN org.apache.flink.runtime.taskmanager.Task
- Task 'Source: Kafka[***] (3/4)' did not react to cancelling signal,
but is stuck in method:
hole?
>
> Greetings,
> Stephan
>
>
>
> On Thu, Oct 8, 2015 at 4:42 PM, Gyula Fóra <gyula.f...@gmail.com> wrote:
>
> > The feedback tuples might get rebalanced but the normal input should not.
> >
> > But still the main problem is the fact that partitio
the iteration head. I think this
> should break ordering as well, in your case.
>
> On Tue, 6 Oct 2015 at 10:39 Gyula Fóra <gyula.f...@gmail.com> wrote:
>
> > Hi,
> >
> > This is just a workaround, which actually breaks input order from my
> > source.
on one machine, the JVM often has not
> enough memory reserved for the stack space to create enough threads (1-2
> threads per task)...
>
> On Wed, Oct 7, 2015 at 2:13 PM, Gyula Fóra <gyf...@apache.org> wrote:
>
> > Hey guys,
> >
> > I am writing a job which inv
Hey guys,
I am writing a job which involves creating many different sources to read
data from (in this case 80 sources wiht the parallelism of 8 each, running
locally on my mac). I cannot create less unfortunately.
The problem is that the job fails while deploying the tasks with the
following
in.map(IdentityMap).setParallelism(2).iterate()
> DataStream mapped = it.map(...)
> it.closeWith(mapped.partitionByHash(someField))
>
> The input is rebalanced to the map inside the iteration as in your example
> and the feedback should be partitioned by hash.
>
> Cheers,
> Aljosc
Hey,
This question is mainly targeted towards Aljoscha but maybe someone can
help me out here:
I think the way feedback partitioning is handled does not work, let me
illustrate with a simple example:
IterativeStream it = ... (parallelism 1)
DataStream mapped = it.map(...) (parallelism 2)
//
: 2015.
szept. 9., Sze, 20:25):
> Yes, pretty clear. I guess semantically it's still a co-group, but
> implemented slightly differently.
>
> Thanks!
>
> --
> Gianmarco
>
> On 9 September 2015 at 15:37, Gyula Fóra <gyula.f...@gmail.com> wrote:
>
> > Hey Gianmarc
sources that you are looking for, but maybe I am
> > also missing it. :)
> > It would be a convenient addition though.
> >
> > Best,
> >
> > Marton
> >
> > On Sun, Sep 13, 2015 at 8:59 PM, Gyula Fóra <gyf...@apache.org> wrote:
> >
> > > He
Hey All!
Is there a proper way of using a Flink Streaming source with event
timestamps and watermarks? What I mean here is instead of implementing a
custom SourceFunction, use an existing one and provide some Timestamp
extractor (like the one currently used for Time windows), which will also
This sounds good +1 from me as well :)
Till Rohrmann ezt írta (időpont: 2015. szept. 9.,
Sze, 10:40):
> +1 for a milestone release with the TypeInformation issues fixed. I'm
> working on it.
>
> On Tue, Sep 8, 2015 at 9:32 PM, Stephan Ewen wrote:
>
> >
is to see what
> > kind of throughput the partitioned state system can handle and also how
> it
> > behaves with larger numbers of keys. The KVStore is just an interface and
> > the really heavy lifting is done by the state system so this would be a
> > good test for it.
> &g
Hey All,
The last couple of days I have been playing around with the idea of
building a streaming key-value store abstraction using stateful streaming
operators that can be used within Flink Streaming programs seamlessly.
Operations executed on this KV store would be fault tolerant as it
same
> > users in an other subsystem; mapping from IDs of phone calls to lists
> > of IDs of participating users; etc.
> > So I imagine they would like this a lot. (At least, if they were
> > considering moving to Flink :))
> >
> > Best,
> > Gabor
> >
> &g
Welcome! :)
On Thu, Aug 20, 2015 at 12:34 PM Matthias J. Sax
mj...@informatik.hu-berlin.de wrote:
Congrats! The squirrel army is growing fast. :)
On 08/20/2015 11:18 AM, Robert Metzger wrote:
The Project Management Committee (PMC) for Apache Flink has asked Chesnay
Schepler to become a
Honestly I don't think the partitioned state changes have anything to do
with the stability, only the reworked test case, which now test proper
exactly-once which was missing before.
Stephan Ewen se...@apache.org ezt írta (időpont: 2015. aug. 4., K, 12:12):
Yes, the build stability is super
)
iter.closeWith(tail1.union(tail2))
(Which is also tricky with the parallelism of the input stream)
On Sun, 2 Aug 2015 at 21:22 Gyula Fóra gyula.f...@gmail.com wrote:
In a streaming program when we create an IterativeDataStream, we
practically mark the union point of some
iterations should still be possible...
On Mon, Aug 3, 2015 at 10:21 AM, Gyula Fóra gyula.f...@gmail.com wrote:
It is critical for many applications (such as SAMOA or Storm
compatibility)
to build arbitrary cyclic flows. If your suggestion covers all cases (for
instance nested iterations
the discussion here, can you help me with what you mean by
different iteration heads and tails ?
An iteration does not have one parallel head and one parallel tail?
On Fri, Jul 31, 2015 at 6:52 PM, Gyula Fóra gyula.f...@gmail.com wrote:
Maybe you can reuse some of the logic that is currently
(at least IMHO). Maybe I'm
wrong there, though.
To me it seems intuitive that I get the feedback at the head they way I
specify it at the tail. But maybe that's also just me... :D
On Fri, 31 Jul 2015 at 14:00 Gyula Fóra gyf...@apache.org wrote:
Hey,
I am not sure what is the intuitive
this program:
https://gist.github.com/aljoscha/45aaf62b2a7957cfafd5
It works, and the implementation is very simple, actually.
On Fri, 31 Jul 2015 at 14:30 Gyula Fóra gyula.f...@gmail.com wrote:
I mean that the head operators have different parallelism:
IterativeDataStream ids = ...
ids.map
that it
does translate and run. Your observation is true. :D
I'm wondering whether it makes sense to allow users to have iteration heads
with differing parallelism, in fact.
On Fri, 31 Jul 2015 at 16:40 Gyula Fóra gyula.f...@gmail.com wrote:
I still don't get how it could possibly work
documented yet, but the base class for
transformations is StreamTransformation. From there anyone who want's to
check it out can find the other transformations.
On Fri, 31 Jul 2015 at 17:17 Gyula Fóra gyula.f...@gmail.com wrote:
There might be reasons why a user would want different
Hi Matthias,
I think Aljoscha is preparing a nice PR that completely reworks the
DataStream classes and the information they actually contain. I don't think
it's a good idea to mess things up before he gets a chance to open the PR.
Also I don't see a well supported reason for moving the
Hey,
I am not sure what is the intuitive behaviour here. As you are not applying
a transformation on the feedback stream but pass it to a closeWith method,
I thought it was somehow nature that it gets the partitioning of the
iteration input, but maybe its not intuitive.
If others also think that
that
a) disables all type checks
b) creates serializers dynamically at runtime.
a) should be fairly straight forward, b) on the other hand
btw., the Python API itself doesn't require the type
information,
it
already does the b part.
On 30.07.2015 22:11, Gyula Fóra wrote
at 15:55 Gyula Fóra gyf...@apache.org wrote:
Hey!
I started putting together a guide/design document for the streaming
operator state interfaces and implementations. The idea would be to
create
a doc that contains all the details about the implementations so anyone
can
use
Hey!
Could anyone briefly tell me what exactly is the reason why we force the
users in the Python API to declare types for operators?
I don't really understand how this works in different systems but I am just
curious why Flink has types and why Spark doesn't for instance.
If you give me some
Hi Slim,
I totally agree with you that we should start working on supporting tools
like these.
StreamFlow is really nice tool, I also like it a lot. I think it would be
not too hard to integrate it with Flink as there are several projects that
have already used Flink streaming in a compositional
added a test here that is true.
With FLINK-2423 I was fixing some else's mistake who disregarded my message
when merging a PR. We could now revert that PR that introduced that bug,
but instead we are reverting my fix for that mistake.
Gyula Fóra gyula.f...@gmail.com ezt írta (időpont: 2015. júl
Hey,
I am sorry that you feel bad about this, I only did not add a test
case for FLINK-2419
because I am adding a test in my upcoming PR which verified the behaviour.
As for FLINK-2423, it is actually very bad that issue is still there. You
introduced this in your PR
be subject
to consensus in a similar way as PRs. The right to commit does not mean
that consensus should not be reached, and this is a clear case of not
having consensus.
Kostas
On Tue, Jul 28, 2015 at 8:30 PM, Gyula Fóra gyula.f...@gmail.com wrote:
What concerns me here is that for FLINK-2419 I
at 9:09 PM, Gyula Fóra gyula.f...@gmail.com wrote:
I agree that consensus should be reached in all changes to the system.
Then Robert and you should reach consensus on FLINK-2419.
What is not clear to me is what is the subject of consensus in this case.
As for FLINK-2423
Hey,
The JobGraph that is executed cannot have cycles (due to scheduling
reasons), that's why there is no edge between the head and tail operators.
What we do instead is we add an extra source to the head and a sink to the
tail (with the same parallelism), and the feedback data is passed outside
runtime constructs...
On Fri, Jul 24, 2015 at 5:43 PM, Aljoscha Krettek aljos...@apache.org
wrote:
Yes, this might be nice. Till and I had similar ideas about using the
pattern to make broadcast variables more useable in Scala, in fact. :D
On Fri, 24 Jul 2015 at 17:39 Gyula Fóra
Hey,
I would like to propose a way to extend the standard Streaming Scala API
methods (map, flatmap, filter etc) with versions that take stateful
functions as lambdas. I think this would eliminate the awkwardness of
implementing RichFunctions in Scala and make statefulness more explicit:
*For
Hey!
I started putting together a guide/design document for the streaming
operator state interfaces and implementations. The idea would be to create
a doc that contains all the details about the implementations so anyone can
use it as a reference later.
at 09:46 Gyula Fóra
gyula.f...@gmail.com
wrote:
I agree lets separate these topics from each other so
we
can
get
faster
resolution.
There is already a state discussion in the thread we
started
with
Paris.
On Wed, Jun 24, 2015 at 9:24
useful for this purpose.
On Mon, Jul 13, 2015 at 6:30 PM, Gyula Fóra gyula.f...@gmail.com wrote:
+1
On Mon, Jul 13, 2015 at 6:23 PM Stephan Ewen se...@apache.org wrote:
If naming is the only concern, then we should go ahead, because we can
change names easily (before the release
, Jul 14, 2015 at 10:48 AM, Gyula Fóra gyula.f...@gmail.com
wrote:
If we only want to have either keyBy or groupBy, why not keep groupBy?
That
would be more consistent with the batch api.
On Tue, Jul 14, 2015 at 10:35 AM Stephan Ewen se...@apache.org
wrote:
Concerning your
, Gyula Fóra gyula.f...@gmail.com wrote:
I think Marton has some good points here.
1) Is KeyedDataStream a better name if this is only a renaming?
2) the discretize semantics is unclear indeed. Are we operating on a
single
or sequence of datasets? If the latter why not call it something
this soon, because basically
all
streaming patches will have to be revisited in light of this...
On Tue, Jul 7, 2015 at 3:41 PM, Gyula Fóra gyula.f...@gmail.com
wrote:
You are right thats an important issue.
And I think we should also do some renaming with the iterations
because
. Not
part of the first prototype.
Do we agree on those points?
On Mon, Jul 13, 2015 at 4:50 PM, Gyula Fóra gyula.f...@gmail.com wrote:
In general I like it, although the main difference between the current
and
the new one is the windowing and that is still not very clear.
Where do we have
parallel windows. Pick what you need
and what works.
On Mon, Jul 13, 2015 at 6:13 PM, Gyula Fóra gyula.f...@gmail.com wrote:
I think we agree on everything its more of a naming issue :)
I thought it might be misleading that global time windows are
non-parallel windows. We dont want
This happens quite often :/
Matthias J. Sax mj...@informatik.hu-berlin.de ezt írta (időpont: 2015.
júl. 12., V, 14:28):
Hi,
Github master is behind asf-git by two days. The last 6 commits are not
available at Github. Please compare:
- https://github.com/apache/flink/commits/master
-
I think the content is pretty good, much better than before. But the page
structure could be better (and this is very important in my opinion).
Now it just looks like a long list of features without any ways to navigate
between them. We should probably have something at the top that summarizes
the
:57 Gyula Fóra gyf...@apache.org wrote:
Hey,
Along with the suggested changes to the streaming API structure I think
we
should also rework the iteration api. Currently the iteration api tries
to mimic the syntax of the batch API while the runtime behaviour is quite
different.
What we
@Kostas:
This new API is I believe equivalent in expressivity with the current one.
We can define nested loops now as well.
And I also don't see nested loops much worse generally than simple loops.
Gyula Fóra gyula.f...@gmail.com ezt írta (időpont: 2015. júl. 7., K,
16:14):
Sorry Stephan I
yes. I was just about to open an Issue for
changing the Streaming Iteration API. :D
Then we should also make the implementation very straightforward and
simple, right now, the implementation of the iterations is all over the
place.
On Tue, 7 Jul 2015 at 15:57 Gyula Fóra gyf
Hey,
Along with the suggested changes to the streaming API structure I think we
should also rework the iteration api. Currently the iteration api tries
to mimic the syntax of the batch API while the runtime behaviour is quite
different.
What we create instead of iterations is really just cyclic
approach, because it is explicit where streams
could be altered in hind-sight (after their definition).
On Tue, Jul 7, 2015 at 4:20 PM, Gyula Fóra gyula.f...@gmail.com wrote:
@Aljoscha:
Yes, thats basically my point as well. This is what happens now too but
we
give this mutable datastream
wrote:
Wow, this looks pretty concise. I really like it!
On Mon, Jun 29, 2015 at 3:27 PM Gyula Fóra gyf...@apache.org
wrote:
Hey all!
Just to add something new to the end of the discussion list. After
some
discussion with Seif, and Paris, I have added a commit
tested.
On Wed, Jul 1, 2015 at 11:15 AM, Ufuk Celebi u...@apache.org wrote:
On 01 Jul 2015, at 10:57, Gyula Fóra gyula.f...@gmail.com wrote:
Hey,
Thanks for the feedback guys:
@Max: You are right, this is not top priority to changes, I was just
mocking up some alternatives
Hey all!
Just to add something new to the end of the discussion list. After some
discussion with Seif, and Paris, I have added a commit that replaces the
use of the Checkpointed interface with field annotations.
This is probably the most lightweight state declaration so far and it will
probably
You are right, one cannot use the current window-join implementation to
this.
A workaround is to implement your custom binary stream operator that will
wait until it receives the whole file, then starts joining.
For instance a filestream.connect(streamToJoinWith).flatMap(
CustomCoFlatMap that
Hey,
I didn't look through the whole code so I probably don't get something but
why don't you just do what storm does? Keep a map from the field names to
indexes somewhere (make this accessible from the tuple) and then you can
just use a simple Flink tuple.
I think this is what's happening in
By declare I mean we assume a Flink Tuple datatype and the user declares
the name mapping (sorry its getting late).
Gyula Fóra gyula.f...@gmail.com ezt írta (időpont: 2015. jún. 29., H,
23:57):
Ah ok, now I get what I didn't get before :)
So you want to take some input stream , and execute
Hey!
Now that we are implementing more and more applications for streaming that
use iterations we realized a huge shortcoming of the current iteration api.
Currently it only allows to feedback data of the same type to the iteration
head.
This makes sense because the operators are typed but makes
useless.
For the API, I proposed to restructure the interactions between all the
different *DataStream classes and grouping/windowing. (See API section of
the doc I posted.)
On Mon, 22 Jun 2015 at 21:56 Gyula Fóra gyula.f...@gmail.com wrote:
Hi Aljoscha,
Thanks for the nice summary
Hey all,
Currently we have reduce and aggregation methods for non-grouped
DataStreams as well, which will produce local aggregates depending on the
parallelism of the operator.
This behaviour is neither intuitive nor useful as it only produces sensible
results if the user specifically sets the
I opened a PR https://github.com/apache/flink/pull/860 for this.
Stephan Ewen se...@apache.org ezt írta (időpont: 2015. jún. 22., H,
19:25):
+1 totally agreed
On Mon, Jun 22, 2015 at 5:32 PM, Gyula Fóra gyf...@apache.org wrote:
Hey all,
Currently we have reduce and aggregation methods
Hi Aljoscha,
Thanks for the nice summary, this is a very good initiative.
I added some comments to the respective sections (where I didnt fully agree
:).).
At some point I think it would be good to have a public hangout session on
this, which could make a more dynamic discussion.
Cheers,
Gyula
The checkpoint cleanup works for HDFS right? I assume the job manager
should see that as well.
This is not a trivial problem in general, so the assumptions we were making
now that the JM can actually execute the cleanup logic.
Aljoscha Krettek aljos...@apache.org ezt írta (időpont: 2015. jún.
501 - 600 of 661 matches
Mail list logo