I would be useful if the documentation warned what type of equality it
expected of values used as keys in keyBy. I just got bit in the ass by
converting a field from a string to a byte array. All of the sudden the
windows were no longer aggregating. So it seems Flink is not doing a deep
compare
Hi Theofilos,
how exactly are you writing the application output?
Are you using a logging framework?
Are you writing the log statements from the open(), map(), invoke() methods
or from some constructors? (I'm asking since different parts are executed
on the cluster and locally).
On Fri, Jun 10,
Thank you very much Matthias! Also, the link you provided is very helpful.
Cheers,
Nikos
On Fri, Jun 10, 2016 at 3:16 AM, Matthias J. Sax wrote:
> I just put an answer to SO.
>
> About the other questions: Flink processes tuple-by-tuple and does some
> internal buffering. You
Hi Shannon,
Some questions:
which Flink version are you using?
Can you provide me with some more logs, in particular the log entries
before this event from the Kafka connector.
Also, it is possible that the Kafka broker was in an erroneous state?
Did the error happen after weeks of data
Hi again,
and again sorry for the late response.
Regarding your first question: You can use a Key Selector Function [1].
Regarding your second question: If I understand your requirement
correctly, this is already happening in my gist.
By taking the union of both streams the local and away max
Hi all,
Flink 1.0.3
Hadoop 2.4.0
When running a job on a Flink Cluster on Yarn, the application output is
not included in the Yarn log. Instead, it is only printed in the stdout
from where I run my program. For the jobmanager, I'm using the
log4j.properties file from the flink/conf
Hi,
setting the unsplittable attribute in the constructor is fine. The field's
value will be send to the cluster.
So what happens is that you initialize the input format in your client
program. Then, its serialized, send over the network to the machines and
deserilaized again. So the value you've
I just put an answer to SO.
About the other questions: Flink processes tuple-by-tuple and does some
internal buffering. You might be interested in
https://cwiki.apache.org/confluence/display/FLINK/Data+exchange+between+tasks
-Matthias
On 06/09/2016 08:13 PM, Nikos R. Katsipoulakis wrote:
>
Hi,
I am replying to myself for the records and to provide an update on what I
am trying to do.
I have looked into Mahout's XmlInputFormat class but unfortunately it
doesn't solve my problem.
My exploratory work with Flink tries to reproduce the key steps that we
already perform in a quite
Q1:
Whether one of your classes requires the *e**nv* parameter depends on
whether you want to create a new Source or set a ExecutionEnvironment
parameter inside the class.
If you don't you can of course not pass it :)
I can't see anything that would prevent it form running on a cluster.
Q2:
Hi Fabian, Thank you for your help.
I want my Flink application to be distributed as well as I want
the facility to support the update of the variable [Coefficients of
LinearRegression].
How you would do in that case?
The problem with iteration is that it expects Dataset with same type to be
Hi,
yes, I was talking about a Flink bug. I forgot to mention the work-around
that Stephan mentioned.
On Thu, 9 Jun 2016 at 20:38 Stephan Ewen wrote:
> You can also make the KeySelector a static inner class. That should work
> as well.
>
> On Thu, Jun 9, 2016 at 7:00 PM,
12 matches
Mail list logo