Re: Repeated exceptions during system metrics registration

2017-09-29 Thread Reinier Kip
o that with Beam. On 29.09.2017 16:03, Reinier Kip wrote: Hi all, I'm running a Beam pipeline on Flink and sending metrics via the Graphite reporter. I get repeated exceptions on the slaves, which try to register the same metric multiple times. Jobmanager and taskmanager data is fine: I ca

Repeated exceptions during system metrics registration

2017-09-29 Thread Reinier Kip
Hi all, I'm running a Beam pipeline on Flink and sending metrics via the Graphite reporter. I get repeated exceptions on the slaves, which try to register the same metric multiple times. Jobmanager and taskmanager data is fine: I can see JVM stuff, but only one datapoint here and there for

EOFException related to memory segments during run of Beam pipeline on Flink

2017-08-30 Thread Reinier Kip
Hi all, I’ve been running a Beam pipeline on Flink. Depending on the dataset size and the heap memory configuration of the jobmanager and taskmanager, I may run into an EOFException, which causes the job to fail. You will find the stacktrace near the bottom of this post (data censored). I

Re: HDFS data locality and distribution

2018-03-12 Thread Reinier Kip
Relevant versions: Beam 2.1, Flink 1.3. From: Reinier Kip <r...@bol.com> Sent: 12 March 2018 13:45:47 To: user@flink.apache.org Subject: HDFS data locality and distribution Hey all, I'm trying to batch-process 30-ish files from HDFS, but I see tha

HDFS data locality and distribution

2018-03-12 Thread Reinier Kip
Hey all, I'm trying to batch-process 30-ish files from HDFS, but I see that data is distributed very badly across slots. 4 out of 32 slots get 4/5ths of the data, another 3 slots get about 1/5th and a last slot just a few records. This probably triggers disk spillover on these slots and slows

Re: HDFS data locality and distribution

2018-03-19 Thread Reinier Kip
a is distributed to each node, i.e. if 80% of your data has the same key (or rather hash), they will all end up on the same node. On 12.03.2018 13:49, Reinier Kip wrote: Relevant versions: Beam 2.1, Flink 1.3. ____________ From: Reinier Kip <r...@bol.com><mailto:r...@