Could you verify with a custom UDF that actually 1m records are being
produced?
Since 3 separate tasks report a consistent number of incoming/outgoing
records
I would rule out an issue in the metric system. These metrics are all
counted
separately from each other; having the same inconsistency everywhere is nigh
impossible.
Is this reproducible, and is it possible for you to provide me with the
job used?
On 05.04.2017 13:52, Flavio Pompermaier wrote:
My job is a batch one.
Here's an image of two different execution of the job.
The third line is where the first(1M) is called. In the left side the
count is what I expect, in the second is slightly less :(
Inline image 1
Any idea?
On Wed, Apr 5, 2017 at 12:51 PM, Chesnay Schepler <ches...@apache.org
<mailto:ches...@apache.org>> wrote:
Hey Flavio,
it's unlikely that the counters skip a record.
For the webUI these metrics are transported in 2 different ways:
For running tasks they are fetched through the metric system; this
provides no guarantee that the final count is ever displayed.
For finished tasks the final count is stored in the ExecutionGraph
and should show an accurate final count.
So, the question is in which state your task is.
Regards,
Chesnay
On 05.04.2017 09:55, Flavio Pompermaier wrote:
Hi to all,
I'm using Flink 1.2.0 and I have a job that (at some point) calls
dataset.first(1M).
Sometimes the records sent displayed by the UI are less than 1M
(lik 999709).
Is it possible that the UI (or the internal Flink counters) miss
some record?
Best,
Flavio
--
Flavio Pompermaier
Development Department
OKKAM S.r.l.
Tel. +(39) 0461 1823908 <tel:+39%200461%20182%203908>
--
Flavio Pompermaier
Development Department
OKKAM S.r.l.
Tel. +(39) 0461 1823908