Could you verify with a custom UDF that actually 1m records are being produced?

Since 3 separate tasks report a consistent number of incoming/outgoing records I would rule out an issue in the metric system. These metrics are all counted
separately from each other; having the same inconsistency everywhere is nigh
impossible.

Is this reproducible, and is it possible for you to provide me with the job used?

On 05.04.2017 13:52, Flavio Pompermaier wrote:
My job is a batch one.
Here's an image of two different execution of the job.
The third line is where the first(1M) is called. In the left side the count is what I expect, in the second is slightly less :(

Inline image 1

Any idea?

On Wed, Apr 5, 2017 at 12:51 PM, Chesnay Schepler <ches...@apache.org <mailto:ches...@apache.org>> wrote:

    Hey Flavio,

    it's unlikely that the counters skip a record.

    For the webUI these metrics are transported in 2 different ways:
    For running tasks they are fetched through the metric system; this
    provides no guarantee that the final count is ever displayed.
    For finished tasks the final count is stored in the ExecutionGraph
    and should show an accurate final count.

    So, the question is in which state your task is.

    Regards,
    Chesnay


    On 05.04.2017 09:55, Flavio Pompermaier wrote:
    Hi to all,
    I'm using Flink 1.2.0 and I have a job that (at some point) calls
    dataset.first(1M).
    Sometimes the records sent displayed by the UI are less than 1M
    (lik 999709).
    Is it possible that the UI (or the internal Flink counters) miss
    some record?

    Best,
    Flavio

-- Flavio Pompermaier
    Development Department

    OKKAM S.r.l.
    Tel. +(39) 0461 1823908 <tel:+39%200461%20182%203908>




--
Flavio Pompermaier
Development Department

OKKAM S.r.l.
Tel. +(39) 0461 1823908

Reply via email to