Re: Question on NEXMark

2020-06-10 Thread Kenneth Knowles
It sounds like it could be something worth addressing. I don't really know
the cost of this behavior. The pipeline is pretty easy to read. The
pipeline itself does not explicitly manage any state, so it would be in the
Flink execution of the GroupByKey primitive transform. The relevant code is
probably in ReduceFnRunner/WatermarkHold, which is actually shared across
many runners.

Kenn

On Wed, Jun 10, 2020 at 11:25 AM Andrew Pilloud  wrote:

> I think the author of this test is long gone, but the code originated
> inside google. This query is not part of the original Nexmark suite but was
> designed to exercise corner cases caused by out of order events, so that is
> what you are probably seeing. Here are relevant bits from the original
> commit messages:
>
> New query 11 to exercise session windows.
>
> Q11 started as a basic session windows test
> with out-of-order and delayed events.
> This refines the trigger to limit the number
> of events in sessions.
>
> Andrew
>
> On Wed, Jun 10, 2020 at 10:37 AM Sruthi S Kumar 
> wrote:
>
>> Hi,
>>
>> We are working on a Flink project and enhancing some state backend
>> functionality. We are using NEXMark benchmark to compare different state
>> backends performance of Flink. While running NEXMark queries using Flink
>> runner of Beam we have noticed that there is quite a lot of non-existent
>> read from the state-backed.
>>
>> For example, when running query 11 with RocksDB state-backed, we had
>> around 368 successful reads while we had around 527 attempts to read
>> non-existent reads. We are curious if that is intentional and if so what's
>> the rationale behind it?
>>
>>
>> --
>> Regards,
>>
>> Sruthi
>>
>


Re: Question on NEXMark

2020-06-10 Thread Andrew Pilloud
I think the author of this test is long gone, but the code originated
inside google. This query is not part of the original Nexmark suite but was
designed to exercise corner cases caused by out of order events, so that is
what you are probably seeing. Here are relevant bits from the original
commit messages:

New query 11 to exercise session windows.

Q11 started as a basic session windows test
with out-of-order and delayed events.
This refines the trigger to limit the number
of events in sessions.

Andrew

On Wed, Jun 10, 2020 at 10:37 AM Sruthi S Kumar 
wrote:

> Hi,
>
> We are working on a Flink project and enhancing some state backend
> functionality. We are using NEXMark benchmark to compare different state
> backends performance of Flink. While running NEXMark queries using Flink
> runner of Beam we have noticed that there is quite a lot of non-existent
> read from the state-backed.
>
> For example, when running query 11 with RocksDB state-backed, we had
> around 368 successful reads while we had around 527 attempts to read
> non-existent reads. We are curious if that is intentional and if so what's
> the rationale behind it?
>
>
> --
> Regards,
>
> Sruthi
>


Question on NEXMark

2020-06-10 Thread Sruthi S Kumar
Hi,

We are working on a Flink project and enhancing some state backend
functionality. We are using NEXMark benchmark to compare different state
backends performance of Flink. While running NEXMark queries using Flink
runner of Beam we have noticed that there is quite a lot of non-existent
read from the state-backed.

For example, when running query 11 with RocksDB state-backed, we had around
368 successful reads while we had around 527 attempts to read non-existent
reads. We are curious if that is intentional and if so what's the rationale
behind it?


-- 
Regards,

Sruthi