My default advice here is to use the Direct Runner for small smoke tests,
and use the Flink LocalRunner for larger datasets that can still be run
locally. As Reuven points out, the Direct Runner is more of a validation
test itself, it does many things designed to test pipelines for the worst
combinations of conditions they may encounter on other runners.

Andrew

On Thu, Jan 10, 2019 at 10:10 AM Reuven Lax <re...@google.com> wrote:

> The Direct Runner as currently implemented is purposely inefficient. It
> was designed for testing, and therefore does many things that are meant to
> expose bugs in user pipelines (e.g. randomly sorting PCollections,
> serializing/deserializing every element, etc.). So it's not surprising that
> it doesn't behave well under load tests.
>
> Reuven
>
> On Thu, Jan 10, 2019 at 5:55 AM Katarzyna Kucharczyk <
> ka.kucharc...@gmail.com> wrote:
>
>> Hi Everyone,
>>
>> My name is Kasia and I contribute to Beam's tests. Currently, I am
>> working with Łukasz Gajowy on load tests and we created Jenkins
>> configuration to run Synthetic Sources test on DirectRunner. It was decided
>> to generate 1 000 000 000 records (bytes) for a small suite (details you
>> can find in this proposal [1] ). Running this test on the Beam’s Jenkins is
>> causing runtime exception with the message:
>> "java.lang.OutOfMemoryError: GC overhead limit exceeded".
>>
>> Of course, this is not a surprise since it's a lot of data. That's why I
>> am asking for your advice/opinion:
>> Do you think if this test should be smaller? On the other hand, if it's
>> going to be smaller would it be still worth testing as a load test?
>> Maybe it would be better to wait for the UniversalLocalRunner instead and
>> use it while it's there? What is the status of ULR?  Do you know if the ULR
>> will replace DirectRunner?
>>
>> I created an issue [2] with details of this problem where you can find
>> the link to the example of a failing job.
>>
>> Thanks,
>> Kasia
>>
>> [1] https://s.apache.org/load-test-basic-operations
>> <https://s.apache.org/load-test-basic-operations>
>> [2] https://issues.apache.org/jira/browse/BEAM-6351
>>
>

Reply via email to