I like the idea of creating separate project(s) for load tests so as
to not compete with other tests and the standard development cycle.

As for how many workers is too many, I would take the track "what is
it we're trying to test?" Unless your stress-testing the shuffle
itself, much of what Beam does is linearly parallizable with the
number of machines. Of course one will still want to run over real,
large data sets, but not every load test needs this every time. More
interesting could be to try out running at 2x and 4x the data, with 2x
and 4x the machines, and seeing where we fail to be linear.

(As an aside, 4 hours x 10 workers seems like a lot for 23GB of
data...or is it 230GB once you've fanned out?)

On Wed, Jan 23, 2019 at 3:33 PM Łukasz Gajowy <[email protected]> wrote:
>
> Hi,
>
> pinging this thread (maybe some folks missed it). What do you think about 
> those concerns/ideas?
>
> Łukasz
>
> pon., 14 sty 2019 o 17:11 Łukasz Gajowy <[email protected]> napisał(a):
>>
>> Hi all,
>>
>> one problem we need to solve while working with load tests we currently 
>> develop is that we don't really know how much GCP/Jenkins resources can we 
>> occupy. We did some initial testing with 
>> beam_Java_LoadTests_GroupByKey_Dataflow_Small[1] and it seems that for:
>>
>> - 1 000 000 000 (~ 23 GB) synthetic record
>> - 10 fanouts
>> - 10 dataflow workers (--maxNumWorkers)
>>
>> the total job time exceeds 4 hours. It seems too much for such a small load 
>> test. Additionally, we plan to add much bigger tests for other core 
>> operations too. The proposal [2] describes only few of them.
>>
>> The questions are:
>> 1. how many workers can we assign to this job without starving the other 
>> jobs? Are 32 workers for a single Dataflow job fine? Would 64 workers for 
>> such job be fine either?
>> 2. given the plans that we are going to add more and more load tests soon, 
>> do you think it is a good idea to create a separate GCP project + separate 
>> Jenkins workers for load testing purposes only? This would avoid starvation 
>> of critical tests (post commits, pre-commits, etc). Or maybe there is 
>> another solution that will bring such isolation? Is such isolation needed?
>>
>> Ad 2: Please note that we will also need to host Flink/Spark clusters later 
>> on GKE/Dataproc (not decided yet).
>>
>> [1] 
>> https://builds.apache.org/view/A-D/view/Beam/view/All/job/beam_Java_LoadTests_GroupByKey_Dataflow_Small_PR/
>> [2] https://s.apache.org/load-test-basic-operations
>>
>>
>> Thanks,
>> Łukasz
>>

Reply via email to