I've run some PerfKit tests for [HDFS with this
PR](https://builds.apache.org/job/beam_PerformanceTests_TextIOIT_HDFS/744/console)
and [GCS with this
PR](https://builds.apache.org/job/beam_PerformanceTests_TextIOIT/1086/console).
The last 6 timings for each are:
### HDFS
```
08:33:54 run_time 530.203248 seconds
14:33:30 run_time 509.528935 seconds
20:32:50 run_time 442.749163 seconds
02:31:21 run_time 402.220293 seconds
08:31:30 run_time 413.367303 seconds
-- and with this PR:
14:30:40 run_time 378.867292 seconds
```
### GCS
```
14:32:25 run_time 390.459900 seconds
20:32:06 run_time 424.058663 seconds
02:31:45 run_time 412.350028 seconds
08:33:19 run_time 524.147186 seconds
14:31:16 run_time 364.864536 seconds
-- and with this PR:
16:19:23 run_time 387.196383 seconds
```
Given these are using 1 million records only and small clusters I don't they
tell us all that much, but thought it worth capturing here regardless.
[ Full content available at: https://github.com/apache/beam/pull/6289 ]
This message was relayed via gitbox.apache.org for [email protected]