I've run some PerfKit tests for [HDFS with this 
PR](https://builds.apache.org/job/beam_PerformanceTests_TextIOIT_HDFS/744/console)
 and [GCS with this 
PR](https://builds.apache.org/job/beam_PerformanceTests_TextIOIT/1086/console).

The last 6 timings for each are:

### HDFS 
```
08:33:54   run_time                            530.203248 seconds
14:33:30   run_time                            509.528935 seconds
20:32:50   run_time                            442.749163 seconds
02:31:21   run_time                            402.220293 seconds
08:31:30   run_time                            413.367303 seconds
-- and with this PR:
14:30:40   run_time                            378.867292 seconds               
        
```

### GCS
```
14:32:25   run_time                            390.459900 seconds
20:32:06   run_time                            424.058663 seconds
02:31:45   run_time                            412.350028 seconds
08:33:19   run_time                            524.147186 seconds
14:31:16   run_time                            364.864536 seconds
-- and with this PR:
16:19:23   run_time                            387.196383 seconds
```

Given these are using 1 million records only and small clusters I don't they 
tell us all that much, but thought it worth capturing here regardless.

[ Full content available at: https://github.com/apache/beam/pull/6289 ]
This message was relayed via gitbox.apache.org for [email protected]

Reply via email to