n3world edited a comment on pull request #10662:
URL: https://github.com/apache/arrow/pull/10662#issuecomment-918737367
> Sorry for the wait for feedback. Have you run the parsing benchmarks in
`parser_benchmark.cc`? Does capturing the offset have any noticeable effect on
performance?
I was getting some wide variations between runs but these are the best
numbers I got for master and this branch
master:
```
----------------------------------------------------------------------------------
Benchmark Time CPU Iterations
UserCounters...
----------------------------------------------------------------------------------
ChunkCSVQuotedBlock 168985 ns 168981 ns 4117
bytes_per_second=959.423M/s
ChunkCSVEscapedBlock 155557 ns 155555 ns 4460
bytes_per_second=980.924M/s
ChunkCSVNoNewlinesBlock 147 ns 147 ns 4741333
bytes_per_second=0/s
ParseCSVQuotedBlock 263473 ns 263469 ns 2646
bytes_per_second=615.346M/s
ParseCSVEscapedBlock 209135 ns 209132 ns 3362
bytes_per_second=729.624M/s
ParseCSVFlightsExample 2175256 ns 2175243 ns 320
bytes_per_second=446.642M/s
ParseCSVVehiclesExample 15967256 ns 15967111 ns 44
bytes_per_second=718.222M/s
ParseCSVStocksExample 3463566 ns 3463298 ns 203
bytes_per_second=605.926M/s
```
This branch:
```
----------------------------------------------------------------------------------
Benchmark Time CPU Iterations
UserCounters...
----------------------------------------------------------------------------------
ChunkCSVQuotedBlock 169009 ns 169006 ns 4093
bytes_per_second=959.283M/s
ChunkCSVEscapedBlock 156445 ns 156443 ns 4467
bytes_per_second=975.356M/s
ChunkCSVNoNewlinesBlock 149 ns 149 ns 4749759
bytes_per_second=0/s
ParseCSVQuotedBlock 369561 ns 369551 ns 1882
bytes_per_second=438.707M/s
ParseCSVEscapedBlock 367681 ns 367671 ns 1867
bytes_per_second=415.012M/s
ParseCSVFlightsExample 2538161 ns 2538102 ns 278
bytes_per_second=382.788M/s
ParseCSVVehiclesExample 16641194 ns 16639585 ns 42
bytes_per_second=689.196M/s
ParseCSVStocksExample 3119450 ns 3119364 ns 224
bytes_per_second=672.734M/s
```
No significant difference that I can see
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]