jonkeane commented on pull request #9615: URL: https://github.com/apache/arrow/pull/9615#issuecomment-833942444
Absolutely, aside from factors, all of these differences are compatible with being pure noise / no real change. If we don't see any speed up with any types other than factors, I'm not totally surprised that the naturalistic data sets aren't seeing an improvement since fannie + nyctaxi when read in as data.frames don't result in any factors. And the chi traffic dataset which starts as a parquet only has two columns which are factors. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
