tustvold commented on PR #7962: URL: https://github.com/apache/arrow-datafusion/pull/7962#issuecomment-1789347016
FWIW for consistency we might want to do something closer to what we do for parquet where: * We have an estimate of the size of the footer which we fetch * We read the actual footer size * We then fetch any extra data needed * Once decoded the footer provides information on the schema and where the data blocks are located This PR instead appears to read the first RecordBatch, whilst I _think_ this should work (provided the file contains data), the more standard approach might be to read the footer. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
