juliuszsompolski commented on PR #38397: URL: https://github.com/apache/spark/pull/38397#issuecomment-1294754916
> Given that OrcFileFormat has no issue like _metadata columns @dongjoon-hyun I think OrcFIleFormat has exactly the same issue as ParquetFileFormat, like @cloud-fan pointed out? When there was a column like `_metadata.file_path` requested for OrcFileFormat, it would also count that column in FileSourceScanExec.supporsBatch, but not count it in OrcFileFormat.supportsBatch during buildReaderWithPartitionValues. The code changes I made to OrcFileFormat exactly mirror what I did to ParquetFileFormat. I updated the description to descibe "`ParquetFileFormat` or `OrcFileFormat`" in various places, but the issue seems exactly the same. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
