pan3793 commented on PR #42336: URL: https://github.com/apache/spark/pull/42336#issuecomment-1665362216
@wangyum @yaooqinn I agree with your opinion to follow the Hive behavior as much as possible, meanwhile, Spark also aims to reduce the difference between DS/Hive. As you can see, the file name pattern is not same as Hive but like DS. - written via Spark `hive_orc/part-00000-5a481e57-caf3-471c-9cf3-0ec26e94e7a3-c000` - written via Hive `hive_orc/000000_0` WDYT to add a configuration and disable in default for this feature? For Parquet/ORC format, the file name does not affect decoding, since the compression information is part of the metadata of the file content. Given that DS's file name is much more friendly for administrators to identify the format and compression codec. I would like to allow Spark to have such an ability. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
