deshanxiao commented on issue #1727: URL: https://github.com/apache/orc/issues/1727#issuecomment-1905532752
I think I understand @ljyss9‘’s issue. Although like @cxzl25 mentioned, we have relevant restrictions on the java side. But csv-import (c++ side) still has the issue which means there is no limitation on the c++ side. As for the condition that data must be difficult to compress. It is just to keep the original data in the block because ORC will not save compressed data once it is found that compression is not cost-effective. Therefore, the final solution is that once the C++ side finds that the compression chunk size exceeds 8M value, it will refuse to write any data like Java side. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
