Shivram Mani created HAWQ-1075:
----------------------------------
Summary: Make checksum verification configurable in PXF
HdfsTextSimple profile
Key: HAWQ-1075
URL: https://issues.apache.org/jira/browse/HAWQ-1075
Project: Apache HAWQ
Issue Type: Improvement
Components: PXF
Reporter: Shivram Mani
Assignee: Goden Yao
Currently HdfsTextSimple profile which is the optimized profile to read
Text/CSV uses ChunkRecordReader to read chunks of records (as opposed to
individual records). Here dfs.client.read.shortcircuit.skip.checksum is
explicitly set to true to avoid incurring any delays with checksum check while
opening/reading the file/block.
This configuration needs to be exposed as an option and by default client side
checksum check must occur in order to be resilient to any data corruption
issues which aren't caught internally by the datanode block reporting mechanism
(even fsck doesn't catch certain block corruption issues).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)