Muhammad created CRUNCH-564: ------------------------------- Summary: Add support for using escape character same as open/close quote character Key: CRUNCH-564 URL: https://issues.apache.org/jira/browse/CRUNCH-564 Project: Crunch Issue Type: Improvement Components: Core Reporter: Muhammad Assignee: Josh Wills Priority: Trivial
As a user I would like to use CSVInputFormat to handle the CSV files following this RFC http://www.ietf.org/rfc/rfc4180.txt. Many developers use Apache StringEscapeUtils.escapeCsv( ) method to escape their CSVs. The method escapes the CSV following the RFC4180. https://commons.apache.org/proper/commons-lang/javadocs/api-2.6/org/apache/commons/lang/StringEscapeUtils.html The CSVLineReader throws exception in such a case. We can enhance the code to support the CSVs that use escape same as the quote characters. https://github.com/apache/crunch/blob/master/crunch-core/src/main/java/org/apache/crunch/io/text/csv/CSVLineReader.java#L152 I would appreciate a comment, if someone has knowingly rejected the idea due to some technical limitation or a problem with allowing escape and quote as same characters. By the way Apache HAWQ seem to get around this issue somehow and reads such CSVs alright. -- This message was sent by Atlassian JIRA (v6.3.4#6332)