[ https://issues.apache.org/jira/browse/CRUNCH-564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14938172#comment-14938172 ]
Nathan Barry commented on CRUNCH-564: ------------------------------------- There is a constructor that requires no options: https://github.com/apache/crunch/blob/master/crunch-core/src/main/java/org/apache/crunch/io/text/csv/CSVLineReader.java#L106 And a constructor that requires all options: https://github.com/apache/crunch/blob/master/crunch-core/src/main/java/org/apache/crunch/io/text/csv/CSVLineReader.java#L135 Though in the all option constructor we are not enforcing that the caller actually provided the values nor does it fall back to a default value if not provided. We could add fallback to the default if an option isn't provided in the all option constructor. > Add support for using escape character same as open/close quote character > ------------------------------------------------------------------------- > > Key: CRUNCH-564 > URL: https://issues.apache.org/jira/browse/CRUNCH-564 > Project: Crunch > Issue Type: Improvement > Components: Core > Reporter: Muhammad > Assignee: Josh Wills > Priority: Trivial > Labels: csv, csvparser > > As a user I would like to use CSVInputFormat to handle the CSV files > following this RFC http://www.ietf.org/rfc/rfc4180.txt. > Many developers use Apache StringEscapeUtils.escapeCsv( ) method to escape > their CSVs. The method escapes the CSV following the RFC4180. > https://commons.apache.org/proper/commons-lang/javadocs/api-2.6/org/apache/commons/lang/StringEscapeUtils.html > The CSVLineReader throws exception in such a case. We can enhance the code to > support the CSVs that use escape same as the quote characters. > https://github.com/apache/crunch/blob/master/crunch-core/src/main/java/org/apache/crunch/io/text/csv/CSVLineReader.java#L152 > I would appreciate a comment, if someone has knowingly rejected the idea due > to some technical limitation or a problem with allowing escape and quote as > same characters. By the way Apache HAWQ seem to get around this issue somehow > and reads such CSVs alright. -- This message was sent by Atlassian JIRA (v6.3.4#6332)