[ 
https://issues.apache.org/jira/browse/CRUNCH-564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14938172#comment-14938172
 ] 

Nathan Barry commented on CRUNCH-564:
-------------------------------------

There is a constructor that requires no options: 
https://github.com/apache/crunch/blob/master/crunch-core/src/main/java/org/apache/crunch/io/text/csv/CSVLineReader.java#L106
And a constructor that requires all options:  
https://github.com/apache/crunch/blob/master/crunch-core/src/main/java/org/apache/crunch/io/text/csv/CSVLineReader.java#L135

Though in the all option constructor we are not enforcing that the caller 
actually provided the values nor does it fall back to a default value if not 
provided.  We could add fallback to the default if an option isn't provided in 
the all option constructor.



> Add support for using escape character same as open/close quote character
> -------------------------------------------------------------------------
>
>                 Key: CRUNCH-564
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-564
>             Project: Crunch
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Muhammad
>            Assignee: Josh Wills
>            Priority: Trivial
>              Labels: csv, csvparser
>
> As a user I would like to use CSVInputFormat to handle the CSV files 
> following this RFC http://www.ietf.org/rfc/rfc4180.txt.
> Many developers use Apache StringEscapeUtils.escapeCsv( ) method to escape 
> their CSVs. The method escapes the CSV following the RFC4180. 
> https://commons.apache.org/proper/commons-lang/javadocs/api-2.6/org/apache/commons/lang/StringEscapeUtils.html
> The CSVLineReader throws exception in such a case. We can enhance the code to 
> support the CSVs that use escape same as the quote characters.
> https://github.com/apache/crunch/blob/master/crunch-core/src/main/java/org/apache/crunch/io/text/csv/CSVLineReader.java#L152
> I would appreciate a comment, if someone has knowingly rejected the idea due 
> to some technical limitation or a problem with allowing escape and quote as 
> same characters. By the way Apache HAWQ seem to get around this issue somehow 
> and reads such CSVs alright.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to