[
https://issues.apache.org/jira/browse/PIG-1924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andreas Paepcke updated PIG-1924:
---------------------------------
Release Note: This module subsumes the current CSVLoader(). However, its
syntax for escaping embedded double quotes is to prepend a second double quote.
This syntax is the one honored by Excel 2007. In addition, this module's
default field delimiter is a comma. In part, this decision is based on Excel
behaving inconsistently with newlines embedded in fields when tab is used as
the delimiter. That delimiter default differs from the existing CSVLoader(),
which defaults to tab for delimiting fields. (was: This module subsumes the
current CSVLoader(). However, its syntax for escaping embedded double quotes is
to prepend a second double quote. This syntax is the one honored by Excel 2007.
In addition, this module's default field delimiter is a comma. In part, this
decision is based on Excel behaving inconsistently with newlines embedded in
fields when tab is used as the delimiter. )
> CSV Loader/Store that handles newlines in fields, and other Excel CSV
> features.
> -------------------------------------------------------------------------------
>
> Key: PIG-1924
> URL: https://issues.apache.org/jira/browse/PIG-1924
> Project: Pig
> Issue Type: New Feature
> Components: tools
> Affects Versions: 0.8.0
> Reporter: Andreas Paepcke
> Attachments: CSVExcelStorage.java, TestCSVExcelStorage.java
>
> Original Estimate: 0h
> Remaining Estimate: 0h
>
> CSVExcelStorage() combines load and store of CSV encoded data. Handles
> newlines within fields, escaped double quotes, and double quoting of fields
> with embedded field delimiters. Newline handling is optional, and controlled
> by a parameter. The module also offers an option to output with Windows style
> newlines (CRLF, instead of the Unix LF). All CSV related syntax decisions
> were made to match Excel 2007.
> The module comes with a test file, and javadoc produces proper documentation
> files.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira