[ 
https://issues.apache.org/jira/browse/CSV-219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16285409#comment-16285409
 ] 

Gary Gregory commented on CSV-219:
----------------------------------

Our quoting seems off IMO. Why not simply do:
{noformat}
diff --git a/src/main/java/org/apache/commons/csv/CSVFormat.java 
b/src/main/java/org/apache/commons/csv/CSVFormat.java
index 58948fd..dc7588b 100644
--- a/src/main/java/org/apache/commons/csv/CSVFormat.java
+++ b/src/main/java/org/apache/commons/csv/CSVFormat.java
@@ -1186,10 +1186,7 @@ public final class CSVFormat implements Serializable {
             } else {
                 char c = value.charAt(pos);

-                // RFC4180 (https://tools.ietf.org/html/rfc4180) TEXTDATA =  
%x20-21 / %x23-2B / %x2D-7E
-                if (newRecord && (c < 0x20 || c > 0x21 && c < 0x23 || c > 0x2B 
&& c < 0x2D || c > 0x7E)) {
-                    quote = true;
-                } else if (c <= COMMENT) {
+                if (c <= COMMENT) {
                     // Some other chars at the start of a value caused the 
parser to fail, so for now
                     // encapsulate if we start in anything less than '#'. We 
are being conservative
                     // by including the default comment char too.
diff --git a/src/test/java/org/apache/commons/csv/CSVPrinterTest.java 
b/src/test/java/org/apache/commons/csv/CSVPrinterTest.java
index ae7aae2..dde7c19 100644
--- a/src/test/java/org/apache/commons/csv/CSVPrinterTest.java
+++ b/src/test/java/org/apache/commons/csv/CSVPrinterTest.java
@@ -1037,7 +1037,7 @@ public class CSVPrinterTest {
         final StringWriter sw = new StringWriter();
         try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.RFC4180)) 
{
             printer.printRecord(EURO_CH, "Deux");
-            assertEquals("\"" + EURO_CH + "\",Deux" + recordSeparator, 
sw.toString());
+            assertEquals(EURO_CH + ",Deux" + recordSeparator, sw.toString());
         }
     }
{noformat}
I do not see why the first char in a record being not in TEXTDATA should quote 
the first field.

Thoughts from other. With the above patch, all tests pass.

> The behavior of quote char using is not similar as Excel does when the first 
> string contains CJK char(s)
> --------------------------------------------------------------------------------------------------------
>
>                 Key: CSV-219
>                 URL: https://issues.apache.org/jira/browse/CSV-219
>             Project: Commons CSV
>          Issue Type: Bug
>          Components: Printer
>    Affects Versions: 1.5
>            Reporter: Zhang Hongda
>         Attachments: diff.patch
>
>
> When using CSVFormat.EXCEL to print a CSV file, the behavior of quote char 
> using is not similar as Microsoft Excel does when the first string contains 
> Chinese, Japanese or Korean (CJK) char(s).
> e.g.
> There are 3 data members in a record, with Japanese chars: "あ", "い", "う":
>   Microsoft Excel outputs:
>   あ,い,う
>   Apache Common CSV outputs:
>   "あ",い,う



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to