[ https://issues.apache.org/jira/browse/HIVE-14679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15571892#comment-15571892 ]
Kenneth MacArthur commented on HIVE-14679: ------------------------------------------ Section 2.6 of RFC 4180 says: "Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes." It seems strange, then, to disable quoting for the csv2 output format by default. What's also strange is that when quoting is disabled, values are in fact still 'quoted' with a null character (00), rather than no character at all (as described in [~ngangam]'s comment on HIVE-9788). This doesn't appear to be mentioned anywhere in RFC 4180. May I suggest that: - Quoting should be enabled by default for csv2, tsv2 and dsv. - Disabling quoting should be possible using a beeline argument. - Disabling quoting should not result in the output of a null character in place of a visible quote - there should simply be no quote character at all in this case. > csv2/tsv2 output format disables quoting by default and it's difficult to > enable > -------------------------------------------------------------------------------- > > Key: HIVE-14679 > URL: https://issues.apache.org/jira/browse/HIVE-14679 > Project: Hive > Issue Type: Bug > Reporter: Brock Noland > Assignee: Jianguo Tian > > Over in HIVE-9788 we made quoting optional for csv2/tsv2. > However I see the following issues: > * JIRA doc doesn't mention it's disabled by default, this should be there an > in the output of beeline help. > * The JIRA says the property is {{--disableQuotingForSV}} but it's actually a > system property. We should not use a system property as it's non-standard so > extremely hard for users to set. For example I must do: {{env > HADOOP_CLIENT_OPTS="-Ddisable.quoting.for.sv=false" beeline ...}} > * The arg {{--disableQuotingForSV}} should be documented in beeline help. -- This message was sent by Atlassian JIRA (v6.3.4#6332)