Kenneth MacArthur commented on HIVE-14679:

Section 2.6 of RFC 4180 says:
"Fields containing line breaks (CRLF), double quotes, and commas should be 
enclosed in double-quotes."

It seems strange, then, to disable quoting for the csv2 output format by 

What's also strange is that when quoting is disabled, values are in fact still 
'quoted' with a null character (00), rather than no character at all (as 
described in [~ngangam]'s comment on HIVE-9788). This doesn't appear to be 
mentioned anywhere in RFC 4180.

May I suggest that:
- Quoting should be enabled by default for csv2, tsv2 and dsv.
- Disabling quoting should be possible using a beeline argument.
- Disabling quoting should not result in the output of a null character in 
place of a visible quote - there should simply be no quote character at all in 
this case.

> csv2/tsv2 output format disables quoting by default and it's difficult to 
> enable
> --------------------------------------------------------------------------------
>                 Key: HIVE-14679
>                 URL: https://issues.apache.org/jira/browse/HIVE-14679
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Brock Noland
>            Assignee: Jianguo Tian
> Over in HIVE-9788 we made quoting optional for csv2/tsv2.
> However I see the following issues:
> * JIRA doc doesn't mention it's disabled by default, this should be there an 
> in the output of beeline help.
> * The JIRA says the property is {{--disableQuotingForSV}} but it's actually a 
> system property. We should not use a system property as it's non-standard so 
> extremely hard for users to set. For example I must do: {{env 
> HADOOP_CLIENT_OPTS="-Ddisable.quoting.for.sv=false" beeline ...}}
> * The arg {{--disableQuotingForSV}} should be documented in beeline help.

This message was sent by Atlassian JIRA

Reply via email to