Jefffrey commented on code in PR #20813:
URL: https://github.com/apache/datafusion/pull/20813#discussion_r3105120386


##########
datafusion/common/src/config.rs:
##########
@@ -2927,6 +2938,13 @@ config_namespace! {
         pub terminator: Option<u8>, default = None
         pub escape: Option<u8>, default = None
         pub double_quote: Option<bool>, default = None
+        /// Quote style for CSV writing.
+        /// One of: "Always", "Necessary", "NonNumeric", "Never"
+        pub quote_style: CsvQuoteStyle, default = CsvQuoteStyle::Necessary
+        /// Whether to ignore leading whitespace in string values when writing 
CSV.
+        pub ignore_leading_whitespace: Option<bool>, default = None
+        /// Whether to ignore trailing whitespace in string values when 
writing CSV.
+        pub ignore_trailing_whitespace: Option<bool>, default = None
         /// Specifies whether newlines in (quoted) values are supported.

Review Comment:
   Perhaps its to do with hierarchy of configs and/or sql parsing 🤔
   
   @alamb do you happen to know a reason for having `Option<bool>` instead of 
plain `bool` for cases where it'll end up being true or false anyway? (i.e. 
`None` doesn't represent a third state, but eventually maps to either true or 
false)



##########
datafusion/sqllogictest/test_files/csv_files.slt:
##########
@@ -380,3 +380,200 @@ SET datafusion.optimizer.repartition_file_min_size = 
10485760;
 
 statement ok
 drop table stored_table_with_cr_terminator;
+
+# Test quote_style option
+
+statement ok
+CREATE TABLE quote_style_source (
+  int_col INT,
+  string_col TEXT,
+  float_col DOUBLE
+) AS VALUES
+(1, 'hello', 1.1),
+(2, 'world', 2.2),
+(3, 'comma,value', 3.3);
+
+# QuoteStyle::Always - all fields are quoted
+query I
+COPY quote_style_source TO 
'test_files/scratch/csv_files/quote_style_always.csv'
+STORED AS csv
+OPTIONS ('format.has_header' 'true', 'format.quote_style' 'Always');
+----
+3
+
+statement ok
+CREATE EXTERNAL TABLE stored_quote_style_always (
+  int_col TEXT,
+  string_col TEXT,
+  float_col TEXT
+) STORED AS CSV
+LOCATION 'test_files/scratch/csv_files/quote_style_always.csv'
+OPTIONS ('format.has_header' 'true', 'format.quote_style' 'Never');
+
+# All values should have been quoted, but reading them back strips the quotes

Review Comment:
   Could do something like this:
   
   ```sql
   statement ok
   CREATE EXTERNAL TABLE stored_quote_style_nonnumeric (
     whole_file TEXT
   ) STORED AS CSV
   LOCATION 'test_files/scratch/csv_files/quote_style_nonnumeric.csv'
   OPTIONS ('format.has_header' 'true', 'format.delimiter' '@');
   
   query T
   select * from stored_quote_style_nonnumeric;
   ----
   1,"hello",1.1
   2,"world",2.2
   3,"comma,value",3.3
   ```
   
   - Pretty much read entire file as a single column, by choosing a delimiter 
that doesn't appear



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to