[ https://issues.apache.org/jira/browse/DRILL-6840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16681693#comment-16681693 ]
Mariano Ruiz commented on DRILL-6840: ------------------------------------- Thanks [~arina], Now knowing better the difference between Drill and SqlLine I understand you are right. Anyway maybe we should create another ticket, because thinking a bit more about how the exporter works when you use a sentence like (not the recorder) : {code:java} CREATE TABLE dfs.tmp. ... {code} It's not the lack of a feature the CSV exporter has, it's a bug itself, because it leads to wrong parsed CSV files. Let me explain: CSV file doesn't enforce you to enclose the string with the " character, in fact as much I know there is no clear standard, but it's a fact that if you have a cell with a text that has a character that it's the same character to separate the columns, enclose the text with " is needed. I detected the issue because that, I had a column that has values like *_Smartwatch XYZ, Black_* (note the comm in the text), that is followed by other columns, so because Drill don't enclose this cell with the " character, any CSV interpreter like any Office tool or a Java library interpret the value in the cell as two cells instead of one. So I can understand that Drill don't have and maybe wont have a setting to configure whether you want always to enclose cells, but anyway it should enclose any cell that it has the comma character (or the separator used) in its value, always in this case without the need of any configuration. What do you think? Do you know whether this was considered before? > Exporting to CSV using !set csvquotecharacter '"' not working in latest > stable or snapshot versions > --------------------------------------------------------------------------------------------------- > > Key: DRILL-6840 > URL: https://issues.apache.org/jira/browse/DRILL-6840 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Text & CSV > Affects Versions: 1.14.0 > Environment: * Tested with latest version *Apache Drill* 1.14.0, and > building the latest version from master (Github repo), commit > ad61c6bc1dd24994e50fe7dfed043d5e57dba8f9 at _Nov 5, 2018_. > * *Linux* x64, Ubuntu 16.04 > * *OpenJDK* Runtime Environment (build > 1.8.0_171-8u171-b11-0ubuntu0.17.10.1-b11) > * Apache *Maven* 3.5.0 > Reporter: Mariano Ruiz > Priority: Minor > Labels: csv, csvparser, export > > Using latest stable version and latest SNAPSHOT version, when I export to a > CSV file the result of a query, the text fields aren't enclosed with double > quotes as specified. > Steps: > {code:java} > 0: jdbc:drill:zk=local> USE dfs.tmp; > +-------+--------------------------------------+ > | ok | summary | > +-------+--------------------------------------+ > | true | Default schema changed to [dfs.tmp] | > +-------+--------------------------------------+ > 1 row selected (0.126 seconds) > 0: jdbc:drill:zk=local> ALTER SESSION SET `store.format`='csv'; > +-------+------------------------+ > | ok | summary | > +-------+------------------------+ > | true | store.format updated. | > +-------+------------------------+ > 1 row selected (0.117 seconds) > 0: jdbc:drill:zk=local> !set csvquotecharacter '"' > 0: jdbc:drill:zk=local> CREATE TABLE dfs.tmp.prods_without_brand AS SELECT * > FROM dfs.`/tmp/prods.csv` WHERE brand = ''; > +-----------+----------------------------+ > | Fragment | Number of records written | > +-----------+----------------------------+ > | 0_0 | 112 | > +-----------+----------------------------+ > 1 row selected (0.198 seconds) > 0: jdbc:drill:zk=local> > {code} > The CSV output doesn't have any field enclosed with *{color:red}"{color}*, > even those that have values with the *{color:red},{color}* character, so the > CSV is broken. -- This message was sent by Atlassian JIRA (v7.6.3#76005)