[jira] [Created] (SPARK-42237) change binary to unsupported dataType in csv format

Wei Guo (Jira) Mon, 30 Jan 2023 01:20:05 -0800

Wei Guo created SPARK-42237:
-------------------------------

             Summary: change binary to unsupported dataType in csv format
                 Key: SPARK-42237
                 URL: https://issues.apache.org/jira/browse/SPARK-42237
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 3.3.1, 2.4.8
            Reporter: Wei Guo
             Fix For: 3.4.0



When a binary colunm is written into csv files, actual content of this colunm 
is {*}object.toString(){*}, which is meaningless. 
{code:java}
val df = 
Seq(Array[Byte](1,2)).toDFdf.write.csv("/Users/guowei19/Desktop/binary_csv") 
{code}
The csv file's content is as follows:
!image-2023-01-30-17-18-16-372.png|width=104,height=21!
Meanwhile, if a binary colunm saved as table with csv fileformat, the table 
can't be read back successfully.
{code:java}
val df = Seq((1, 
Array[Byte](1,2))).toDFdf.write.format("csv").saveAsTable("binaryDataTable")spark.sql("select
 * from binaryDataTable").show() {code}
!https://rte.weiyun.baidu.com/wiki/attach/image/api/imageDownloadAddress?attachId=82da0afc444c41bdaac34418a1c89963&docGuid=Eiscz4oMI45Sfp&sign=eyJhbGciOiJkaXIiLCJlbmMiOiJBMjU2R0NNIiwiYXBwSWQiOjEsInVpZCI6IjgtVWkzU0lMY2wiLCJkb2NJZCI6IkVpc2N6NG9NSTQ1U2ZwIn0..z1O-00hE1tTua9co.RmL0GxEQyNVQbIMYOvyAmQY18NMCxHdGdEPtulFiV3BuqsVlJODgA9-xFY9H9yer_Ckpbt4aG2ZrqgohIq43_ywzj-8u8SKKZnnzm7Dt-EhQBwrA7EhwUveE4-MRcAmsgqRKneN0gUJIu78ogR-M5-GAYqiyd-C-PH0LTaHDhNBWFBkF01kVOLJ18c2VTT6_lbc9j9Drmxj56ouymFgfhdUtpA.cTYqsEvvnKDcIPiah99f_A!
So I think it' better to change binary to unsupported dataType in csv format, 
both for datasource v1(CSVFileFormat) and v2(CSVTable).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (SPARK-42237) change binary to unsupported dataType in csv format

Reply via email to