[GitHub] [spark] ulysses-you commented on issue #26831: [SPARK-30201][SQL] HiveOutputWriter standardOI should use ObjectInspectorCopyOption.DEFAULT

GitBox Wed, 11 Dec 2019 06:24:54 -0800

ulysses-you commented on issue #26831: [SPARK-30201][SQL] HiveOutputWriter 
standardOI should use ObjectInspectorCopyOption.DEFAULT
URL: https://github.com/apache/spark/pull/26831#issuecomment-564565979
 
 
   The problem is writing.  When a column type is string,  spark will read 
bytes to UTF8String.  This step not actually check the UTF-8 code,   just copy 
bytes.  
   Then convert the UTF8String.toString during write. This step will convert 
every bytes as UTF-8  string. 
   As the result,  non UTF-8 code bytes will error. 
   
   So we should pass bytes directly without tostring in right sence.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] ulysses-you commented on issue #26831: [SPARK-30201][SQL] HiveOutputWriter standardOI should use ObjectInspectorCopyOption.DEFAULT

Reply via email to