[GitHub] [spark] ulysses-you commented on issue #26831: [SPARK-30201][SQL] HiveOutputWriter standardOI should use ObjectInspectorCopyOption.DEFAULT

GitBox Wed, 11 Dec 2019 16:06:01 -0800

ulysses-you commented on issue #26831: [SPARK-30201][SQL] HiveOutputWriter 
standardOI should use ObjectInspectorCopyOption.DEFAULT
URL: https://github.com/apache/spark/pull/26831#issuecomment-564788840
 
 
   > how do you get non-UTF-8 data here in the first place?
   
   As I said we use hadoop api to write bytes  and create table with it. In 
normal this bytes are UTF8 code, but sometimes it does not. It is hard to say 
why it happened because data are historical.
   
   Then we use sha1 or md5 to check if data is correct which function also use 
bytes directly. And we meet this problem.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] ulysses-you commented on issue #26831: [SPARK-30201][SQL] HiveOutputWriter standardOI should use ObjectInspectorCopyOption.DEFAULT

Reply via email to