vagetablechicken commented on issue #5036:
URL: 
https://github.com/apache/incubator-doris/issues/5036#issuecomment-755848621


   I've found a similar error. The reason is:
   1. be side: use the utf8 charset to encode 
   
https://github.com/apache/incubator-doris/blob/65d33cf43c837e56a2a36e78b358bfc0a9d1916b/be/src/util/arrow/row_batch.cpp#L80
   1. spark-doris-connector side: use the default charset
   
https://github.com/apache/incubator-doris/blob/65d33cf43c837e56a2a36e78b358bfc0a9d1916b/extension/spark-doris-connector/src/main/java/org/apache/doris/spark/serialization/RowBatch.java#L271
   
   In my environment, the default charset is US-ASCII, so the Chinese 
characters become messy.
   It's better to specify charset `UTF_8` in `serialization/RowBatch`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to