Adding to what Jeff just pointed out previously I'm dealing with the same issue 
writing Parquet files using the ParquetIO module in Dataflow and same stuff 
happens, even forcing all String objects with UTF-8. Maybe it is related to 
behind the scenes decoding/encoding within the previously mentioned module 
which causes those chars to be wrongly encoded in the output, just in case you 
are doing some Parquet processing or using any other module in the end which 
may have a similar behavior.

Reply via email to