The RowCoder encoding is not really intended to be an external encoding -
i.e. it's not intended to be a stable encoding for writing into files.
While it's fine to take in PCollection<Row> in your write operation, I
would not recommend just using RowCoder in order to generate the bytes
written to the file.

On Wed, Nov 27, 2024 at 1:46 PM Facundo Tomatis <facundotoma...@gmail.com>
wrote:

> Hello everyone!
>
> I've been developing a csv connector that wraps CsvIO, the read
> operation outputs PCollection<Row> and the write operation takes
> PCollection<Row>. I am having issues setting the encoding of the
> resulting file and the input file, for example I would like to write a
> CSV with ISO-8859-1 encoding or windows-1250 and more, and read from
> those encodings as well.
>
> Reading the source code I found out that Row's String fields (generated
> with RowCoder.of(schema)) have a StringUtf8Encoder associated, is there
> a way to change this encoder to be a custom encoder while maintaining
> PCollection<Row>?
>
> Thanks for your time.
>
> Facu.
>
>

Reply via email to