Re: DataOutputSerializer serializing long UTF Strings

2024-01-23 Thread Gyula Fóra
Hi Peter! I think this is a good additional serialization utility to Flink that may benefit different data formats / connectors in the future. +1 Cheers, Gyula On Mon, Jan 22, 2024 at 8:04 PM Steven Wu wrote: > I think this is a reasonable extension to `DataOutputSerializer`. Although > 64

Re: DataOutputSerializer serializing long UTF Strings

2024-01-22 Thread Steven Wu
I think this is a reasonable extension to `DataOutputSerializer`. Although 64 KB is not small, it is still possible to have long strings over that limit. There are already precedents of extended APIs `DataOutputSerializer`. E.g. public void setPosition(int position) {

DataOutputSerializer serializing long UTF Strings

2024-01-19 Thread Péter Váry
Hi Team, During the root cause analysis of an Iceberg serialization issue [1], we have found that *DataOutputSerializer.writeUTF* has a hard limit on the length of the string (64k). This is inherited from the *DataOutput.writeUTF* method, where the JDK specifically defines this limit [2]. For