Ok thanks Gordon! It would be nice to have a benchmark also on this ;) Thanks a lot for the support, Flavio
On Tue, May 16, 2017 at 9:41 AM, Tzu-Li (Gordon) Tai <tzuli...@apache.org> wrote: > If you don’t register the TBaseSerializer for your MyThriftObj (or in > general don’t register any serializer for the Thrift class), I think Kryo’s > default FieldSerializer will be used for it. > > The TBaseSerializer basically just uses TBase for de-/serialization as you > normally would for the Thrift classes (see [1]), so there should be some > specific optimization going on for that compared to Kryo’s generic > FieldSerializer. I’m not entirely sure about performance gain between the > two as I don’t really know the details of the serialization differences, > but I would suggest to use TBaseSerializer if they are Thrift classes. > > Cheers, > Gordon > > [1] https://github.com/twitter/chill/blob/develop/ > chill-thrift/src/main/java/com/twitter/chill/thrift/TBaseSerializer.java > > > On 16 May 2017 at 3:26:32 PM, Flavio Pompermaier (pomperma...@okkam.it) > wrote: > > Hi Gordon, > thanks for the link. Will the usage ofTBaseSerializer wrt Kryo lead to a > performance gain? > > On Tue, May 16, 2017 at 7:32 AM, Tzu-Li (Gordon) Tai <tzuli...@apache.org> > wrote: > >> Hi Flavio! >> >> I believe [1] has what you are looking for. Have you taken a look at that? >> >> Cheers, >> Gordon >> >> [1] https://ci.apache.org/projects/flink/flink-docs-release- >> 1.3/dev/custom_serializers.html >> >> On 15 May 2017 at 9:08:33 PM, Flavio Pompermaier (pomperma...@okkam.it) >> wrote: >> >> Hi to all, >> in my Flink job I create a Dataset<MyThriftObj> using HadoopInputFormat >> in this way: >> >> HadoopInputFormat<Void, MyThriftObj> inputFormat = new >> HadoopInputFormat<>( >> new ParquetThriftInputFormat<MyThriftObj>(), Void.class, >> MyThriftObj.class, job); >> FileInputFormat.addInputPath(job, new org.apache.hadoop.fs.Path(inpu >> tPath); >> *DataSet<Tuple2<Void, MyThriftObj>> ds* = env.createInput(inputFormat); >> >> Flink logs this message: >> >> - TypeExtractor - *class MyThriftObj contains custom serialization >> methods we do not call.* >> >> >> Indeed MyThriftObj has readObject/writeObject functions and when I print >> the type of ds I see: >> >> - Java Tuple2<Void, *GenericType<MyThriftObj>*> >> >> Fom my experience GenericType is a performace killer...what should I do >> to improve the reading/writing of MyThriftObj? >> >> Best, >> Flavio >> >> >> -- >> Flavio Pompermaier >> Development Department >> >> OKKAM S.r.l. >> Tel. +(39) 0461 1823908 <+39%200461%20182%203908> >> >> > > > -- > Flavio Pompermaier > Development Department > > OKKAM S.r.l. > Tel. +(39) 0461 1823908 <+39%200461%20182%203908> > >