Giving the user a fallback option is a good thing, may come in handy when troubleshooting so +1 (I realize of course that it may create more need for troubleshooting :-)
Ram On Wed, May 18, 2016 at 10:15 AM, Bhupesh Chawda <[email protected]> wrote: > I agree that the system behaviour must be predictable. However, this can > still be an optional configuration which users can use if they need to get > going even if it is sub optimal. > > Even Storm does not enforce it and has a config parameter: > Config.TOPOLOGY_FALL_BACK_ON_JAVA_SERIALIZATION > > ~Bhupesh > > On Tue, May 17, 2016 at 1:17 PM, Vlad Rozov <[email protected]> > wrote: > > > +1. Java serialization is much slower than Kryo serialization and using > > Java serialization must be an explicit application designer choice. > > > > Thank you, > > Vlad > > > > > > On 5/17/16 11:50, Thomas Weise wrote: > > > >> IMO automatically picking a serialializer conflicts with predictable > >> system > >> behavior. If the serialization does not work I would want to know that > >> instead of the system doing some trick and arrive at suboptimal or > faulty > >> behavior. > >> > >> That does not mean we cannot have optimizations though, as long as there > >> is > >> explicit user control. > >> > >> Thomas > >> > >> > >> On Tue, May 17, 2016 at 11:34 AM, Bhupesh Chawda < > [email protected] > >> > > >> wrote: > >> > >> As Ram ans Sandesh pointed out, we do have @Bind and @DefaultSerializer > >>> annotations. However, these are tightly coupled with the field in > >>> question > >>> and do require modifying external code. Additionally it may also break > >>> other systems, if we are binding it to a JavaSerializer and perhaps > there > >>> are systems which have other means of serializing the field. > >>> > >>> My point was more to do with user having to worry about what serializer > >>> to > >>> use and how to serialize objects. > >>> For example, I liked the approach that Storm takes by falling back to > >>> Java > >>> serialization automatically in case the target class does not have a > >>> default constructor. > >>> > >>> Of course, we can explore type based serialization. But this email was > >>> more > >>> about the usability aspect; to handle classes not having default > >>> constructors in general, not just POJO tuples. > >>> > >>> ~Bhupesh > >>> > >>> > >>> > >>> On Tue, May 17, 2016 at 9:53 AM, Pramod Immaneni < > [email protected] > >>> > > >>> wrote: > >>> > >>> Can we do a test where we hard code a codec for a POJO and compare > >>>> performance against kryo. Thereafter we can dynamically compose a > >>>> codec via pojoutils and inject it. > >>>> > >>>> Thanks > >>>> > >>>> On May 17, 2016, at 8:16 AM, Vlad Rozov <[email protected]> > >>>>> > >>>> wrote: > >>> > >>>> +1 for type based serialization. Tuples in most cases are flat > >>>>> > >>>> records/pojo and it should be possible programmatically construct a > >>>> codec > >>>> that will significantly outperform Kryo. It should also reduce amount > of > >>>> data passed over the wire. I started to look in that direction as well > >>>> as > >>>> Kryo serialization is one of bottlenecks that limits Apex throughput > >>>> when > >>>> operators are deployed into different containers including NODE_LOCAL > >>>> > >>> case. > >>> > >>>> Thank you, > >>>>> Vlad > >>>>> > >>>>> On 5/17/16 07:13, Sandesh Hegde wrote: > >>>>>> If it is possible to serialize, platform should do it automatically, > >>>>>> > >>>>> it > >>> > >>>> reduces the tribal knowledge requirement to use the platform. Couples > >>>>>> > >>>>> of > >>> > >>>> month back, I also sent out the similar email. > >>>>>> > >>>>>> Type based serialization may improve the performance. > >>>>>> > >>>>>> On Tue, May 17, 2016, 6:06 AM Munagala Ramanath < > [email protected] > >>>>>>> > >>>>>> wrote: > >>>> > >>>>> Traditionally, we've recommended using > >>>>>>> "@DefaultSerializer(JavaSerializer.class)" or > >>>>>>> "@FieldSerializer.Bind(CustomSerializer.class)" as outlined at > >>>>>>> > >>>>>>> > >>>>>>> > >>> > http://docs.datatorrent.com/troubleshooting/#application-throwing-following-kryo-exception > >>> > >>>> Can you describe why those approaches are not adequate ? > >>>>>>> > >>>>>>> Ram > >>>>>>> > >>>>>>> On Mon, May 16, 2016 at 11:46 PM, Bhupesh Chawda < > >>>>>>> > >>>>>> [email protected]> > >>>> > >>>>> wrote: > >>>>>>> > >>>>>>> Hi All, > >>>>>>>> > >>>>>>>> While working on the integration of Apex with Apache Samoa, I am > >>>>>>>> > >>>>>>> coming > >>>> > >>>>> across some scenarios where I have to add default constructors in > >>>>>>>> > >>>>>>> some > >>> > >>>> external classes to make them Kryo serializable. Although this > >>>>>>>> > >>>>>>> should > >>> > >>>> be > >>>> > >>>>> okay, we would like to avoid modifying external classes as far as > >>>>>>>> > >>>>>>> possible. > >>>>>>> > >>>>>>>> Some other streaming engines have taken different approaches > towards > >>>>>>>> serialization. > >>>>>>>> > >>>>>>>> I looked at Flink and Storm serialization mechanisms. > >>>>>>>> > >>>>>>>> Storm has a fall back mechanism on Java serialization. It does use > >>>>>>>> > >>>>>>> Kryo > >>>> > >>>>> for > >>>>>>> > >>>>>>>> serialization due to performance. But, if the class is not > >>>>>>>> > >>>>>>> serializable > >>>> > >>>>> using Kryo, then it will try to serialize it using Java > >>>>>>>> > >>>>>>> serialization. If > >>>> > >>>>> even then it cannot serialize, then it throws an error. [1] > >>>>>>>> > >>>>>>>> Flink has its own serialization stack where it uses a serializer > >>>>>>>> > >>>>>>> based on > >>>> > >>>>> the type information known about the data. [2] > >>>>>>>> > >>>>>>>> What does the community think about the current state of > >>>>>>>> > >>>>>>> serialization in > >>>> > >>>>> Apex. Is there a need to explore some approaches which could avoid > >>>>>>>> serialization issues such as the one described above? Are there > any > >>>>>>>> > >>>>>>> other > >>>> > >>>>> approaches one could use? > >>>>>>>> > >>>>>>>> 1. > >>>>>>>> > >>>>>>> > >>> > http://storm.apache.org/releases/current/Serialization.html#java-serialization > >>> > >>>> 2. > >>>>>>>> > >>>>>>> > >>> > https://cwiki.apache.org/confluence/display/FLINK/Type+System,+Type+Extraction,+Serialization > >>> > >>>> ~Bhupesh > >>>>>>>> > >>>>>>> > > >
