Re: Is AvroCoder the right coder for me?

Maximilian Michels Tue, 02 Apr 2019 08:53:00 -0700

Hey Augusto,

I haven't used @DefaultCoder, but it could be the problem here.


What if you specify the coder directly for your PCollection? For example:

  pCol.setCoder(AvroCoder.of(YourClazz.class));


Thanks,
Max

On 01.04.19 17:52, Augusto Ribeiro wrote:

Hi Max,
I tried to run the job again in a cluster, this is a thread dump fromone of the Spark executors (16 cores)
https://imgur.com/u2Gz0xY
As you can see, almost all threads are blocked on that single Avroreflection method.
Best regards,
Augusto
On 2019/03/27 07:43:17, Augusto Ribeiro <a...@gmail.com<http://gmail.com>> wrote:
 > Hi Max,>
 >
> Thanks for the answer I will give it another try after I sorted outsome other things. I will try to save more data next time (screenshots,thread dumps) so that if it happens again I will be more specific in myquestions.>
 >
 > Best regards,>
 > Augusto>
 >
> On 2019/03/26 12:31:54, Maximilian Michels <m....@apache.org<http://apache.org>> wrote: >
 > > Hi Augusto,> >
 > > >
> > Generally speaking Avro should provide very good performance. Thecalls > >> > you are seeing should not be significant because Avro caches theschema > >> > information for a type. It only creates a schema via Reflection the> >
 > > first time it sees a new type.> >
 > > >
> > You can optimize further by using your domain knowledge and createa > >> > custom coder. However, if you do not do anything fancy, I think theodds > >
 > > are low that you will see a performance increase.> >
 > > >
 > > Cheers,> >
 > > Max> >
 > > >
 > > On 26.03.19 09:35, Augusto Ribeiro wrote:> >
 > > > Hi again,> >
 > > > > >
> > > Sorry for bumping this thread but nobody really came withinsight.> >
 > > > > >
> > > Should I be defining my own coders for my objects or is it commonpractice to use the AvroCoder or maybe some other coder?> >
 > > > > >
 > > > Best regards,> >
 > > > Augusto> >
 > > > > >
> > > On 2019/03/21 07:35:07, au...@gmail.com <http://gmail.com><a....@gmail.com <http://gmail.com>> wrote:> >
 > > >> Hi>> >
 > > >>> >
> > >> I am trying out Beam to do some data aggregations. Many of theinputs/outputs of my transforms are complex objects (not super complex,but containing Maps/Lists/Sets sometimes) so when I was prompted todefined a coder to these objects I added the annotation@DefaultCoder(AvroCoder.class) and things worked in my developmentenvironment.>> >
 > > >>> >
> > >> Now that I am trying to run in on "real" data I notice thatafter I deployed it to a spark runner and looking at some thread dumps,many of the threads were blocked on the following method on the Avrolibrary (ReflectData.getAccessorsFor). So my question is, did I do thewrong thing by using the AvroCoder or is there some other coder thateasily can solve my problem?>> >
 > > >>> >
 > > >> Best regards,>> >
 > > >> Augusto>> >
 > > >>> >
 > > >>> >
 > > >>> >
 > > >

Re: Is AvroCoder the right coder for me?

Reply via email to