This sounds like AVRO-1760, fixed since Avro 1.8.0. https://issues.apache.org/jira/browse/AVRO-1760
What version of Avro are you using? Doug On Mon, Jan 22, 2018 at 9:45 AM, Nishanth S <[email protected]> wrote: > Hi All, > > We have a process that reads data from a local file share ,serailizes > and writes to HDFS in avro format. Currently it is running as a single > threaded process. When converted to a parallel process we did get some > performance improvement but not the desired .Thread dumps are pasted > below .I am just wondering if I am building the avro objects correctly. > For every record that that is read from the binary file we create an > equivalent avro object in the below format. Our avro schema is pretty > big, around 1800 fields and all of those have default values . After doing > some profiling I could see that the most time consuming method > is org.apache.avro.generic.GenericData.getDefaultValue() . This is in > fact taking more time than doing the actual reads/writes. Thanks for > taking a look. > > Parent p = new Parent(); > LOGHDR hdr = LOGHDR.newBuilder().build() > MSGHDR msg = MSGHDR.newBuilder().build() > p.setHdr(hdr); > p.setMsg(msg); > > Then all fields in p and all the nested types that p holds together like > LOGHDR and MSGHDR are set . > > > > > "pool-6-thread-5" #53 prio=5 os_prio=0 tid=0x00007fad896c7800 nid=0x4328 > waiting for monitor entry [0x00007fad52833000] > java.lang.Thread.State: BLOCKED (on object monitor) > at java.util.Collections$Synchron > izedMap.get(Collections.java:2584) > - waiting to lock <0x000000066a5e3460> (a > java.util.Collections$SynchronizedMap) > at org.apache.avro.generic.Generi > cData.getDefaultValue(GenericData.java:981) > at org.apache.avro.data.RecordBui > lderBase.defaultValue(RecordBuilderBase.java:135) > > > "pool-6-thread-4" #52 prio=5 os_prio=0 tid=0x00007fad896c6000 nid=0x4327 > waiting for monitor entry [0x00007fad52934000] > java.lang.Thread.State: BLOCKED (on object monitor) > at java.util.Collections$Synchron > izedMap.get(Collections.java:2584) > - waiting to lock <0x000000066a5e3460> (a > java.util.Collections$SynchronizedMap) > at org.apache.avro.generic.Generi > cData.getDefaultValue(GenericData.java:981) > at org.apache.avro.data.RecordBui > lderBase.defaultValue(RecordBuilderBase.java:135) > at com.model.avro.SEGMENT1B$Build > er.build(SEGMENT1B.java:4362) > > "pool-6-thread-2" #50 prio=5 os_prio=0 tid=0x00007fad8953a800 nid=0x4325 > runnable [0x00007fad52b36000] > java.lang.Thread.State: RUNNABLE > at java.util.Collections$SynchronizedMap.get(Collections.java: > 2584) > - locked <0x000000066a5e3460> (a java.util.Collections$Synchron > izedMap) > at org.apache.avro.generic.GenericData.getDefaultValue(GenericD > ata.java:981) > > > On Fri, Jan 19, 2018 at 6:04 PM, Nishanth S <[email protected]> > wrote: > >> Hi All, >> >> We have a process that reads data from a local file share ,serailizes >> and writes to HDFS in avro format. Currently it is running as a single >> threaded process. When converted t to a parallel process we did get some >> performance improvement but not the desired .Thread dumps show that at >> any time only on thread has access to this method and others are blocked >> .I am just wondering if I am building the avro objects correctly. >> >> "pool-6-thread-5" #53 prio=5 os_prio=0 tid=0x00007fad896c7800 nid=0x4328 >> waiting for monitor entry [0x00007fad52833000] >> java.lang.Thread.State: BLOCKED (on object monitor) >> at java.util.Collections$Synchron >> izedMap.get(Collections.java:2584) >> - waiting to lock <0x000000066a5e3460> (a >> java.util.Collections$SynchronizedMap) >> at org.apache.avro.generic.Generi >> cData.getDefaultValue(GenericData.java:981) >> at org.apache.avro.data.RecordBui >> lderBase.defaultValue(RecordBuilderBase.java:135) >> >> >> "pool-6-thread-4" #52 prio=5 os_prio=0 tid=0x00007fad896c6000 nid=0x4327 >> waiting for monitor entry [0x00007fad52934000] >> java.lang.Thread.State: BLOCKED (on object monitor) >> at java.util.Collections$Synchron >> izedMap.get(Collections.java:2584) >> - waiting to lock <0x000000066a5e3460> (a >> java.util.Collections$SynchronizedMap) >> at org.apache.avro.generic.Generi >> cData.getDefaultValue(GenericData.java:981) >> at org.apache.avro.data.RecordBui >> lderBase.defaultValue(RecordBuilderBase.java:135) >> at com.model.avro.SEGMENT1B$Build >> er.build(SEGMENT1B.java:4362) >> >> "pool-6-thread-2" #50 prio=5 os_prio=0 tid=0x00007fad8953a800 nid=0x4325 >> runnable [0x00007fad52b36000] >> java.lang.Thread.State: RUNNABLE >> at java.util.Collections$SynchronizedMap.get(Collections.java: >> 2584) >> - locked <0x000000066a5e3460> (a java.util.Collections$Synchron >> izedMap) >> at org.apache.avro.generic.GenericData.getDefaultValue(GenericD >> ata.java:981) >> >> >> >
