Thanks Doug ..That sounds like it..We are using 1.7.6.I will upgrade our version and let every one .Thanks for jumping in.
On Jan 23, 2018 5:19 PM, "Doug Cutting" <[email protected]> wrote: > This sounds like AVRO-1760, fixed since Avro 1.8.0. > > https://issues.apache.org/jira/browse/AVRO-1760 > > What version of Avro are you using? > > Doug > > On Mon, Jan 22, 2018 at 9:45 AM, Nishanth S <[email protected]> > wrote: > >> Hi All, >> >> We have a process that reads data from a local file share ,serailizes >> and writes to HDFS in avro format. Currently it is running as a single >> threaded process. When converted to a parallel process we did get some >> performance improvement but not the desired .Thread dumps are pasted >> below .I am just wondering if I am building the avro objects correctly. >> For every record that that is read from the binary file we create an >> equivalent avro object in the below format. Our avro schema is pretty >> big, around 1800 fields and all of those have default values . After doing >> some profiling I could see that the most time consuming method is >> org.apache.avro.generic.GenericData.getDefaultValue() . This is in fact >> taking more time than doing the actual reads/writes. Thanks for taking a >> look. >> >> Parent p = new Parent(); >> LOGHDR hdr = LOGHDR.newBuilder().build() >> MSGHDR msg = MSGHDR.newBuilder().build() >> p.setHdr(hdr); >> p.setMsg(msg); >> >> Then all fields in p and all the nested types that p holds together like >> LOGHDR and MSGHDR are set . >> >> >> >> >> "pool-6-thread-5" #53 prio=5 os_prio=0 tid=0x00007fad896c7800 nid=0x4328 >> waiting for monitor entry [0x00007fad52833000] >> java.lang.Thread.State: BLOCKED (on object monitor) >> at java.util.Collections$Synchron >> izedMap.get(Collections.java:2584) >> - waiting to lock <0x000000066a5e3460> (a >> java.util.Collections$SynchronizedMap) >> at org.apache.avro.generic.Generi >> cData.getDefaultValue(GenericData.java:981) >> at org.apache.avro.data.RecordBui >> lderBase.defaultValue(RecordBuilderBase.java:135) >> >> >> "pool-6-thread-4" #52 prio=5 os_prio=0 tid=0x00007fad896c6000 nid=0x4327 >> waiting for monitor entry [0x00007fad52934000] >> java.lang.Thread.State: BLOCKED (on object monitor) >> at java.util.Collections$Synchron >> izedMap.get(Collections.java:2584) >> - waiting to lock <0x000000066a5e3460> (a >> java.util.Collections$SynchronizedMap) >> at org.apache.avro.generic.Generi >> cData.getDefaultValue(GenericData.java:981) >> at org.apache.avro.data.RecordBui >> lderBase.defaultValue(RecordBuilderBase.java:135) >> at com.model.avro.SEGMENT1B$Build >> er.build(SEGMENT1B.java:4362) >> >> "pool-6-thread-2" #50 prio=5 os_prio=0 tid=0x00007fad8953a800 nid=0x4325 >> runnable [0x00007fad52b36000] >> java.lang.Thread.State: RUNNABLE >> at java.util.Collections$SynchronizedMap.get(Collections.java:2 >> 584) >> - locked <0x000000066a5e3460> (a java.util.Collections$Synchron >> izedMap) >> at org.apache.avro.generic.GenericData.getDefaultValue(GenericD >> ata.java:981) >> >> >> On Fri, Jan 19, 2018 at 6:04 PM, Nishanth S <[email protected]> >> wrote: >> >>> Hi All, >>> >>> We have a process that reads data from a local file share ,serailizes >>> and writes to HDFS in avro format. Currently it is running as a single >>> threaded process. When converted t to a parallel process we did get some >>> performance improvement but not the desired .Thread dumps show that at >>> any time only on thread has access to this method and others are blocked >>> .I am just wondering if I am building the avro objects correctly. >>> >>> "pool-6-thread-5" #53 prio=5 os_prio=0 tid=0x00007fad896c7800 nid=0x4328 >>> waiting for monitor entry [0x00007fad52833000] >>> java.lang.Thread.State: BLOCKED (on object monitor) >>> at java.util.Collections$Synchron >>> izedMap.get(Collections.java:2584) >>> - waiting to lock <0x000000066a5e3460> (a >>> java.util.Collections$SynchronizedMap) >>> at org.apache.avro.generic.Generi >>> cData.getDefaultValue(GenericData.java:981) >>> at org.apache.avro.data.RecordBui >>> lderBase.defaultValue(RecordBuilderBase.java:135) >>> >>> >>> "pool-6-thread-4" #52 prio=5 os_prio=0 tid=0x00007fad896c6000 nid=0x4327 >>> waiting for monitor entry [0x00007fad52934000] >>> java.lang.Thread.State: BLOCKED (on object monitor) >>> at java.util.Collections$Synchron >>> izedMap.get(Collections.java:2584) >>> - waiting to lock <0x000000066a5e3460> (a >>> java.util.Collections$SynchronizedMap) >>> at org.apache.avro.generic.Generi >>> cData.getDefaultValue(GenericData.java:981) >>> at org.apache.avro.data.RecordBui >>> lderBase.defaultValue(RecordBuilderBase.java:135) >>> at com.model.avro.SEGMENT1B$Build >>> er.build(SEGMENT1B.java:4362) >>> >>> "pool-6-thread-2" #50 prio=5 os_prio=0 tid=0x00007fad8953a800 nid=0x4325 >>> runnable [0x00007fad52b36000] >>> java.lang.Thread.State: RUNNABLE >>> at java.util.Collections$SynchronizedMap.get(Collections.java:2 >>> 584) >>> - locked <0x000000066a5e3460> (a java.util.Collections$Synchron >>> izedMap) >>> at org.apache.avro.generic.GenericData.getDefaultValue(GenericD >>> ata.java:981) >>> >>> >>> >> >
