This sounds like AVRO-1760, fixed since Avro 1.8.0.

https://issues.apache.org/jira/browse/AVRO-1760

What version of Avro are you using?

Doug

On Mon, Jan 22, 2018 at 9:45 AM, Nishanth S <[email protected]> wrote:

> Hi All,
>
> We have a process that reads data from a  local file share  ,serailizes
> and writes to HDFS in avro format. Currently it is running as a single
> threaded process. When converted  to a parallel process we did  get  some
> performance improvement  but  not the desired .Thread dumps are pasted
> below .I am just wondering if I am building the avro objects correctly.
> For every record that  that is read from the binary file we create an
> equivalent avro object in the below format. Our avro schema is  pretty
> big,  around 1800 fields and all of those have default values . After doing
> some profiling  I  could see that the most  time consuming method
> is  org.apache.avro.generic.GenericData.getDefaultValue() . This is in
> fact taking  more time than doing the actual reads/writes. Thanks for
> taking a look.
>
> Parent p = new Parent();
> LOGHDR hdr = LOGHDR.newBuilder().build()
> MSGHDR msg = MSGHDR.newBuilder().build()
> p.setHdr(hdr);
> p.setMsg(msg);
>
> Then  all fields in p and all the nested types that p holds together like
> LOGHDR and MSGHDR are set  .
>
>
>
>
> "pool-6-thread-5" #53 prio=5 os_prio=0 tid=0x00007fad896c7800 nid=0x4328
> waiting for monitor entry [0x00007fad52833000]
>           java.lang.Thread.State: BLOCKED (on object monitor)
>                at java.util.Collections$Synchron
> izedMap.get(Collections.java:2584)
>                - waiting to lock <0x000000066a5e3460> (a
> java.util.Collections$SynchronizedMap)
>                at org.apache.avro.generic.Generi
> cData.getDefaultValue(GenericData.java:981)
>                at org.apache.avro.data.RecordBui
> lderBase.defaultValue(RecordBuilderBase.java:135)
>
>
> "pool-6-thread-4" #52 prio=5 os_prio=0 tid=0x00007fad896c6000 nid=0x4327
> waiting for monitor entry [0x00007fad52934000]
>           java.lang.Thread.State: BLOCKED (on object monitor)
>                at java.util.Collections$Synchron
> izedMap.get(Collections.java:2584)
>                - waiting to lock <0x000000066a5e3460> (a
> java.util.Collections$SynchronizedMap)
>                at org.apache.avro.generic.Generi
> cData.getDefaultValue(GenericData.java:981)
>                at org.apache.avro.data.RecordBui
> lderBase.defaultValue(RecordBuilderBase.java:135)
>                at com.model.avro.SEGMENT1B$Build
> er.build(SEGMENT1B.java:4362)
>
> "pool-6-thread-2" #50 prio=5 os_prio=0 tid=0x00007fad8953a800 nid=0x4325
> runnable [0x00007fad52b36000]
>    java.lang.Thread.State: RUNNABLE
>         at java.util.Collections$SynchronizedMap.get(Collections.java:
> 2584)
>         - locked <0x000000066a5e3460> (a java.util.Collections$Synchron
> izedMap)
>         at org.apache.avro.generic.GenericData.getDefaultValue(GenericD
> ata.java:981)
>
>
> On Fri, Jan 19, 2018 at 6:04 PM, Nishanth S <[email protected]>
> wrote:
>
>> Hi All,
>>
>> We have a process that reads data from a  local file share  ,serailizes
>> and writes to HDFS in avro format. Currently it is running as a single
>> threaded process. When converted t to a parallel process we did  get  some
>> performance improvement  but  not the desired .Thread dumps show  that at
>> any time only on thread  has access to  this method and others are  blocked
>> .I am just wondering if I am building the avro objects correctly.
>>
>> "pool-6-thread-5" #53 prio=5 os_prio=0 tid=0x00007fad896c7800 nid=0x4328
>> waiting for monitor entry [0x00007fad52833000]
>>           java.lang.Thread.State: BLOCKED (on object monitor)
>>                at java.util.Collections$Synchron
>> izedMap.get(Collections.java:2584)
>>                - waiting to lock <0x000000066a5e3460> (a
>> java.util.Collections$SynchronizedMap)
>>                at org.apache.avro.generic.Generi
>> cData.getDefaultValue(GenericData.java:981)
>>                at org.apache.avro.data.RecordBui
>> lderBase.defaultValue(RecordBuilderBase.java:135)
>>
>>
>> "pool-6-thread-4" #52 prio=5 os_prio=0 tid=0x00007fad896c6000 nid=0x4327
>> waiting for monitor entry [0x00007fad52934000]
>>           java.lang.Thread.State: BLOCKED (on object monitor)
>>                at java.util.Collections$Synchron
>> izedMap.get(Collections.java:2584)
>>                - waiting to lock <0x000000066a5e3460> (a
>> java.util.Collections$SynchronizedMap)
>>                at org.apache.avro.generic.Generi
>> cData.getDefaultValue(GenericData.java:981)
>>                at org.apache.avro.data.RecordBui
>> lderBase.defaultValue(RecordBuilderBase.java:135)
>>                at com.model.avro.SEGMENT1B$Build
>> er.build(SEGMENT1B.java:4362)
>>
>> "pool-6-thread-2" #50 prio=5 os_prio=0 tid=0x00007fad8953a800 nid=0x4325
>> runnable [0x00007fad52b36000]
>>    java.lang.Thread.State: RUNNABLE
>>         at java.util.Collections$SynchronizedMap.get(Collections.java:
>> 2584)
>>         - locked <0x000000066a5e3460> (a java.util.Collections$Synchron
>> izedMap)
>>         at org.apache.avro.generic.GenericData.getDefaultValue(GenericD
>> ata.java:981)
>>
>>
>>
>

Reply via email to