We have a process that reads data from a  local file share  ,serailizes and
writes to HDFS in avro format. Currently it is running as a single threaded
process. When converted t to a parallel process we did  get  some
performance improvement  but  not the desired .Thread dumps show  that at
any time only on thread  has access to  this method and others are  blocked
.I am just wondering if I am building the avro objects correctly. For every
record that  that is read from the binary file we create an equivalent avro
object in the below format.

Parent p = new Parent();
LOGHDR hdr = LOGHDR.newBuilder().build()
MSGHDR msg = MSGHDR.newBuilder().build()
p.setHdr(hdr);
p.setMsg(msg);

Then  all fields in p and all the nested types that p holds together like
LOGHDR and MSGHDR are set  .

"pool-6-thread-5" #53 prio=5 os_prio=0 tid=0x00007fad896c7800 nid=0x4328
waiting for monitor entry [0x00007fad52833000]
          java.lang.Thread.State: BLOCKED (on object monitor)
               at java.util.Collections$SynchronizedMap.get(
Collections.java:2584)
               - waiting to lock <0x000000066a5e3460> (a
java.util.Collections$SynchronizedMap)
               at org.apache.avro.generic.GenericData.getDefaultValue(
GenericData.java:981)
               at org.apache.avro.data.RecordBuilderBase.defaultValue(
RecordBuilderBase.java:135)


"pool-6-thread-4" #52 prio=5 os_prio=0 tid=0x00007fad896c6000 nid=0x4327
waiting for monitor entry [0x00007fad52934000]
          java.lang.Thread.State: BLOCKED (on object monitor)
               at java.util.Collections$SynchronizedMap.get(
Collections.java:2584)
               - waiting to lock <0x000000066a5e3460> (a
java.util.Collections$SynchronizedMap)
               at org.apache.avro.generic.GenericData.getDefaultValue(
GenericData.java:981)
               at org.apache.avro.data.RecordBuilderBase.defaultValue(
RecordBuilderBase.java:135)
               at com.model.avro.SEGMENT1B$Builder.build(SEGMENT1B.java:
4362)

"pool-6-thread-2" #50 prio=5 os_prio=0 tid=0x00007fad8953a800 nid=0x4325
runnable [0x00007fad52b36000]
   java.lang.Thread.State: RUNNABLE
        at java.util.Collections$SynchronizedMap.get(Collections.java:2584)
        - locked <0x000000066a5e3460> (a java.util.Collections$
SynchronizedMap)
        at org.apache.avro.generic.GenericData.getDefaultValue(
GenericData.java:981)

On Fri, Jan 19, 2018 at 6:04 PM, Nishanth S <[email protected]> wrote:

> Hi All,
>
> We have a process that reads data from a  local file share  ,serailizes
> and writes to HDFS in avro format. Currently it is running as a single
> threaded process. When converted t to a parallel process we did  get  some
> performance improvement  but  not the desired .Thread dumps show  that at
> any time only on thread  has access to  this method and others are  blocked
> .I am just wondering if I am building the avro objects correctly.
>
> "pool-6-thread-5" #53 prio=5 os_prio=0 tid=0x00007fad896c7800 nid=0x4328
> waiting for monitor entry [0x00007fad52833000]
>           java.lang.Thread.State: BLOCKED (on object monitor)
>                at java.util.Collections$SynchronizedMap.get(
> Collections.java:2584)
>                - waiting to lock <0x000000066a5e3460> (a
> java.util.Collections$SynchronizedMap)
>                at org.apache.avro.generic.GenericData.getDefaultValue(
> GenericData.java:981)
>                at org.apache.avro.data.RecordBuilderBase.defaultValue(
> RecordBuilderBase.java:135)
>
>
> "pool-6-thread-4" #52 prio=5 os_prio=0 tid=0x00007fad896c6000 nid=0x4327
> waiting for monitor entry [0x00007fad52934000]
>           java.lang.Thread.State: BLOCKED (on object monitor)
>                at java.util.Collections$SynchronizedMap.get(
> Collections.java:2584)
>                - waiting to lock <0x000000066a5e3460> (a
> java.util.Collections$SynchronizedMap)
>                at org.apache.avro.generic.GenericData.getDefaultValue(
> GenericData.java:981)
>                at org.apache.avro.data.RecordBuilderBase.defaultValue(
> RecordBuilderBase.java:135)
>                at com.model.avro.SEGMENT1B$Builder.build(SEGMENT1B.java:
> 4362)
>
> "pool-6-thread-2" #50 prio=5 os_prio=0 tid=0x00007fad8953a800 nid=0x4325
> runnable [0x00007fad52b36000]
>    java.lang.Thread.State: RUNNABLE
>         at java.util.Collections$SynchronizedMap.get(
> Collections.java:2584)
>         - locked <0x000000066a5e3460> (a java.util.Collections$
> SynchronizedMap)
>         at org.apache.avro.generic.GenericData.getDefaultValue(
> GenericData.java:981)
>
>
>

Reply via email to