[
https://issues.apache.org/jira/browse/HDFS-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981176#action_12981176
]
Liyin Liang commented on HDFS-1583:
-----------------------------------
Hi Todd,
This is mainly caused by the serialization of array. The job is done by :
{code:}
ObjectWritable::writeObject(DataOutput out, Object instance,
Class declaredClass,
Configuration conf)
{code}
This function traverses the array and serialize each element as an object.
According to my test, an byte array with 8000 elements will grow up to 56008
elements after serialization (2.4ms). However, a wrapped object size is 8094
after serialization (0.03ms).
By the way, there is a array wrapper class already:
{code:}
public class ArrayWritable implements Writable
{code}
This class is used in FSEditLog to log operations, e.g.
FSEditLog::logMkDir(String path, INode newNode).
I'll update the patch to use ArrayWritable.
> Improve backup-node sync performance by wrapping RPC parameters
> ---------------------------------------------------------------
>
> Key: HDFS-1583
> URL: https://issues.apache.org/jira/browse/HDFS-1583
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: name-node
> Reporter: Liyin Liang
> Fix For: 0.23.0
>
> Attachments: HDFS-1583-1.patch
>
>
> The journal edit records are sent by the active name-node to the backup-node
> with RPC:
> {code:}
> public void journal(NamenodeRegistration registration,
> int jAction,
> int length,
> byte[] records) throws IOException;
> {code}
> During the name-node throughput benchmark, the size of byte array _records_
> is around *8000*. Then the serialization and deserialization is
> time-consuming. I wrote a simple application to test RPC with byte array
> parameter. When the size got to 8000, each RPC call need about 6 ms. While
> name-node sync 8k byte to local disk only need 0.3~0.4ms.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.