[
https://issues.apache.org/jira/browse/HDFS-12853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312614#comment-16312614
]
Tsz Wo Nicholas Sze commented on HDFS-12853:
--------------------------------------------
Thanks for working on this. The patch looks good in general. Some comments:
- COMMIT_DATA should be sync'ed with WRITE_DATA so that COMMIT_DATA is always
after WRITE_DATA.
- OZONE_SCM_RATIS_NUM_WRITE_CHUNK_THREADS is a container conf so that it should
be moved to OzoneConfigKeys.DFS_CONTAINER_RATIS_...
- Let's change the format from "TMP-<checksum>-<chunkname>" to
"<chunkname>.<checksum>.tmp" so that it is easier for admin to do ls.
- Pass the original proto as a prototype to newBuilder. Then, we only need to
change the new fields.
{code}
public TransactionContext startTransaction(RaftClientRequest request)
throws IOException {
final org.apache.ratis.shaded.com.google.protobuf.ByteString content =
request.getMessage().getContent();
final ContainerCommandRequestProto proto =
ContainerCommandRequestProto.parseFrom(ShadedProtoUtil.asByteString(content));
final SMLogEntryProto log;
if (proto.getCmdType() == ContainerProtos.Type.WriteChunk) {
final ContainerProtos.WriteChunkRequestProto write =
proto.getWriteChunk();
final ContainerProtos.WriteChunkRequestProto dataWriteProto =
ContainerProtos.WriteChunkRequestProto.newBuilder(write)
.setStage(ContainerProtos.Stage.WRITE_DATA)
.build();
ContainerCommandRequestProto dataProto = ContainerCommandRequestProto
.newBuilder(proto)
.setWriteChunk(dataWriteProto)
.build();
org.apache.ratis.shaded.com.google.protobuf.ByteString dataByteString =
ShadedProtoUtil.asShadedByteString(dataProto.toByteArray());
final ContainerProtos.WriteChunkRequestProto commitWriteProto =
ContainerProtos.WriteChunkRequestProto.newBuilder(write)
.setStage(ContainerProtos.Stage.COMMIT_DATA)
.build();
ContainerCommandRequestProto commitProto = ContainerCommandRequestProto
.newBuilder(proto)
.setWriteChunk(commitWriteProto)
.build();
org.apache.ratis.shaded.com.google.protobuf.ByteString commitByteString =
ShadedProtoUtil.asShadedByteString(commitProto.toByteArray());
log = SMLogEntryProto.newBuilder()
.setData(commitByteString)
.setStateMachineData(dataByteString)
.build();
} else {
log = SMLogEntryProto.newBuilder().setData(content).build();
}
return new TransactionContext(this, request, log);
}
{code}
Other comments (all can be done separately):
- ChunkUtils.writeData should use CREATE_NEW instead of CREATE.
- ChunkUtils.writeData, verifyChecksum should not throw
NoSuchAlgorithmException. Just wrap it as IllegalArgumentException. Also use
and ThreadLocal for it.
{code}
//ChunkUtils
private static final ThreadLocal<MessageDigest> FILE_MESSAGE_DIGEST
= ThreadLocal.withInitial(() -> {
try {
return MessageDigest.getInstance(OzoneConsts.FILE_HASH);
} catch (NoSuchAlgorithmException e) {
throw new IllegalArgumentException("Failed to create MessageDigest
for "
+ OzoneConsts.FILE_HASH, e);
}
});
public static MessageDigest getMessageDigest() {
final MessageDigest md = FILE_MESSAGE_DIGEST.get();
md.reset();
return md;
}
{code}
- Pass data as ByteString instead of byte[] in order to avoid data copying.
- File.length() calls require local fs access so that length for the same file
should be stored.
> Ozone: Optimize chunk writes for Ratis by avoiding double writes
> ----------------------------------------------------------------
>
> Key: HDFS-12853
> URL: https://issues.apache.org/jira/browse/HDFS-12853
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: ozone
> Affects Versions: HDFS-7240
> Reporter: Mukul Kumar Singh
> Assignee: Mukul Kumar Singh
> Fix For: HDFS-7240
>
> Attachments: HDFS-12853-HDFS-7240.001.patch
>
>
> Ozone, in replicated mode writes the data twice, once to the raft log and
> then to the statemachine.
> This results in the data being written twice during a particular chunk write,
> this is subobtimal. With RATIS-122 the statemachine in Ozone can be optimized
> by writing the data to the statemachine only once.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]