[
https://issues.apache.org/jira/browse/HDFS-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17391605#comment-17391605
]
liuyongpan edited comment on HDFS-16147 at 8/3/21, 4:18 AM:
------------------------------------------------------------
1、HDFS-16147.002.patch fix the error of test unit at
org.apache.hadoop.hdfs.server.namenode.TestFSImage.testNoParallelSectionsWithCompressionEnabled
2、Upon careful examination, oiv can indeed work normally, and I can't explain
why it works.
You can simply verify as follows:
In class TestOfflineImageViewer , method{color:#172b4d} createOriginalFSImage,
add and remove such code , make a contrast:{color}
{color:#de350b}note:{color} first get my patch HDFS-16147.002.patch !
{code:java}
// turn on both parallelization and compression
conf.setBoolean(DFSConfigKeys.DFS_IMAGE_COMPRESS_KEY, true);
conf.set(DFSConfigKeys.DFS_IMAGE_COMPRESSION_CODEC_KEY,
"org.apache.hadoop.io.compress.GzipCodec");
conf.set(DFSConfigKeys.DFS_IMAGE_PARALLEL_LOAD_KEY, "true");
conf.set(DFSConfigKeys.DFS_IMAGE_PARALLEL_INODE_THRESHOLD_KEY, "2");
conf.set(DFSConfigKeys.DFS_IMAGE_PARALLEL_TARGET_SECTIONS_KEY, "2");
conf.set(DFSConfigKeys.DFS_IMAGE_PARALLEL_THREADS_KEY, "2");
{code}
run test unit {color:#ffc66d}testPBDelimitedWriter , y{color}ou can get the
answer.
3、If I create a parallel compressed image with this patch, and then try to
load it in a NN without this patch and parallel loading disabled, the NN still
able to load it.
You can simply verify as follows:
in class TestFSImageWithSnapshot , method : setUp , add such code:
{code:java}
public void setUp() throws Exception {
conf = new Configuration();
//*************add**************
conf.setBoolean(DFSConfigKeys.DFS_IMAGE_COMPRESS_KEY, true);
conf.set(DFSConfigKeys.DFS_IMAGE_COMPRESSION_CODEC_KEY,
"org.apache.hadoop.io.compress.GzipCodec");
conf.set(DFSConfigKeys.DFS_IMAGE_PARALLEL_LOAD_KEY, "true");
conf.set(DFSConfigKeys.DFS_IMAGE_PARALLEL_INODE_THRESHOLD_KEY, "3");
conf.set(DFSConfigKeys.DFS_IMAGE_PARALLEL_TARGET_SECTIONS_KEY, "3");
conf.set(DFSConfigKeys.DFS_IMAGE_PARALLEL_THREADS_KEY, "3");
//**************add*************
cluster = new MiniDFSCluster.Builder(conf).numDataNodes(REPLICATION)
.build();
cluster.waitActive();
fsn = cluster.getNamesystem();
hdfs = cluster.getFileSystem();
}
{code}
In class TestOfflineImageViewer , method createOriginalFSImage, change as
follow:
{code:java}
class FSImageFormatProtobuf, method loadInternal
case INODE: {
currentStep = new Step(StepType.INODES);
prog.beginStep(Phase.LOADING_FSIMAGE, currentStep);
stageSubSections = getSubSectionsOfName(
subSections, SectionName.INODE_SUB);
// if (loadInParallel && (stageSubSections.size() > 0)) {
// inodeLoader.loadINodeSectionInParallel(executorService,
// stageSubSections, summary.getCodec(), prog, currentStep);
// } else {
// inodeLoader.loadINodeSection(in, prog, currentStep);
// }
inodeLoader.loadINodeSection(in, prog, currentStep);
}
{code}
then run test unit {color:#ffc66d}testSaveLoadImage , {color:#172b4d}you can
get the answer.{color}{color}
was (Author: mofei):
1、HDFS-16147.002.patch fix the error of test unit at
org.apache.hadoop.hdfs.server.namenode.TestFSImage.testNoParallelSectionsWithCompressionEnabled
2、Upon careful examination, oiv can indeed work normally, and I can't explain
why it works.
You can simply verify as follows:
In class TestOfflineImageViewer , method{color:#172b4d} createOriginalFSImage,
add and remove such code , make a contrast:{color}
{color:#de350b}note:{color} first get my patch HDFS-16147.002.patch !
{code:java}
// turn on both parallelization and compression
conf.setBoolean(DFSConfigKeys.DFS_IMAGE_COMPRESS_KEY, true);
conf.set(DFSConfigKeys.DFS_IMAGE_COMPRESSION_CODEC_KEY,
"org.apache.hadoop.io.compress.GzipCodec");
conf.set(DFSConfigKeys.DFS_IMAGE_PARALLEL_LOAD_KEY, "true");
conf.set(DFSConfigKeys.DFS_IMAGE_PARALLEL_INODE_THRESHOLD_KEY, "2");
conf.set(DFSConfigKeys.DFS_IMAGE_PARALLEL_TARGET_SECTIONS_KEY, "2");
conf.set(DFSConfigKeys.DFS_IMAGE_PARALLEL_THREADS_KEY, "2");
{code}
run test unit {color:#ffc66d}testPBDelimitedWriter , y{color}ou can get the
answer.
3、If I create a parallel compressed image with this patch, and then try to
load it in a NN without this patch and parallel loading disabled, the NN still
able to load it.
You can simply verify as follows:
in class TestFSImageWithSnapshot , method : setUp , add such code:
{code:java}
public void setUp() throws Exception {
conf = new Configuration();
//*************add**************
conf.setBoolean(DFSConfigKeys.DFS_IMAGE_COMPRESS_KEY, true);
conf.set(DFSConfigKeys.DFS_IMAGE_COMPRESSION_CODEC_KEY,
"org.apache.hadoop.io.compress.GzipCodec");
conf.set(DFSConfigKeys.DFS_IMAGE_PARALLEL_LOAD_KEY, "true");
conf.set(DFSConfigKeys.DFS_IMAGE_PARALLEL_INODE_THRESHOLD_KEY, "3");
conf.set(DFSConfigKeys.DFS_IMAGE_PARALLEL_TARGET_SECTIONS_KEY, "3");
conf.set(DFSConfigKeys.DFS_IMAGE_PARALLEL_THREADS_KEY, "3");
//**************add*************
cluster = new MiniDFSCluster.Builder(conf).numDataNodes(REPLICATION)
.build();
cluster.waitActive();
fsn = cluster.getNamesystem();
hdfs = cluster.getFileSystem();
}
{code}
In class TestOfflineImageViewer , method createOriginalFSImage, change as
follow:
{code:java}
class FSImageFormatProtobuf, method loadInternal
case INODE: {
currentStep = new Step(StepType.INODES);
prog.beginStep(Phase.LOADING_FSIMAGE, currentStep);
stageSubSections = getSubSectionsOfName(
subSections, SectionName.INODE_SUB);
// if (loadInParallel && (stageSubSections.size() > 0)) {
// inodeLoader.loadINodeSectionInParallel(executorService,
// stageSubSections, summary.getCodec(), prog, currentStep);
// } else {
// inodeLoader.loadINodeSection(in, prog, currentStep);
// }
inodeLoader.loadINodeSection(in, prog, currentStep);
}
{code}
then run test unit {color:#ffc66d}testSaveLoadImage {color:#172b4d}, you can
get the answer.{color}{color}
> load fsimage with parallelization and compression
> -------------------------------------------------
>
> Key: HDFS-16147
> URL: https://issues.apache.org/jira/browse/HDFS-16147
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: namanode
> Affects Versions: 3.3.0
> Reporter: liuyongpan
> Priority: Minor
> Attachments: HDFS-16147.001.patch, HDFS-16147.002.patch,
> subsection.svg
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]