[ https://issues.apache.org/jira/browse/HDFS-12051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16329913#comment-16329913 ]
Misha Dmitriev commented on HDFS-12051: --------------------------------------- [~szetszwo] regarding the patch name: I believe your comments are not very constructive, because you repeatedly complain that the summary is misleading, but don't explain in more details what you would like to change. The summary cannot cover the details of all the things I changed in the code. If it was crucial for you that it just mentions "NameCache" (that's the change that you've just made), you could say so explicitly and/or make this change yourself right away. That would save both of us a lot of time. Regarding the numbers. I would really appreciate if you spent some time reading the beginning of this thread, where I gave the numbers indicating the significance of the problem (how much memory was wasted by duplicate byte[] arrays despite the presence of the old NameCache), and how much savings my new NameCache provided. But if you insist that I do it once again, I am copying this here for your convenience. "Analyzing one heap dump with jxray (www.jxray.com), we observed that duplicate byte[] arrays result in 6.5% memory overhead, and most of these arrays are referenced by {{org.apache.hadoop.hdfs.server.namenode.INodeFileAttributes$SnapshotCopy.name}} and {{org.apache.hadoop.hdfs.server.namenode.INodeFile.name}}" "What makes this case special is that the number of byte[] arrays is very high (~100M total arrays, ~25M unique arrays), but the average duplication factor is not very high (~4). Some byte[] arrays are replicated in an extremely high number, e.g. per the jxray report there are 3.5M copies of one 17-element array and so on. But that means that the vast majority of arrays actually don't have any duplicates." "I've redesigned the new NameCache so that its size adjusts depending on the size of the input data, within user-specified limits. It was tested using a synthetic workload simulating that of a big Hadoop installation. The result is an 8.5% reduction in the overhead due to duplicate byte[] arrays. Here are the results of the jxray analysis of the respective heap dumps: Before {code:java} 19. DUPLICATE PRIMITIVE ARRAYS Types of duplicate objects: Ovhd Num objs Num unique objs Class name 346,198K (12.6%) 12097893 3714559 byte[] ... Total arrays: 12,101,111 Unique arrays: 3,716,791 Duplicate values: 371,424 Overhead: 346,322K (12.6%) {code} After: {code:java} 19. DUPLICATE PRIMITIVE ARRAYS Types of duplicate objects: Ovhd Num objs Num unique objs Class name 100,440K (3.9%) 6208877 3855398 byte[] ... Total arrays: 6,212,104 Unique arrays: 3,857,624 Duplicate values: 727,662 Overhead: 100,566K (3.9%){code} " I hope very much that you will now spend some time and really read these numbers. As for "reasons to hurry" - this is not a hurry, this is just a change that's desperately behind the schedule. I made it in August 2017, and now it's January 2018. > Reimplement NameCache in NameNode: Intern duplicate byte[] arrays (mainly > those denoting file/directory names) to save memory > ----------------------------------------------------------------------------------------------------------------------------- > > Key: HDFS-12051 > URL: https://issues.apache.org/jira/browse/HDFS-12051 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Misha Dmitriev > Assignee: Misha Dmitriev > Priority: Major > Attachments: HDFS-12051.01.patch, HDFS-12051.02.patch, > HDFS-12051.03.patch, HDFS-12051.04.patch, HDFS-12051.05.patch, > HDFS-12051.06.patch > > > When snapshot diff operation is performed in a NameNode that manages several > million HDFS files/directories, NN needs a lot of memory. Analyzing one heap > dump with jxray (www.jxray.com), we observed that duplicate byte[] arrays > result in 6.5% memory overhead, and most of these arrays are referenced by > {{org.apache.hadoop.hdfs.server.namenode.INodeFileAttributes$SnapshotCopy.name}} > and {{org.apache.hadoop.hdfs.server.namenode.INodeFile.name}}: > {code:java} > 19. DUPLICATE PRIMITIVE ARRAYS > Types of duplicate objects: > Ovhd Num objs Num unique objs Class name > 3,220,272K (6.5%) 104749528 25760871 byte[] > .... > 1,841,485K (3.7%), 53194037 dup arrays (13158094 unique) > 3510556 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), 2228255 > of byte[8](48, 48, 48, 48, 48, 48, 95, 48), 357439 of byte[17](112, 97, 114, > 116, 45, 109, 45, 48, 48, 48, ...), 237395 of byte[8](48, 48, 48, 48, 48, 49, > 95, 48), 227853 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), > 179193 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), 169487 > of byte[8](48, 48, 48, 48, 48, 50, 95, 48), 145055 of byte[17](112, 97, 114, > 116, 45, 109, 45, 48, 48, 48, ...), 128134 of byte[8](48, 48, 48, 48, 48, 51, > 95, 48), 108265 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...) > ... and 45902395 more arrays, of which 13158084 are unique > <-- > org.apache.hadoop.hdfs.server.namenode.INodeFileAttributes$SnapshotCopy.name > <-- org.apache.hadoop.hdfs.server.namenode.snapshot.FileDiff.snapshotINode > <-- {j.u.ArrayList} <-- > org.apache.hadoop.hdfs.server.namenode.snapshot.FileDiffList.diffs <-- > org.apache.hadoop.hdfs.server.namenode.snapshot.FileWithSnapshotFeature.diffs > <-- org.apache.hadoop.hdfs.server.namenode.INode$Feature[] <-- > org.apache.hadoop.hdfs.server.namenode.INodeFile.features <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.bc <-- ... (1 > elements) ... <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap$1.entries <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.blocks <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.blocksMap <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.this$0 > <-- j.l.Thread[] <-- j.l.ThreadGroup.threads <-- j.l.Thread.group <-- Java > Static: org.apache.hadoop.fs.FileSystem$Statistics.STATS_DATA_CLEANER > 409,830K (0.8%), 13482787 dup arrays (13260241 unique) > 430 of byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 353 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 352 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 350 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 342 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 341 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 341 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 340 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 337 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 334 of > byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...) > ... and 13479257 more arrays, of which 13260231 are unique > <-- org.apache.hadoop.hdfs.server.namenode.INodeFile.name <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.bc <-- > org.apache.hadoop.util.LightWeightGSet$LinkedElement[] <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap$1.entries <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.blocks <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.blocksMap <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.this$0 > <-- j.l.Thread[] <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap$1.entries <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.blocks <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.blocksMap <-- > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.this$0 > <-- j.l.Thread[] <-- j.l.ThreadGroup.threads <-- j.l.Thread.group <-- Java > Static: org.apache.hadoop.fs.FileSystem$Statistics.STATS_DATA_CLEANER > .... > {code} > There are several other places in NameNode code which also produce duplicate > {{byte[]}} arrays. > To eliminate this duplication and reclaim memory, we will need to write a > small class similar to StringInterner, but designed specifically for byte[] > arrays. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org