[jira] [Commented] (HDFS-7784) load fsimage in parallel
[ https://issues.apache.org/jira/browse/HDFS-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16889199#comment-16889199 ] Hadoop QA commented on HDFS-7784: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 7s{color} | {color:red} HDFS-7784 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-7784 | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/27267/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > load fsimage in parallel > > > Key: HDFS-7784 > URL: https://issues.apache.org/jira/browse/HDFS-7784 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Walter Su >Assignee: Walter Su >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-7784.001.patch, test-20150213.pdf > > > When single Namenode has huge amount of files, without using federation, the > startup/restart speed is slow. The fsimage loading step takes the most of the > time. fsimage loading can seperate to two parts, deserialization and object > construction(mostly map insertion). Deserialization takes the most of CPU > time. So we can do deserialization in parallel, and add to hashmap in serial. > It will significantly reduce the NN start time. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7784) load fsimage in parallel
[ https://issues.apache.org/jira/browse/HDFS-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15796824#comment-15796824 ] Gang Xie commented on HDFS-7784: About the GC activities, the following is the gstat output. And it do caused some long-time GC. But comparing it to the one used in full block report, it looks OK. jstat -gcutil 10885 5000 1000 S0 S1 E O P YGC YGCTFGCFGCT GCT 0.00 100.00 67.94 89.32 69.63313 188.870 33.130 192.000 0.00 100.00 67.94 89.32 69.63313 188.870 33.130 192.000 0.00 100.00 67.95 89.32 69.63313 188.870 33.130 192.000 0.00 100.00 81.32 89.32 70.61313 188.870 33.130 192.000 100.00 0.00 19.44 89.68 70.62314 192.495 33.130 195.626 0.00 64.43 60.41 90.04 70.62315 192.938 33.130 196.068 56.75 7.26 100.00 90.27 70.62317 193.167 33.130 196.297 2.27 0.00 43.16 90.38 70.62318 193.653 33.130 196.783 0.00 0.68 91.15 90.38 70.62319 193.729 33.130 196.859 0.00 0.05 38.53 90.38 70.62321 193.875 33.130 197.005 0.01 0.00 82.04 90.38 70.62322 193.951 33.130 197.081 0.00 0.00 19.95 90.38 70.62324 194.084 33.130 197.214 0.00 0.00 0.00 90.38 70.62326 194.235 43.130 197.365 0.00 0.00 98.27 90.33 70.62326 194.235 45.240 199.475 0.00 0.00 40.11 90.27 70.62328 194.372 45.240 199.612 0.00 0.00 90.25 90.20 70.62329 194.449 45.240 199.689 0.00 0.00 30.08 90.13 70.62331 194.605 45.240 199.845 0.00 0.00 74.21 90.05 70.62332 194.676 45.240 199.916 0.00 0.00 14.04 89.95 70.62334 194.819 45.240 200.059 0.00 0.00 62.17 89.85 70.62335 194.894 45.240 200.134 0.00 0.00 4.01 89.79 70.62337 195.042 45.240 200.282 0.00 0.00 48.13 89.74 60.00338 195.116 45.240 200.356 0.00 0.00 80.22 89.74 60.00339 195.192 55.241 200.433 0.00 0.00 4.01 89.74 60.00341 195.349 55.241 200.590 0.00 0.00 24.07 89.74 60.00342 195.423 55.241 200.664 0.00 0.00 50.14 89.74 60.00343 195.498 55.241 200.739 0.00 0.00 96.27 89.74 60.00344 195.571 55.241 200.813 0.00 0.00 38.11 89.74 60.00346 195.708 55.241 200.949 0.00 0.00 86.24 89.74 60.00347 195.785 55.241 201.026 Total time for which application threads were stopped: 1.6167710 seconds Total time for which application threads were stopped: 9.6578530 seconds Total time for which application threads were stopped: 1.0820690 seconds Total time for which application threads were stopped: 1.1189530 seconds Total time for which application threads were stopped: 1.2096840 seconds Total time for which application threads were stopped: 8.6128080 seconds Total time for which application threads were stopped: 7.5763860 seconds Total time for which application threads were stopped: 2.1393520 seconds Total time for which application threads were stopped: 1.9607400 seconds Total time for which application threads were stopped: 3.0785030 seconds Total time for which application threads were stopped: 2.7774960 seconds Total time for which application threads were stopped: 4.5180250 seconds Total time for which application threads were stopped: 1.9637590 seconds Total time for which application threads were stopped: 1.8422970 seconds Total time for which application threads were stopped: 1.9868880 seconds Total time for which application threads were stopped: 2.2927440 seconds Total time for which application threads were stopped: 2.7141160 seconds Total time for which application threads were stopped: 2.9030460 seconds Total time for which application threads were stopped: 5.2282350 seconds Total time for which application threads were stopped: 3.6261510 seconds Total time for which application threads were stopped: 2.1100760 seconds > load fsimage in parallel > > > Key: HDFS-7784 > URL: https://issues.apache.org/jira/browse/HDFS-7784 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Walter Su >Assignee: Walter Su >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-7784.001.patch, test-20150213.pdf > > > When single Namenode has huge amount of files, without using federation, the > startup/restart speed is slow. The fsimage loading step takes the most of the > time. fsimage loading can seperate to two parts, deserialization and object > construction(mostly map insertion). Deserialization takes the most of CPU > time. So
[jira] [Commented] (HDFS-7784) load fsimage in parallel
[ https://issues.apache.org/jira/browse/HDFS-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15796809#comment-15796809 ] Gang Xie commented on HDFS-7784: The hardware info: CPU: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz with 24 Cores Mem: cat /proc/meminfo MemTotal: 131749888 kB MemFree: 9390596 kB Buffers: 171080 kB Cached: 23657816 kB SwapCached:0 kB Active: 119711620 kB Inactive: 381236 kB Active(anon): 96186924 kB Inactive(anon):81452 kB Active(file): 23524696 kB Inactive(file): 299784 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 0 kB SwapFree: 0 kB Dirty: 108 kB Writeback: 0 kB AnonPages: 96264056 kB Mapped:26604 kB Shmem: 4412 kB Slab: 728272 kB SReclaimable: 673344 kB SUnreclaim:54928 kB KernelStack:5392 kB PageTables: 192256 kB NFS_Unstable: 0 kB Bounce:0 kB WritebackTmp: 0 kB CommitLimit:65874944 kB Committed_AS: 107921484 kB VmallocTotal: 34359738367 kB VmallocUsed: 488704 kB VmallocChunk: 34289747040 kB HardwareCorrupted: 4 kB AnonHugePages: 90095616 kB HugePages_Total: 0 HugePages_Free:0 HugePages_Rsvd:0 HugePages_Surp:0 Hugepagesize: 2048 kB DirectMap4k:8192 kB DirectMap2M: 2015232 kB DirectMap1G:132120576 kB And it's hdd. > load fsimage in parallel > > > Key: HDFS-7784 > URL: https://issues.apache.org/jira/browse/HDFS-7784 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Walter Su >Assignee: Walter Su >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-7784.001.patch, test-20150213.pdf > > > When single Namenode has huge amount of files, without using federation, the > startup/restart speed is slow. The fsimage loading step takes the most of the > time. fsimage loading can seperate to two parts, deserialization and object > construction(mostly map insertion). Deserialization takes the most of CPU > time. So we can do deserialization in parallel, and add to hashmap in serial. > It will significantly reduce the NN start time. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7784) load fsimage in parallel
[ https://issues.apache.org/jira/browse/HDFS-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15796804#comment-15796804 ] Kai Zheng commented on HDFS-7784: - OOO today for customer visit, please expect delayed response. Thanks. > load fsimage in parallel > > > Key: HDFS-7784 > URL: https://issues.apache.org/jira/browse/HDFS-7784 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Walter Su >Assignee: Walter Su >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-7784.001.patch, test-20150213.pdf > > > When single Namenode has huge amount of files, without using federation, the > startup/restart speed is slow. The fsimage loading step takes the most of the > time. fsimage loading can seperate to two parts, deserialization and object > construction(mostly map insertion). Deserialization takes the most of CPU > time. So we can do deserialization in parallel, and add to hashmap in serial. > It will significantly reduce the NN start time. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7784) load fsimage in parallel
[ https://issues.apache.org/jira/browse/HDFS-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15796797#comment-15796797 ] Gang Xie commented on HDFS-7784: The JVM setting: -Xmx102400m -Xms102400m -Xmn5508m -XX:MaxDirectMemorySize=3686m -XX:MaxPermSize=1024m -XX:+PrintGCApplicationStoppedTime -XX:+UseConcMarkSweepGC -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:SurvivorRatio=6 -XX:+UseCMSCompactAtFullCollection -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly -XX:+CMSParallelRemarkEnabled -XX:+UseNUMA -XX:+CMSClassUnloadingEnabled -XX:CMSMaxAbortablePrecleanTime=1 -XX:TargetSurvivorRatio=80 -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=100 -XX:GCLogFileSize=128m -XX:CMSWaitDuration=8000 -XX:+CMSScavengeBeforeRemark -XX:ConcGCThreads=16 -XX:ParallelGCThreads=16 -XX:+CMSConcurrentMTEnabled -XX:+SafepointTimeout -XX:MonitorBound=16384 -XX:-UseBiasedLocking -XX:MaxTenuringThreshold=3 -XX:+ParallelRefProcEnabled -XX:-OmitStackTraceInFastThrow > load fsimage in parallel > > > Key: HDFS-7784 > URL: https://issues.apache.org/jira/browse/HDFS-7784 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Walter Su >Assignee: Walter Su >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-7784.001.patch, test-20150213.pdf > > > When single Namenode has huge amount of files, without using federation, the > startup/restart speed is slow. The fsimage loading step takes the most of the > time. fsimage loading can seperate to two parts, deserialization and object > construction(mostly map insertion). Deserialization takes the most of CPU > time. So we can do deserialization in parallel, and add to hashmap in serial. > It will significantly reduce the NN start time. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7784) load fsimage in parallel
[ https://issues.apache.org/jira/browse/HDFS-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15795396#comment-15795396 ] Kihwal Lee commented on HDFS-7784: -- [~xiegang112], when you have a chance to test the performance, please also share the jvm GC setting and the hardware spec (e.g. how many cores, as it affects the GC performance). It will be even better if you can measure the GC activities before and after. If everything looks positive, people will certainly be interested. > load fsimage in parallel > > > Key: HDFS-7784 > URL: https://issues.apache.org/jira/browse/HDFS-7784 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Walter Su >Assignee: Walter Su >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-7784.001.patch, test-20150213.pdf > > > When single Namenode has huge amount of files, without using federation, the > startup/restart speed is slow. The fsimage loading step takes the most of the > time. fsimage loading can seperate to two parts, deserialization and object > construction(mostly map insertion). Deserialization takes the most of CPU > time. So we can do deserialization in parallel, and add to hashmap in serial. > It will significantly reduce the NN start time. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7784) load fsimage in parallel
[ https://issues.apache.org/jira/browse/HDFS-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15784660#comment-15784660 ] Gang Xie commented on HDFS-7784: After make the AclStorage synchronized, it could works. the loading time could be reduced to ~12 mins from 29 mins with 20 threads. Need further check if similar issue exists. > load fsimage in parallel > > > Key: HDFS-7784 > URL: https://issues.apache.org/jira/browse/HDFS-7784 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Walter Su >Assignee: Walter Su >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-7784.001.patch, test-20150213.pdf > > > When single Namenode has huge amount of files, without using federation, the > startup/restart speed is slow. The fsimage loading step takes the most of the > time. fsimage loading can seperate to two parts, deserialization and object > construction(mostly map insertion). Deserialization takes the most of CPU > time. So we can do deserialization in parallel, and add to hashmap in serial. > It will significantly reduce the NN start time. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7784) load fsimage in parallel
[ https://issues.apache.org/jira/browse/HDFS-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15784285#comment-15784285 ] Gang Xie commented on HDFS-7784: Found a potential issue while trying to back port this patch to 2.4. Pls correct me if I'm wrong: When ACL is enabled on the file, it will call addAclFeature to add the AclFeature to UNIQUE_ACL_FEATURES, which is a hashmap and shared by all the files. Since we intrudoced multi threading, I think this could be a problem. Actually, in my test, trying to load 22G fsimage with 200M inodes, it could not finished the loading fsimage in some hours (without the patch, it could finish it in about 30mins). And the jstack show it's busy with UNIQUE_ACL_FEATURES. Not sure if the cache is messed up. As the image is huge and need 100G mem to profile it. It's hard to open the dump. So, I'm 100% sure about this. Do we hit similar issue when doing the test? > load fsimage in parallel > > > Key: HDFS-7784 > URL: https://issues.apache.org/jira/browse/HDFS-7784 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Walter Su >Assignee: Walter Su >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-7784.001.patch, test-20150213.pdf > > > When single Namenode has huge amount of files, without using federation, the > startup/restart speed is slow. The fsimage loading step takes the most of the > time. fsimage loading can seperate to two parts, deserialization and object > construction(mostly map insertion). Deserialization takes the most of CPU > time. So we can do deserialization in parallel, and add to hashmap in serial. > It will significantly reduce the NN start time. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7784) load fsimage in parallel
[ https://issues.apache.org/jira/browse/HDFS-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15777928#comment-15777928 ] Gang Xie commented on HDFS-7784: Hello, Any update about this improvement? Loading huge image really takes time. And it seems that this improvement is quite necessary. > load fsimage in parallel > > > Key: HDFS-7784 > URL: https://issues.apache.org/jira/browse/HDFS-7784 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Walter Su >Assignee: Walter Su >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-7784.001.patch, test-20150213.pdf > > > When single Namenode has huge amount of files, without using federation, the > startup/restart speed is slow. The fsimage loading step takes the most of the > time. fsimage loading can seperate to two parts, deserialization and object > construction(mostly map insertion). Deserialization takes the most of CPU > time. So we can do deserialization in parallel, and add to hashmap in serial. > It will significantly reduce the NN start time. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7784) load fsimage in parallel
[ https://issues.apache.org/jira/browse/HDFS-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15067073#comment-15067073 ] Kihwal Lee commented on HDFS-7784: -- bq. protobuf seems to generate a lot of garbage during startup, causing many full GCs which really consume a lot of time. One of the large NNs used to do multiple full GCs during start-up, but mainly due to initial full block report processing. Ever since the young gen size was increased, it stopped doing it. We initially feared the minor collection time would increase dramatically, but that wasn't the case. Along with the increase YG size, we set {{-XX:ParGCCardsPerStrideChunk=32768}}. We will look into javanano version. Thanks for the pointer. > load fsimage in parallel > > > Key: HDFS-7784 > URL: https://issues.apache.org/jira/browse/HDFS-7784 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Walter Su >Assignee: Walter Su >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-7784.001.patch, test-20150213.pdf > > > When single Namenode has huge amount of files, without using federation, the > startup/restart speed is slow. The fsimage loading step takes the most of the > time. fsimage loading can seperate to two parts, deserialization and object > construction(mostly map insertion). Deserialization takes the most of CPU > time. So we can do deserialization in parallel, and add to hashmap in serial. > It will significantly reduce the NN start time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7784) load fsimage in parallel
[ https://issues.apache.org/jira/browse/HDFS-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066949#comment-15066949 ] Colin Patrick McCabe commented on HDFS-7784: Thanks, [~kihwal]. Unfortunately, that's what we've seen as well... protobuf seems to generate a lot of garbage during startup, causing many full GCs which really consume a lot of time. It used to be you could ignore temporary objects as long as you didn't create tenured objects, but it turns out that if there are too many temporaries, HotSpot pushes them into the PermGen. At this point, it's not clear that parallelization is a win for fsimage loading unless we can mitigate that GC problem. Have you guys looked into using the "javanano" version of protocol buffers? See here: https://github.com/google/protobuf/tree/master/javanano It seems like this would generate a lot less garbage than the "official" PB library because it avoids builders in favor of mutable state, uses ints instead of enums, uses arrays instead of ArrayList, etc. etc. I think we should probably adopt this on the server-side, even if we keep the client-side with the existing PB library. This would help with RPC as well, of course. > load fsimage in parallel > > > Key: HDFS-7784 > URL: https://issues.apache.org/jira/browse/HDFS-7784 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Walter Su >Assignee: Walter Su >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-7784.001.patch, test-20150213.pdf > > > When single Namenode has huge amount of files, without using federation, the > startup/restart speed is slow. The fsimage loading step takes the most of the > time. fsimage loading can seperate to two parts, deserialization and object > construction(mostly map insertion). Deserialization takes the most of CPU > time. So we can do deserialization in parallel, and add to hashmap in serial. > It will significantly reduce the NN start time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7784) load fsimage in parallel
[ https://issues.apache.org/jira/browse/HDFS-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15064274#comment-15064274 ] Kihwal Lee commented on HDFS-7784: -- bq. ... find out that the bottleneck is deserialization taking too much cpu time, not disk I/O. That's exactly what we see. Disk never is the bottleneck for loading fsimage. It's the decoding of protobuf that is slow and creating a lot of garbage. Parallelizing will increase the gabage generation rate and if the GC cannot keep up, it can get even slower by incurring full gc. As for time reduction, I would still say yes to even 50% speed up. > load fsimage in parallel > > > Key: HDFS-7784 > URL: https://issues.apache.org/jira/browse/HDFS-7784 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Walter Su >Assignee: Walter Su >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-7784.001.patch, test-20150213.pdf > > > When single Namenode has huge amount of files, without using federation, the > startup/restart speed is slow. The fsimage loading step takes the most of the > time. fsimage loading can seperate to two parts, deserialization and object > construction(mostly map insertion). Deserialization takes the most of CPU > time. So we can do deserialization in parallel, and add to hashmap in serial. > It will significantly reduce the NN start time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7784) load fsimage in parallel
[ https://issues.apache.org/jira/browse/HDFS-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14335929#comment-14335929 ] Walter Su commented on HDFS-7784: - I use visualvm to profile the loading process and find out that the bottleneck is deserialization taking too much cpu time, not disk I/O. The test(test-20150213.pdf) uses three 7200rpm hard disks as raid0. I tried single-threaded starts with and without cleaning buffer cache, and the difference is very small. load fsimage in parallel Key: HDFS-7784 URL: https://issues.apache.org/jira/browse/HDFS-7784 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Walter Su Assignee: Walter Su Priority: Minor Attachments: HDFS-7784.001.patch, test-20150213.pdf When single Namenode has huge amount of files, without using federation, the startup/restart speed is slow. The fsimage loading step takes the most of the time. fsimage loading can seperate to two parts, deserialization and object construction(mostly map insertion). Deserialization takes the most of CPU time. So we can do deserialization in parallel, and add to hashmap in serial. It will significantly reduce the NN start time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7784) load fsimage in parallel
[ https://issues.apache.org/jira/browse/HDFS-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325162#comment-14325162 ] Colin Patrick McCabe commented on HDFS-7784: At the end of the day, there are situations where you have to restart both NameNodes. For example, you might have hit a bug that causes both the standby and the active to crash. We've had bugs like that in the past. So I do think this is an important improvement. I think the discussion here has been a little too dismissive. Some people are regularly spending 10 minutes to load their big fsimages... I don't think those people would write off a 2x (or 2.5x speedup) as not good enough. I do think [~wheat9]'s point about avoiding complexity is good. Can we get some benefit just doing a really large amount of readahead? For example, if we had a background thread that ran concurrently, that simply did nothing but read the FSImage from start to back, it would warm up the buffer cache for the other thread. This would mean that our single-threaded loading process would spend less time waiting for disk I/O. Maybe try that out and see what the numbers look like on a really big fsimage (something like 5-7 GB). load fsimage in parallel Key: HDFS-7784 URL: https://issues.apache.org/jira/browse/HDFS-7784 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Walter Su Assignee: Walter Su Priority: Minor Attachments: HDFS-7784.001.patch, test-20150213.pdf When single Namenode has huge amount of files, without using federation, the startup/restart speed is slow. The fsimage loading step takes the most of the time. fsimage loading can seperate to two parts, deserialization and object construction(mostly map insertion). Deserialization takes the most of CPU time. So we can do deserialization in parallel, and add to hashmap in serial. It will significantly reduce the NN start time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7784) load fsimage in parallel
[ https://issues.apache.org/jira/browse/HDFS-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319717#comment-14319717 ] Hadoop QA commented on HDFS-7784: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12698660/HDFS-7784.001.patch against trunk revision ba3c80a. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The test build failed in hadoop-hdfs-project/hadoop-hdfs Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9572//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9572//console This message is automatically generated. load fsimage in parallel Key: HDFS-7784 URL: https://issues.apache.org/jira/browse/HDFS-7784 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-7784.001.patch When single Namenode has huge amount of files, without using federation, the startup/restart speed is slow. The fsimage loading step takes the most of the time. fsimage loading can seperate to two parts, deserialization and object construction(mostly map insertion). Deserialization takes the most of CPU time. So we can do deserialization in parallel, and add to hashmap in serial. It will significantly reduce the NN start time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7784) load fsimage in parallel
[ https://issues.apache.org/jira/browse/HDFS-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319840#comment-14319840 ] Hadoop QA commented on HDFS-7784: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12698691/test-20150213.pdf against trunk revision ba3c80a. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9573//console This message is automatically generated. load fsimage in parallel Key: HDFS-7784 URL: https://issues.apache.org/jira/browse/HDFS-7784 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-7784.001.patch, test-20150213.pdf When single Namenode has huge amount of files, without using federation, the startup/restart speed is slow. The fsimage loading step takes the most of the time. fsimage loading can seperate to two parts, deserialization and object construction(mostly map insertion). Deserialization takes the most of CPU time. So we can do deserialization in parallel, and add to hashmap in serial. It will significantly reduce the NN start time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7784) load fsimage in parallel
[ https://issues.apache.org/jira/browse/HDFS-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320201#comment-14320201 ] Walter Su commented on HDFS-7784: - I agree with you. A single Namenode with 64GB memory can hold about 100m files(maybe a little more). In this situation, The startup time drops from 371s to 159s and it's not good enough. Usually we don't restart Namenode often. So I think it's ok we wait another 2 minutes for restarting. If people store 10x or 100x more than 100m files, they should consider federation. So I changed the priority to minor, and still I'll upload the patch, Maybe it'll help someone. load fsimage in parallel Key: HDFS-7784 URL: https://issues.apache.org/jira/browse/HDFS-7784 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Walter Su Assignee: Walter Su Priority: Minor Attachments: HDFS-7784.001.patch, test-20150213.pdf When single Namenode has huge amount of files, without using federation, the startup/restart speed is slow. The fsimage loading step takes the most of the time. fsimage loading can seperate to two parts, deserialization and object construction(mostly map insertion). Deserialization takes the most of CPU time. So we can do deserialization in parallel, and add to hashmap in serial. It will significantly reduce the NN start time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7784) load fsimage in parallel
[ https://issues.apache.org/jira/browse/HDFS-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320210#comment-14320210 ] Walter Su commented on HDFS-7784: - I mean fsimage loading time drops from 371s to 159s. And processing blockreport takes much more time than that. load fsimage in parallel Key: HDFS-7784 URL: https://issues.apache.org/jira/browse/HDFS-7784 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Walter Su Assignee: Walter Su Priority: Minor Attachments: HDFS-7784.001.patch, test-20150213.pdf When single Namenode has huge amount of files, without using federation, the startup/restart speed is slow. The fsimage loading step takes the most of the time. fsimage loading can seperate to two parts, deserialization and object construction(mostly map insertion). Deserialization takes the most of CPU time. So we can do deserialization in parallel, and add to hashmap in serial. It will significantly reduce the NN start time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7784) load fsimage in parallel
[ https://issues.apache.org/jira/browse/HDFS-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320261#comment-14320261 ] Walter Su commented on HDFS-7784: - In my testing, the memory usage doesn't grow. GC doesn't get worse. I do use a small buffer to avoid frequently lock()/unlock(), How will (small)buffer affect gc? Deserialization still create the same amount of garbage, it's a matter of speed. load fsimage in parallel Key: HDFS-7784 URL: https://issues.apache.org/jira/browse/HDFS-7784 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Walter Su Assignee: Walter Su Priority: Minor Attachments: HDFS-7784.001.patch, test-20150213.pdf When single Namenode has huge amount of files, without using federation, the startup/restart speed is slow. The fsimage loading step takes the most of the time. fsimage loading can seperate to two parts, deserialization and object construction(mostly map insertion). Deserialization takes the most of CPU time. So we can do deserialization in parallel, and add to hashmap in serial. It will significantly reduce the NN start time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7784) load fsimage in parallel
[ https://issues.apache.org/jira/browse/HDFS-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14321075#comment-14321075 ] Kai Zheng commented on HDFS-7784: - Hi [~walter.k.su], It's interesting, thanks ! bq.So I changed the priority to minor I don't think it's minor. It does make sense. I thought it's a good discussion. bq.One thing we might consider is a two-thread system, where one thread does deserialization and puts the results into a BlockingQueue read by the other FSN loading thread. I thought it's a good idea. We might consider it as well and have a try ? So we have the current approach, the parallel approach proposed here, and the above one suggested by [~cmccabe]. Is it possible to enhance and allow to plugin the fsimage loading approach ? By default it will use the current method. load fsimage in parallel Key: HDFS-7784 URL: https://issues.apache.org/jira/browse/HDFS-7784 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Walter Su Assignee: Walter Su Priority: Minor Attachments: HDFS-7784.001.patch, test-20150213.pdf When single Namenode has huge amount of files, without using federation, the startup/restart speed is slow. The fsimage loading step takes the most of the time. fsimage loading can seperate to two parts, deserialization and object construction(mostly map insertion). Deserialization takes the most of CPU time. So we can do deserialization in parallel, and add to hashmap in serial. It will significantly reduce the NN start time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7784) load fsimage in parallel
[ https://issues.apache.org/jira/browse/HDFS-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319115#comment-14319115 ] Haohui Mai commented on HDFS-7784: -- I've done some experiments in HDFS-5698. Parallelism does improve the performance, however, my feeling is that the improvement is significant enough to justify the the complexity, especially having one race / bug here could easily lead to data loss. load fsimage in parallel Key: HDFS-7784 URL: https://issues.apache.org/jira/browse/HDFS-7784 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Walter Su Assignee: Walter Su When single Namenode has huge amount of files, without using federation, the startup/restart speed is slow. The fsimage loading step takes the most of the time. fsimage loading can seperate to two parts, deserialization and object construction(mostly map insertion). Deserialization takes the most of CPU time. So we can do deserialization in parallel, and add to hashmap in serial. It will significantly reduce the NN start time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7784) load fsimage in parallel
[ https://issues.apache.org/jira/browse/HDFS-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319097#comment-14319097 ] Colin Patrick McCabe commented on HDFS-7784: Hi Walter, this is an interesting idea. We have found that GC is a major part of NN startup time. Have you tested with FSImages larger than 3 GB? If we are doing a lot of buffering, my concern would be that GC could get worse. One thing we might consider is a two-thread system, where one thread does deserialization and puts the results into a BlockingQueue read by the other FSN loading thread. This would avoid buffering an enormous amount of data, but still get 2x parallelism. load fsimage in parallel Key: HDFS-7784 URL: https://issues.apache.org/jira/browse/HDFS-7784 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Walter Su Assignee: Walter Su When single Namenode has huge amount of files, without using federation, the startup/restart speed is slow. The fsimage loading step takes the most of the time. fsimage loading can seperate to two parts, deserialization and object construction(mostly map insertion). Deserialization takes the most of CPU time. So we can do deserialization in parallel, and add to hashmap in serial. It will significantly reduce the NN start time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7784) load fsimage in parallel
[ https://issues.apache.org/jira/browse/HDFS-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319684#comment-14319684 ] Walter Su commented on HDFS-7784: - I'll upload performance test results in 4 hours. load fsimage in parallel Key: HDFS-7784 URL: https://issues.apache.org/jira/browse/HDFS-7784 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Walter Su Assignee: Walter Su Attachments: HDFS-7784.001.patch When single Namenode has huge amount of files, without using federation, the startup/restart speed is slow. The fsimage loading step takes the most of the time. fsimage loading can seperate to two parts, deserialization and object construction(mostly map insertion). Deserialization takes the most of CPU time. So we can do deserialization in parallel, and add to hashmap in serial. It will significantly reduce the NN start time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)