[ 
https://issues.apache.org/jira/browse/HBASE-21165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16612489#comment-16612489
 ] 

stack commented on HBASE-21165:
-------------------------------

.001 is a simple log line that gives some sense of the amount of entries in 
WALs. Needs more logging for later stages of processing. In fact, let me commit 
this patch in a sub-issue and leave this open as an umbrella.

> During ProcedureStore load, there is no listing of progress...
> --------------------------------------------------------------
>
>                 Key: HBASE-21165
>                 URL: https://issues.apache.org/jira/browse/HBASE-21165
>             Project: HBase
>          Issue Type: Bug
>          Components: amv2, Operability
>            Reporter: stack
>            Assignee: stack
>            Priority: Major
>         Attachments: HBASE-21165.branch-2.1.001.patch
>
>
> I  have a Master that crashed on a large cluster with hundreds of outstanding 
> Procedure WALs and (probably --TBD) a few million Procedures to load. It is 
> taking a long time (two hours)... 
> There were STUCK procedures that were preventing clean-up of the old WALs.
> I can tell we are making progress by enabling TRACE on the Procedure Store. 
> Better would be an emission as we made progress through the files with an 
> output after every so many procedures loaded.
> Then, post-load, there is a long time spent sorting-out the Procedure 
> image... We are in finish for ages doing stuff like:
> {code}
> "master/vc0207:22001:becomeActiveMaster" #98 daemon prio=5 os_prio=0 
> tid=0x0000000000d31800 nid=0x1efc0 runnable [0x00007f0a3c17d000]
>    java.lang.Thread.State: RUNNABLE
>         at 
> org.apache.hadoop.hbase.procedure2.store.wal.ProcedureWALFormatReader$WalProcedureMap.removeFromMap(ProcedureWALFormatReader.java:837)
>         at 
> org.apache.hadoop.hbase.procedure2.store.wal.ProcedureWALFormatReader$WalProcedureMap.fetchReady(ProcedureWALFormatReader.java:614)
>         at 
> org.apache.hadoop.hbase.procedure2.store.wal.ProcedureWALFormatReader.finish(ProcedureWALFormatReader.java:201)
>         at 
> org.apache.hadoop.hbase.procedure2.store.wal.ProcedureWALFormat.load(ProcedureWALFormat.java:94)
>         at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.load(WALProcedureStore.java:426)
>         at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:382)
>         at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:663)
>         at 
> org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1335)
>         at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:878)
>         at 
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2119)
>         at 
> org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:567)
>         at 
> org.apache.hadoop.hbase.master.HMaster$$Lambda$42/1930759883.run(Unknown 
> Source)
>         at java.lang.Thread.run(Thread.java:748)
> {code}
> and....
> {code}
> "master/vc0207:22001:becomeActiveMaster" #98 daemon prio=5 os_prio=0 
> tid=0x0000000000d31800 nid=0x1efc0 runnable [0x00007f0a3c17d000]
>    java.lang.Thread.State: RUNNABLE
>         at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>         at java.security.Provider$Service.newInstance(Provider.java:1595)
>         at sun.security.jca.GetInstance.getInstance(GetInstance.java:236)
>         at sun.security.jca.GetInstance.getInstance(GetInstance.java:164)
>         at java.security.Security.getImpl(Security.java:695)
>         at java.security.MessageDigest.getInstance(MessageDigest.java:167)
>         at org.apache.hadoop.hbase.util.MD5Hash.getMD5AsHex(MD5Hash.java:59)
>         at 
> org.apache.hadoop.hbase.client.RegionInfo.createRegionName(RegionInfo.java:560)
>         at 
> org.apache.hadoop.hbase.client.RegionInfo.createRegionName(RegionInfo.java:490)
>         at 
> org.apache.hadoop.hbase.client.RegionInfoBuilder$MutableRegionInfo.<init>(RegionInfoBuilder.java:243)
>         at 
> org.apache.hadoop.hbase.client.RegionInfoBuilder.build(RegionInfoBuilder.java:120)
>         at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toRegionInfo(ProtobufUtil.java:3132)
>         at 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure.deserializeStateData(ServerCrashProcedure.java:335)
>         at 
> org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:283)
>         at 
> org.apache.hadoop.hbase.procedure2.store.wal.ProcedureWALFormatReader$Entry.convert(ProcedureWALFormatReader.java:359)
>         at 
> org.apache.hadoop.hbase.procedure2.store.wal.ProcedureWALFormatReader$EntryIterator.next(ProcedureWALFormatReader.java:410)
>         at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:460)
>         at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:76)
>         at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:391)
>         at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore$2.load(WALProcedureStore.java:441)
>         at 
> org.apache.hadoop.hbase.procedure2.store.wal.ProcedureWALFormatReader.finish(ProcedureWALFormatReader.java:202)
>         at 
> org.apache.hadoop.hbase.procedure2.store.wal.ProcedureWALFormat.load(ProcedureWALFormat.java:94)
>         at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.load(WALProcedureStore.java:426)
>         at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:382)
>         at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:663)
>         at 
> org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1335)
>         at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:878)
>         at 
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2119)
>         at 
> org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:567)
>         at 
> org.apache.hadoop.hbase.master.HMaster$$Lambda$42/1930759883.run(Unknown 
> Source)
> {code}
> In this case, it took 1.25 hours before we starting showing progress in the 
> log. Let me add progress output (Separately am looking at how to do speedup).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to