[ 
https://issues.apache.org/jira/browse/HBASE-26209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17438451#comment-17438451
 ] 

shihuafeng edited comment on HBASE-26209 at 11/4/21, 2:10 AM:
--------------------------------------------------------------

*2. why did region assignment happen in split log*

      *i*  think  that it   trigger  history task  on /hbase/MasterProcWALs

021-11-02 11:39:14,081 INFO 
org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Finished 
processing of crashed gy14.esgync.local,60020,1635822375153
 *2021-11-02 12:00:45,919 INFO 
org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start processing 
crashed gy14.esgync.local,60020,{color:#ff0000}1627032643439{color}*
 2021-11-02 12:00:46,049 INFO 
org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start processing 
crashed gy14.esgync.local,60020,1631773418557
 2021-11-02 12:00:46,178 INFO 
org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start processing 
crashed gy14.esgync.local,60020,1628213135404
 2021-11-02 12:00:46,306 INFO 
org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start processing 
crashed gy11.esgync.local,60020,1631328697683
 2021-11-02 12:00:46,435 INFO 
org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start processing 
crashed gy21.esgync.local,60020,1623984229877
 2021-11-02 12:00:46,435 INFO 
org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Finished 
processing of crashed gy14.esgync.local,60020,1627032643439
 2021-11-02 12:00:46,570 INFO 
org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start processing 
crashed gy21.esgync.local,60020,1627351005533
 2021-11-02 12:00:46,572 INFO 
org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Finished 
processing of crashed gy14.esgync.local,60020,1631773418557
 2021-11-02 12:00:46,701 INFO 
org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start processing 
crashed gy14.esgync.local,60020,1632296713023
 2021-11-02 12:00:46,702 INFO 
org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Finished 
processing of crashed gy14.esgync.local,60020,1628213135404
 2021-11-02 12:00:46,847 INFO

*region   assign log* 

2021-11-02 12:00:45,550 INFO org.apache.hadoop.hbase.master.RegionStates: 
Transition \{617e7194714d0952104a3626935495db state=OPEN, ts=1635825645550, 
server=gy14.esgync.local,60020,1635824505300} to 
\{617e7194714d0952104a3626935495db state=OFFLINE, ts=1635825645550, 
server=gy14.esgync.local,60020,1635824505300}
2021-11-02 12:01:05,917 INFO org.apache.hadoop.hbase.master.RegionStates: 
Transition \{617e7194714d0952104a3626935495db state=OFFLINE, ts=1635825665835, 
server=gy14.esgync.local,60020,1635824505300} to 
\{617e7194714d0952104a3626935495db state=PENDING_OPEN, ts=1635825665917, 
server=gy14.esgync.local,60020,1635825620829}
2021-11-02 12:01:07,110 INFO org.apache.hadoop.hbase.master.RegionStates: 
Transition \{617e7194714d0952104a3626935495db state=PENDING_OPEN, 
ts=1635825665917, server=gy14.esgync.local,60020,1635825620829} to 
\{617e7194714d0952104a3626935495db state=OPENING, ts=1635825667110, 
server=gy14.esgync.local,60020,1635825620829}
2021-11-02 12:01:32,226 INFO org.apache.hadoop.hbase.master.RegionStates: 
Transition \{617e7194714d0952104a3626935495db state=OPENING, ts=1635825667110, 
server=gy14.esgync.local,60020,1635825620829} to 
\{617e7194714d0952104a3626935495db state=OPEN, ts=1635825692226, 
server=gy14.esgync.local,60020,1635825620829}
2021-11-02 12:01:32,238 INFO org.apache.hadoop.hbase.master.RegionStates: 
Offlined 617e7194714d0952104a3626935495db from 
gy14.esgync.local,60020,1635824505300

 

org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start processing 
crashed gy25.esgync.local,60020,1626313309249
 2021-11-02 12:00:46,847 INFO 
org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Finished 
processing of crashed gy11.esgync.local,60020,1631328697683


was (Author: shihuafeng):
*2. why did region assignment happen in split log*

      *i*  think  that it   trigger  history task  on /hbase/MasterProcWALs

021-11-02 11:39:14,081 INFO 
org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Finished 
processing of crashed gy14.esgync.local,60020,1635822375153
*2021-11-02 12:00:45,919 INFO 
org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start processing 
crashed gy14.esgync.local,60020,{color:#FF0000}1627032643439{color}*
2021-11-02 12:00:46,049 INFO 
org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start processing 
crashed gy14.esgync.local,60020,1631773418557
2021-11-02 12:00:46,178 INFO 
org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start processing 
crashed gy14.esgync.local,60020,1628213135404
2021-11-02 12:00:46,306 INFO 
org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start processing 
crashed gy11.esgync.local,60020,1631328697683
2021-11-02 12:00:46,435 INFO 
org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start processing 
crashed gy21.esgync.local,60020,1623984229877
2021-11-02 12:00:46,435 INFO 
org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Finished 
processing of crashed gy14.esgync.local,60020,1627032643439
2021-11-02 12:00:46,570 INFO 
org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start processing 
crashed gy21.esgync.local,60020,1627351005533
2021-11-02 12:00:46,572 INFO 
org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Finished 
processing of crashed gy14.esgync.local,60020,1631773418557
2021-11-02 12:00:46,701 INFO 
org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start processing 
crashed gy14.esgync.local,60020,1632296713023
2021-11-02 12:00:46,702 INFO 
org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Finished 
processing of crashed gy14.esgync.local,60020,1628213135404
2021-11-02 12:00:46,847 INFO 
org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start processing 
crashed gy25.esgync.local,60020,1626313309249
2021-11-02 12:00:46,847 INFO 
org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Finished 
processing of crashed gy11.esgync.local,60020,1631328697683

> edit file loss result in  data loss  when restart hbase cluster
> ---------------------------------------------------------------
>
>                 Key: HBASE-26209
>                 URL: https://issues.apache.org/jira/browse/HBASE-26209
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>         Environment: Linux version 3.10.0-693.el7.x86_64 
> ([email protected]) (gcc version 4.8.5 20150623 (Red 
> Hat 4.8.5-16) (GCC) ) #1 SMP Thu Jul 6 19:56:57 EDT 2017
>            Reporter: shihuafeng
>            Priority: Blocker
>         Attachments: Repaly_edit.log, rename_edit.log
>
>
> {{      when i  restart  hbase cluster,i find  edit file loss when wal  
> repaly .}}
>          the number of edit files (00000000seqid.tmp to   00000000seqid )  is 
> 31 when split wal to  edit.  But when i read edits to repaly ,i foud the  sum 
> is 30.
>          i see rename file is sucessful,but i can not find the edit file .
> {panel:title=/var/log/message     i find  system exception}
>   ACPI Error: SMBus/IPMI/GenericSerialBus write requires Buffer of length 66, 
> found length 32 (20130517/exfield-389)
>  Aug 16 03:30:41 esgsh6 kernel: ACPI Error: Method parse/execution failed 
> [\_SB_.PMI0._PMM] (Node ffff8810e9eab258), AE_AML_BUFFER_LIMIT 
> (20130517/psparse-536)
> {panel}
>  #  *rename (00000000seqid.tmp to   00000000seqid is 31* 
>         
>          i can not find the follwing file , i confirm the edit file 
> (0*000000000001825010*)is not empty.
>   
> {panel:title=log}
> hbase-cmf-hbase-REGIONSERVER-gy11.esgync.local.log.out:2021-08-16 
> 17:56:28,956 INFO org.apache.hadoop.hbase.wal.WALSplitter: Rename 
> hdfs://nameservice1/hbase/data/default/TRAFODION.JAVABENCH2.OE_STOCK_INDEX_300/8a42de414d97b457da88bc4682dd7c52/recovered.edits/0000000000001810650.temp
>  to 
> hdfs://nameservice1/hbase/data/default/TRAFODION.JAVABENCH2.OE_STOCK_INDEX_300/8a42de414d97b457da88bc4682dd7c52/recovered.edits/0*000000000001825010*
> {panel}
> **
>          you can see attachment{color:#999999} {color} 
> *{color:#3366ff}r{color}*{color:#3366ff}*ename_edit.log*{color}
>      *2.  at replay phase , Reading the edits is 30* 
>   
> {panel:title=log}
>       hbase-cmf-hbase-REGIONSERVER-gy11.esgync.local.log.out:2021-08-16 
> 17:56:14,938      INFO org.apache.hadoop.hbase.regionserver.HRegion: after 
> replayRecoveredEdits Maximum sequenceid  1914955 and minimum sequenceid for 
> the region is 1916711, replay the file, 
> path=hdfs://nameservice1/hbase/data/default/TRAFODION.JAVABENCH2.OE_STOCK_INDEX_300/8a42de414d97b457da88bc4682dd7c52/recovered.edits/0000000000001914955,seqid=1916711,*size=30*
> {panel}
> **
>   
>   
> {code:java}
> org.apache.hadoop.hbase.regionserver.HRegion
>  NavigableSet<Path> files = WALSplitter.getSplitEditFilesSorted(fs, 
> regiondir);
> if (LOG.isDebugEnabled()) {
>  LOG.debug("Found " + (files == null ? 0 : files.size())
>  + " recovered edits file(s) under " + regiondir);
> }
> if (files == null || files.isEmpty()) return seqid;
> long start=System.currentTimeMillis();
> for (Path edits: files) {
>  if (edits == null || !fs.exists(edits)) {
>  LOG.warn("Null or non-existent edits file: " + edits);
>  continue;
>  }
>  if (isZeroLengthThenDelete(fs, edits)) continue;
>  long maxSeqId;
>  String fileName = edits.getName();
>  maxSeqId = Math.abs(Long.parseLong(fileName));
>  if (maxSeqId <= minSeqIdForTheRegion) {
>  if (LOG.isDebugEnabled()) {
>  String msg = "Maximum sequenceid for this wal is " + maxSeqId
>  + " and minimum sequenceid for the region is " + minSeqIdForTheRegion
>  + ", skipped the whole file, path=" + edits;
>  LOG.info(msg);
>  }
>  continue;
>  }
>  try {
>  seqid = Math.max(seqid, replayRecoveredEdits(edits, maxSeqIdInStores, 
> reporter));
>  // replay the edits. Replay can return -1 if everything is skipped, only 
> update
>  // if seqId is greater
>  String msg = "after replayRecoveredEdits Maximum sequenceid " + maxSeqId
>  + " and minimum sequenceid for the region is " + minSeqIdForTheRegion
>  + ", replay the file, path=" + edits
>  +",seqid="+seqid+",size="+files.size();
>  LOG.info(msg);{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to