[ 
https://issues.apache.org/jira/browse/HBASE-19343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pankaj Kumar updated HBASE-19343:
---------------------------------
    Description: 
Restore snapshot makes parent split region online as shown in the attached 
snapshot.

Steps to reproduce
=====================
1. Create table
2. Insert few records into the table
3. flush the table
4. Split the table
5. Create snapshot before catalog janitor clears the parent region entry from 
meta.
6. Restore snapshot


We can see the problem in meta entries,

Meta content before restore snapshot:
{noformat}
t1,,1511537529449.077a12b0b3c91b053fa95223635f9543.         
column=info:regioninfo, timestamp=1511537565964, value={ENCODED => 
077a12b0b3c91b053fa95223635f9543, NAME => 
't1,,1511537529449.077a12b0b3c91b053fa95223635f9543.', STARTKEY =>
                                                              '', ENDKEY => '', 
OFFLINE => true, SPLIT => true}
 t1,,1511537529449.077a12b0b3c91b053fa95223635f9543.         
column=info:seqnumDuringOpen, timestamp=1511537530107, 
value=\x00\x00\x00\x00\x00\x00\x00\x02
 t1,,1511537529449.077a12b0b3c91b053fa95223635f9543.         
column=info:server, timestamp=1511537530107, value=host-xx:16020
 t1,,1511537529449.077a12b0b3c91b053fa95223635f9543.         
column=info:serverstartcode, timestamp=1511537530107, value=1511537511523
 t1,,1511537529449.077a12b0b3c91b053fa95223635f9543.         
column=info:splitA, timestamp=1511537565964, value={ENCODED => 
3c7c866d4df370c586131a4cbe0ef6a8, NAME => 
't1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8.', STARTKEY => '',
                                                              ENDKEY => 'm'}
 t1,,1511537529449.077a12b0b3c91b053fa95223635f9543.         
column=info:splitB, timestamp=1511537565964, value={ENCODED => 
dc7facd824c85b94e5bf6a2e6b5f5efc, NAME => 
't1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc.', STARTKEY => 'm
                                                             ', ENDKEY => ''}
 t1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8.         
column=info:regioninfo, timestamp=1511537566075, value={ENCODED => 
3c7c866d4df370c586131a4cbe0ef6a8, NAME => 
't1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8.', STARTKEY =>
                                                              '', ENDKEY => 'm'}
 t1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8.         
column=info:seqnumDuringOpen, timestamp=1511537566075, 
value=\x00\x00\x00\x00\x00\x00\x00\x02
 t1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8.         
column=info:server, timestamp=1511537566075, value=host-xx:16020
 t1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8.         
column=info:serverstartcode, timestamp=1511537566075, value=1511537511523
 t1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc.        
column=info:regioninfo, timestamp=1511537566069, value={ENCODED => 
dc7facd824c85b94e5bf6a2e6b5f5efc, NAME => 
't1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc.', STARTKEY =
                                                             > 'm', ENDKEY => 
''}
 t1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc.        
column=info:seqnumDuringOpen, timestamp=1511537566069, 
value=\x00\x00\x00\x00\x00\x00\x00\x08
 t1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc.        
column=info:server, timestamp=1511537566069, value=host-xx:16020
 t1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc.        
column=info:serverstartcode, timestamp=1511537566069, value=1511537511523


{noformat}

Meta content after restore snapshot:
{noformat}
 t1,,1511537529449.077a12b0b3c91b053fa95223635f9543.         
column=info:regioninfo, timestamp=1511537667635, value={ENCODED => 
077a12b0b3c91b053fa95223635f9543, NAME => 
't1,,1511537529449.077a12b0b3c91b053fa95223635f9543.', STARTKEY =>
                                                              '', ENDKEY => ''}
 t1,,1511537529449.077a12b0b3c91b053fa95223635f9543.         
column=info:seqnumDuringOpen, timestamp=1511537667635, 
value=\x00\x00\x00\x00\x00\x00\x00\x0A
 t1,,1511537529449.077a12b0b3c91b053fa95223635f9543.         
column=info:server, timestamp=1511537667635, value=host-xx:16020
 t1,,1511537529449.077a12b0b3c91b053fa95223635f9543.         
column=info:serverstartcode, timestamp=1511537667635, value=1511537511523
 t1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8.         
column=info:regioninfo, timestamp=1511537667598, value={ENCODED => 
3c7c866d4df370c586131a4cbe0ef6a8, NAME => 
't1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8.', STARTKEY =>
                                                              '', ENDKEY => 'm'}
 t1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8.         
column=info:seqnumDuringOpen, timestamp=1511537667598, 
value=\x00\x00\x00\x00\x00\x00\x00\x0B
 t1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8.         
column=info:server, timestamp=1511537667598, value=host-xx:16020
 t1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8.         
column=info:serverstartcode, timestamp=1511537667598, value=1511537511523
 t1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc.        
column=info:regioninfo, timestamp=1511537667621, value={ENCODED => 
dc7facd824c85b94e5bf6a2e6b5f5efc, NAME => 
't1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc.', STARTKEY =
                                                             > 'm', ENDKEY => 
''}
 t1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc.        
column=info:seqnumDuringOpen, timestamp=1511537667621, 
value=\x00\x00\x00\x00\x00\x00\x00\x0D
 t1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc.        
column=info:server, timestamp=1511537667621, value=host-xx:16020
 t1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc.        
column=info:serverstartcode, timestamp=1511537667621, value=1511537511523

{noformat}

Root Cause:
We dont update the region split information in .regioninfo file in HDFS, but 
while restoring the snapshot we set regioninfo based on the .regioninfo entries,
{code}
    // Identify which region are still available and which not.
    // NOTE: we rely upon the region name as: "table name, start key, end key"
    List<HRegionInfo> tableRegions = getTableRegions();
    if (tableRegions != null) {
      monitor.rethrowException();
      for (HRegionInfo regionInfo: tableRegions) {
        String regionName = regionInfo.getEncodedName();
        if (regionNames.contains(regionName)) {
          LOG.info("region to restore: " + regionName);
          regionNames.remove(regionName);
          metaChanges.addRegionToRestore(regionInfo);
        } else {
          LOG.info("region to remove: " + regionName);
          metaChanges.addRegionToRemove(regionInfo);
        }
      }
{code}
Here getTableRegions() is retrieved from HDFS.


There can be two solutions,
1. Set the regioninfo based on the snapshot-manifest details.
2. Update the .regioninfo after region split


> Restore snapshot makes parent split region online 
> --------------------------------------------------
>
>                 Key: HBASE-19343
>                 URL: https://issues.apache.org/jira/browse/HBASE-19343
>             Project: HBase
>          Issue Type: Bug
>          Components: snapshots
>            Reporter: Pankaj Kumar
>            Assignee: Pankaj Kumar
>
> Restore snapshot makes parent split region online as shown in the attached 
> snapshot.
> Steps to reproduce
> =====================
> 1. Create table
> 2. Insert few records into the table
> 3. flush the table
> 4. Split the table
> 5. Create snapshot before catalog janitor clears the parent region entry from 
> meta.
> 6. Restore snapshot
> We can see the problem in meta entries,
> Meta content before restore snapshot:
> {noformat}
> t1,,1511537529449.077a12b0b3c91b053fa95223635f9543.         
> column=info:regioninfo, timestamp=1511537565964, value={ENCODED => 
> 077a12b0b3c91b053fa95223635f9543, NAME => 
> 't1,,1511537529449.077a12b0b3c91b053fa95223635f9543.', STARTKEY =>
>                                                               '', ENDKEY => 
> '', OFFLINE => true, SPLIT => true}
>  t1,,1511537529449.077a12b0b3c91b053fa95223635f9543.         
> column=info:seqnumDuringOpen, timestamp=1511537530107, 
> value=\x00\x00\x00\x00\x00\x00\x00\x02
>  t1,,1511537529449.077a12b0b3c91b053fa95223635f9543.         
> column=info:server, timestamp=1511537530107, value=host-xx:16020
>  t1,,1511537529449.077a12b0b3c91b053fa95223635f9543.         
> column=info:serverstartcode, timestamp=1511537530107, value=1511537511523
>  t1,,1511537529449.077a12b0b3c91b053fa95223635f9543.         
> column=info:splitA, timestamp=1511537565964, value={ENCODED => 
> 3c7c866d4df370c586131a4cbe0ef6a8, NAME => 
> 't1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8.', STARTKEY => '',
>                                                               ENDKEY => 'm'}
>  t1,,1511537529449.077a12b0b3c91b053fa95223635f9543.         
> column=info:splitB, timestamp=1511537565964, value={ENCODED => 
> dc7facd824c85b94e5bf6a2e6b5f5efc, NAME => 
> 't1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc.', STARTKEY => 'm
>                                                              ', ENDKEY => ''}
>  t1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8.         
> column=info:regioninfo, timestamp=1511537566075, value={ENCODED => 
> 3c7c866d4df370c586131a4cbe0ef6a8, NAME => 
> 't1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8.', STARTKEY =>
>                                                               '', ENDKEY => 
> 'm'}
>  t1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8.         
> column=info:seqnumDuringOpen, timestamp=1511537566075, 
> value=\x00\x00\x00\x00\x00\x00\x00\x02
>  t1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8.         
> column=info:server, timestamp=1511537566075, value=host-xx:16020
>  t1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8.         
> column=info:serverstartcode, timestamp=1511537566075, value=1511537511523
>  t1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc.        
> column=info:regioninfo, timestamp=1511537566069, value={ENCODED => 
> dc7facd824c85b94e5bf6a2e6b5f5efc, NAME => 
> 't1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc.', STARTKEY =
>                                                              > 'm', ENDKEY => 
> ''}
>  t1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc.        
> column=info:seqnumDuringOpen, timestamp=1511537566069, 
> value=\x00\x00\x00\x00\x00\x00\x00\x08
>  t1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc.        
> column=info:server, timestamp=1511537566069, value=host-xx:16020
>  t1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc.        
> column=info:serverstartcode, timestamp=1511537566069, value=1511537511523
> {noformat}
> Meta content after restore snapshot:
> {noformat}
>  t1,,1511537529449.077a12b0b3c91b053fa95223635f9543.         
> column=info:regioninfo, timestamp=1511537667635, value={ENCODED => 
> 077a12b0b3c91b053fa95223635f9543, NAME => 
> 't1,,1511537529449.077a12b0b3c91b053fa95223635f9543.', STARTKEY =>
>                                                               '', ENDKEY => 
> ''}
>  t1,,1511537529449.077a12b0b3c91b053fa95223635f9543.         
> column=info:seqnumDuringOpen, timestamp=1511537667635, 
> value=\x00\x00\x00\x00\x00\x00\x00\x0A
>  t1,,1511537529449.077a12b0b3c91b053fa95223635f9543.         
> column=info:server, timestamp=1511537667635, value=host-xx:16020
>  t1,,1511537529449.077a12b0b3c91b053fa95223635f9543.         
> column=info:serverstartcode, timestamp=1511537667635, value=1511537511523
>  t1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8.         
> column=info:regioninfo, timestamp=1511537667598, value={ENCODED => 
> 3c7c866d4df370c586131a4cbe0ef6a8, NAME => 
> 't1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8.', STARTKEY =>
>                                                               '', ENDKEY => 
> 'm'}
>  t1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8.         
> column=info:seqnumDuringOpen, timestamp=1511537667598, 
> value=\x00\x00\x00\x00\x00\x00\x00\x0B
>  t1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8.         
> column=info:server, timestamp=1511537667598, value=host-xx:16020
>  t1,,1511537565718.3c7c866d4df370c586131a4cbe0ef6a8.         
> column=info:serverstartcode, timestamp=1511537667598, value=1511537511523
>  t1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc.        
> column=info:regioninfo, timestamp=1511537667621, value={ENCODED => 
> dc7facd824c85b94e5bf6a2e6b5f5efc, NAME => 
> 't1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc.', STARTKEY =
>                                                              > 'm', ENDKEY => 
> ''}
>  t1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc.        
> column=info:seqnumDuringOpen, timestamp=1511537667621, 
> value=\x00\x00\x00\x00\x00\x00\x00\x0D
>  t1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc.        
> column=info:server, timestamp=1511537667621, value=host-xx:16020
>  t1,m,1511537565718.dc7facd824c85b94e5bf6a2e6b5f5efc.        
> column=info:serverstartcode, timestamp=1511537667621, value=1511537511523
> {noformat}
> Root Cause:
> We dont update the region split information in .regioninfo file in HDFS, but 
> while restoring the snapshot we set regioninfo based on the .regioninfo 
> entries,
> {code}
>     // Identify which region are still available and which not.
>     // NOTE: we rely upon the region name as: "table name, start key, end key"
>     List<HRegionInfo> tableRegions = getTableRegions();
>     if (tableRegions != null) {
>       monitor.rethrowException();
>       for (HRegionInfo regionInfo: tableRegions) {
>         String regionName = regionInfo.getEncodedName();
>         if (regionNames.contains(regionName)) {
>           LOG.info("region to restore: " + regionName);
>           regionNames.remove(regionName);
>           metaChanges.addRegionToRestore(regionInfo);
>         } else {
>           LOG.info("region to remove: " + regionName);
>           metaChanges.addRegionToRemove(regionInfo);
>         }
>       }
> {code}
> Here getTableRegions() is retrieved from HDFS.
> There can be two solutions,
> 1. Set the regioninfo based on the snapshot-manifest details.
> 2. Update the .regioninfo after region split



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to