[ https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188642#comment-13188642 ]
jirapos...@reviews.apache.org commented on HBASE-2600: ------------------------------------------------------ ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3466/ ----------------------------------------------------------- (Updated 2012-01-18 19:21:37.796315) Review request for hbase, Michael Stack and Lars Hofhansl. Summary (updated) ------- This is an idea that Ryan and I have been kicking around on and off for a while now. If regionnames were made of tablename+endrow instead of tablename+startrow, then in the metatables, doing a search for the region that contains the wanted row, we'd just have to open a scanner using passed row and the first row found by the scan would be that of the region we need (If offlined parent, we'd have to scan to the next row). If we redid the meta tables in this format, we'd be using an access that is natural to hbase, a scan as opposed to the perverse, expensive getClosestRowBefore we currently have that has to walk backward in meta finding a containing region. This issue is about changing the way we name regions. If we were using scans, prewarming client cache would be near costless (as opposed to what we'll currently have to do which is first a getClosestRowBefore and then a scan from the closestrowbefore forward). Converting to the new method, we'd have to run a migration on startup changing the content in meta. Up to this, the randomid component of a region name has been the timestamp of region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy too susceptible to hash clashes" proposes changing the randomid so that it contains actual name of the directory in the filesystem that hosts the region. If we had this in place, I think it would help with the migration to this new way of doing the meta because as is, the region name in fs is a hash of regionname... changing the format of the regionname would mean we generate a different hash... so we'd need hbase-2531 to be in place before we could do this change. This addresses bug HBASE-2600. https://issues.apache.org/jira/browse/HBASE-2600 Diffs ----- src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 74cb821 src/main/java/org/apache/hadoop/hbase/HConstants.java 8370ef8 src/main/java/org/apache/hadoop/hbase/HTableDescriptor.java 133759d src/main/java/org/apache/hadoop/hbase/KeyValue.java be7e2d8 src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java e5e60a8 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 88c381f src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java d857538 src/main/java/org/apache/hadoop/hbase/client/HTable.java 57605e6 src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 784fdc2 src/main/java/org/apache/hadoop/hbase/client/HTablePool.java a5c198f src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java f0c6828 src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java 7a7b896 src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java b1b5a78 src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java c0a4184 src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 0431444 src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 4307d89 src/main/java/org/apache/hadoop/hbase/regionserver/GetClosestRowBeforeTracker.java 3a26bbb src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java c7cc402 src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 5cb606f src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 50e7fe0 src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java a3850e5 src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 636e533 src/main/java/org/apache/hadoop/hbase/rest/RegionsResource.java bf85bc1 src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java 3919985 src/main/java/org/apache/hadoop/hbase/rest/model/TableRegionModel.java 67e7a04 src/main/java/org/apache/hadoop/hbase/thrift/ThriftServerRunner.java b6a6349 src/main/java/org/apache/hadoop/hbase/thrift/generated/Hbase.java 9e31c61 src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift 5821d31 src/test/java/org/apache/hadoop/hbase/TestKeyValue.java dc4ee8d src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 95ab8e6 src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java dacb936 src/test/java/org/apache/hadoop/hbase/regionserver/TestGetClosestAtOrBefore.java 5f97167 src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionInfo.java 6e1211b src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java 33c78ab src/test/java/org/apache/hadoop/hbase/rest/TestStatusResource.java cffdcb6 src/test/java/org/apache/hadoop/hbase/rest/model/TestTableRegionModel.java b6f0ab5 Diff: https://reviews.apache.org/r/3466/diff Testing ------- Unit tests started table. Tests in error: org.apache.hadoop.hbase.client.TestMetaMigrationRemovingHTD: Table 'TestTable we searched for the StartKey: TestTable ,, startKey lastChar's int value: 32 with the stopKey: TestTable#,, stopRow lastChar's int value: 35 with parentTable:.META. I need to know how to update/recreate the tar ball which is the source for that test. Thanks, Alex > Change how we do meta tables; from tablename+STARTROW+randomid to instead, > tablename+ENDROW+randomid > ---------------------------------------------------------------------------------------------------- > > Key: HBASE-2600 > URL: https://issues.apache.org/jira/browse/HBASE-2600 > Project: HBase > Issue Type: Bug > Reporter: stack > Assignee: Alex Newman > Attachments: > 0001-Changed-regioninfo-format-to-use-endKey-instead-of-s.patch, > 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch, > 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v4.patch, > 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v6.patch, > 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v7.2.patch, > 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v8, > 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v8.1, > 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v9.patch, > 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch, > 2600-trunk-01-17.txt, jenkins.pdf > > > This is an idea that Ryan and I have been kicking around on and off for a > while now. > If regionnames were made of tablename+endrow instead of tablename+startrow, > then in the metatables, doing a search for the region that contains the > wanted row, we'd just have to open a scanner using passed row and the first > row found by the scan would be that of the region we need (If offlined > parent, we'd have to scan to the next row). > If we redid the meta tables in this format, we'd be using an access that is > natural to hbase, a scan as opposed to the perverse, expensive > getClosestRowBefore we currently have that has to walk backward in meta > finding a containing region. > This issue is about changing the way we name regions. > If we were using scans, prewarming client cache would be near costless (as > opposed to what we'll currently have to do which is first a > getClosestRowBefore and then a scan from the closestrowbefore forward). > Converting to the new method, we'd have to run a migration on startup > changing the content in meta. > Up to this, the randomid component of a region name has been the timestamp of > region creation. HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy > too susceptible to hash clashes" proposes changing the randomid so that it > contains actual name of the directory in the filesystem that hosts the > region. If we had this in place, I think it would help with the migration to > this new way of doing the meta because as is, the region name in fs is a hash > of regionname... changing the format of the regionname would mean we generate > a different hash... so we'd need hbase-2531 to be in place before we could do > this change. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira