[jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid

stack (Commented) (JIRA) Tue, 17 Jan 2012 17:36:08 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188181#comment-13188181
 ]


stack commented on HBASE-2600:
------------------------------

Why does this have to be hardcoded?

{code}
-        return locateRegionInMeta(HConstants.ROOT_TABLE_NAME, tableName, row,
-            useCache, metaRegionLock);
+
+        //HARD CODED TO POINT TO THE FIRST META TABLE
+        return locateRegionInMeta(HConstants.ROOT_TABLE_NAME,
+                                  HConstants.META_TABLE_NAME,
+                                  HConstants.EMPTY_BYTE_ARRAY,
+                                  useCache,
+                                  metaRegionLock);
{code}

It works right?

I'm looking at NINES in HConnectionManager... we don't need this anymore now we 
are scanning in the 'natural' direction?

Is this enough?

{code}
+            // We always try to get two rows just in case one of them is a 
split.
+            Result[] result = server.next(scannerId, 2);
{code}

What if the split has split?  Then you'd have two offlined regions in meta... 
so you'd have to scan a third to get the live one (and so on... if the split is 
split is split....)

Is this comment right?

{code}
-      // <tableName>,<startKey>,<regionIdTimeStamp>/encodedName/
+      // <tableName>,<endKey>,<regionIdTimeStamp>/encodedName/
{code}

Should the be a '!' in there?

This I think I follow but its kind of an important change so should be crystal 
clear:

{code}
+  // It should say, the tablename encoded in the region ends with !,
+  // but the last region's tablename ends with "
+  public static final int END_OF_TABLE_NAME = 33;  // The ascii for !
+  public static final int END_OF_TABLE_NAME_FOR_EMPTY_ENDKEY =
+          END_OF_TABLE_NAME + 1;
{code}

So, last region in a table has a '!' delimiter between it and its empty endrow 
rather than a ','?

Is the comment above complete?  Whats the '"' about?

Oh, I see.  Lets discuss the actual characters used.  Hopefully can be better 
ones than '!' and '"' (But this is minor)

You do this a bunch in your patch:


{code}
-    return createRegionName(tableName, startKey, Bytes.toBytes(id), newFormat);
+      final byte [] endKey, final String id, boolean newFormat) {
+    return createRegionName(tableName, endKey, Bytes.toBytes(id), newFormat);
   }
+    /**
+     * Make a region name of passed parameters.
+     *
{code}

I'm referring to the spacing.  It should be two spaces, not four.

How is this so?

{code}
-      final int metalength = 7; // '.META.' length
+      final int metalength = 8; // '.META.' length
{code}

Update the comment to explain 8 I'd say.



                
> Change how we do meta tables; from tablename+STARTROW+randomid to instead, 
> tablename+ENDROW+randomid
> ----------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-2600
>                 URL: https://issues.apache.org/jira/browse/HBASE-2600
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Alex Newman
>         Attachments: 
> 0001-Changed-regioninfo-format-to-use-endKey-instead-of-s.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v2.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v4.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v6.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen-v7.2.patch, 
> 0001-HBASE-2600.-Change-how-we-do-meta-tables-from-tablen.patch, 
> 2600-trunk-01-17.txt, jenkins.pdf
>
>
> This is an idea that Ryan and I have been kicking around on and off for a 
> while now.
> If regionnames were made of tablename+endrow instead of tablename+startrow, 
> then in the metatables, doing a search for the region that contains the 
> wanted row, we'd just have to open a scanner using passed row and the first 
> row found by the scan would be that of the region we need (If offlined 
> parent, we'd have to scan to the next row).
> If we redid the meta tables in this format, we'd be using an access that is 
> natural to hbase, a scan as opposed to the perverse, expensive 
> getClosestRowBefore we currently have that has to walk backward in meta 
> finding a containing region.
> This issue is about changing the way we name regions.
> If we were using scans, prewarming client cache would be near costless (as 
> opposed to what we'll currently have to do which is first a 
> getClosestRowBefore and then a scan from the closestrowbefore forward).
> Converting to the new method, we'd have to run a migration on startup 
> changing the content in meta.
> Up to this, the randomid component of a region name has been the timestamp of 
> region creation.   HBASE-2531 "32-bit encoding of regionnames waaaaaaayyyyy 
> too susceptible to hash clashes" proposes changing the randomid so that it 
> contains actual name of the directory in the filesystem that hosts the 
> region.  If we had this in place, I think it would help with the migration to 
> this new way of doing the meta because as is, the region name in fs is a hash 
> of regionname... changing the format of the regionname would mean we generate 
> a different hash... so we'd need hbase-2531 to be in place before we could do 
> this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-2600) Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid

Reply via email to