[Lucene-hadoop Wiki] Trivial Update of "Hbase/HbaseRest" by stack

Apache Wiki Wed, 14 Nov 2007 23:02:23 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for 
change notification.


The following page has been changed by stack:
http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest

The comment on the change is:
Added comments

------------------------------------------------------------------------------
- This is a provisional spec for the Hbase-REST api.
+ This is a provisional spec for the Hbase-REST API done under the aegis of 
[https://issues.apache.org/jira/browse/HADOOP-2068 HADOOP-2068].
  
+ ~-''St.Ack comment: Bryan I added comments inline.  Remove them once them 
when you are done.-~
  
  == System Information ==
  
- GET /
+ '''GET /'''
      Retrieve a list of all the tables in HBase.
      
      Returns: 
@@ -14, +15 @@

              <table name="first_table" uri="/first_table" />
              <table name="second_table" uri="/second_table" />        
          </tables>
+ ~-''St.Ack comment: FYI, there is an xhtml formatter in hbase under the shell 
package.  If we used that for outputting metadata-type pages such as this one, 
then we'll have a legup on implementation.  It uses xmlenc which is bundled w/ 
hadoop.  xmlenc is fast and dumb (like me).  IIRC, it doesn't do entities; it 
adds the entity to the closer element too.  This is dumb.  On otherhand, it 
makes it so we don't have to have the entities vs. elements argument 
(Smile).''-~
      
- GET /[table_name]
+ '''GET /[table_name]'''
      Retrieve metadata about the table. This includes column family 
descriptors.
  
      Returns: 
@@ -27, +29 @@

                  <columnFamily name="stats" />
              </columnFamilies>
          </table>
-     
+ ~-''St.Ack comment: FYI, here is an example column descriptor: {name: 
triples, max versions: 3, compression: NONE, in memory: false, max length: 
2147483647, bloom filter: none}.  We're also about to add being able to add 
arbitrary key/value pairs to both table and column descriptors-~    
+ 
- GET /[table_name]/regions
+ '''GET /[table_name]/regions'''
-     Retrieve a list of the regions for this table so that you can efficiently 
split up work (a la MapReduce).
+     Retrieve a list of the regions for this table so that you can efficiently 
split up the work (a la MapReduce).
      
      Options: 
          start_key, end_key: Only return the list of regions that contain the 
range start_key...end_key
@@ -41, +44 @@

              <region start_key="0201" server="region_server_3" />
          </regions>
  
+ ~-''St.Ack comment: This won't be needed if you use TableInputFormat in your 
mapper -- but no harm in having it in place-~   
  
  == Row Interaction ==
  
- GET /[table_name]/row/[row_key]/timestamps
+ '''GET /[table_name]/row/[row_key]/timestamps'''
      Retrieve a list of all the timestamps available for this row key.
  
      Returns: 
@@ -54, +58 @@

              <timestamp value="20071115T000800" 
uri="/first_table/row/0001/20071115T000800" />
              <timestamp value="20071115T001200" 
uri="/first_table/row/0001/20071115T001200" />
          </timestamps>
+ 
+ ~-''St.Ack comment: Currently not supported in native hbase client but we 
should add it-~
-     
+    
- GET /[table_name]/row/[row_key]/
+ '''GET /[table_name]/row/[row_key]/'''
- GET /[table_name]/row/[row_key]/[timestamp]
+ '''GET /[table_name]/row/[row_key]/[timestamp]'''
      Retrieve data from a row, constrained by an optional timestamp value.
  
      Headers:
@@ -70, +76 @@

                                  column values out of the data.
      Options: 
          columns: A semicolon-delimited list of column names. If omitted, the 
result will contain all columns in the row.
+ 
+ ~-''St.Ack comment: +1 that MIME is way to return rows.  -1 that octet-stream 
would be an option.  Just expect xml or MIME if full row specified -~
      
- POST/PUT /[table_name]/row/[row_key]/
+ '''POST/PUT /[table_name]/row/[row_key]/'''
- POST/PUT /[table_name]/row/[row_key]/[timestamp]
+ '''POST/PUT /[table_name]/row/[row_key]/[timestamp]'''
      Set the value of one or more columns for a given row key with an optional 
timestamp.
  
      Headers:
@@ -89, +97 @@

          HTTP 201 (Created) if the column(s) could successfully be saved. HTTP 
415 (Unsupported Media Type) if 
          the query string column options do not match the Content-type header, 
or if the binary data of either
          octet-stream or Multipart/related is unreadable.
+ 
+ ~-''St.Ack comment: -1 again on octet-stream.  It messes up your nice clean 
API.  Might consider adding column name as MIME header if multipart rather than 
have columns as option IF multipart (ignored if XML).  Might not make sense if 
this only time its done (since every where else need to be able to handle the 
column option) -~
      
- DELETE /[table_name]/row/[row_key]/
+ '''DELETE /[table_name]/row/[row_key]/'''
- DELETE /[table_name]/row/[row_key]/[timestamp]
+ '''DELETE /[table_name]/row/[row_key]/[timestamp]'''
      Delete the specified columns from the row. If there are no columns 
specified, then it will delete ALL columns. Optionally, specify a timestamp.
      
      Options:
@@ -103, +113 @@

      
  == Scanning ==    
  
- POST/PUT /[table_name]/scanner
+ '''POST/PUT /[table_name]/scanner'''
      Request that a scanner be created with the specified options. Returns a 
scanner ID that can be used to iterate over the results of the scanner.
      Options: 
          columns: A semicolon-delimited list of column names. If omitted, each 
result will contain all columns in the row.
@@ -113, +123 @@

          HTTP 201 (Created) with a Location header that references the scanner 
URI. Example:
          /first_table/scanner/1234348890231890
          
- GET /[table_name]/scanner/[scanner_id]/current
+ '''GET /[table_name]/scanner/[scanner_id]/current'''
      Get the row and columns for the current item in the scanner without 
advancing the scanner. 
      Equivalent to a queue peek operation. Multiple requests to this URI will 
return the same result.
      
@@ -132, +142 @@

          
          If the scanner is used up, HTTP 404 (Not Found).
      
- DELETE /[table_name]/scanner/[scanner_id]/current
+ '''DELETE /[table_name]/scanner/[scanner_id]/current'''
      Return the current item in the scanner and advance to the next one. Think 
of it as a queue dequeue operation.
      
      Headers:
@@ -149, +159 @@

          depends on the Accept header. See the documentation for getting an 
individual row for data format.
          
          If the scanner is used up, HTTP 404 (Not Found).
+ 
+ ~- Stack comment: DELETE to increment strikes me as wrong.  What about a 
POST/PUT to the URL /[table_name]/scanner/[scanner_id]/next?  Would return 
current and move scanner to next item? -~
      
- DELETE /[table_name]/scanner/[scanner_id]
+ '''DELETE /[table_name]/scanner/[scanner_id]'''
      Close a scanner. You must call this when you are done using a scanner to 
deallocate it.
      
      Returns:
          HTTP 202 (Accepted) if it can be closed. HTTP 404 (Not Found) if the 
scanner id is invalid. 
          HTTP 410 (Gone) if the scanner is already closed or the lease time 
has expired.
  
+ 
+ == Exception Handling ==
+ Generally, exceptions will show on the REST client side as 40Xs with a 
descriptive message and possibly a body with the java stack trace.  TODO: Table 
of the types of exceptions a client could get and how they should react.
+

[Lucene-hadoop Wiki] Trivial Update of "Hbase/HbaseRest" by stack

Reply via email to