Dear Wiki user, You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.
The following page has been changed by stack: http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest The comment on the change is: Added comments ------------------------------------------------------------------------------ - This is a provisional spec for the Hbase-REST api. + This is a provisional spec for the Hbase-REST API done under the aegis of [https://issues.apache.org/jira/browse/HADOOP-2068 HADOOP-2068]. + ~-''St.Ack comment: Bryan I added comments inline. Remove them once them when you are done.-~ == System Information == - GET / + '''GET /''' Retrieve a list of all the tables in HBase. Returns: @@ -14, +15 @@ <table name="first_table" uri="/first_table" /> <table name="second_table" uri="/second_table" /> </tables> + ~-''St.Ack comment: FYI, there is an xhtml formatter in hbase under the shell package. If we used that for outputting metadata-type pages such as this one, then we'll have a legup on implementation. It uses xmlenc which is bundled w/ hadoop. xmlenc is fast and dumb (like me). IIRC, it doesn't do entities; it adds the entity to the closer element too. This is dumb. On otherhand, it makes it so we don't have to have the entities vs. elements argument (Smile).''-~ - GET /[table_name] + '''GET /[table_name]''' Retrieve metadata about the table. This includes column family descriptors. Returns: @@ -27, +29 @@ <columnFamily name="stats" /> </columnFamilies> </table> - + ~-''St.Ack comment: FYI, here is an example column descriptor: {name: triples, max versions: 3, compression: NONE, in memory: false, max length: 2147483647, bloom filter: none}. We're also about to add being able to add arbitrary key/value pairs to both table and column descriptors-~ + - GET /[table_name]/regions + '''GET /[table_name]/regions''' - Retrieve a list of the regions for this table so that you can efficiently split up work (a la MapReduce). + Retrieve a list of the regions for this table so that you can efficiently split up the work (a la MapReduce). Options: start_key, end_key: Only return the list of regions that contain the range start_key...end_key @@ -41, +44 @@ <region start_key="0201" server="region_server_3" /> </regions> + ~-''St.Ack comment: This won't be needed if you use TableInputFormat in your mapper -- but no harm in having it in place-~ == Row Interaction == - GET /[table_name]/row/[row_key]/timestamps + '''GET /[table_name]/row/[row_key]/timestamps''' Retrieve a list of all the timestamps available for this row key. Returns: @@ -54, +58 @@ <timestamp value="20071115T000800" uri="/first_table/row/0001/20071115T000800" /> <timestamp value="20071115T001200" uri="/first_table/row/0001/20071115T001200" /> </timestamps> + + ~-''St.Ack comment: Currently not supported in native hbase client but we should add it-~ - + - GET /[table_name]/row/[row_key]/ + '''GET /[table_name]/row/[row_key]/''' - GET /[table_name]/row/[row_key]/[timestamp] + '''GET /[table_name]/row/[row_key]/[timestamp]''' Retrieve data from a row, constrained by an optional timestamp value. Headers: @@ -70, +76 @@ column values out of the data. Options: columns: A semicolon-delimited list of column names. If omitted, the result will contain all columns in the row. + + ~-''St.Ack comment: +1 that MIME is way to return rows. -1 that octet-stream would be an option. Just expect xml or MIME if full row specified -~ - POST/PUT /[table_name]/row/[row_key]/ + '''POST/PUT /[table_name]/row/[row_key]/''' - POST/PUT /[table_name]/row/[row_key]/[timestamp] + '''POST/PUT /[table_name]/row/[row_key]/[timestamp]''' Set the value of one or more columns for a given row key with an optional timestamp. Headers: @@ -89, +97 @@ HTTP 201 (Created) if the column(s) could successfully be saved. HTTP 415 (Unsupported Media Type) if the query string column options do not match the Content-type header, or if the binary data of either octet-stream or Multipart/related is unreadable. + + ~-''St.Ack comment: -1 again on octet-stream. It messes up your nice clean API. Might consider adding column name as MIME header if multipart rather than have columns as option IF multipart (ignored if XML). Might not make sense if this only time its done (since every where else need to be able to handle the column option) -~ - DELETE /[table_name]/row/[row_key]/ + '''DELETE /[table_name]/row/[row_key]/''' - DELETE /[table_name]/row/[row_key]/[timestamp] + '''DELETE /[table_name]/row/[row_key]/[timestamp]''' Delete the specified columns from the row. If there are no columns specified, then it will delete ALL columns. Optionally, specify a timestamp. Options: @@ -103, +113 @@ == Scanning == - POST/PUT /[table_name]/scanner + '''POST/PUT /[table_name]/scanner''' Request that a scanner be created with the specified options. Returns a scanner ID that can be used to iterate over the results of the scanner. Options: columns: A semicolon-delimited list of column names. If omitted, each result will contain all columns in the row. @@ -113, +123 @@ HTTP 201 (Created) with a Location header that references the scanner URI. Example: /first_table/scanner/1234348890231890 - GET /[table_name]/scanner/[scanner_id]/current + '''GET /[table_name]/scanner/[scanner_id]/current''' Get the row and columns for the current item in the scanner without advancing the scanner. Equivalent to a queue peek operation. Multiple requests to this URI will return the same result. @@ -132, +142 @@ If the scanner is used up, HTTP 404 (Not Found). - DELETE /[table_name]/scanner/[scanner_id]/current + '''DELETE /[table_name]/scanner/[scanner_id]/current''' Return the current item in the scanner and advance to the next one. Think of it as a queue dequeue operation. Headers: @@ -149, +159 @@ depends on the Accept header. See the documentation for getting an individual row for data format. If the scanner is used up, HTTP 404 (Not Found). + + ~- Stack comment: DELETE to increment strikes me as wrong. What about a POST/PUT to the URL /[table_name]/scanner/[scanner_id]/next? Would return current and move scanner to next item? -~ - DELETE /[table_name]/scanner/[scanner_id] + '''DELETE /[table_name]/scanner/[scanner_id]''' Close a scanner. You must call this when you are done using a scanner to deallocate it. Returns: HTTP 202 (Accepted) if it can be closed. HTTP 404 (Not Found) if the scanner id is invalid. HTTP 410 (Gone) if the scanner is already closed or the lease time has expired. + + == Exception Handling == + Generally, exceptions will show on the REST client side as 40Xs with a descriptive message and possibly a body with the java stack trace. TODO: Table of the types of exceptions a client could get and how they should react. +