Re: Hive/Accumulo

Josh Elser Fri, 24 Apr 2015 13:49:09 -0700

When you define a table that's backed by the AccumuloStorageHandler, youdefine a Hive column which is essentially the Accumulo rowID (":rowID"in the column mapping string).

You can include some filter in the WHERE clause over that Hive column.That portion is extracted and used to set a normal Accumulo Range.


Concretely:

CREATE TABLE my_table(uid string, name string, age int, height int)
STORED BY 'org.apache.hadoop.hive.accumulo.AccumuloStorageHandler'
WITH SERDEPROPERTIES ("accumulo.columns.mapping" =
  ":rowID,person:name,person:age,person:height");

SELECT * FROM my_table WHERE uid > "f" AND uid < "m";

In the above example, the "uid" Hive column maps to the ":rowID". Thewhere clause in this query would limit the Range used on the Scanner to("f", "m").


Does that help?

THORMAN, ROBERT D wrote:

Does anyone know if there is a way to limit the rowID range that Hive
will scan on an Accumulo table? What I’m looking for is the equivalent
of ‘scan –b <start-row> –e <end-row>’ in an HQL statement.

v/r
Bob Thorman
Principal Big Data Engineer
AT&T Big Data CoE
2900 W. Plano Parkway
Plano, TX 75075
972-658-1714

Re: Hive/Accumulo

Reply via email to