When you define a table that's backed by the AccumuloStorageHandler, you define a Hive column which is essentially the Accumulo rowID (":rowID" in the column mapping string).

You can include some filter in the WHERE clause over that Hive column. That portion is extracted and used to set a normal Accumulo Range.

Concretely:

CREATE TABLE my_table(uid string, name string, age int, height int)
STORED BY 'org.apache.hadoop.hive.accumulo.AccumuloStorageHandler'
WITH SERDEPROPERTIES ("accumulo.columns.mapping" =
  ":rowID,person:name,person:age,person:height");

SELECT * FROM my_table WHERE uid > "f" AND uid < "m";

In the above example, the "uid" Hive column maps to the ":rowID". The where clause in this query would limit the Range used on the Scanner to ("f", "m").

Does that help?

THORMAN, ROBERT D wrote:
Does anyone know if there is a way to limit the rowID range that Hive
will scan on an Accumulo table? What I’m looking for is the equivalent
of ‘scan –b <start-row> –e <end-row>’ in an HQL statement.

v/r
Bob Thorman
Principal Big Data Engineer
AT&T Big Data CoE
2900 W. Plano Parkway
Plano, TX 75075
972-658-1714


Reply via email to