When you define a table that's backed by the AccumuloStorageHandler, you
define a Hive column which is essentially the Accumulo rowID (":rowID"
in the column mapping string).
You can include some filter in the WHERE clause over that Hive column.
That portion is extracted and used to set a normal Accumulo Range.
Concretely:
CREATE TABLE my_table(uid string, name string, age int, height int)
STORED BY 'org.apache.hadoop.hive.accumulo.AccumuloStorageHandler'
WITH SERDEPROPERTIES ("accumulo.columns.mapping" =
":rowID,person:name,person:age,person:height");
SELECT * FROM my_table WHERE uid > "f" AND uid < "m";
In the above example, the "uid" Hive column maps to the ":rowID". The
where clause in this query would limit the Range used on the Scanner to
("f", "m").
Does that help?
THORMAN, ROBERT D wrote:
Does anyone know if there is a way to limit the rowID range that Hive
will scan on an Accumulo table? What I’m looking for is the equivalent
of ‘scan –b <start-row> –e <end-row>’ in an HQL statement.
v/r
Bob Thorman
Principal Big Data Engineer
AT&T Big Data CoE
2900 W. Plano Parkway
Plano, TX 75075
972-658-1714