I was trying to map existing HBase table to view based on the instruction:
https://phoenix.apache.org/faq.html#How_I_map_Phoenix_table_to_an_existing_HBase_table

One surprise from query on the view is all values from one column are all
null although they are clearly populated with data when viewed in Hbase
shell.

After some investigation, seems the issue is likely the timestamp used for
query in view is not "The Most Current". It might be the current server
timestamp, which would filter out any cells with timestamp larger than the
"current" value.

Here is my test case.

First run follow command in the HBase shell:

tableName = 'TEST'
create tableName, 'FAM'
put tableName,'test1','FAM:VAL',"Hello1"
put tableName,'test2','FAM:VAL',"Hello2",852223752434352130
scan tableName

and here are two rows been populated, note the second cell has a custom
timestamp:

ROW                             COLUMN+CELL

 test1                          column=FAM:VAL, timestamp=1492705385114,
value=Hello1
 test2                          column=FAM:VAL,
timestamp=852223752434352130, value=Hello2

After that, start "sqlline.py", and map the table as:
create view TEST (pk varchar primary key, FAM.VAL varchar);

and query the view by:
select * from TEST;

The result shows only one row with the default timestamp:

PK                                       VAL

----------------------------------------
----------------------------------------
test1                                    Hello1

Time: 0.101 sec(s)

Apparently, the second row is filtered due to custom timestamp. So the
question is if there is anyway to specify the timestamp for query in such
case. This seems an undesired behavior. Ideally, the query should always
return the latest value as scan or get operation in HBase shell or API.

Reply via email to