Afshin Moazami created PHOENIX-2521:
---------------------------------------
Summary: Index rows are not updated when the index key updated
using bulk loader
Key: PHOENIX-2521
URL: https://issues.apache.org/jira/browse/PHOENIX-2521
Project: Phoenix
Issue Type: Bug
Affects Versions: 4.5.2
Reporter: Afshin Moazami
found out the map reduce csv bulk load tool doesn't behave the same as
UPSERTs. Is it by design or a bug?
Here is the queries for creating table and index:
{code} CREATE TABLE mySchema.mainTable (
id varchar NOT NULL,
name varchar,
address varchar
CONSTRAINT pk PRIMARY KEY (id)); {code}
{code} CREATE INDEX myIndex
ON mySchema.mainTable (name, id)
INCLUDE (address); {code}
if I execute two upserts where the second one update the name (which is the key
for index), everything works fine (the record will be updated in both table and
index table)
{code} UPSERT INTO mySchema.mainTable (id, name, address) values ('1', 'john',
'Montreal');{code}
{code}UPSERT INTO mySchema.mainTable (id, name, address) values ('1', 'jack',
'Montreal');{code}
{code}SELECT /*+ INDEX(mySchema.mainTable myIndex) */ * from mySchema.mainTable
where name = 'jack'; {code} ==> one record
{code}SELECT /*+ INDEX(mySchema.mainTable myIndex) */ * from mySchema.mainTable
where name = 'john'; {code} ==> zero records
But, if I load the date using org.apache.phoenix.mapreduce.CsvBulkLoadTool to
the main table, it behaves different. The main table will be updated, but the
new record will be appended to the index table:
HADOOP_CLASSPATH=/usr/lib/hbase/lib/hbase-protocol-1.1.2.jar:/etc/hbase/conf
hadoop jar
/usr/lib/hbase/phoenix-4.5.2-HBase-1.1-bin/phoenix-4.5.2-HBase-1.1-client.jar
org.apache.phoenix.mapreduce.CsvBulkLoadTool -d',' -s mySchema -t mainTable -i
/tmp/input.txt
input.txt:
2,tomas,montreal
2,george,montreal
(I have tried it both with/without -it and got the same result)
{code}SELECT /*+ INDEX(mySchema.mainTable myIndex) */ * from mySchema.mainTable
where name = 'tomas' {code} ==> one record;
{code} SELECT /*+ INDEX(mySchema.mainTable myIndex) */ * from
mySchema.mainTable where name = 'george' {code} ==> one record;
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)