I am using Phoenix 5 as shipped with Hortonworks HDP 3.1. I am storing 3+ million file names in a table and then using the table to keep track of which files I have processed using a Storm topology. I have been doing some testing to make sure that everything is working correctly and as part of that, tried my code on single files to ensure that columns were being updated as expected and that my logic was working properly.
I have found that when I do an UPSERT to set a column to the current timestamp (now()) the first time, it works as expected, then I reset that column to NULL so that I can redo the test, but find that the exact same UPSERT statement fails to make a change even though I get a response that confirms one record was affected. In order to eliminate potential red herrings, here is an example of the problem that is reproducible using just the phoenix-sqlline interface (I get same exact behavior when running equivalent Scala code). Here is the DDL for the table: CREATE TABLE DEFAULT.FILE_INDEX( MTIME TIMESTAMP NOT NULL, FILENAME VARCHAR NOT NULL, TYPE VARCHAR NOT NULL, SUBTYPE VARCHAR NOT NULL, SENSOR VARCHAR NOT NULL, SIZE BIGINT NOT NULL, OWNER VARCHAR NOT NULL, GROUP_OWNER VARCHAR NOT NULL, PERMISSIONS VARCHAR NOT NULL, STARTED TIMESTAMP, PROCESSED TIMESTAMP, EVENT_COUNT BIGINT CONSTRAINT PK PRIMARY KEY(MTIME ROW_TIMESTAMP,FILENAME,TYPE,SUBTYPE,SENSOR,SIZE,OWNER,GROUP_OWNER,PERMISSIONS)) COMPRESSION='SNAPPY',DEFAULT_COLUMN_FAMILY='F'; and here is an example set of queries to reproduce the issue I am seeing: 0: jdbc:phoenix:master-1.> UPSERT INTO DEFAULT.FILE_INDEX(MTIME,FILENAME,TYPE,SUBTYPE,SENSOR,SIZE,OWNER,GROUP_OWNER,PERMISSIONS,STARTED) SELECT MTIME,FILENAME,TYPE,SUBTYPE,SENSOR,SIZE,OWNER,GROUP_OWNER,PERMISSIONS,now() AS STARTED FROM DEFAULT.FILE_INDEX WHERE FILENAME='hdfs://filename.log';1 row affected (5.041 seconds)0: jdbc:phoenix:master-1.> select * from default.file_index where started is not null;+--------------------------+-----------------------------------------------------------------------------------------------------------------------------+-------+----------+-----------------+---------+------------+--------------+------+| MTIME | FILENAME | TYPE | SUBTYPE | SENSOR | SIZE | OWNER | GROUP_OWNER | PERM |+--------------------------+-----------------------------------------------------------------------------------------------------------------------------+-------+----------+-----------------+---------+------------+--------------+------+| 2018-11-01 00:00:00.000 | hdfs://filename.log | BRO | DNS | something | 224500 | somebody | hdfs | rw-r |+--------------------------+-----------------------------------------------------------------------------------------------------------------------------+-------+----------+-----------------+---------+------------+--------------+------+1 row selected (4.046 seconds)0: jdbc:phoenix:master-1.> UPSERT INTO DEFAULT.FILE_INDEX(MTIME,FILENAME,TYPE,SUBTYPE,SENSOR,SIZE,OWNER,GROUP_OWNER,PERMISSIONS,STARTED) SELECT MTIME,FILENAME,TYPE,SUBTYPE,SENSOR,SIZE,OWNER,GROUP_OWNER,PERMISSIONS,NULL AS STARTED FROM DEFAULT.FILE_INDEX WHERE FILENAME='hdfs://filename.log';1 row affected (4.541 seconds)0: jdbc:phoenix:master-1.> select * from default.file_index where started is not null;+--------+-----------+-------+----------+---------+-------+--------+--------------+--------------+----------+------------+--------------+| MTIME | FILENAME | TYPE | SUBTYPE | SENSOR | SIZE | OWNER | GROUP_OWNER | PERMISSIONS | STARTED | PROCESSED | EVENT_COUNT |+--------+-----------+-------+----------+---------+-------+--------+--------------+--------------+----------+------------+--------------++--------+-----------+-------+----------+---------+-------+--------+--------------+--------------+----------+------------+--------------+No rows selected (4.782 seconds)0: jdbc:phoenix:master-1.> UPSERT INTO DEFAULT.FILE_INDEX(MTIME,FILENAME,TYPE,SUBTYPE,SENSOR,SIZE,OWNER,GROUP_OWNER,PERMISSIONS,STARTED) SELECT MTIME,FILENAME,TYPE,SUBTYPE,SENSOR,SIZE,OWNER,GROUP_OWNER,PERMISSIONS,now() AS STARTED FROM DEFAULT.FILE_INDEX WHERE FILENAME='hdfs://filename.log';1 row affected (5.254 seconds)0: jdbc:phoenix:master-1.> select * from default.file_index where started is not null;+--------+-----------+-------+----------+---------+-------+--------+--------------+--------------+----------+------------+--------------+| MTIME | FILENAME | TYPE | SUBTYPE | SENSOR | SIZE | OWNER | GROUP_OWNER | PERMISSIONS | STARTED | PROCESSED | EVENT_COUNT |+--------+-----------+-------+----------+---------+-------+--------+--------------+--------------+----------+------------+--------------++--------+-----------+-------+----------+---------+-------+--------+--------------+--------------+----------+------------+--------------+No rows selected (4.389 seconds) I would greatly appreciate any insights into what I might be doing wrong here. My assumption is that this behavior, though undesirable, is expected and has to do with versions or perhaps autocommit settings...but for the life of me, I have tried every permutation of autocommit settings and have not been able to get around this behavior...