Hello, Using a portion of exported production data we have attempted to add secondary indexes in our testing environment. (This question is not related to my previous question in the mailing list which used a different dataset).
The primary row-key in HBase is a 30-byte binary value: 1 Byte for salting, 8 Bytes - timestamp (Long), 20 Bytes - hash result of other record fields. + 1 extra byte for unknown issue about updating schema in future. Generating the indexes asynchronously we hit the following error: 19/07/17 18:11:07 INFO mapreduce.Job: Task Id : attempt_1562553170901_0491_m_000007_2, Status : FAILED Error: java.lang.RuntimeException: java.sql.SQLException: ERROR 218 (23018): Constraint violation. STATISTICS_HOUR_INDEX.:PK may not be null at org.apache.phoenix.mapreduce.index.PhoenixIndexImportMapper.map(PhoenixIndexImportMapper.java:122) at org.apache.phoenix.mapreduce.index.PhoenixIndexImportMapper.map(PhoenixIndexImportMapper.java:48) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.sql.SQLException: ERROR 218 (23018): Constraint violation. STATISTICS_HOUR_INDEX.:PK may not be null at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:422) at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145) at org.apache.phoenix.schema.ConstraintViolationException.<init>(ConstraintViolationException.java:39) at org.apache.phoenix.schema.PTableImpl.newKey(PTableImpl.java:618) at org.apache.phoenix.compile.UpsertCompiler.setValues(UpsertCompiler.java:137) at org.apache.phoenix.compile.UpsertCompiler.access$500(UpsertCompiler.java:106) at org.apache.phoenix.compile.UpsertCompiler$3.execute(UpsertCompiler.java:917) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:338) at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:326) at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:324) at org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:245) at org.apache.phoenix.jdbc.PhoenixPreparedStatement.execute(PhoenixPreparedStatement.java:172) at org.apache.phoenix.jdbc.PhoenixPreparedStatement.execute(PhoenixPreparedStatement.java:177) at org.apache.phoenix.mapreduce.index.PhoenixIndexImportMapper.map(PhoenixIndexImportMapper.java:101) ... 9 more Correct me if I’m wrong; my understanding is that :PK (in the index table) would be the same as the PK from the main table, such that other columns in the original table at that row can easily be read when querying the index table? However there are no rows in the main table where PK is null (is it even possible?), therefore it’s very strange that it is telling us that the :PK is null somewhere. Is my understanding incorrect, or could there be something else causing this issue? Phoenix: 4.7.0 (CLABS_PHOENIX) HBase: 1.2.0-cdh5.7.6 Thanks