[jira] [Updated] (PHOENIX-3555) Building async local index by IndexTool generate wrong data
[ https://issues.apache.org/jira/browse/PHOENIX-3555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenzhiming updated PHOENIX-3555: - Description: 1.a salt table which pk is varchar CREATE TABLE C_PICRECORD ( ID VARCHAR NOT NULL PRIMARY KEY, "info".CAR_NUM VARCHAR(18) NULL, "info".CAP_DATE VARCHAR NULL, "info".ORG_ID BIGINT NULL, "info".ORG_NAME VARCHAR(255) NULL ) SALT_BUCKETS=3; 2.upsert into the table UPSERT INTO C_PICRECORD(ID,CAR_NUM,CAP_DATE,ORG_ID,ORG_NAME) VALUES('1','car1','2016-01-01 00:00:00',11,'orgname1'); 3.create async local index CREATE LOCAL INDEX C_PICRECORD_IDX_1 on C_PICRECORD("info".CAR_NUM,"info".CAP_DATE) ASYNC; 4.use IndexTool to build index hbase org.apache.phoenix.mapreduce.index.IndexTool --data-table C_PICRECORD --index-table C_PICRECORD_IDX_1 --output-path /tmp/C_PICRECORD_IDX_1 5.enter into "hbase shell" and scan salt table hbase(main):102:0> scan 'C_PICRECORD' ROW COLUMN+CELL \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001\x0 column=L#0:_0, timestamp=1483108992853, value=x 0\x00\x00\x00 \x021 column=info:CAP_DATE, timestamp=1483021375797, value=2016-01-01 00:00:00 \x021 column=info:CAR_NUM, timestamp=1483021375797, value=car1 \x021 column=info:ORG_ID, timestamp=1483021375797, value=\x80\x00\x00\x00\x00\x00\x00\x0B \x021 column=info:ORG_NAME, timestamp=1483021375797, value=orgname1 \x021 column=info:_0, timestamp=1483021375797, value=x -- look here,the index data is wrong: \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001\x00\x00\x00\x00 the right index data should be: \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001 this is the reason i get any null value(the column not in index): 0: jdbc:phoenix:master> SELECT ORG_ID,CAP_DATE,CAR_NUM,ORG_NAME FROM C_PICRECORD WHERE CAR_NUM='car1' AND CAP_DATE>='2016-01-01' AND CAP_DATE<='2016-05-02' LIMIT 10; | ORG_ID | CAP_DATE| CAR_NUM | ORG_NAME | | null | 2016-01-01 00:00:00 | car1 | | PS: I can get the right index data if change pk's datatype to bigint or upsert some string as pk such as 'abc'. was: 1.a salt table which pk is varchar CREATE TABLE C_PICRECORD ( ID VARCHAR NOT NULL PRIMARY KEY, "info".CAR_NUM VARCHAR(18) NULL, "info".CAP_DATE VARCHAR NULL, "info".ORG_ID BIGINT NULL, "info".ORG_NAME VARCHAR(255) NULL ) SALT_BUCKETS=3; 2.upsert into the table UPSERT INTO C_PICRECORD(ID,CAR_NUM,CAP_DATE,ORG_ID,ORG_NAME) VALUES('1','car1','2016-01-01 00:00:00',11,'orgname1'); 3.create async local index CREATE LOCAL INDEX C_PICRECORD_IDX_1 on C_PICRECORD("info".CAR_NUM,"info".CAP_DATE) ASYNC; 4.use IndexTool to build index hbase org.apache.phoenix.mapreduce.index.IndexTool --data-table C_PICRECORD --index-table C_PICRECORD_IDX_1 --output-path /tmp/C_PICRECORD_IDX_1 5.enter into "hbase shell" and scan salt table hbase(main):102:0> scan 'C_PICRECORD' ROW COLUMN+CELL \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001\x0 column=L#0:_0, timestamp=1483108992853, value=x 0\x00\x00\x00 \x021 column=info:CAP_DATE, timestamp=1483021375797, value=2016-01-01 00:00:00 \x021
[jira] [Updated] (PHOENIX-3555) Building async local index by IndexTool generate wrong data
[ https://issues.apache.org/jira/browse/PHOENIX-3555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenzhiming updated PHOENIX-3555: - Description: 1.a salt table which pk is varchar CREATE TABLE C_PICRECORD ( ID VARCHAR NOT NULL PRIMARY KEY, "info".CAR_NUM VARCHAR(18) NULL, "info".CAP_DATE VARCHAR NULL, "info".ORG_ID BIGINT NULL, "info".ORG_NAME VARCHAR(255) NULL ) SALT_BUCKETS=3; 2.upsert into the table UPSERT INTO C_PICRECORD(ID,CAR_NUM,CAP_DATE,ORG_ID,ORG_NAME) VALUES('1','car1','2016-01-01 00:00:00',11,'orgname1'); 3.create async local index CREATE LOCAL INDEX C_PICRECORD_IDX_1 on C_PICRECORD("info".CAR_NUM,"info".CAP_DATE) ASYNC; 4.use IndexTool to build index hbase org.apache.phoenix.mapreduce.index.IndexTool --data-table C_PICRECORD --index-table C_PICRECORD_IDX_1 --output-path /tmp/C_PICRECORD_IDX_1 5.enter into "hbase shell" and scan salt table hbase(main):102:0> scan 'C_PICRECORD' ROW COLUMN+CELL \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001\x0 column=L#0:_0, timestamp=1483108992853, value=x 0\x00\x00\x00 \x021 column=info:CAP_DATE, timestamp=1483021375797, value=2016-01-01 00:00:00 \x021 column=info:CAR_NUM, timestamp=1483021375797, value=car1 \x021 column=info:ORG_ID, timestamp=1483021375797, value=\x80\x00\x00\x00\x00\x00\x00\x0B \x021 column=info:ORG_NAME, timestamp=1483021375797, value=orgname1 \x021 column=info:_0, timestamp=1483021375797, value=x -- look here,the index data is wrong: \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001\x00\x00\x00\x00 the right index data should be: \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001 this is the reason i get any null value(the column not in index): 0: jdbc:phoenix:master> SELECT ORG_ID,CAP_DATE,CAR_NUM,ORG_NAME FROM C_PICRECORD WHERE CAR_NUM='car1' AND CAP_DATE>='2016-01-01' AND CAP_DATE<='2016-05-02' LIMIT 10; | ORG_ID | CAP_DATE| CAR_NUM | ORG_NAME | | null | 2016-01-01 00:00:00 | car1 | | ** i can get the right index data if change pk's datatype to bigint or upsert some string as pk such as 'abc'.** was: 1.a salt table which pk is varchar CREATE TABLE C_PICRECORD ( ID VARCHAR NOT NULL PRIMARY KEY, "info".CAR_NUM VARCHAR(18) NULL, "info".CAP_DATE VARCHAR NULL, "info".ORG_ID BIGINT NULL, "info".ORG_NAME VARCHAR(255) NULL ) SALT_BUCKETS=3; 2.upsert into the table UPSERT INTO C_PICRECORD(ID,CAR_NUM,CAP_DATE,ORG_ID,ORG_NAME) VALUES('1','car1','2016-01-01 00:00:00',11,'orgname1'); 3.create async local index CREATE LOCAL INDEX C_PICRECORD_IDX_1 on C_PICRECORD("info".CAR_NUM,"info".CAP_DATE) ASYNC; 4.use IndexTool to build index hbase org.apache.phoenix.mapreduce.index.IndexTool --data-table C_PICRECORD --index-table C_PICRECORD_IDX_1 --output-path /tmp/C_PICRECORD_IDX_1 5.enter into "hbase shell" and scan salt table hbase(main):102:0> scan 'C_PICRECORD' ROW COLUMN+CELL \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001\x0 column=L#0:_0, timestamp=1483108992853, value=x 0\x00\x00\x00 \x021 column=info:CAP_DATE, timestamp=1483021375797, value=2016-01-01 00:00:00 \x021
[jira] [Updated] (PHOENIX-3555) Building async local index by IndexTool generate wrong data
[ https://issues.apache.org/jira/browse/PHOENIX-3555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenzhiming updated PHOENIX-3555: - Description: 1.a salt table which pk is varchar CREATE TABLE C_PICRECORD ( ID VARCHAR NOT NULL PRIMARY KEY, "info".CAR_NUM VARCHAR(18) NULL, "info".CAP_DATE VARCHAR NULL, "info".ORG_ID BIGINT NULL, "info".ORG_NAME VARCHAR(255) NULL ) SALT_BUCKETS=3; 2.upsert into the table UPSERT INTO C_PICRECORD(ID,CAR_NUM,CAP_DATE,ORG_ID,ORG_NAME) VALUES('1','car1','2016-01-01 00:00:00',11,'orgname1'); 3.create async local index CREATE LOCAL INDEX C_PICRECORD_IDX_1 on C_PICRECORD("info".CAR_NUM,"info".CAP_DATE) ASYNC; 4.use IndexTool to build index hbase org.apache.phoenix.mapreduce.index.IndexTool --data-table C_PICRECORD --index-table C_PICRECORD_IDX_1 --output-path /tmp/C_PICRECORD_IDX_1 5.enter into "hbase shell" and scan salt table hbase(main):102:0> scan 'C_PICRECORD' ROW COLUMN+CELL \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001\x0 column=L#0:_0, timestamp=1483108992853, value=x 0\x00\x00\x00 \x021 column=info:CAP_DATE, timestamp=1483021375797, value=2016-01-01 00:00:00 \x021 column=info:CAR_NUM, timestamp=1483021375797, value=car1 \x021 column=info:ORG_ID, timestamp=1483021375797, value=\x80\x00\x00\x00\x00\x00\x00\x0B \x021 column=info:ORG_NAME, timestamp=1483021375797, value=orgname1 \x021 column=info:_0, timestamp=1483021375797, value=x -- look here,the index data is wrong: \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001\x00\x00\x00\x00 the right index data should be: \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001 this is the reason i get any null value(the column not in index): 0: jdbc:phoenix:master> SELECT ORG_ID,CAP_DATE,CAR_NUM,ORG_NAME FROM C_PICRECORD WHERE CAR_NUM='car1' AND CAP_DATE>='2016-01-01' AND CAP_DATE<='2016-05-02' LIMIT 10; | ORG_ID | CAP_DATE| CAR_NUM | ORG_NAME | __ | null | 2016-01-01 00:00:00 | car1 | | ps: i can get the right index data if change pk's datatype to bigint or upsert some string as pk such as 'abc'. was: 1.a salt table which pk is varchar CREATE TABLE C_PICRECORD ( ID VARCHAR NOT NULL PRIMARY KEY, "info".CAR_NUM VARCHAR(18) NULL, "info".CAP_DATE VARCHAR NULL, "info".ORG_ID BIGINT NULL, "info".ORG_NAME VARCHAR(255) NULL ) SALT_BUCKETS=3; 2.upsert into the table UPSERT INTO C_PICRECORD(ID,CAR_NUM,CAP_DATE,ORG_ID,ORG_NAME) VALUES('1','car1','2016-01-01 00:00:00',11,'orgname1'); 3.create async local index CREATE LOCAL INDEX C_PICRECORD_IDX_1 on C_PICRECORD("info".CAR_NUM,"info".CAP_DATE) ASYNC; 4.use IndexTool to build index hbase org.apache.phoenix.mapreduce.index.IndexTool --data-table C_PICRECORD --index-table C_PICRECORD_IDX_1 --output-path /tmp/C_PICRECORD_IDX_1 5.enter into "hbase shell" and scan salt table hbase(main):102:0> scan 'C_PICRECORD' ROW COLUMN+CELL \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001\x0 column=L#0:_0, timestamp=1483108992853, value=x 0\x00\x00\x00 \x021
[jira] [Updated] (PHOENIX-3555) Building async local index by IndexTool generate wrong data
[ https://issues.apache.org/jira/browse/PHOENIX-3555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenzhiming updated PHOENIX-3555: - Description: 1.a salt table which pk is varchar CREATE TABLE C_PICRECORD ( ID VARCHAR NOT NULL PRIMARY KEY, "info".CAR_NUM VARCHAR(18) NULL, "info".CAP_DATE VARCHAR NULL, "info".ORG_ID BIGINT NULL, "info".ORG_NAME VARCHAR(255) NULL ) SALT_BUCKETS=3; 2.upsert into the table UPSERT INTO C_PICRECORD(ID,CAR_NUM,CAP_DATE,ORG_ID,ORG_NAME) VALUES('1','car1','2016-01-01 00:00:00',11,'orgname1'); 3.create async local index CREATE LOCAL INDEX C_PICRECORD_IDX_1 on C_PICRECORD("info".CAR_NUM,"info".CAP_DATE) ASYNC; 4.use IndexTool to build index hbase org.apache.phoenix.mapreduce.index.IndexTool --data-table C_PICRECORD --index-table C_PICRECORD_IDX_1 --output-path /tmp/C_PICRECORD_IDX_1 5.enter into "hbase shell" and scan salt table hbase(main):102:0> scan 'C_PICRECORD' ROW COLUMN+CELL \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001\x0 column=L#0:_0, timestamp=1483108992853, value=x 0\x00\x00\x00 \x021 column=info:CAP_DATE, timestamp=1483021375797, value=2016-01-01 00:00:00 \x021 column=info:CAR_NUM, timestamp=1483021375797, value=car1 \x021 column=info:ORG_ID, timestamp=1483021375797, value=\x80\x00\x00\x00\x00\x00\x00\x0B \x021 column=info:ORG_NAME, timestamp=1483021375797, value=orgname1 \x021 column=info:_0, timestamp=1483021375797, value=x -- look here,the index data is wrong: \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001\x00\x00\x00\x00 the right index data should be: \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001 this is the reason i get any null value(the column not in index): 0: jdbc:phoenix:master> SELECT ORG_ID,CAP_DATE,CAR_NUM,ORG_NAME FROM C_PICRECORD WHERE CAR_NUM='car1' AND CAP_DATE>='2016-01-01' AND CAP_DATE<='2016-05-02' LIMIT 10; | ORG_ID | CAP_DATE| CAR_NUM | ORG_NAME | | null | 2016-01-01 00:00:00 | car1 | | ps: i can get the right index data if change pk's datatype to bigint or upsert some string as pk such as 'abc'. was: 1.a salt table which pk is varchar CREATE TABLE C_PICRECORD ( ID VARCHAR NOT NULL PRIMARY KEY, "info".CAR_NUM VARCHAR(18) NULL, "info".CAP_DATE VARCHAR NULL, "info".ORG_ID BIGINT NULL, "info".ORG_NAME VARCHAR(255) NULL ) SALT_BUCKETS=3; 2.upsert into the table UPSERT INTO C_PICRECORD(ID,CAR_NUM,CAP_DATE,ORG_ID,ORG_NAME) VALUES('1','car1','2016-01-01 00:00:00',11,'orgname1'); 3.create async local index CREATE LOCAL INDEX C_PICRECORD_IDX_1 on C_PICRECORD("info".CAR_NUM,"info".CAP_DATE) ASYNC; 4.use IndexTool to build index hbase org.apache.phoenix.mapreduce.index.IndexTool --data-table C_PICRECORD --index-table C_PICRECORD_IDX_1 --output-path /tmp/C_PICRECORD_IDX_1 5.enter into "hbase shell" and scan salt table hbase(main):102:0> scan 'C_PICRECORD' ROW COLUMN+CELL \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001\x0 column=L#0:_0, timestamp=1483108992853, value=x 0\x00\x00\x00 \x021 column=info:CAP_DATE, timestamp=1483021375797, value=2016-01-01 00:00:00 \x021
[jira] [Updated] (PHOENIX-3555) Building async local index by IndexTool generate wrong data
[ https://issues.apache.org/jira/browse/PHOENIX-3555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenzhiming updated PHOENIX-3555: - Description: 1.a salt table which pk is varchar CREATE TABLE C_PICRECORD ( ID VARCHAR NOT NULL PRIMARY KEY, "info".CAR_NUM VARCHAR(18) NULL, "info".CAP_DATE VARCHAR NULL, "info".ORG_ID BIGINT NULL, "info".ORG_NAME VARCHAR(255) NULL ) SALT_BUCKETS=3; 2.upsert into the table UPSERT INTO C_PICRECORD(ID,CAR_NUM,CAP_DATE,ORG_ID,ORG_NAME) VALUES('1','car1','2016-01-01 00:00:00',11,'orgname1'); 3.create async local index CREATE LOCAL INDEX C_PICRECORD_IDX_1 on C_PICRECORD("info".CAR_NUM,"info".CAP_DATE) ASYNC; 4.use IndexTool to build index hbase org.apache.phoenix.mapreduce.index.IndexTool --data-table C_PICRECORD --index-table C_PICRECORD_IDX_1 --output-path /tmp/C_PICRECORD_IDX_1 5.enter into "hbase shell" and scan salt table hbase(main):102:0> scan 'C_PICRECORD' ROW COLUMN+CELL \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001\x0 column=L#0:_0, timestamp=1483108992853, value=x 0\x00\x00\x00 \x021 column=info:CAP_DATE, timestamp=1483021375797, value=2016-01-01 00:00:00 \x021 column=info:CAR_NUM, timestamp=1483021375797, value=car1 \x021 column=info:ORG_ID, timestamp=1483021375797, value=\x80\x00\x00\x00\x00\x00\x00\x0B \x021 column=info:ORG_NAME, timestamp=1483021375797, value=orgname1 \x021 column=info:_0, timestamp=1483021375797, value=x -- look here,the index data is wrong: \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001\x00\x00\x00\x00 the right index data should be: \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001 this is the reason i get any null value(the column not in index): 0: jdbc:phoenix:master> SELECT ORG_ID,CAP_DATE,CAR_NUM,ORG_NAME FROM C_PICRECORD WHERE CAR_NUM='car1' AND CAP_DATE>='2016-01-01' AND CAP_DATE<='2016-05-02' LIMIT 10; | ORG_ID | CAP_DATE| CAR_NUM | ORG_NAME | | null | 2016-01-01 00:00:00 | car1 | | ps: i can get the right index data if change pk's datatype to bigint or upsert some string as pk such as 'abc'. was: 1.a salt table which pk is varchar CREATE TABLE C_PICRECORD ( ID VARCHAR NOT NULL PRIMARY KEY, "info".CAR_NUM VARCHAR(18) NULL, "info".CAP_DATE VARCHAR NULL, "info".ORG_ID BIGINT NULL, "info".ORG_NAME VARCHAR(255) NULL ) SALT_BUCKETS=3; 2.upsert into the table UPSERT INTO C_PICRECORD(ID,CAR_NUM,CAP_DATE,ORG_ID,ORG_NAME) VALUES('1','car1','2016-01-01 00:00:00',11,'orgname1'); 3.create async local index CREATE LOCAL INDEX C_PICRECORD_IDX_1 on C_PICRECORD("info".CAR_NUM,"info".CAP_DATE) ASYNC; 4.use IndexTool to build index hbase org.apache.phoenix.mapreduce.index.IndexTool --data-table C_PICRECORD --index-table C_PICRECORD_IDX_1 --output-path /tmp/C_PICRECORD_IDX_1 5.enter into "hbase shell" and scan salt table hbase(main):102:0> scan 'C_PICRECORD' ROW COLUMN+CELL \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001\x0 column=L#0:_0, timestamp=1483108992853, value=x 0\x00\x00\x00 \x021 column=info:CAP_DATE, timestamp=1483021375797, value=2016-01-01 00:00:00
[jira] [Updated] (PHOENIX-3555) Building async local index by IndexTool generate wrong data
[ https://issues.apache.org/jira/browse/PHOENIX-3555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenzhiming updated PHOENIX-3555: - Description: 1.a salt table which pk is varchar CREATE TABLE C_PICRECORD ( ID VARCHAR NOT NULL PRIMARY KEY, "info".CAR_NUM VARCHAR(18) NULL, "info".CAP_DATE VARCHAR NULL, "info".ORG_ID BIGINT NULL, "info".ORG_NAME VARCHAR(255) NULL ) SALT_BUCKETS=3; 2.upsert into the table UPSERT INTO C_PICRECORD(ID,CAR_NUM,CAP_DATE,ORG_ID,ORG_NAME) VALUES('1','car1','2016-01-01 00:00:00',11,'orgname1'); 3.create async local index CREATE LOCAL INDEX C_PICRECORD_IDX_1 on C_PICRECORD("info".CAR_NUM,"info".CAP_DATE) ASYNC; 4.use IndexTool to build index hbase org.apache.phoenix.mapreduce.index.IndexTool --data-table C_PICRECORD --index-table C_PICRECORD_IDX_1 --output-path /tmp/C_PICRECORD_IDX_1 5.enter into "hbase shell" and scan salt table hbase(main):102:0> scan 'C_PICRECORD' ROW COLUMN+CELL \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001\x0 column=L#0:_0, timestamp=1483108992853, value=x 0\x00\x00\x00 \x021 column=info:CAP_DATE, timestamp=1483021375797, value=2016-01-01 00:00:00 \x021 column=info:CAR_NUM, timestamp=1483021375797, value=car1 \x021 column=info:ORG_ID, timestamp=1483021375797, value=\x80\x00\x00\x00\x00\x00\x00\x0B \x021 column=info:ORG_NAME, timestamp=1483021375797, value=orgname1 \x021 column=info:_0, timestamp=1483021375797, value=x -- look here,the index data is wrong: \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001\x00\x00\x00\x00 the right index data should be: \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001 this is the reason i get any null value(the column not in index): 0: jdbc:phoenix:master> SELECT ORG_ID,CAP_DATE,CAR_NUM,ORG_NAME FROM C_PICRECORD WHERE CAR_NUM='car1' AND CAP_DATE>='2016-01-01' AND CAP_DATE<='2016-05-02' LIMIT 10; | ORG_ID | CAP_DATE| CAR_NUM | ORG_NAME | | null | 2016-01-01 00:00:00 | car1 | | ps: i can get the right index data if change pk's datatype to bigint or upsert some string as pk such as 'abc'. was: 1.a salt table which pk is varchar CREATE TABLE C_PICRECORD ( ID VARCHAR NOT NULL PRIMARY KEY, "info".CAR_NUM VARCHAR(18) NULL, "info".CAP_DATE VARCHAR NULL, "info".ORG_ID BIGINT NULL, "info".ORG_NAME VARCHAR(255) NULL ) SALT_BUCKETS=3; 2.upsert into the table UPSERT INTO C_PICRECORD(ID,CAR_NUM,CAP_DATE,ORG_ID,ORG_NAME) VALUES('1','car1','2016-01-01 00:00:00',11,'orgname1'); 3.create async local index CREATE LOCAL INDEX C_PICRECORD_IDX_1 on C_PICRECORD("info".CAR_NUM,"info".CAP_DATE) ASYNC; 4.use IndexTool to build index hbase org.apache.phoenix.mapreduce.index.IndexTool --data-table C_PICRECORD --index-table C_PICRECORD_IDX_1 --output-path /tmp/C_PICRECORD_IDX_1 5.enter into "hbase shell" and scan salt table hbase(main):102:0> scan 'C_PICRECORD' ROW COLUMN+CELL \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001\x0 column=L#0:_0, timestamp=1483108992853, value=x 0\x00\x00\x00 \x021
[jira] [Updated] (PHOENIX-3555) Building async local index by IndexTool generate wrong data
[ https://issues.apache.org/jira/browse/PHOENIX-3555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenzhiming updated PHOENIX-3555: - Description: 1.a salt table which pk is varchar CREATE TABLE C_PICRECORD ( ID VARCHAR NOT NULL PRIMARY KEY, "info".CAR_NUM VARCHAR(18) NULL, "info".CAP_DATE VARCHAR NULL, "info".ORG_ID BIGINT NULL, "info".ORG_NAME VARCHAR(255) NULL ) SALT_BUCKETS=3; 2.upsert into the table UPSERT INTO C_PICRECORD(ID,CAR_NUM,CAP_DATE,ORG_ID,ORG_NAME) VALUES('1','car1','2016-01-01 00:00:00',11,'orgname1'); 3.create async local index CREATE LOCAL INDEX C_PICRECORD_IDX_1 on C_PICRECORD("info".CAR_NUM,"info".CAP_DATE) ASYNC; 4.use IndexTool to build index hbase org.apache.phoenix.mapreduce.index.IndexTool --data-table C_PICRECORD --index-table C_PICRECORD_IDX_1 --output-path /tmp/C_PICRECORD_IDX_1 5.enter into "hbase shell" and scan salt table hbase(main):102:0> scan 'C_PICRECORD' ROW COLUMN+CELL \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001\x0 column=L#0:_0, timestamp=1483108992853, value=x 0\x00\x00\x00 \x021 column=info:CAP_DATE, timestamp=1483021375797, value=2016-01-01 00:00:00 \x021 column=info:CAR_NUM, timestamp=1483021375797, value=car1 \x021 column=info:ORG_ID, timestamp=1483021375797, value=\x80\x00\x00\x00\x00\x00\x00\x0B \x021 column=info:ORG_NAME, timestamp=1483021375797, value=orgname1 \x021 column=info:_0, timestamp=1483021375797, value=x -- look here,the index data is wrong: \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001\x00\x00\x00\x00 the right index data should be: \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001 this is the reason i get any null value(the column not in index): 0: jdbc:phoenix:master> SELECT ORG_ID,CAP_DATE,CAR_NUM,ORG_NAME FROM C_PICRECORD WHERE CAR_NUM='car1' AND CAP_DATE>='2016-01-01' AND CAP_DATE<='2016-05-02' LIMIT 10; | ORG_ID | CAP_DATE | CAR_NUM | ORG_NAME | +-+--+--+---+ | null| 2016-01-01 00:00:00 | car1 | | +-+--+--+---+ ps: i can get the right index data if change pk's datatype to bigint or upsert some string as pk such as 'abc'. was: 1.a salt table which pk is varchar CREATE TABLE C_PICRECORD ( ID VARCHAR NOT NULL PRIMARY KEY, "info".CAR_NUM VARCHAR(18) NULL, "info".CAP_DATE VARCHAR NULL, "info".ORG_ID BIGINT NULL, "info".ORG_NAME VARCHAR(255) NULL ) SALT_BUCKETS=3; 2.upsert into the table UPSERT INTO C_PICRECORD(ID,CAR_NUM,CAP_DATE,ORG_ID,ORG_NAME) VALUES('1','car1','2016-01-01 00:00:00',11,'orgname1'); 3.create async local index CREATE LOCAL INDEX C_PICRECORD_IDX_1 on C_PICRECORD("info".CAR_NUM,"info".CAP_DATE) ASYNC; 4.use IndexTool to build index hbase org.apache.phoenix.mapreduce.index.IndexTool --data-table C_PICRECORD --index-table C_PICRECORD_IDX_1 --output-path /tmp/C_PICRECORD_IDX_1 5.enter into "hbase shell" and scan salt table hbase(main):102:0> scan 'C_PICRECORD' ROW COLUMN+CELL \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001\x0 column=L#0:_0, timestamp=1483108992853, value=x 0\x00\x00\x00 \x021 column=info:CAP_DATE, timestamp=1483021375797, value=2016-01-01
[jira] [Created] (PHOENIX-3555) Building async local index by IndexTool generate wrong data
chenzhiming created PHOENIX-3555: Summary: Building async local index by IndexTool generate wrong data Key: PHOENIX-3555 URL: https://issues.apache.org/jira/browse/PHOENIX-3555 Project: Phoenix Issue Type: Bug Affects Versions: 4.8.0 Environment: phoenix4.8.0 Reporter: chenzhiming 1.a salt table which pk is varchar CREATE TABLE C_PICRECORD ( ID VARCHAR NOT NULL PRIMARY KEY, "info".CAR_NUM VARCHAR(18) NULL, "info".CAP_DATE VARCHAR NULL, "info".ORG_ID BIGINT NULL, "info".ORG_NAME VARCHAR(255) NULL ) SALT_BUCKETS=3; 2.upsert into the table UPSERT INTO C_PICRECORD(ID,CAR_NUM,CAP_DATE,ORG_ID,ORG_NAME) VALUES('1','car1','2016-01-01 00:00:00',11,'orgname1'); 3.create async local index CREATE LOCAL INDEX C_PICRECORD_IDX_1 on C_PICRECORD("info".CAR_NUM,"info".CAP_DATE) ASYNC; 4.use IndexTool to build index hbase org.apache.phoenix.mapreduce.index.IndexTool --data-table C_PICRECORD --index-table C_PICRECORD_IDX_1 --output-path /tmp/C_PICRECORD_IDX_1 5.enter into "hbase shell" and scan salt table hbase(main):102:0> scan 'C_PICRECORD' ROW COLUMN+CELL \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001\x0 column=L#0:_0, timestamp=1483108992853, value=x 0\x00\x00\x00 \x021 column=info:CAP_DATE, timestamp=1483021375797, value=2016-01-01 00:00:00 \x021 column=info:CAR_NUM, timestamp=1483021375797, value=car1 \x021 column=info:ORG_ID, timestamp=1483021375797, value=\x80\x00\x00\x00\x00\x00\x00\x0B \x021 column=info:ORG_NAME, timestamp=1483021375797, value=orgname1 \x021 column=info:_0, timestamp=1483021375797, value=x -- look here,the index data is wrong: \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001\x00\x00\x00\x00 the right index data should be: \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001 this is the reason i get any null value(the column not in index): 0: jdbc:phoenix:master> SELECT ORG_ID,CAP_DATE,CAR_NUM,ORG_NAME FROM C_PICRECORD WHERE CAR_NUM='car1' AND CAP_DATE>='2016-01-01' AND CAP_DATE<='2016-05-02' LIMIT 10; +-+--+--+---+ | ORG_ID | CAP_DATE | CAR_NUM | ORG_NAME | +-+--+--+---+ | null| 2016-01-01 00:00:00 | car1 | | +-+--+--+---+ ps: i can get the right index data if change pk's datatype to bigint or upsert some string as pk such as 'abc'. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (PHOENIX-3554) Building async local index by IndexTool generate wrong data
chenzhiming created PHOENIX-3554: Summary: Building async local index by IndexTool generate wrong data Key: PHOENIX-3554 URL: https://issues.apache.org/jira/browse/PHOENIX-3554 Project: Phoenix Issue Type: Bug Affects Versions: 4.8.0 Environment: phoenix4.8.0 Reporter: chenzhiming 1.a salt table which pk is varchar CREATE TABLE C_PICRECORD ( ID VARCHAR NOT NULL PRIMARY KEY, "info".CAR_NUM VARCHAR(18) NULL, "info".CAP_DATE VARCHAR NULL, "info".ORG_ID BIGINT NULL, "info".ORG_NAME VARCHAR(255) NULL ) SALT_BUCKETS=3; 2.upsert into the table UPSERT INTO C_PICRECORD(ID,CAR_NUM,CAP_DATE,ORG_ID,ORG_NAME) VALUES('1','car1','2016-01-01 00:00:00',11,'orgname1'); 3.create async local index CREATE LOCAL INDEX C_PICRECORD_IDX_1 on C_PICRECORD("info".CAR_NUM,"info".CAP_DATE) ASYNC; 4.use IndexTool to build index hbase org.apache.phoenix.mapreduce.index.IndexTool --data-table C_PICRECORD --index-table C_PICRECORD_IDX_1 --output-path /tmp/C_PICRECORD_IDX_1 5.enter into "hbase shell" and scan salt table hbase(main):102:0> scan 'C_PICRECORD' ROW COLUMN+CELL \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001\x0 column=L#0:_0, timestamp=1483108992853, value=x 0\x00\x00\x00 \x021 column=info:CAP_DATE, timestamp=1483021375797, value=2016-01-01 00:00:00 \x021 column=info:CAR_NUM, timestamp=1483021375797, value=car1 \x021 column=info:ORG_ID, timestamp=1483021375797, value=\x80\x00\x00\x00\x00\x00\x00\x0B \x021 column=info:ORG_NAME, timestamp=1483021375797, value=orgname1 \x021 column=info:_0, timestamp=1483021375797, value=x -- look here,the index data is wrong: \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001\x00\x00\x00\x00 the right index data should be: \x02\x00\x0Ecar1\x002016-01-01 00:00:00\x001 this is the reason i get any null value(the column not in index): 0: jdbc:phoenix:master> SELECT ORG_ID,CAP_DATE,CAR_NUM,ORG_NAME FROM C_PICRECORD WHERE CAR_NUM='car1' AND CAP_DATE>='2016-01-01' AND CAP_DATE<='2016-05-02' LIMIT 10; +-+--+--+---+ | ORG_ID | CAP_DATE | CAR_NUM | ORG_NAME | +-+--+--+---+ | null| 2016-01-01 00:00:00 | car1 | | +-+--+--+---+ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3553) Zookeeper connection should be closed immediately after DefaultStatisticsCollector's collecting stats done
[ https://issues.apache.org/jira/browse/PHOENIX-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15787049#comment-15787049 ] Hadoop QA commented on PHOENIX-3553: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12845150/PHOENIX-3553.patch against master branch at commit 07f92732f9c6d2d9464012cebeb4cefc10da95d5. ATTACHMENT ID: 12845150 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 42 warning messages. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + SchemaUtil.getPhysicalTableName(PhoenixDatabaseMetaData.SYSTEM_CATALOG_NAME_BYTES, env.getConfiguration())); +get.addColumn(PhoenixDatabaseMetaData.TABLE_FAMILY_BYTES, PhoenixDatabaseMetaData.GUIDE_POSTS_WIDTH_BYTES); +guidepostWidth = PLong.INSTANCE.getCodec().decodeLong(cell.getValueArray(), cell.getValueOffset(), SortOrder.getDefault()); {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-PHOENIX-Build/712//testReport/ Javadoc warnings: https://builds.apache.org/job/PreCommit-PHOENIX-Build/712//artifact/patchprocess/patchJavadocWarnings.txt Console output: https://builds.apache.org/job/PreCommit-PHOENIX-Build/712//console This message is automatically generated. > Zookeeper connection should be closed immediately after > DefaultStatisticsCollector's collecting stats done > -- > > Key: PHOENIX-3553 > URL: https://issues.apache.org/jira/browse/PHOENIX-3553 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.9.0 >Reporter: Yeonseop Kim > Labels: stats, zookeeper > Fix For: 4.10.0 > > Attachments: PHOENIX-3553.patch > > > In every minor compaction job of HBase, > org.apache.phoenix.schema.stats.DefaultStatisticsCollector.initGuidePostDepth() > is called, > and SYSTEM.CATALOG table is open to get guidepost width via > htable = env.getTable( > > SchemaUtil.getPhysicalTableName(PhoenixDatabaseMetaData.SYSTEM_CATALOG_NAME_BYTES, > env.getConfiguration())); > This function call creates one zookeeper connection to get cluster id. > DefaultStatisticsCollector doesn't close this zookeeper connection > immediately after get guidepost width, and the zookeeper connection remains > alive until HRegion is closed. > This is not a problem with small number of Regions, but when number of Region > is large and upsert operation is frequent, the number of zookeeper connection > gradually increases to hundreds, and the zookeeper server nodes experience > short of available TCP/IP ports. > This zookeeper connection should be closed immediately after get guidepost > width. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3333) Support Spark 2.0
[ https://issues.apache.org/jira/browse/PHOENIX-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15787001#comment-15787001 ] James Taylor commented on PHOENIX-: --- +1 to the patch. Nice work, [~jmahonin] and thanks for the testing, [~dalin...@gmail.com] & [~kalyanhadoop]. > Support Spark 2.0 > - > > Key: PHOENIX- > URL: https://issues.apache.org/jira/browse/PHOENIX- > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.9.1 > Environment: spark 2.0 ,phoenix 4.8.0 , os is centos 6.7 ,hadoop is > hdp 2.5 >Reporter: dalin qin > Fix For: 4.10.0 > > Attachments: PHOENIX--interim.patch, PHOENIX-.patch > > > spark version is 2.0.0.2.5.0.0-1245 > As mentioned by Josh , I believe spark 2.0 changed their api so that failed > phoenix. Please come up with update version to adapt spark's change. > In [1]: df = sqlContext.read \ >...: .format("org.apache.phoenix.spark") \ >...: .option("table", "TABLE1") \ >...: .option("zkUrl", "namenode:2181:/hbase-unsecure") \ >...: .load() > --- > Py4JJavaError Traceback (most recent call last) > in () > > 1 df = sqlContext.read .format("org.apache.phoenix.spark") > .option("table", "TABLE1") .option("zkUrl", > "namenode:2181:/hbase-unsecure") .load() > /usr/hdp/2.5.0.0-1245/spark2/python/pyspark/sql/readwriter.pyc in load(self, > path, format, schema, **options) > 151 return > self._df(self._jreader.load(self._spark._sc._jvm.PythonUtils.toSeq(path))) > 152 else: > --> 153 return self._df(self._jreader.load()) > 154 > 155 @since(1.4) > /usr/hdp/2.5.0.0-1245/spark2/python/lib/py4j-0.10.1-src.zip/py4j/java_gateway.py > in __call__(self, *args) > 931 answer = self.gateway_client.send_command(command) > 932 return_value = get_return_value( > --> 933 answer, self.gateway_client, self.target_id, self.name) > 934 > 935 for temp_arg in temp_args: > /usr/hdp/2.5.0.0-1245/spark2/python/pyspark/sql/utils.pyc in deco(*a, **kw) > 61 def deco(*a, **kw): > 62 try: > ---> 63 return f(*a, **kw) > 64 except py4j.protocol.Py4JJavaError as e: > 65 s = e.java_exception.toString() > /usr/hdp/2.5.0.0-1245/spark2/python/lib/py4j-0.10.1-src.zip/py4j/protocol.py > in get_return_value(answer, gateway_client, target_id, name) > 310 raise Py4JJavaError( > 311 "An error occurred while calling {0}{1}{2}.\n". > --> 312 format(target_id, ".", name), value) > 313 else: > 314 raise Py4JError( > Py4JJavaError: An error occurred while calling o43.load. > : java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame > at java.lang.Class.getDeclaredMethods0(Native Method) > at java.lang.Class.privateGetDeclaredMethods(Class.java:2701) > at java.lang.Class.getDeclaredMethod(Class.java:2128) > at > java.io.ObjectStreamClass.getPrivateMethod(ObjectStreamClass.java:1475) > at java.io.ObjectStreamClass.access$1700(ObjectStreamClass.java:72) > at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:498) > at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472) > at java.security.AccessController.doPrivileged(Native Method) > at java.io.ObjectStreamClass.(ObjectStreamClass.java:472) > at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369) > at > java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1134) > at > java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548) > at > java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509) > at > java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432) > at > java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178) > at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348) > at > org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:43) > at > org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100) > at > org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:295) > at > org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:288) > at > org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:108) > at org.apache.spark.SparkContext.clean(SparkContext.scala:2037) > at
[jira] [Commented] (PHOENIX-3333) Support Spark 2.0
[ https://issues.apache.org/jira/browse/PHOENIX-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786977#comment-15786977 ] dalin qin commented on PHOENIX-: Hi Josh, yes, you are right ,to let spark load/wirte phoenix table ,only the new compiled phoenix-spark-4.9.0-HBase-1.1.jar and phoenix-4.9.0-HBase-1.1-client.jar are sufficent . I've also don spark2.0.2 write and spark 1.6.2 read write testing ,all works fine. Thanks. > Support Spark 2.0 > - > > Key: PHOENIX- > URL: https://issues.apache.org/jira/browse/PHOENIX- > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.9.1 > Environment: spark 2.0 ,phoenix 4.8.0 , os is centos 6.7 ,hadoop is > hdp 2.5 >Reporter: dalin qin > Fix For: 4.10.0 > > Attachments: PHOENIX--interim.patch, PHOENIX-.patch > > > spark version is 2.0.0.2.5.0.0-1245 > As mentioned by Josh , I believe spark 2.0 changed their api so that failed > phoenix. Please come up with update version to adapt spark's change. > In [1]: df = sqlContext.read \ >...: .format("org.apache.phoenix.spark") \ >...: .option("table", "TABLE1") \ >...: .option("zkUrl", "namenode:2181:/hbase-unsecure") \ >...: .load() > --- > Py4JJavaError Traceback (most recent call last) > in () > > 1 df = sqlContext.read .format("org.apache.phoenix.spark") > .option("table", "TABLE1") .option("zkUrl", > "namenode:2181:/hbase-unsecure") .load() > /usr/hdp/2.5.0.0-1245/spark2/python/pyspark/sql/readwriter.pyc in load(self, > path, format, schema, **options) > 151 return > self._df(self._jreader.load(self._spark._sc._jvm.PythonUtils.toSeq(path))) > 152 else: > --> 153 return self._df(self._jreader.load()) > 154 > 155 @since(1.4) > /usr/hdp/2.5.0.0-1245/spark2/python/lib/py4j-0.10.1-src.zip/py4j/java_gateway.py > in __call__(self, *args) > 931 answer = self.gateway_client.send_command(command) > 932 return_value = get_return_value( > --> 933 answer, self.gateway_client, self.target_id, self.name) > 934 > 935 for temp_arg in temp_args: > /usr/hdp/2.5.0.0-1245/spark2/python/pyspark/sql/utils.pyc in deco(*a, **kw) > 61 def deco(*a, **kw): > 62 try: > ---> 63 return f(*a, **kw) > 64 except py4j.protocol.Py4JJavaError as e: > 65 s = e.java_exception.toString() > /usr/hdp/2.5.0.0-1245/spark2/python/lib/py4j-0.10.1-src.zip/py4j/protocol.py > in get_return_value(answer, gateway_client, target_id, name) > 310 raise Py4JJavaError( > 311 "An error occurred while calling {0}{1}{2}.\n". > --> 312 format(target_id, ".", name), value) > 313 else: > 314 raise Py4JError( > Py4JJavaError: An error occurred while calling o43.load. > : java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame > at java.lang.Class.getDeclaredMethods0(Native Method) > at java.lang.Class.privateGetDeclaredMethods(Class.java:2701) > at java.lang.Class.getDeclaredMethod(Class.java:2128) > at > java.io.ObjectStreamClass.getPrivateMethod(ObjectStreamClass.java:1475) > at java.io.ObjectStreamClass.access$1700(ObjectStreamClass.java:72) > at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:498) > at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472) > at java.security.AccessController.doPrivileged(Native Method) > at java.io.ObjectStreamClass.(ObjectStreamClass.java:472) > at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369) > at > java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1134) > at > java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548) > at > java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509) > at > java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432) > at > java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178) > at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348) > at > org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:43) > at > org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100) > at > org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:295) > at > org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:288) > at >
[jira] [Updated] (PHOENIX-3553) Zookeeper connection should be closed immediately after DefaultStatisticsCollector's collecting stats done
[ https://issues.apache.org/jira/browse/PHOENIX-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yeonseop Kim updated PHOENIX-3553: -- Attachment: PHOENIX-3553.patch > Zookeeper connection should be closed immediately after > DefaultStatisticsCollector's collecting stats done > -- > > Key: PHOENIX-3553 > URL: https://issues.apache.org/jira/browse/PHOENIX-3553 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.9.0 >Reporter: Yeonseop Kim > Labels: stats, zookeeper > Fix For: 4.10.0 > > Attachments: PHOENIX-3553.patch > > > In every minor compaction job of HBase, > org.apache.phoenix.schema.stats.DefaultStatisticsCollector.initGuidePostDepth() > is called, > and SYSTEM.CATALOG table is open to get guidepost width via > htable = env.getTable( > > SchemaUtil.getPhysicalTableName(PhoenixDatabaseMetaData.SYSTEM_CATALOG_NAME_BYTES, > env.getConfiguration())); > This function call creates one zookeeper connection to get cluster id. > DefaultStatisticsCollector doesn't close this zookeeper connection > immediately after get guidepost width, and the zookeeper connection remains > alive until HRegion is closed. > This is not a problem with small number of Regions, but when number of Region > is large and upsert operation is frequent, the number of zookeeper connection > gradually increases to hundreds, and the zookeeper server nodes experience > short of available TCP/IP ports. > This zookeeper connection should be closed immediately after get guidepost > width. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (PHOENIX-3553) Zookeeper connection should be closed immediately after DefaultStatisticsCollector's collecting stats done
Yeonseop Kim created PHOENIX-3553: - Summary: Zookeeper connection should be closed immediately after DefaultStatisticsCollector's collecting stats done Key: PHOENIX-3553 URL: https://issues.apache.org/jira/browse/PHOENIX-3553 Project: Phoenix Issue Type: Bug Affects Versions: 4.9.0 Reporter: Yeonseop Kim In every minor compaction job of HBase, org.apache.phoenix.schema.stats.DefaultStatisticsCollector.initGuidePostDepth() is called, and SYSTEM.CATALOG table is open to get guidepost width via htable = env.getTable( SchemaUtil.getPhysicalTableName(PhoenixDatabaseMetaData.SYSTEM_CATALOG_NAME_BYTES, env.getConfiguration())); This function call creates one zookeeper connection to get cluster id. DefaultStatisticsCollector doesn't close this zookeeper connection immediately after get guidepost width, and the zookeeper connection remains alive until HRegion is closed. This is not a problem with small number of Regions, but when number of Region is large and upsert operation is frequent, the number of zookeeper connection gradually increases to hundreds, and the zookeeper server nodes experience short of available TCP/IP ports. This zookeeper connection should be closed immediately after get guidepost width. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-3333) Support Spark 2.0
[ https://issues.apache.org/jira/browse/PHOENIX-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15786510#comment-15786510 ] Josh Mahonin commented on PHOENIX-: --- Thanks for testing [~dalin...@gmail.com] One thing I'm curious about is whether all of those JARs are necessary in the spark classpath settings? In my experience, just the phoenix--client.jar is sufficient. > Support Spark 2.0 > - > > Key: PHOENIX- > URL: https://issues.apache.org/jira/browse/PHOENIX- > Project: Phoenix > Issue Type: Improvement >Affects Versions: 4.9.1 > Environment: spark 2.0 ,phoenix 4.8.0 , os is centos 6.7 ,hadoop is > hdp 2.5 >Reporter: dalin qin > Fix For: 4.10.0 > > Attachments: PHOENIX--interim.patch, PHOENIX-.patch > > > spark version is 2.0.0.2.5.0.0-1245 > As mentioned by Josh , I believe spark 2.0 changed their api so that failed > phoenix. Please come up with update version to adapt spark's change. > In [1]: df = sqlContext.read \ >...: .format("org.apache.phoenix.spark") \ >...: .option("table", "TABLE1") \ >...: .option("zkUrl", "namenode:2181:/hbase-unsecure") \ >...: .load() > --- > Py4JJavaError Traceback (most recent call last) > in () > > 1 df = sqlContext.read .format("org.apache.phoenix.spark") > .option("table", "TABLE1") .option("zkUrl", > "namenode:2181:/hbase-unsecure") .load() > /usr/hdp/2.5.0.0-1245/spark2/python/pyspark/sql/readwriter.pyc in load(self, > path, format, schema, **options) > 151 return > self._df(self._jreader.load(self._spark._sc._jvm.PythonUtils.toSeq(path))) > 152 else: > --> 153 return self._df(self._jreader.load()) > 154 > 155 @since(1.4) > /usr/hdp/2.5.0.0-1245/spark2/python/lib/py4j-0.10.1-src.zip/py4j/java_gateway.py > in __call__(self, *args) > 931 answer = self.gateway_client.send_command(command) > 932 return_value = get_return_value( > --> 933 answer, self.gateway_client, self.target_id, self.name) > 934 > 935 for temp_arg in temp_args: > /usr/hdp/2.5.0.0-1245/spark2/python/pyspark/sql/utils.pyc in deco(*a, **kw) > 61 def deco(*a, **kw): > 62 try: > ---> 63 return f(*a, **kw) > 64 except py4j.protocol.Py4JJavaError as e: > 65 s = e.java_exception.toString() > /usr/hdp/2.5.0.0-1245/spark2/python/lib/py4j-0.10.1-src.zip/py4j/protocol.py > in get_return_value(answer, gateway_client, target_id, name) > 310 raise Py4JJavaError( > 311 "An error occurred while calling {0}{1}{2}.\n". > --> 312 format(target_id, ".", name), value) > 313 else: > 314 raise Py4JError( > Py4JJavaError: An error occurred while calling o43.load. > : java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame > at java.lang.Class.getDeclaredMethods0(Native Method) > at java.lang.Class.privateGetDeclaredMethods(Class.java:2701) > at java.lang.Class.getDeclaredMethod(Class.java:2128) > at > java.io.ObjectStreamClass.getPrivateMethod(ObjectStreamClass.java:1475) > at java.io.ObjectStreamClass.access$1700(ObjectStreamClass.java:72) > at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:498) > at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472) > at java.security.AccessController.doPrivileged(Native Method) > at java.io.ObjectStreamClass.(ObjectStreamClass.java:472) > at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369) > at > java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1134) > at > java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548) > at > java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509) > at > java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432) > at > java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178) > at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348) > at > org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:43) > at > org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100) > at > org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:295) > at > org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:288) > at > org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:108) > at
[jira] [Created] (PHOENIX-3552) JDBC connectivity is very slow with Phoenix Client driver
srinivas padala created PHOENIX-3552: Summary: JDBC connectivity is very slow with Phoenix Client driver Key: PHOENIX-3552 URL: https://issues.apache.org/jira/browse/PHOENIX-3552 Project: Phoenix Issue Type: Bug Affects Versions: 4.7.0 Reporter: srinivas padala JDBC connectivity is very slow with Phoenix Client driver -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-3453) Secondary index and query using distinct: Outer query results in ERROR 201 (22000): Illegal data. CHAR types may only contain single byte characters
[ https://issues.apache.org/jira/browse/PHOENIX-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Taylor updated PHOENIX-3453: -- Assignee: chenglei > Secondary index and query using distinct: Outer query results in ERROR 201 > (22000): Illegal data. CHAR types may only contain single byte characters > > > Key: PHOENIX-3453 > URL: https://issues.apache.org/jira/browse/PHOENIX-3453 > Project: Phoenix > Issue Type: Bug >Affects Versions: 4.8.0 >Reporter: Joel Palmert >Assignee: chenglei > > Steps to repro: > CREATE TABLE IF NOT EXISTS TEST.TEST ( > ENTITY_ID CHAR(15) NOT NULL, > SCORE DOUBLE, > CONSTRAINT TEST_PK PRIMARY KEY ( > ENTITY_ID > ) > ) VERSIONS=1, MULTI_TENANT=FALSE, REPLICATION_SCOPE=1, TTL=31536000; > CREATE INDEX IF NOT EXISTS TEST_SCORE ON TEST.TEST (SCORE DESC, ENTITY_ID > DESC); > UPSERT INTO test.test VALUES ('entity1',1.1); > SELECT DISTINCT entity_id, score > FROM( > SELECT entity_id, score > FROM test.test > LIMIT 25 > ); > Output (in SQuirreL) > ��� 1.1 > If you run it in SQuirreL it results in the entity_id column getting the > above error value. Notice that if you remove the secondary index or DISTINCT > you get the correct result. > I've also run the query through the Phoenix java api. Then I get the > following exception: > Caused by: java.sql.SQLException: ERROR 201 (22000): Illegal data. CHAR types > may only contain single byte characters () > at > org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:454) > at > org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145) > at > org.apache.phoenix.schema.types.PDataType.newIllegalDataException(PDataType.java:291) > at org.apache.phoenix.schema.types.PChar.toObject(PChar.java:121) > at org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:997) > at > org.apache.phoenix.compile.ExpressionProjector.getValue(ExpressionProjector.java:75) > at > org.apache.phoenix.jdbc.PhoenixResultSet.getString(PhoenixResultSet.java:608) > at > org.apache.phoenix.jdbc.PhoenixResultSet.getString(PhoenixResultSet.java:621) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (PHOENIX-3551) broken package
Flavius Nopcea created PHOENIX-3551: --- Summary: broken package Key: PHOENIX-3551 URL: https://issues.apache.org/jira/browse/PHOENIX-3551 Project: Phoenix Issue Type: Bug Affects Versions: 4.4.0 Reporter: Flavius Nopcea Hi, I want to let you know that the package located here https://mvnrepository.com/artifact/org.apache.phoenix/phoenix/4.4.0-HBase-1.1 is broken. we can not download it from a regular pom xml file. thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)