[
https://issues.apache.org/jira/browse/PHOENIX-3755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15962388#comment-15962388
]
Craig Roberts commented on PHOENIX-3755:
----------------------------------------
I've never commented here before, but I saw this through the mailing list, and
we had a similar problem which _may_ help. The mailing list thread archive for
our duplicate issue is at:
https://lists.apache.org/thread.html/ecf8ae1316c9c14885070a28054bbca4cd7ab929ca4d29a4ce0ca294@%3Cuser.phoenix.apache.org%3E
In our case, there seemed to be some issue with the SYSTEM.* tables. However,
since we didn't know how to fix this, and it was only a shared development
environment, we simply wiped HBase and started again. The problem never
re-occurred. There are some steps in there, which might quickly show if it's a
similar issue:
* We queried by a (guaranteed) unique ID, and got multiple results, but if we
included _all_ schema columns, we got only one result
* Using DISTINCT() on our UUID field also showed the correct number of results
* We could see from the HBase scan we definitely only had one underlying event
> Duplicate rows
> --------------
>
> Key: PHOENIX-3755
> URL: https://issues.apache.org/jira/browse/PHOENIX-3755
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.9.0
> Environment: Ubuntu 16.04
> HBase version 1.2.2
> Reporter: Viet Nguyen
>
> I have a major bug in apache phoenix version 4.9.0 as following:
> A query create table:
> CREATE TABLE ANALYTIC_ITEM_URL_V2
> (
> DOMAIN_ID INTEGER NOT NULL ,
> ITEM VARCHAR(40) NOT NULL ,
> URL VARCHAR(500),
> CONSTRAINT PK PRIMARY KEY (DOMAIN_ID,ITEM )
> ) SALT_BUCKETS=4, COMPRESSION='SNAPPY', IMMUTABLE_ROWS=true;
> This table has primary key with two field are domain_id and item. And when I
> executed the query:
> "select * from ANALYTIC_ITEM_URL_V2 where domain_id=17 and
> item='435bbf4da995a9b618b3d00d536ba730'"
> but I was quite surprised with the result as follow:
> +------------+-----------------------------------+-----------------------------------------------------------------------------------------------------------------+
> | DOMAIN_ID | ITEM |
> URL
> |
> +------------+-----------------------------------+-----------------------------------------------------------------------------------------------------------------+
> | 17 | 435bbf4da995a9b618b3d00d536ba730 |
> /nhom-nghi-si-thieu-so-bi-an-va-quyen-luc-khien-tong-thong-trump-tham-bai-truoc-obamacare-2017032819455659.chn
> |
> | 17 | 435bbf4da995a9b618b3d00d536ba730 |
> /nhom-nghi-si-thieu-so-bi-an-va-quyen-luc-khien-tong-thong-trump-tham-bai-truoc-obamacare-2017032819455659.chn
> |
> | 17 | 435bbf4da995a9b618b3d00d536ba730 |
> /nhom-nghi-si-thieu-so-bi-an-va-quyen-luc-khien-tong-thong-trump-tham-bai-truoc-obamacare-2017032819455659.chn
> |
> +------------+-----------------------------------+-----------------------------------------------------------------------------------------------------------------+
> As you see, there are 3 rows with same primary key. I also executed two query:
> - select count(*) from ANALYTIC_ITEM_URL_V2; in Phoenix
> And
> - count 'ANALYTIC_ITEM_URL_V2' in HBase
> Result is there are about 20M record in HBase and about 35M record in
> Phoenix. So I think phoenix has a bug in metadata store. Can I help me
> explain this thing?
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)