Re: Phoenix Index tool deletes data from main table and does not populate local index

2022-08-23 Thread Istvan Toth
Hi,

This list is for the Apache Phoenix releases.
If you use a vendor version, please contact your vendor for support.

regards
Istvan

On Mon, Aug 22, 2022 at 7:21 PM Pradheep Shanmugam via user <
user@phoenix.apache.org> wrote:

> HI,
>
> I am using CDP 7.3.1
> I am trying create a phoenix local index for already existing table with
> ~14M rows.
> Following are the steps I did.
> 1. take snapshot of table from hbase (table is salted in phoenix)
> 2. clone the snapshot to new test table
> 3. create the table in phoenix to link the phoenix to new habse table
> created from clone snapshot. I can see all rows in the phoenix table.
>
>
> 4. create the local index in async mode:
> CREATE LOCAL INDEX MILESTONE_LOCAL_INDEX ON MILESTONE_TEST
> (h.eventTimestamp) ASYNC;
>
>
> 5. Run the MR job Phoenix IndexTool:
>
> ./hbase org.apache.phoenix.mapreduce.index.IndexTool --data-table
> MILESTONE_TEST --index-table MILESTONE_LOCAL_INDEX --output-path
> /hbase/data/default/MILESTONE_LOCAL_INDEX_HFILE
>
> MR jobs says succeeded, but he data in main test table  and local index is
> 0 but the index is marked active.
> I don’t see any explicit error in MR job.
> What could be the issue with MR job?
>
> Another question on Local index usage.
> When I tried with some 5 rows, I tried to see if the local index is used
>
> explain select * from MILESTONE_TEST where eventTimestamp <=
> TO_TIMESTAMP('2022-07-25 14:03:22.559');
>
>
> +---+
>
> |
> PLAN|
>
>
> +---+
>
> | CLIENT 10-CHUNK 0 ROWS 0 BYTES PARALLEL 10-WAY ROUND ROBIN FULL SCAN
> OVER MILESTONE_TEST  |
>
> | SERVER FILTER BY H.EVENTTIMESTAMP <= TIMESTAMP '2022-07-25
> 14:03:22.559'  |
>
>
> +---+
>
> I am not seeing the local index name in the explain suggesting its usage.
> Does the phoenix not use the local index if there are less rows? Hence am
> trying with millions of rows so that phoenix with try to use the local
> index. Can I expect to see to the local index name in explain if I have a
> few million rows in table and local index?
>
>
> Thanks,
> Pradheep
>


Phoenix Index tool deletes data from main table and does not populate local index

2022-08-22 Thread Pradheep Shanmugam via user
HI,

I am using CDP 7.3.1
I am trying create a phoenix local index for already existing table with ~14M 
rows.
Following are the steps I did.
1. take snapshot of table from hbase (table is salted in phoenix)
2. clone the snapshot to new test table
3. create the table in phoenix to link the phoenix to new habse table created 
from clone snapshot. I can see all rows in the phoenix table.

4. create the local index in async mode:
CREATE LOCAL INDEX MILESTONE_LOCAL_INDEX ON MILESTONE_TEST (h.eventTimestamp) 
ASYNC;

5. Run the MR job Phoenix IndexTool:

./hbase org.apache.phoenix.mapreduce.index.IndexTool --data-table 
MILESTONE_TEST --index-table MILESTONE_LOCAL_INDEX --output-path 
/hbase/data/default/MILESTONE_LOCAL_INDEX_HFILE

MR jobs says succeeded, but he data in main test table  and local index is 0 
but the index is marked active.
I don’t see any explicit error in MR job.
What could be the issue with MR job?

Another question on Local index usage.
When I tried with some 5 rows, I tried to see if the local index is used

explain select * from MILESTONE_TEST where eventTimestamp <= 
TO_TIMESTAMP('2022-07-25 14:03:22.559');
+---+
|   PLAN
|
+---+
| CLIENT 10-CHUNK 0 ROWS 0 BYTES PARALLEL 10-WAY ROUND ROBIN FULL SCAN OVER 
MILESTONE_TEST  |
| SERVER FILTER BY H.EVENTTIMESTAMP <= TIMESTAMP '2022-07-25 14:03:22.559'  
|
+---+

I am not seeing the local index name in the explain suggesting its usage.
Does the phoenix not use the local index if there are less rows? Hence am 
trying with millions of rows so that phoenix with try to use the local index. 
Can I expect to see to the local index name in explain if I have a few million 
rows in table and local index?

Thanks,
Pradheep