[ 
https://issues.apache.org/jira/browse/PHOENIX-7064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17777167#comment-17777167
 ] 

fanartoria commented on PHOENIX-7064:
-------------------------------------

[~vjasani] 

Here is my test case:
{code:java}
table columns: 3 pks + 30 columns 
indexes: 10 indexes 
test data: 100,000 rows{code}
DDL: global index [^ddl-global.sql]; local index [^ddl-local.sql]

Data generator shell script(100,000 rows): [^gen-data.sh]

Test result summary
{code:bash}
# upsert data using psql.py
./bin/psql.py -t TESTTABLE_LOCAL -d ' ' test-data.csv
# local
## first upsert
CSV Upsert complete. 100000 rows upserted
Time: 76.36 sec(s)
## second upsert
CSV Upsert complete. 100000 rows upserted
Time: 158.516 sec(s)

# global
## first
CSV Upsert complete. 100000 rows upserted
Time: 46.15 sec(s)
## second
CSV Upsert complete. 100000 rows upserted
Time: 61.516 sec(s)

# local with test patch
## first
CSV Upsert complete. 100000 rows upserted
Time: 39.279 sec(s)
## second
CSV Upsert complete. 100000 rows upserted
Time: 55.167 sec(s)
{code}
the second upsert time is slower than the first one because there are extra 
index prepare logic to process.

> Prepare of local index mutations is extremely slow
> --------------------------------------------------
>
>                 Key: PHOENIX-7064
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-7064
>             Project: Phoenix
>          Issue Type: Improvement
>    Affects Versions: 5.1.3
>            Reporter: fanartoria
>            Priority: Major
>         Attachments: ddl-global.sql, ddl-local.sql, gen-data.sh, 
> image-2023-10-09-17-29-47-856.png, image-2023-10-09-17-41-29-679.png
>
>
> When the data table has more than one index, the prepare time of local index 
> will be much slower than global index. 
> The write performance should be better on local indexes.
> Here is the stack trace which the most time is spent in.
> !image-2023-10-09-17-29-47-856.png!
> Seems a LocalTableState object will be create when prepare index mutation for 
> each row.
> Compared with other ValueGetter, LazyValueGetter may be has bad performance.
> Why not use IndexMaintainer#createGetterFromKeyValues?
> Or combine the logic with global index prepare?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to