[ 
https://issues.apache.org/jira/browse/PHOENIX-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15330892#comment-15330892
 ] 

James Taylor commented on PHOENIX-2995:
---------------------------------------

Thanks, [~mujtabachohan]. What does a sample table/view DDL statement look 
like? Are the column names particularly long? You can take a look at the member 
variables in PTableImpl - does 7K or 11K per table add up? Where's all the 
space being used?

Once PHOENIX-2940 is in, stats won't be stored in PTable any longer. We could 
potentially decrease the size further (probably by 1/2) if we don't store both 
the String and byte[] of column names, but then GC cost would go up a bit. We 
usually access by String. We also have a duplicate Map by byte[] and by String 
for column families. We could switch to TreeMap which use less memory. Or even 
not have a map and let the search be linear - this is probably fine for column 
families.

Sounds like there's a discrepancy in the actual size versus estimated size that 
should be straightened out as well - would you mind filing a separate JIRA for 
that?

Do you know what the requirements are in terms of caching? There are likely 
views that are more frequently accessed than others which should mitigate this 
some, no? 

> Write performance severely degrades with large number of views 
> ---------------------------------------------------------------
>
>                 Key: PHOENIX-2995
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2995
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Mujtaba Chohan
>            Assignee: James Taylor
>              Labels: Argus
>         Attachments: upsert_rate.png
>
>
> Write performance for each 1K batch degrades significantly when there are 
> *10K* views being written in random with default 
> {{phoenix.client.maxMetaDataCacheSize}}. With all views created, upsert rate 
> remains around 25 seconds per 1K batch i.e. ~2K rows/min upsert rate. 
> When {{phoenix.client.maxMetaDataCacheSize}} is increased to 100MB+ then view 
> does not need to get re-resolved and upsert rate gets back to normal ~60K 
> rows/min.
> With *100K* views and {{phoenix.client.maxMetaDataCacheSize}} set to 1GB, I 
> wasn't able create all 100K views as upsert time for each 1K batch keeps on 
> steadily increasing. 
> Following graph shows 1K batch upsert rate over time with variation of number 
> of views. Rows are upserted to random views {{CREATE VIEW IF NOT EXISTS ... 
> APPEND_ONLY_SCHEMA = true, UPDATE_CACHE_FREQUENCY=900000}} is executed before 
> upsert statement.
> !upsert_rate.png!
> Base table is also created with {{APPEND_ONLY_SCHEMA = true, 
> UPDATE_CACHE_FREQUENCY = 900000, AUTO_PARTITION_SEQ}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to