[
https://issues.apache.org/jira/browse/PHOENIX-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15330892#comment-15330892
]
James Taylor commented on PHOENIX-2995:
---------------------------------------
Thanks, [~mujtabachohan]. What does a sample table/view DDL statement look
like? Are the column names particularly long? You can take a look at the member
variables in PTableImpl - does 7K or 11K per table add up? Where's all the
space being used?
Once PHOENIX-2940 is in, stats won't be stored in PTable any longer. We could
potentially decrease the size further (probably by 1/2) if we don't store both
the String and byte[] of column names, but then GC cost would go up a bit. We
usually access by String. We also have a duplicate Map by byte[] and by String
for column families. We could switch to TreeMap which use less memory. Or even
not have a map and let the search be linear - this is probably fine for column
families.
Sounds like there's a discrepancy in the actual size versus estimated size that
should be straightened out as well - would you mind filing a separate JIRA for
that?
Do you know what the requirements are in terms of caching? There are likely
views that are more frequently accessed than others which should mitigate this
some, no?
> Write performance severely degrades with large number of views
> ---------------------------------------------------------------
>
> Key: PHOENIX-2995
> URL: https://issues.apache.org/jira/browse/PHOENIX-2995
> Project: Phoenix
> Issue Type: Bug
> Reporter: Mujtaba Chohan
> Assignee: James Taylor
> Labels: Argus
> Attachments: upsert_rate.png
>
>
> Write performance for each 1K batch degrades significantly when there are
> *10K* views being written in random with default
> {{phoenix.client.maxMetaDataCacheSize}}. With all views created, upsert rate
> remains around 25 seconds per 1K batch i.e. ~2K rows/min upsert rate.
> When {{phoenix.client.maxMetaDataCacheSize}} is increased to 100MB+ then view
> does not need to get re-resolved and upsert rate gets back to normal ~60K
> rows/min.
> With *100K* views and {{phoenix.client.maxMetaDataCacheSize}} set to 1GB, I
> wasn't able create all 100K views as upsert time for each 1K batch keeps on
> steadily increasing.
> Following graph shows 1K batch upsert rate over time with variation of number
> of views. Rows are upserted to random views {{CREATE VIEW IF NOT EXISTS ...
> APPEND_ONLY_SCHEMA = true, UPDATE_CACHE_FREQUENCY=900000}} is executed before
> upsert statement.
> !upsert_rate.png!
> Base table is also created with {{APPEND_ONLY_SCHEMA = true,
> UPDATE_CACHE_FREQUENCY = 900000, AUTO_PARTITION_SEQ}}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)