[
https://issues.apache.org/jira/browse/KUDU-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17501086#comment-17501086
]
ASF subversion and git services commented on KUDU-3197:
-------------------------------------------------------
Commit f4b6d8917b79b9de53957174ade1a7ffc76e0090 in kudu's branch
refs/heads/master from shenxingwuying
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=f4b6d89 ]
KUDU-3197 [tserver] optimal Schema's memory used, using std::shared_ptr
Change TabletMeta's variable Schema* to std::shared_ptr<Schema>
to reduce memory used when alter schema.
Because TabletMeta save old_schemas to reserve the elder schemas
when alter schema, maybe they have been used by scanners or
compaction jobs. As jira KUDU-3197 said, frequently alter schema will
lead to tserver's memory becomes very large, just like memory leak,
especially column's number is very large.
The jira issued by wangningito, and I continue his work, and
now use std::shared_ptr instead of scoped_refptr<Schema>, because
scoped_refptr<Schema> causes too many changes, just as:
https://gerrit.cloudera.org/c/18098/
Change-Id: Ic284dde108c49130419d876c6698b40c195e9b35
Reviewed-on: http://gerrit.cloudera.org:8080/18255
Tested-by: Kudu Jenkins
Reviewed-by: Andrew Wong <[email protected]>
> Tablet keeps all history schemas in memory may result in high memory
> consumption
> --------------------------------------------------------------------------------
>
> Key: KUDU-3197
> URL: https://issues.apache.org/jira/browse/KUDU-3197
> Project: Kudu
> Issue Type: Improvement
> Components: tablet
> Affects Versions: 1.12.0
> Reporter: wangningito
> Assignee: wangningito
> Priority: Minor
> Attachments: image-2020-09-25-14-45-33-402.png,
> image-2020-09-25-14-49-30-913.png, image-2020-09-25-15-05-44-948.png,
> image-2020-12-02-19-59-46-733.png, screenshot-1.png
>
> Time Spent: 50m
> Remaining Estimate: 0h
>
> In case of high frequency of updating table, memory consumption of
> kudu-tserver may be very high, and the memory in not tracked in the memory
> page.
> This is the memory usage of a tablet, the memory consumption of tablet-xxx‘s
> peak is 3.6G, but none of its' childrens' memory can reach.
> !image-2020-09-25-14-45-33-402.png!
> So I use pprof to get the heap sampling. The tserver started for long but the
> memory is still consuming by TabletBootstrap:PlayAlterSchemaRequest.
> !image-2020-09-25-14-49-30-913.png!
> I change the `old_schemas_` in tablet_metadata.h to a fixed size vector,
> // Previous values of 'schema_'.
> // These are currently kept alive forever, under the assumption that
> // a given tablet won't have thousands of "alter table" calls.
> // They are kept alive so that callers of schema() don't need to
> // worry about reference counting or locking.
> std::vector<Schema*> old_schemas_;
> The heap sampling then becomes
> !image-2020-09-25-15-05-44-948.png!
> So, to make application layer more flexible, it could be better to make the
> size of the old_schemas configurable.
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)