[
https://issues.apache.org/jira/browse/THRIFT-1559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252303#comment-13252303
]
Yingfeng Zhang commented on THRIFT-1559:
----------------------------------------
As to llalloc (Lock free memory allocator), it has similar defragmentation
mechanism as jemalloc: The memory consumption will keep growing for some time,
and is reduced to some degree peoridically.
The overall defragmentation performance of llalloc is better than jemalloc:
1.5G memory consumption caused by fragmentation.
As a result, unless the thrift rpc could be refined more, no available memory
allocator could resolve the memory fragmentation problem totally.
> Provide memory pool for TBinaryProtocol to eliminate memory fragmentation
> -------------------------------------------------------------------------
>
> Key: THRIFT-1559
> URL: https://issues.apache.org/jira/browse/THRIFT-1559
> Project: Thrift
> Issue Type: Improvement
> Components: C++ - Library
> Affects Versions: 0.8
> Environment: Linux
> Reporter: Yingfeng Zhang
> Labels: memory
>
> We use Thrift c++ client library (0.7/0.8) to communicate with Apache
> Cassandra (1.0), and we need to frequently get intensive data from Cassandra.
> The type of data got has the following definition(multiget_slice):
> std::map<std::string, std::vector<ColumnOrSuperColumn> >, where
> ColumnOrSuperColumn is a struct composed of several std::map with std::string
> keys.
> Supose we have 1M data, and each time we got 1k, it means 1k records will
> exist in such struct as "std::map<std::string,
> std::vector<ColumnOrSuperColumn> >", then we need to call thrift RPC 1K
> times. While we destroy the above object of "std::map<std::string,
> std::vector<ColumnOrSuperColumn> >" immediately after the RPC, which means we
> do nothing but just perform the RPC operation. During that period, we found
> that the memory consumption keeps growing, evenif we attach jemalloc to the
> process for memory defragmentation.
> No matter how we tune the batch size, say the above 1k, ranging from 10 to
> 20k, the memory fragmentation keeps a high percentage, it means given more
> data, say 10M, just such RPC operation will eat up the memory: In fact, our
> process was killed by OS due to too much memory consumption.
> We believe that the current design of memory usage of Thrift cpp client has
> caused too much memory fragmentation and the issue appears to be more serious
> given more data as well as more complicated struct as defined in Cassandra.
> I suggest to provide memory pool for Thrift cpp library.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira