Re: [bitcoin-dev] Memory leaks?

Jonathan Toomim via bitcoin-dev Tue, 20 Oct 2015 20:02:39 -0700

More notes:

1. I ran a side-by-side comparison with two bitcoind processes (Core, same 
recent git commit as before) on the same computer with the same settings 
running on different ports. With both processes, I logged RSS (via 
/proc/$pid/status) every 6 seconds. With one of those processes, I also ran 
bitcoin-cli getblocktemplate > /dev/null every 6 seconds. I let that run for 
about 30 hours. A graph and links to the CSVs of raw data are below. Results 
seem pretty clear: the getblocktemplate RPC is implicated in this issue.



http://toom.im/files/memlog8518.csv
http://toom.im/files/memlog-nogbt-8503.csv
http://toom.im/files/bitcoind_memory_usage_gbt.png


2. I ran valgrind twice, for about 6 hours each, on bitcoind while hitting it 
with getblocktemplate every 6 hours. Full valgrind output can be found at these 
two URLs:

http://toom.im/files/valgrind-gbt-1.log
http://toom.im/files/valgrind-gbt-2.log

The summary:

==4064== LEAK SUMMARY:
==4064==    definitely lost: 0 bytes in 0 blocks
==4064==    indirectly lost: 0 bytes in 0 blocks
==4064==      possibly lost: 288 bytes in 1 blocks
==4064==    still reachable: 527,594 bytes in 4,367 blocks
==4064==         suppressed: 0 bytes in 0 blocks
The main components of that still reachable section seem to just be one output 
of CreateNewBlock that's cached in case another getblocktemplate request is 
received before any new transactions come in:

==4064== 98,304 bytes in 1 blocks are still reachable in loss record 39 of 40
==4064==    at 0x4C29180: operator new(unsigned long) (vg_replace_malloc.c:324)
==4064==    by 0x28EAA1: 
__gnu_cxx::new_allocator<CTransaction>::allocate(unsigned long, void const*) 
(new_allocator.h:104)
==4064==    by 0x27EE44: __gnu_cxx::__alloc_traits<std::allocator<CTransaction> 
>::allocate(std::allocator<CTransaction>&, unsigned long) (alloc_traits.h:182)
==4064==    by 0x26DFB0: std::_Vector_base<CTransaction, 
std::allocator<CTransaction> >::_M_allocate(unsigned long) (stl_vector.h:170)
==4064==    by 0x2D5BDE: std::vector<CTransaction, std::allocator<CTransaction> 
>::_M_insert_aux(__gnu_cxx::__normal_iterator<CTransaction*, 
std::vector<CTransaction, std::allocator<CTransaction> > >, CTransaction 
const&) (vector.tcc:353)
==4064==    by 0x2D3FF8: std::vector<CTransaction, std::allocator<CTransaction> 
>::push_back(CTransaction const&) (stl_vector.h:925)
==4064==    by 0x2D113E: CreateNewBlock(CScript const&) (miner.cpp:298)
==4064==    by 0x442D78: getblocktemplate(UniValue const&, bool) 
(rpcmining.cpp:513)
==4064==    by 0x390CEB: CRPCTable::execute(std::string const&, UniValue 
const&) const (rpcserver.cpp:526)
==4064==    by 0x41C5AB: HTTPReq_JSONRPC(HTTPRequest*, std::string const&) 
(httprpc.cpp:125)
==4064==    by 0x3559BD: boost::detail::function::void_function_invoker2<bool 
(*)(HTTPRequest*, std::string const&), void, HTTPRequest*, std::string 
const&>::invoke(boost::detail::function::function_buffer&, HTTPRequest*, 
std::string const&) (function_template.hpp:112)
==4064==    by 0x422520: boost::function2<void, HTTPRequest*, std::string 
const&>::operator()(HTTPRequest*, std::string const&) const 
(function_template.hpp:767)

There are a few other similar loss records (mostly referring to pblock or 
pblocktemplate in CreateNewBlock(...), but I see nothing that can explain the 
multi-GB memory consumption.

3. One user on the bitcointalk p2pool thread 
(https://bitcointalk.org/index.php?topic=18313.msg12733791#msg12733791) claimed 
that he had this memory usage issue on Linux, but not on Mac OS X, under a GBT 
workload in both situations. If this is true, that would suggest this might be 
a fragmentation issue due to poor memory allocation. The other likely 
hypothesis is bloated caches. Looking into those two possibilities will be my 
next steps.



On Oct 20, 2015, at 5:39 AM, Jonathan Toomim <j...@toom.im> wrote:

> I did that Sunday twice. I'll report the results soon. Short version is that 
> it looks like valgrind is just finding 200 kB to 600 kB of pblocktemplate, 
> which is declared as a static pointer. Not exactly the multi-GB leak I'm 
> looking for, but possibly related.
> 
> I've also got two bitcoind processes running on the same machine that I 
> started at the same time, running on different ports, all with the same 
> settings, but one of which is serving getblocktemplate every 5-6 seconds and 
> the other is not, while logging RSS on both every 6 seconds. RSS for the 
> non-serving node is now 734 MB, and for the serving node 1997 MB. Graphs 
> coming soon.
> 
> 
> On Oct 20, 2015, at 3:12 AM, Mike Hearn <he...@vinumeris.com> wrote:
> 
>> OK, then running under Valgrind whilst sending gbt RPCs would be the next 
>> step.
>> 
>> On Mon, Oct 19, 2015 at 9:17 PM, Multipool Admin <ad...@multipool.us> wrote:
>> My nodes are continuously running getblocktemplate and getinfo, and I also 
>> suspected the issue is in either gbt or the rpc server.
>> 
>> The instance only takes a few hours to get up to that memory usage.
>> 
>> On Oct 18, 2015 8:59 AM, "Jonathan Toomim via bitcoin-dev" 
>> <bitcoin-dev@lists.linuxfoundation.org> wrote:
>> On Oct 14, 2015, at 2:39 AM, Wladimir J. van der Laan <laa...@gmail.com> 
>> wrote:
>>> This is *most likely* the mempool, but is just not reported correctly.
>> 
>> I did some testing with PR #6410's better mempool reporting. The improved 
>> reporting suggests that actual in-memory usage ("usage":) by CTxMemPool is 
>> about 2.5x to 3x higher than the serialized transaction sizes ("bytes":). 
>> The excess memory usage that I'm seeing is on the order of 100x higher than 
>> the mempool "bytes": value. As such, I think it's unlikely that this is the 
>> mempool, or at least not normal/correct mempool behavior.
>> 
>> Another user (ad...@multipool.us) reported 35 GB of RSS usage. I'm guessing 
>> his bitcoind has been running longer than any of mine. His server definitely 
>> has more RAM. I don't know which email list he is subscribed to (probably 
>> XT), so I'm sharing it with both lists to make sure you're all aware of how 
>> big an issue this can be.
>> 
>>> In the meantime you can mitigate the mempool growth by setting `-mintxfee`, 
>>> see
>>> https://github.com/bitcoin/bitcoin/blob/v0.11.0/doc/release-notes.md#transaction-flooding
>> 
>> I have mintxfee and minrelaytxfee set to about 0.00003, which is high enough 
>> to exclude essentially all of the of the 14700-14800 byte flood 
>> transactions. My nodes' mempools only contain about one or two blocks' worth 
>> of transactions. So I don't think this is correct either.
>> 
>> 
>> 
>> Some additional notes on this issue:
>> 
>> 1. I think it's related to CreateNewBlock() and getblocktemplate. I ran a 
>> Core bitcoind process (commit d78a880) overnight with no mining connected to 
>> it, and (IIRC -- my memory is fuzzy) when I woke up it was using around 400 
>> MB of RSS and the mempool was at around "bytes":10MB, "usage": 25MB. I ran 
>> ./bitcoin-cli getblocktemplate once, and IIRC the RSS shot up to around 800 
>> MB. I then ran getblocktemplate every 5 seconds for about 30 minutes, and 
>> RSS climbed to 1180 MB. An hour after that with more getblocktemplates, and 
>> now RSS is at 1350 MB. [Edit: 1490 MB about 30 minutes later.] 
>> getmempoolinfo is still showing "usage" around 25MB or less.
>> 
>> I'll do some more testing with this and see if I can make it repeatable, and 
>> record the results more carefully. Expect a follow-up from me in a day or 
>> two.
>> 
>> 2. valgrind did not show anything super promising. It did report this:
>> 
>> ==6880== LEAK SUMMARY:
>> ==6880==    definitely lost: 0 bytes in 0 blocks
>> ==6880==    indirectly lost: 0 bytes in 0 blocks
>> ==6880==      possibly lost: 288 bytes in 1 blocks
>> ==6880==    still reachable: 10,552 bytes in 39 blocks
>> ==6880==         suppressed: 0 bytes in 0 blocks
>> (Bitcoin Core commit d78a880)
>> 
>> and this:
>> ==6778== LEAK SUMMARY:
>> ==6778==    definitely lost: 0 bytes in 0 blocks
>> ==6778==    indirectly lost: 0 bytes in 0 blocks
>> ==6778==      possibly lost: 320 bytes in 1 blocks
>> ==6778==    still reachable: 10,080 bytes in 32 blocks
>> ==6778==         suppressed: 0 bytes in 0 blocks
>> (Bitcoin XT commit fe446d)
>> 
>> I haven't found anything in there yet that I think would produce the 
>> multi-GB memory usage after running for a few days, but I could be missing 
>> it. Email me if you want the full log.
>> 
>> I did not try running getblocktemplate while valgrind was running. I'll have 
>> to try that. I also have not let valgrind run for more than an hour.
>> 
>> 
>> 
>> P.S.: Sorry for all the cross-post confusion and consequent flamewar 
>> fallout. While it's probably too late for this thread, I'll make sure to 
>> post in a manner that keeps the threads clearly separate in the future (e.g. 
>> different subject lines).
>> 
>> _______________________________________________
>> bitcoin-dev mailing list
>> bitcoin-dev@lists.linuxfoundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>> 
>> 
>> --
>> You received this message because you are subscribed to the Google Groups 
>> "bitcoin-xt" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to bitcoin-xt+unsubscr...@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>> 
>

signature.asc
Description: Message signed with OpenPGP using GPGMail

_______________________________________________
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev

Re: [bitcoin-dev] Memory leaks?

Reply via email to