Hey Doug, Yes, that's exactly what was happening. I've since rebuilt everything with tcmalloc/google-perftools according to the docs and the memory usage has become more manageable but I still see high consumption and eventual memory exhaustion during heavy updates.
A new problem I've encountered with the tcmalloc-built binaries is that the ThriftBroker hangs soon after it completes some random number of reads or updates, usually within a minute or two of activity. I tried using the non-tcmalloc ThriftBroker binary with the currently running tcmalloc master/rangeservers/kosmosbrokers and it still hung. I'm going to try going back and start a fresh Hypertable instance with the non-tcmalloc binaries for everything to see if the problem goes away. Could be some changes to our app code causing the ThriftBroker hangs, we'll see. Thanks for the update btw! :-) Josh On Wed, Apr 15, 2009 at 9:31 PM, Doug Judd <[email protected]> wrote: > Hi Josh, > > Is it possible that the system underwent heavy update activity during that > time period? We don't have request throttling in place yet (should be out > next week), so it is possible for the RangeServer to exhaust memory under > heavy update workloads. It looks like the commit log got > truncated/corrupted when the machine died. You can tell the RangeServer to > skip commit log errors with the following property: > > Hypertable.CommitLog.SkipErrors=true > > This data in the commit log that is being skipped will most likely be lost. > > - Doug > > On Mon, Apr 13, 2009 at 1:10 PM, Josh Adams <[email protected]> wrote: >> >> On Mon, Apr 13, 2009 at 9:58 AM, Doug Judd <[email protected]> wrote: >> > No, it shouldn't. One thing that might help is to install tcmalloc >> > (google-perftools) and then re-build. You'll need to have tcmalloc >> > installed in all your runtime environments. >> >> Ok thanks, I'll try that out hopefully this week and let you know. >> >> > 157 on it a while back. It would be interesting to know if the disk >> > subsystems on any of your machines are getting saturated during this low >> > throughput condition. If so, then there probably is not much we can do >> >> Good point, I'll keep an eye on that. >> >> I was out of town on a short trip over the weekend and I wasn't >> watching our Hypertable instance very closely. During the early >> morning hours on Saturday it looks like each of the four machines >> running RangeServer/kosmosBroker/ThriftBroker had their memory spike >> heavily for about an hour. The root RangeServer started swapping and >> the machine went down later that day. I can't start the instance back >> up at the moment because the root RangeServer is complaining about >> this error and dies when I try starting it: >> >> 1239651998 ERROR Hypertable.RangeServer : load_next_valid_header >> >> (/data/tmp/dev/src/hypertable/6d5fdd1/src/cc/Hypertable/Lib/CommitLogBlockStream.cc:148): >> Hypertable::Exception: Error reading 34 bytes from DFS fd 1057 - >> HYPERTABLE failed expectation >> at virtual size_t Hypertable::DfsBroker::Client::read(int32_t, >> void*, >> size_t) >> (/data/tmp/dev/src/hypertable/6d5fdd1/src/cc/DfsBroker/Lib/Client.cc:258) >> at size_t Hypertable::ClientBufferedReaderHandler::read(void*, >> size_t) >> (/data/tmp/dev/src/hypertable/6d5fdd1/src/cc/DfsBroker/Lib/ClientBufferedReaderHandler.cc:161): >> empty queue >> >> I've attached a file containing the relevant errors at the end of its >> log and also the whole kosmosBroker log file for that startup attempt. >> >> Cheers, >> Josh >> >> > > > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Hypertable Development" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/hypertable-dev?hl=en -~----------~----~----~----~------~----~------~--~---
