Re: [Kosmosfs-users] too many disk IOs / chunkserver suddenly consumes ~100GB of address space

Joshua Taylor Sat, 25 Apr 2009 15:42:48 -0700

Thank you Sriram for the speedy fix!  I've applied the update and haven't
seen the problem since.


Thanks again!
Josh


On Sat, Apr 25, 2009 at 1:56 PM, Sriram Rao <[email protected]> wrote:

> Thanks to Josh's logs we figured out what caused the problem.  There
> was a bug in the readahead code that was causing the client to send
> bogus values for read; we've fixed the client side and also fixed the
> server side to handle this case.  This fixes are in with revision 327
> of trunk.
>
> Sriram
>
> On Thu, Apr 23, 2009 at 12:44 PM, Sriram Rao <[email protected]> wrote:
> > I have checked in a fix to the chunkserver to fail the read when it
> > sees bogus values (revision 316).  That was a bug in the code :-(
> > Would still be good to know what caused the client to send bogus
> > values for the size...
> >
> > Sriram
> >
> > On Thu, Apr 23, 2009 at 12:32 PM, Sriram Rao <[email protected]>
> wrote:
> >> Josh,
> >>
> >> Thanks.  Josh Adams had also seen the same bloat and from his logs, I
> >> had figured that the chunkserver was in loop trying to read data and
> >> we couldn't quite figure out what was causing it.  This looks like the
> >> smoking gun :-)
> >>
> >> What was the test doing?  I am still curious why the client requested
> >> a bizarre amount of data.  I'll put in the fixes to fail the read.
> >>
> >> Sriram
> >>
> >> On Thu, Apr 23, 2009 at 12:17 PM, Joshua Taylor <
> [email protected]> wrote:
> >>> I have a bit more info about this problem.  It just happened to one of
> my
> >>> chunkservers as well.  I had replaced all log4cpp calls with fprintf to
> >>> avoid random log4cpp crashes (separate problem), so the chunkserver is
> >>> already doing debug logging.  Here's the log from the beginning of the
> >>> bloat/overload:
> >>>
> >>> 04-23-2009 05:42:26.625 DEBUG - (ClientSM.cc:301) Got command: read:
> chunkId
> >>> = 51823 chunkversion = 1 offset: 32636928 numBytes: 131072
> >>> 04-23-2009 05:42:26.630 DEBUG - (ClientSM.cc:301) Got command: read:
> chunkId
> >>> = 51823 chunkversion = 1 offset: 32768000 numBytes: 131072
> >>> 04-23-2009 05:42:26.633 DEBUG - (ClientSM.cc:301) Got command: read:
> chunkId
> >>> = 51823 chunkversion = 1 offset: 32899072 numBytes: 131072
> >>> 04-23-2009 05:42:26.636 DEBUG - (ClientSM.cc:301) Got command: read:
> chunkId
> >>> = 51823 chunkversion = 1 offset: 33030144 numBytes: 131072
> >>> 04-23-2009 05:42:26.638 DEBUG - (ClientSM.cc:301) Got command: read:
> chunkId
> >>> = 51823 chunkversion = 1 offset: 33161216 numBytes: 131072
> >>> 04-23-2009 05:42:26.641 DEBUG - (ClientSM.cc:301) Got command: read:
> chunkId
> >>> = 51823 chunkversion = 1 offset: 33292288 numBytes: 131072
> >>> 04-23-2009 05:42:26.648 DEBUG - (ClientSM.cc:301) Got command: read:
> chunkId
> >>> = 49677 chunkversion = 1 offset: 16276417 numBytes: 9223372036854775807
> >>> 04-23-2009 05:42:27.053 INFO - (DiskManager.cc:392) Too many disk IOs
> (5001)
> >>> outstanding...overloaded
> >>> 04-23-2009 05:42:27.053 INFO - (DiskManager.cc:392) Too many disk IOs
> (5002)
> >>> outstanding...overloaded
> >>> 04-23-2009 05:42:27.053 INFO - (DiskManager.cc:392) Too many disk IOs
> (5003)
> >>> outstanding...overloaded
> >>> 04-23-2009 05:42:27.053 INFO - (DiskManager.cc:392) Too many disk IOs
> (5004)
> >>> outstanding...overloaded
> >>> 04-23-2009 05:42:27.054 INFO - (DiskManager.cc:392) Too many disk IOs
> (5005)
> >>> outstanding...overloaded
> >>> 04-23-2009 05:42:27.054 INFO - (DiskManager.cc:392) Too many disk IOs
> (5006)
> >>> outstanding...overloaded
> >>> 04-23-2009 05:42:27.054 INFO - (DiskManager.cc:392) Too many disk IOs
> (5007)
> >>> outstanding...overloaded
> >>> 04-23-2009 05:42:27.054 INFO - (DiskManager.cc:392) Too many disk IOs
> (5008)
> >>> outstanding...overloaded
> >>> 04-23-2009 05:42:27.055 INFO - (DiskManager.cc:392) Too many disk IOs
> (5009)
> >>> outstanding...overloaded
> >>> 04-23-2009 05:42:27.055 INFO - (DiskManager.cc:392) Too many disk IOs
> (5010)
> >>> outstanding...overloaded
> >>> 04-23-2009 05:42:27.055 INFO - (DiskManager.cc:392) Too many disk IOs
> (5011)
> >>> outstanding...overloaded
> >>> 04-23-2009 05:42:27.055 INFO - (DiskManager.cc:392) Too many disk IOs
> (5012)
> >>> outstanding...overloaded
> >>> 04-23-2009 05:42:27.056 INFO - (DiskManager.cc:392) Too many disk IOs
> (5013)
> >>> outstanding...overloaded
> >>> 04-23-2009 05:42:27.056 INFO - (DiskManager.cc:392) Too many disk IOs
> (5014)
> >>> outstanding...overloaded
> >>> 04-23-2009 05:42:27.056 INFO - (DiskManager.cc:392) Too many disk IOs
> (5015)
> >>> outstanding...overloaded
> >>> 04-23-2009 05:42:27.056 INFO - (DiskManager.cc:392) Too many disk IOs
> (5016)
> >>> outstanding...overloaded
> >>>
> >>> The line where numBytes == 2^63 - 1 is rather suspicious.  I'm not
> running
> >>> Heritrix/Hbase/etc.  This is just a few file copies.  I have 8
> chunkservers
> >>> total.  Only one is having this problem.  The other chunkservers are
> still
> >>> logging that they're making progress (reading and writing chunks).
> >>>
> >>> Here's a stack trace from the server currently pegged at 100% CPU, 200
> GB
> >>> VmSize:
> >>>
> >>> Thread 5 (Thread 0x40bd9950 (LWP 3838)):
> >>> #0  0x00000030df80af19 in pthread_cond_wait@@GLIBC_2.3.2 ()
> >>> #1  0x000000000047f2d7 in KFS::MetaThread::sleep ()
> >>> #2  0x000000000047f317 in KFS::MetaQueue<KFS::KfsOp>::dequeue_internal
> ()
> >>> #3  0x000000000047f414 in KFS::MetaQueue<KFS::KfsOp>::dequeue ()
> >>> #4  0x000000000047bdf3 in KFS::Logger::MainLoop ()
> >>> #5  0x000000000047bf79 in logger_main ()
> >>> #6  0x00000030df80729a in start_thread () from /lib64/libpthread.so.0
> >>> #7  0x00000030dece439d in clone () from /lib64/libc.so.6
> >>> Thread 4 (Thread 0x415da950 (LWP 3839)):
> >>> #0  0x000000000046fe1a in
> >>> std::_List_const_iterator<boost::shared_ptr<KFS::DiskEvent_t>
> >::operator++
> >>> ()
> >>> #1  0x000000000046fe4a in
> >>>
> std::__distance<std::_List_const_iterator<boost::shared_ptr<KFS::DiskEvent_t>
> >>>> > ()
> >>> #2  0x000000000046fe93 in
> >>>
> std::distance<std::_List_const_iterator<boost::shared_ptr<KFS::DiskEvent_t>
> >>>> > ()
> >>> #3  0x000000000046fec3 in
> std::list<boost::shared_ptr<KFS::DiskEvent_t>,
> >>> std::allocator<boost::shared_ptr<KFS::DiskEvent_t> > >::size ()
> >>> #4  0x000000000048db0a in KFS::DiskManager::IOInitiated ()
> >>> #5  0x000000000048e0ee in KFS::DiskManager::Read ()
> >>> #6  0x000000000048aeca in KFS::DiskConnection::Read ()
> >>> #7  0x0000000000452a74 in KFS::ChunkManager::ReadChunk ()
> >>> #8  0x0000000000469d04 in KFS::ReadOp::HandleChunkMetaReadDone ()
> >>> #9  0x0000000000458378 in KFS::ObjectMethod<KFS::ReadOp>::execute ()
> >>> #10 0x000000000045574e in KFS::KfsCallbackObj::HandleEvent ()
> >>> #11 0x0000000000452b9f in KFS::ChunkManager::ReadChunkMetadata ()
> >>> #12 0x0000000000467fe1 in KFS::ReadOp::Execute ()
> >>> #13 0x0000000000463ad3 in KFS::SubmitOp ()
> >>> #14 0x000000000046104e in KFS::ClientSM::HandleClientCmd ()
> >>> #15 0x000000000046163c in KFS::ClientSM::HandleRequest ()
> >>> #16 0x00000000004628e8 in KFS::ObjectMethod<KFS::ClientSM>::execute ()
> >>> #17 0x000000000045574e in KFS::KfsCallbackObj::HandleEvent ()
> >>> #18 0x0000000000493a20 in KFS::NetConnection::HandleReadEvent ()
> >>> #19 0x00000000004949ca in KFS::NetManager::MainLoop ()
> >>> #20 0x000000000045f2c0 in netWorker ()
> >>> #21 0x00000030df80729a in start_thread () from /lib64/libpthread.so.0
> >>> #22 0x00000030dece439d in clone () from /lib64/libc.so.6
> >>> Thread 3 (Thread 0x415e2950 (LWP 9108)):
> >>> #0  0x00000030df80b19d in pthread_cond_timedwait@@GLIBC_2.3.2 ()
> >>> #1  0x00000030e0003183 in handle_fildes_io () from /lib64/librt.so.1
> >>> #2  0x00000030df80729a in start_thread () from /lib64/libpthread.so.0
> >>> #3  0x00000030dece439d in clone () from /lib64/libc.so.6
> >>> Thread 2 (Thread 0x415ea950 (LWP 9508)):
> >>> #0  0x00000030df80b19d in pthread_cond_timedwait@@GLIBC_2.3.2 ()
> >>> #1  0x00000030e0003183 in handle_fildes_io () from /lib64/librt.so.1
> >>> #2  0x00000030df80729a in start_thread () from /lib64/libpthread.so.0
> >>> #3  0x00000030dece439d in clone () from /lib64/libc.so.6
> >>> Thread 1 (Thread 0x7fd83a463700 (LWP 3837)):
> >>> #0  0x00000030df807b75 in pthread_join () from /lib64/libpthread.so.0
> >>> #1  0x000000000045f87e in KFS::MetaThread::join ()
> >>> #2  0x000000000045f299 in KFS::ChunkServer::MainLoop ()
> >>> #3  0x000000000044e0b8 in main ()
> >>>
> >>> Everything looks blocked except Thread 4, so I'm guessing it's the one
> going
> >>> berzerk.
> >>>
> >>>
> >>> On Fri, Apr 17, 2009 at 10:22 AM, Sriram Rao <[email protected]>
> wrote:
> >>>>
> >>>> Hi Andy,
> >>>>
> >>>> Thanks for the info.  From what you are seeing, it looks like
> >>>> something is causing a memory bloat, causing too many IOs and
> >>>> eventually everything grinds to a halt.  Since it is easy to
> >>>> reproduce, I can give it a try on my machine.  Can you give me the
> >>>> inputs to Heritrix and any config info to Hbase?
> >>>>
> >>>> It'd also be good if you can get me the chunkserver logs.  In the
> >>>> Chunkserver.prp file, can you add the following line:
> >>>> chunkServer.loglevel = DEBUG
> >>>>
> >>>> and send me the file/upload it?  The easier thing to do is, file a bug
> >>>> about this issue on KFS sourceforge page and then upload the
> >>>> chunkserver logs.
> >>>>
> >>>> Def. want to get to the bottom of this since this is blocking
> you/Ryan.
> >>>>
> >>>> Sriram
> >>>>
> >>>> On Fri, Apr 17, 2009 at 4:16 AM, Andrew Purtell <[email protected]>
> >>>> wrote:
> >>>> >
> >>>> > Hi Sriram,
> >>>> >
> >>>> >> Can you tell me the exact steps to repro the problem:
> >>>> >>  - What version of Hbase?
> >>>> >
> >>>> > SVN trunk, 0.20.0-dev
> >>>> >
> >>>> >>  - Which version of Heritrix?
> >>>> >
> >>>> > Heritrix 2.0, plus the HBase writer which can be found here:
> >>>> > http://code.google.com/p/hbase-writer/
> >>>> >
> >>>> >> What is happening is that the KFS chunkserver is sending
> >>>> >> writes down to disk and they aren't coming back "soon
> >>>> >> enough", causing things to backlog; the chunkserver is
> >>>> >> printing out the backlog status message.
> >>>> >
> >>>> > I wonder if this might be a secondary effect. Just before
> >>>> > these messages begin streaming into the log, the chunkserver
> >>>> > suddenly balloons its address space from ~200KB to ~100GB.
> >>>> > These two things have strong correlation and happen in the
> >>>> > same order in repeatable manner.
> >>>> >
> >>>> > Once the backlog messages begin, no further IO completes as
> >>>> > far as I can tell. The count of outstanding IOs
> >>>> > monotonically increases. Also, the metaserver declares the
> >>>> > chunkserver dead.
> >>>> >
> >>>> > I can take steps to help diagnose the problem. Please advise.
> >>>> > Would it help if I replicate the problem again with
> >>>> > chunkserver logging at DEBUG and then post the compressed
> >>>> > logs somewhere?
> >>>> >
> >>>> > [...]
> >>>> >> On Thu, Apr 16, 2009 at 12:27 AM, Andrew Purtell
> >>>> >> >
> >>>> >> > Hi,
> >>>> >> >
> >>>> >> > Like Ryan I have been trying to run HBase on top of
> >>>> >> > KFS. In my case I am running a SVN snapshot from
> >>>> >> > yesterday. I have a minimal installation of KFS
> >>>> >> > metaserver, chunkserver, and HBase master and
> >>>> >> > regionserver all running on one test host with 4GB of
> >>>> >> > RAM. Of course I do not expect more than minimal
> >>>> >> > function. To apply some light load, I run the Heritrix
> >>>> >> > crawler with 5 TOE threads which write on average
> >>>> >> > 200 Kbit/sec of data into HBase, which flushes this
> >>>> >> > incoming data in ~64MB increments and also runs
> >>>> >> > occasional compaction cycles where the 64MB flush
> >>>> >> > files will be compacted into ~256MB files.
> >>>> >> >
> >>>> >> > I find that for no obvious reason the chunkserver will
> >>>> >> > suddenly grab ~100 GIGAbytes of address space and emit
> >>>> >> > a steady stream of "(DiskManager.cc:392) Too many disk
> >>>> >> > IOs (N)" to the log at INFO level, where N is a
> >>>> >> > steadily increasing number. The host is under moderate
> >>>> >> > load at the time -- KFS is busy -- but is not in swap
> >>>> >> > and according to atop has some disk I/O and network
> >>>> >> > bandwidth to spare.
> >>>> > [...]
> >>>> >
> >>>> >
> >>>> >
> >>>> >
> >>>> >
> >>>>
> >>>>
> >>>>
> ------------------------------------------------------------------------------
> >>>> Stay on top of everything new and different, both inside and
> >>>> around Java (TM) technology - register by April 22, and save
> >>>> $200 on the JavaOne (SM) conference, June 2-5, 2009, San Francisco.
> >>>> 300 plus technical and hands-on sessions. Register today.
> >>>> Use priority code J9JMT32. http://p.sf.net/sfu/p
> >>>> _______________________________________________
> >>>> Kosmosfs-users mailing list
> >>>> [email protected]
> >>>> https://lists.sourceforge.net/lists/listinfo/kosmosfs-users
> >>>
> >>>
> >>
> >
>

Re: [Kosmosfs-users] too many disk IOs / chunkserver suddenly consumes ~100GB of address space

Reply via email to