I tracked the problem from the source code.  what I suspect is there are
only 100 worker for RangeServer,but total client threads are 5*50=250,
when data in rangeserver get large, maybe all worker are busy, and later
client threads (the remain 150 client threads) just put the data, then
IOHandlers are removed for no more workes. so when rangeserver try to using
cb->response_ok(), it can not find the IOHandler according to CommAddr, just
like the log shows:
Comm.cc:273) No connection for 10.20.130.101:54645 - COMM not connected

Does it matter when RangeServer.workers <= 1/5 * client_threads for heavily
inserting and scan operations?

I will try the 0.9.5.0 "prev4" release,  Could you tell me which bugs you
fixed sounds like the problem I met ?





2011/5/10 Doug Judd <[email protected]>

> Hi Pany,
>
> Can you try with the 0.9.5.0 "pre4" release?  We overhauled the code
> specifically to fix a number of stability issues and the symptom that you
> describe sounds like one of the bugs that we fixed.
>
> - Doug
>
> On Mon, May 9, 2011 at 6:59 PM, Pany Yue <[email protected]> wrote:
>
>> Hi,
>>
>> I deployed Hypertable cluster on one machine. using LocalBroker. There are
>> 5 client, every client has 50 threads, they do scan and insert on two table
>> concurrently.
>> each table has 100 columns.
>> One days after the system running, I found clients cann't insert or scan
>> anything in rangeserver, then I used ht shell to test scan, I just got the
>> namespace and table listing,
>> but it blocked when  I used "select * from table";
>>
>> After a litter while, all clients are cored. then after minutes,
>> RangeServer cored and exit;
>>
>> here are the bt for client:
>> (gdb) bt
>> #0  0x00002aab0c77c3d0 in ?? ()
>> #1  0x00002aaeeeefc21a in Hypertable::IOHandler::deliver_event () from
>> /home/combo/usr/lib/libHyperComm.so
>> #2  0x00002aaeeeef995d in Hypertable::IOHandlerData::handle_message_body
>> () from /home/combo/usr/lib/libHyperComm.so
>> #3  0x00002aaeeeefa198 in Hypertable::IOHandlerData::handle_event () from
>> /home/combo/usr/lib/libHyperComm.so
>> #4  0x00002aaeeef10f6d in Hypertable::ReactorRunner::operator() () from
>> /home/combo/usr/lib/libHyperComm.so
>> #5  0x00002aaeeef0f900 in
>> boost::detail::thread_data<Hypertable::ReactorRunner>::run () from
>> /home/combo/usr/lib/libHyperComm.so
>> #6  0x00002aaeef782bbb in thread_proxy () from
>> /home/combo/iprocess_client/lib/libboost_thread.so.1.43.0
>> #7  0x00000036a9c0673d in start_thread () from /lib64/libpthread.so.0
>> #8  0x00000036a90d3f6d in clone () from /lib64/libc.so.6
>>
>> client log:
>>
>> 1304935833 WARN Hypertable :
>> (/home/itlanger/hypertable/src_for_build/hypertable-0.9.4.3-alpha/src/cc/AsyncComm/IOHandlerData.cc:590)
>> Received respon
>> se for non-pending event (id=4424064,version=1,total_len=42)
>> 1304935833 WARN Hypertable :
>> (/home/itlanger/hypertable/src_for_build/hypertable-0.9.4.3-alpha/src/cc/AsyncComm/IOHandlerData.cc:590)
>> Received respon
>> se for non-pending event (id=4424001,version=1,total_len=42)
>> 1304935833 WARN Hypertable :
>> (/home/itlanger/hypertable/src_for_build/hypertable-0.9.4.3-alpha/src/cc/AsyncComm/IOHandlerData.cc:590)
>> Received respon
>> se for non-pending event (id=4424007,version=1,total_len=42)
>> 1304939118 ERROR Hypertable :
>> (/home/itlanger/hypertable/src_for_build/hypertable-0.9.4.3-alpha/src/cc/Hypertable/Lib/TableMutator.cc:51)
>> caught std:
>> :exception:
>> 1304939118 ERROR Hypertable : ~TableMutator
>> (/home/itlanger/hypertable/src_for_build/hypertable-0.9.4.3-alpha/src/cc/Hypertable/Lib/TableMutator.cc:8
>> 0): Hypertable::Exception:  - HYPERTABLE request timeout
>>     at void
>> Hypertable::TableMutator::wait_for_previous_buffer(Hypertable::Timer&)
>> (/home/itlanger/hypertable/src_for_build/hypertable-0.9.4.3-alpha/
>> src/cc/Hypertable/Lib/TableMutator.cc:405)
>>     at bool
>> Hypertable::TableMutatorCompletionCounter::wait_for_completion(Hypertable::Timer&)
>> (/home/itlanger/hypertable/src_for_build/hypertable-0.
>> 9.4.3-alpha/src/cc/Hypertable/Lib/TableMutatorCompletionCounter.h:71): ,
>> final flush
>>
>>
>> Here are the RangeServer log:
>>
>> 1304938467 ERROR Hypertable.RangeServer :
>> (/home/itlanger/hypertable/src_for_build/hypertable-0.9.4.3-alpha/src/cc/AsyncComm/Comm.cc:273)
>> No connection for 10.20.130.101:54645 - COMM not connected
>> 1304938467 ERROR Hypertable.RangeServer :
>> (/home/itlanger/hypertable/src_for_build/hypertable-0.9.4.3-alpha/src/cc/Hypertable/RangeServer/RangeServer.cc:1939)
>> Problem sending OK response - COMM not connected
>> 1304938472 INFO Hypertable.RangeServer :
>> (/home/itlanger/hypertable/src_for_build/hypertable-0.9.4.3-alpha/src/cc/Hypertable/Lib/CommitLog.cc:255)
>> Purging commit log fragments with latest revision older than
>> 1304933365168229004
>> 1304938472 INFO Hypertable.RangeServer :
>> (/home/itlanger/hypertable/src_for_build/hypertable-0.9.4.3-alpha/src/cc/Hypertable/Lib/CommitLog.cc:275)
>> clgc LOG FRAGMENT PURGE breaking because 1304934571714547031 >=
>> 1304933365168229004
>> 1304938472 ERROR Hypertable.RangeServer :
>> (/home/itlanger/hypertable/src_for_build/hypertable-0.9.4.3-alpha/src/cc/AsyncComm/Comm.cc:273)
>> No connection for 10.20.130.101:54645 - COMM not connected
>> 1304938472 ERROR Hypertable.RangeServer :
>> (/home/itlanger/hypertable/src_for_build/hypertable-0.9.4.3-alpha/src/cc/Hypertable/RangeServer/RangeServer.cc:1939)
>> Problem sending OK response - COMM not connected
>> 1304938472 INFO Hypertable.RangeServer :
>> (/home/itlanger/hypertable/src_for_build/hypertable-0.9.4.3-alpha/src/cc/Hypertable/RangeServer/MaintenanceScheduler.cc:171)
>> Memory Statistics (MB): VM=8161.32, RSS=6644.96, tracked=3445.58,
>> computed=3445.58 limit=8032.00
>> 1304938472 INFO Hypertable.RangeServer :
>> (/home/itlanger/hypertable/src_for_build/hypertable-0.9.4.3-alpha/src/cc/Hypertable/RangeServer/MaintenanceScheduler.cc:176)
>> Memory Allocation: BlockCache=87.06% BlockIndex=3.08% BloomFilter=0.44%
>> CellCache=7.96% ShadowCache=0.00% QueryCache=1.45%
>> 1304938477 ERROR Hypertable.RangeServer :
>> (/home/itlanger/hypertable/src_for_build/hypertable-0.9.4.3-alpha/src/cc/AsyncComm/Comm.cc:273)
>> No connection for 10.20.130.101:54645 - COMM not connected
>> ...
>> STAT 1/0[29657_
>> http://www.cartech8.com/forum-29308-1.html..29766_http://www.z0760.com/zhongshan-7179-1-1.html](default)cumulative_size
>>  84239734 <= prune_threshold 200000000
>> STAT 1/0[29766_
>> http://www.z0760.com/zhongshan-7179-1-1.html..29878_http://bbs.manzuo.com/redirect.php-tid-63928-goto-lastpost](default)cumulative_size
>>  84239734 <= prune_threshold 200000000
>> STAT 1/0[29878_
>> http://bbs.manzuo.com/redirect.php-tid-63928-goto-lastpost..ÿÿ](default)cumulative_size
>>  84239734 <= prune_threshold 200000000
>> STAT 1/1[..14993_2562e20f038e03a2ad08fe61f460c0c2](default)
>> cumulative_size 84239734 <= prune_threshold 200000000
>> STAT 1/1[14993_2562e20f038e03a2ad08fe61f460c0c2..ÿÿ](default)
>> cumulative_size 84239734 <= prune_threshold 200000000
>> 1304939767 INFO Hypertable.RangeServer :
>> (/home/itlanger/hypertable/src_for_build/hypertable-0.9.4.3-alpha/src/cc/Hypertable/RangeServer/RangeServer.cc:2817)
>> Memory Usage: 3460595399 bytes
>> 1304939778 INFO Hypertable.RangeServer :
>> (/home/itlanger/hypertable/src_for_build/hypertable-0.9.4.3-alpha/src/cc/Hypertable/RangeServer/TableInfo.cc:184)
>> Adding range 5/0[..ÿÿ] to TableInfo end row = ÿÿ
>> 1304939778 INFO Hypertable.RangeServer :
>> (/home/itlanger/hypertable/src_for_build/hypertable-0.9.4.3-alpha/src/cc/Hypertable/RangeServer/RangeServer.cc:1183)
>> Successfully loaded range 5/0[..ÿÿ]
>> Hypertable.RangeServer:
>> /home/itlanger/hypertable/src_for_build/hypertable-0.9.4.3-alpha/src/cc/Hypertable/RangeServer/TableInfo.cc:181:
>> void Hypertable::TableInfo::add_range(Hypertable::RangePtr&): Assertion
>> `iter == m_range_map.end()' failed.
>>
>>
>> There are approximately 100,000 cell/s inserted, and 50,000 cells/s
>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "Hypertable Development" group.
>> To post to this group, send email to [email protected].
>> To unsubscribe from this group, send email to
>> [email protected].
>> For more options, visit this group at
>> http://groups.google.com/group/hypertable-dev?hl=en.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "Hypertable Development" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/hypertable-dev?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Hypertable Development" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/hypertable-dev?hl=en.

Reply via email to