I've pushed out the one line fix for this one. Luke's solution is better,
but will require some refactoring. We actually had a regression test in
place for this uncompressible data case, but there was a subtle bug in the
test that caused the problem to be masked. I've checked in a fix for the
regression as well (see
http://github.com/nuggetwheat/hypertable/commit/347c45aebd4250cf9596f1b0714de21bff62095d
).
- Doug
On Thu, Aug 28, 2008 at 10:57 AM, Joshua Taylor <[EMAIL PROTECTED]>wrote:
> I'm generating a smallish set of random printable rows, columns, and
> values, then inserting a large number of cells generated as combinations of
> the sets. So yes, the individual values are random, but there should be a
> lot of component-wise repetition between cells.
>
> On Thu, Aug 28, 2008 at 10:44 AM, Luke <[EMAIL PROTECTED]> wrote:
>
>
>> Yeah, the bmz itself is tested better than the block compressor
>> itself, which is just covered by a simple regression. The optimization
>> is copied from zlib's block compressor but missed a output.reserve...
>> for some reason. Looks like we should handle checksum and NONE in a
>> different layer and reduce the complexity of the per codec
>> implementation. OTOH, what kind of data you're inserting, Josh?
>> uncompressible random bits?
>>
>> On Aug 28, 9:21 am, "Doug Judd" <[EMAIL PROTECTED]> wrote:
>> > I can see the bug in the code snippet you provided! :) Basically,
>> there's
>> > an optimization where if the size of the compressed buffer is not
>> > significantly less than the uncompressed buffer, then the block
>> compressor
>> > will store the uncompressed data as-is (e.g. compression_type == NONE).
>> It
>> > looks like there is a bug inside the BMZ decompression logic where it is
>> not
>> > properly handling this case. Luke wrote the BMZ compressor, so he'd be
>> the
>> > best person to take care of this one.
>> >
>> > - Doug
>> >
>> > On Wed, Aug 27, 2008 at 7:33 PM, Joshua Taylor <[EMAIL PROTECTED]
>> >wrote:
>> >
>> > > Is the BMZ compressor supported? The RangeServer crashed in the BMZ
>> codec
>> > > when I was dumping 500k random cells into a table using
>> COMPRESSOR="bmz":
>> >
>> > > Core was generated by
>> > > `/data1/home/cosmix/hypertable/bin/Hypertable.RangeServer
>> > > --pidfile=/data1/home/'.
>> > > Program terminated with signal 11, Segmentation fault.
>> > > #0 0x0000000000608723 in
>> Hypertable::BlockCompressionCodecBmz::inflate
>> > > (this=0x174db00, [EMAIL PROTECTED], [EMAIL PROTECTED],
>> [EMAIL PROTECTED])
>> > > at
>> > >
>> /home/josh/hypertable/src/cc/Hypertable/Lib/BlockCompressionCodecBmz.cc:114
>> > > 114
>> > >
>> /home/josh/hypertable/src/cc/Hypertable/Lib/BlockCompressionCodecBmz.cc: No
>> > > such file or directory.
>> > > in
>> > >
>> /home/josh/hypertable/src/cc/Hypertable/Lib/BlockCompressionCodecBmz.cc
>> > > (gdb) directory hypertable/src/cc/Hypertable/Lib
>> > > Source directories searched:
>> > > /data1/home/cosmix/hypertable/src/cc/Hypertable/Lib:$cdir:$cwd
>> > > (gdb) list
>> > > 109 Error::BLOCK_COMPRESSOR_CHECKSUM_MISMATCH);
>> > > 110
>> > > 111 size_t outlen = header.get_data_length();
>> > > 112
>> > > 113 if (header.get_compression_type() == NONE)
>> > > 114 memcpy(output.base, ip, outlen);
>> > > 115 else {
>> > > 116 output.reserve(outlen);
>> > > 117 m_workmem.reserve(bmz_unpack_worklen(outlen), true);
>> > > 118
>> > > (gdb) print output
>> > > $1 = (Hypertable::DynamicBuffer &) @0x1e0aac0: {base = 0x0, ptr = 0x0,
>> size
>> > > = 0, own = true}
>> > > (gdb) print ip
>> > > $2 = (const uint8_t *) 0xbf281a ""
>> > > (gdb) print outlen
>> > > $3 = 76
>> > > (gdb) bt
>> > > #0 0x0000000000608723 in
>> Hypertable::BlockCompressionCodecBmz::inflate
>> > > (this=0x174db00, [EMAIL PROTECTED], [EMAIL PROTECTED],
>> [EMAIL PROTECTED])
>> > > at
>> > >
>> /home/josh/hypertable/src/cc/Hypertable/Lib/BlockCompressionCodecBmz.cc:114
>> > > #1 0x000000000059f859 in Hypertable::CellStoreV0::load_index
>> > > (this=0x1e0aa00) at
>> > > /home/josh/hypertable/src/cc/Hypertable/RangeServer/CellStoreV0.cc:474
>> > > #2 0x000000000058b0f0 in Hypertable::AccessGroup::shrink
>> (this=0xeefa00,
>> > > [EMAIL PROTECTED]) at
>> > > /home/josh/hypertable/src/cc/Hypertable/RangeServer/AccessGroup.cc:489
>> > > #3 0x000000000057911f in Hypertable::Range::split_compact_and_shrink
>> > > (this=0xeffc80) at
>> > > /home/josh/hypertable/src/cc/Hypertable/RangeServer/Range.cc:563
>> > > #4 0x000000000057ad95 in Hypertable::Range::split (this=0xeffc80) at
>> > > /home/josh/hypertable/src/cc/Hypertable/RangeServer/Range.cc:341
>> > > #5 0x0000000000576aad in Hypertable::MaintenanceTaskSplit::execute
>> > > (this=0x1c2e9c0) at
>> > >
>> /home/josh/hypertable/src/cc/Hypertable/RangeServer/MaintenanceTaskSplit.cc:40
>> > > #6 0x0000000000561d0a in
>> Hypertable::MaintenanceQueue::Worker::operator()
>> > > (this=0x4f017158) at
>> > >
>> /home/josh/hypertable/src/cc/Hypertable/RangeServer/MaintenanceQueue.h:108
>> > > #7 0x0000000000561e73 in
>> > >
>> boost::detail::function::void_function_obj_invoker0<Hypertable::MaintenanceQueue::Worker,
>> > > void>::invoke ([EMAIL PROTECTED])
>> > > at
>> > >
>> /home/josh/hypertable/src/cc/boost-1_34-fix/boost/function/function_template.hpp:158
>> > > #8 0x00002aaaab62551e in boost::function0<void,
>> > > std::allocator<boost::function_base> >::operator() () from
>> > > /usr/lib64/libboost_thread-mt.so.3
>> > > #9 0x00002aaaab624f32 in boost::thread_group::join_all () from
>> > > /usr/lib64/libboost_thread-mt.so.3
>> > > #10 0x0000003986106337 in start_thread () from /lib64/libpthread.so.0
>> > > #11 0x0000003790bcc38d in clone () from /lib64/libc.so.6
>> > > #12 0x0000000000000000 in ?? ()
>> > > (gdb)
>> >
>> > > Looks like a memcpy to a null output buffer. Here's the tail of the
>> > > RangeServer log:
>> >
>> > > 1219889447 INFO Hypertable.RangeServer :
>> > >
>> (/home/josh/hypertable/src/cc/Hypertable/RangeServer/RangeServer.cc:700)
>> > > Successfully fetched 65453 bytes of scan data
>> > > RangeServer::fetch_scanblock
>> > > Scanner ID = 1
>> > > 1219889447 INFO Hypertable.RangeServer :
>> > >
>> (/home/josh/hypertable/src/cc/Hypertable/RangeServer/RangeServer.cc:700)
>> > > Successfully fetched 65453 bytes of scan data
>> > > RangeServer::fetch_scanblock
>> > > Scanner ID = 1
>> > > 1219889447 INFO Hypertable.RangeServer :
>> > >
>> (/home/josh/hypertable/src/cc/Hypertable/RangeServer/RangeServer.cc:700)
>> > > Successfully fetched 46970 bytes of scan data
>> > > 1219889447 INFO Hypertable.RangeServer :
>> > >
>> (/home/josh/hypertable/src/cc/Hypertable/RangeServer/RangeServer.cc:656)
>> > > destroying scanner id=1
>> > > 1219889447 INFO Hypertable.RangeServer :
>> > >
>> (/home/josh/hypertable/src/cc/Hypertable/RangeServer/ConnectionHandler.cc:177)
>> > > Event: type=DISCONNECT from=10.10.1.162:43904
>> > > 1219889453 INFO Hypertable.RangeServer :
>> > >
>> (/home/josh/hypertable/src/cc/Hypertable/RangeServer/RangeServer.cc:1883)
>> > > Cleaning log (threshold=200000000)
>> > > 1219889456 INFO Hypertable.RangeServer :
>> > > (/home/josh/hypertable/src/cc/AsyncComm/IOHandler.h:81) Event:
>> > > type=CONNECTION_ESTABLISHED from=10.10.1.162:43909
>> > > 1219889456 INFO Hypertable.RangeServer :
>> > >
>> (/home/josh/hypertable/src/cc/Hypertable/RangeServer/ConnectionHandler.cc:167)
>> > > Event: type=CONNECTION_ESTABLISHED from=10.10.1.162:43909
>> > > RangeServer::update
>> > > {TableIdentifier: name='perf' id='19' generation='1'}1219889457 INFO
>> > > Hypertable.RangeServer :
>> > >
>> (/home/josh/hypertable/src/cc/Hypertable/RangeServer/RangeServer.cc:1188)
>> > > Added 16394 (0 split off) updates to 'perf'
>> > > 1219889457 INFO Hypertable.RangeServer :
>> > >
>> (/home/josh/hypertable/src/cc/Hypertable/RangeServer/RangeServer.cc:1286)
>> > > drj mem=4123295 items=67595 vm-est12234695
>> > > RangeServer::update
>> > > {TableIdentifier: name='perf' id='19' generation='1'}1219889457 INFO
>> > > Hypertable.RangeServer :
>> > >
>> (/home/josh/hypertable/src/cc/Hypertable/RangeServer/RangeServer.cc:1188)
>> > > Added 16394 (0 split off) updates to 'perf'
>> > > 1219889457 INFO Hypertable.RangeServer :
>> > >
>> (/home/josh/hypertable/src/cc/Hypertable/RangeServer/RangeServer.cc:1286)
>> > > drj mem=5123329 items=83989 vm-est15202009
>> > > RangeServer::update
>> > > {TableIdentifier: name='perf' id='19' generation='1'}1219889457 INFO
>> > > Hypertable.RangeServer :
>> > >
>> (/home/josh/hypertable/src/cc/Hypertable/RangeServer/RangeServer.cc:1188)
>> > > Added 16394 (0 split off) updates to 'perf'
>> > > 1219889457 INFO Hypertable.RangeServer :
>> > >
>> (/home/josh/hypertable/src/cc/Hypertable/RangeServer/RangeServer.cc:1286)
>> > > drj mem=6123363 items=100383 vm-est18169323
>> > > 1219889457 INFO Hypertable.RangeServer :
>> > > (/home/josh/hypertable/src/cc/Hypertable/Lib/CommitLog.cc:79)
>> RollLimit =
>> > > 100000000
>> > > 1219889457 DEBUG Hypertable.RangeServer : write
>> > > (/home/josh/hypertable/src/cc/Hypertable/Lib/MetaLogDfsBase.cc:117):
>> > > checksum=12916318 timestamp=1219889457515443000 type=1 payload=149
>> > > 1219889457 INFO Hypertable.RangeServer :
>> > >
>> (/home/josh/hypertable/src/cc/Hypertable/RangeServer/AccessGroup.cc:288)
>> > > Starting Major Compaction of perf[..<FF><FF>](default)
>> > > RangeServer::update
>> > > {TableIdentifier: name='perf' id='19' generation='1'}1219889457 INFO
>> > > Hypertable.RangeServer :
>> > >
>> (/home/josh/hypertable/src/cc/Hypertable/RangeServer/RangeServer.cc:1188)
>> > > Added 16394 (8518 split off) updates to 'perf'
>> > > 1219889457 INFO Hypertable.RangeServer :
>> > >
>> (/home/josh/hypertable/src/cc/Hypertable/RangeServer/RangeServer.cc:1286)
>> > > drj mem=7123397 items=116777 vm-est21136637
>> > > RangeServer::update
>> > > {TableIdentifier: name='perf' id='19' generation='1'}1219889457 INFO
>> > > Hypertable.RangeServer :
>> > >
>> (/home/josh/hypertable/src/cc/Hypertable/RangeServer/RangeServer.cc:1188)
>> > > Added 16394 (9950 split off) updates to 'perf'
>> > > 1219889457 INFO Hypertable.RangeServer :
>> > >
>> (/home/josh/hypertable/src/cc/Hypertable/RangeServer/RangeServer.cc:1286)
>> > > drj mem=8123431 items=133171 vm-est24103951
>> > > RangeServer::update
>> > > {TableIdentifier: name='perf' id='19' generation='1'}1219889458 INFO
>> > > Hypertable.RangeServer :
>> > >
>> (/home/josh/hypertable/src/cc/Hypertable/RangeServer/RangeServer.cc:1188)
>> > > Added 16394 (8246 split off) updates to 'perf'
>> > > 1219889458 INFO Hypertable.RangeServer :
>> > >
>> (/home/josh/hypertable/src/cc/Hypertable/RangeServer/RangeServer.cc:1286)
>> > > drj mem=9123465 items=149565 vm-est27071265
>> > > RangeServer::update
>> > > {TableIdentifier: name='perf' id='19' generation='1'}1219889458 INFO
>> > > Hypertable.RangeServer :
>> > >
>> (/home/josh/hypertable/src/cc/Hypertable/RangeServer/RangeServer.cc:1188)
>> > > Added 16394 (7358 split off) updates to 'perf'
>> > > 1219889458 INFO Hypertable.RangeServer :
>> > >
>> (/home/josh/hypertable/src/cc/Hypertable/RangeServer/RangeServer.cc:1286)
>> > > drj mem=10123499 items=165959 vm-est30038579
>> > > RangeServer::update
>> > > {TableIdentifier: name='perf' id='19' generation='1'}1219889458 INFO
>> > > Hypertable.RangeServer :
>> > >
>> (/home/josh/hypertable/src/cc/Hypertable/RangeServer/RangeServer.cc:1188)
>> > > Added 16394 (10666 split off) updates to 'perf'
>> > > 1219889458 INFO Hypertable.RangeServer :
>> > >
>> (/home/josh/hypertable/src/cc/Hypertable/RangeServer/RangeServer.cc:1286)
>> > > drj mem=11123533 items=182353 vm-est33005893
>> > > RangeServer::update
>> > > {TableIdentifier: name='perf' id='19' generation='1'}1219889458 INFO
>> > > Hypertable.RangeServer :
>> > >
>> (/home/josh/hypertable/src/cc/Hypertable/RangeServer/RangeServer.cc:1188)
>> > > Added 16394 (7086 split off) updates to 'perf'
>> > > 1219889458 INFO Hypertable.RangeServer :
>> > >
>> (/home/josh/hypertable/src/cc/Hypertable/RangeServer/RangeServer.cc:1286)
>> > > drj mem=12123567 items=198747 vm-est35973207
>> > > RangeServer::update
>> > > {TableIdentifier: name='perf' id='19' generation='1'}1219889459 INFO
>> > > Hypertable.RangeServer :
>> > >
>> (/home/josh/hypertable/src/cc/Hypertable/RangeServer/AccessGroup.cc:430)
>> > > Finished Compaction of perf[..<FF><FF>](default)
>> > > 1219889459 INFO Hypertable.RangeServer :
>> >
>> > ...
>> >
>> > read more ยป
>>
>>
>
> >
>
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Hypertable Development" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at
http://groups.google.com/group/hypertable-dev?hl=en
-~----------~----~----~----~------~----~------~--~---