Hi Kuer,

As I mentioned before, the real issue here is why the logging access  
group in the METADATA table is being compacted. That seems to be the  
root of any corruption. Coming to the issue of commit log replay, we  
have a couple of log replay fixes in the soon to be released 0.9.2.5  
release. You can apply the attached patches to see if that solves the  
problem.

-Sanjit



--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Hypertable Development" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/hypertable-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

Attachment: 0001-Fixed-bug-in-CommitLogReader-caused-by-fragment-queu.patch
Description: Binary data


Attachment: 0002-Fixed-bug-in-CommitLog-that-was-causing-some-fragmen.patch
Description: Binary data




On Jul 22, 2009, at 6:20 PM, kuer wrote:

>
> Hi, Sanjit,
>
> I have patched the HT_INFOF() in AccessGroup.cc.
>
> But, I modified CellStoreV1.cc to make sure m_trailer.num_filter_items
>> 0 when creating BloomFilter. I think this value is something like
> capacity of bloomfilter, so enlarging it will not make much trouble.
>
> After modifying and launching, the rangeserver cannot finish log-
> replaying.  It seemed that the modified version rangeserver has
> destroy something. This time "RANGE SERVER range not found".
>
> I post it in another post:
> http://groups.google.com/group/hypertable-dev/browse_thread/thread/fd137bfa8e98281a
>
> Thanks
>
>   -- kuer
>
>
> On 7月23日, 上午7时38分, Sanjit Jhala <[email protected]> wrote:
>> Hi Kuer,
>>
>> I suspect the BloomFilter code is fine. Taking a look at your logs it
>> looks like the METADATA table is going through a split and the
>> AccessGroup "logging" is undergoing a major compaction, however since
>> it is empty, nothing gets inserted in BloomFilterItems and hence the
>> assert gets hit.
>> (From the log:
>> 2009-07-22 09:39:01,415 1351514432 Hypertable.RangeServer [INFO]
>> (RangeServer/AccessGroup.cc:379) Starting Major Compaction of
>> METADATA[0:<FF> <FF>..<FF><FF>](logging))
>>
>> This AccessGroup is currently not used by the system and so it should
>> be empty and should not undergo a compaction. Can you make the change
>> (below) to AccessGroup.cc so we can have a better idea of why its
>> compacting? I suspect it might be a memory corruption issue (that we
>> have a fix  for in the upcoming release).
>>
>> It would be great if you ran the RangeServer with valgrind turned on
>> and see if the valgrind log reveals anything further (to do this add
>> the option --valgrind-rangeserver when calling the start-all-
>> servers.sh script (eg: <$HYPERTABLE_INSTALL_DIR/bin/start-all-
>> servers.sh --valgrind-rangeserver > ))
>>
>> -Sanjit
>>
>> --- a/src/cc/Hypertable/RangeServer/AccessGroup.cc
>> +++ b/src/cc/Hypertable/RangeServer/AccessGroup.cc
>> @@ -375,8 +375,8 @@ void AccessGroup::run_compaction(bool major) {
>>           if (m_immutable_cache->memory_used()==0 && m_stores.size()
>> <= (size_t)1)
>>             HT_THROW(Error::OK, "");
>>           tableidx = 0;
>> -        HT_INFOF("Starting Major Compaction of %s(%s)",
>> -                 m_range_name.c_str(), m_name.c_str());
>> +        HT_INFOF("Starting Major Compaction of %s(%s) immutable  
>> cache
>> mem=%llu, num cell stores=%d",
>> +                 m_range_name.c_str(), m_name.c_str(),
>> m_immutable_cache->memory_used(), m_stores.size());
>>         }
>>         else {
>>           if (m_stores.size() >
>> (size_t)Global::access_group_max_files) {
>>
>> On Jul 22, 2009, at 4:11 AM, kuer wrote:
>>
>>
>>
>>> Hi, all,
>>
>>> I find something interesting in cc/Hypertable/RangeServer/
>>> CellStoreV1.cc :
>>
>>> 168   if (m_bloom_filter_mode != BLOOM_FILTER_DISABLED) {
>>> 169     m_bloom_filter_items = new BloomFilterItems(); //  
>>> aproximator
>>> items
>>> 170   }
>>
>>> 367
>>> 368   // if bloom_items haven't been spilled to create a bloom  
>>> filter
>>> yet, do it
>>> 369   if (m_bloom_filter_mode != BLOOM_FILTER_DISABLED) {
>>> 370     if (m_bloom_filter_items) {
>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^
>>> I think this cannot promise m_bloom_filter_items->size() > 0
>>
>>> 371       m_trailer.num_filter_items = m_bloom_filter_items->size();
>>> ^^^^^ How about adding the following lines ???
>>> +
>>> +           if (m_trailer.num_filter_items <  1 ) {
>>> +               m_trailer.num_filter_items = m_max_entries;
>>> +           }
>>> +           if (m_trailer.num_filter_items < 1) {
>>> +               m_trailer.num_filter_items = 1;
>>> +           }
>>> +
>>> 372       create_bloom_filter();
>>> 373     }
>>> 374     assert(!m_bloom_filter_items && m_bloom_filter);
>>> 375
>>> 376     m_bloom_filter->serialize(send_buf);
>>> 377     m_filesys->append(m_fd, send_buf, 0, &m_sync_handler);
>>> 378
>>> 379     m_outstanding_appends++;
>>> 380     m_offset += m_bloom_filter->size();
>>> 381   }
>>> 382
>>
>>> thanks
>>
>>> -- kuer
>>
>>> On 7月22日, 下午5时05分, kuer <[email protected]> wrote:
>>>> Hi, all,
>>
>>>> the content of the file that cause assertion failure of  
>>>> BloomFilter :
>>
>>>> /hypertable/tables/METADATA/logging/AB2A0D28DE6B77FFDD6C72AF/cs0
>>
>>>> $ hexdump -C cs0
>>>> 00000000  49 64 78 46 69 78 2d 2d  2d 2d 1a 00 ff ff ff ff  |
>>>> IdxFix----......|
>>>> 00000010  00 00 00 00 00 00 00 00  7d 9f 49 64 78 56 61 72
>>>> |........}.IdxVar|
>>>> 00000020  2d 2d 2d 2d 1a 00 ff ff  ff ff 00 00 00 00 00 00
>>>> |----............|
>>>> 00000030  00 00 87 97                                       |....|
>>>> 00000034
>>
>>>>  FYI
>>
>>>>    -- kuer
>>
>>>> On 7月22日, 下午1时03分, Sanjit Jhala <[email protected]>
>>>> wrote:
>>
>>>>>   Recovering ranges from crashed RangeServers is one of the high
>>>>> priority items Doug is working on.
>>
>>>>> -Sanjit
> --~--~---------~--~----~------------~-------~--~----~
> You received this message because you are subscribed to the Google  
> Groups "Hypertable Development" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to 
> [email protected]
> For more options, visit this group at 
> http://groups.google.com/group/hypertable-dev?hl=en
> -~----------~----~----~----~------~----~------~--~---
>

Reply via email to