Hi, Sanjit,
there some code in patch :
diff --git a/src/cc/Hypertable/RangeServer/MaintenanceScheduler.cc b/
src/cc/Hypertable/RangeServer/MaintenanceScheduler.cc
index 4c408e3..6c55cde 100644
--- a/src/cc/Hypertable/RangeServer/MaintenanceScheduler.cc
+++ b/src/cc/Hypertable/RangeServer/MaintenanceScheduler.cc
@@ -86,9 +86,16 @@ void MaintenanceScheduler::schedule() {
* Purge commit log fragments
*/
{
- int64_t revision_root = TIMESTAMP_MAX;
- int64_t revision_metadata = TIMESTAMP_MAX;
- int64_t revision_user = TIMESTAMP_MAX;
+ int64_t revision_user;
+ int64_t revision_metadata;
+ int64_t revision_root;
+
+ (Global::user_log !=0) ?
+ revision_user = Global::user_log->get_latest_revision() :
TIMESTAMP_MIN;
+ (Global::metadata_log !=0) ?
+ revision_metadata = Global::metadata_log->get_latest_revision
() : TIMESTAMP_MIN;
+ (Global::root_log !=0) ?
+ revision_root = Global::root_log->get_latest_revision() :
TIMESTAMP_MIN;
I think they are wrong. they should be :
+ revision_user = (Global::user_log !=0) ?
+ Global::user_log->get_latest_revision() : TIMESTAMP_MIN;
+ revision_metadata = (Global::metadata_log !=0) ?
+ Global::metadata_log->get_latest_revision() : TIMESTAMP_MIN;
+ revision_root = (Global::root_log !=0) ?
+ Global::root_log->get_latest_revision() : TIMESTAMP_MIN;
Is that right???
-- kuer
On 7月23日, 下午10时18分, Sanjit Jhala <[email protected]> wrote:
> Hi Kuer,
>
> As I mentioned before, the real issue here is why the logging access
> group in the METADATA table is being compacted. That seems to be the
> root of any corruption. Coming to the issue of commit log replay, we
> have a couple of log replay fixes in the soon to be released 0.9.2.5
> release. You can apply the attached patches to see if that solves the
> problem.
>
> -Sanjit
>
> 0001-Fixed-bug-in-CommitLogReader-caused-by-fragment-queu.patch
> 6K查看下载
>
>
>
> 0002-Fixed-bug-in-CommitLog-that-was-causing-some-fragmen.patch
> 5K查看下载
>
>
>
> On Jul 22, 2009, at 6:20 PM, kuer wrote:
>
>
>
> > Hi, Sanjit,
>
> > I have patched the HT_INFOF() in AccessGroup.cc.
>
> > But, I modified CellStoreV1.cc to make sure m_trailer.num_filter_items
> >> 0 when creating BloomFilter. I think this value is something like
> > capacity of bloomfilter, so enlarging it will not make much trouble.
>
> > After modifying and launching, the rangeserver cannot finish log-
> > replaying. It seemed that the modified version rangeserver has
> > destroy something. This time "RANGE SERVER range not found".
>
> > I post it in another post:
> >http://groups.google.com/group/hypertable-dev/browse_thread/thread/fd...
>
> > Thanks
>
> > -- kuer
>
> > On 7月23日, 上午7时38分, Sanjit Jhala <[email protected]> wrote:
> >> Hi Kuer,
>
> >> I suspect the BloomFilter code is fine. Taking a look at your logs it
> >> looks like the METADATA table is going through a split and the
> >> AccessGroup "logging" is undergoing a major compaction, however since
> >> it is empty, nothing gets inserted in BloomFilterItems and hence the
> >> assert gets hit.
> >> (From the log:
> >> 2009-07-22 09:39:01,415 1351514432 Hypertable.RangeServer [INFO]
> >> (RangeServer/AccessGroup.cc:379) Starting Major Compaction of
> >> METADATA[0:<FF> <FF>..<FF><FF>](logging))
>
> >> This AccessGroup is currently not used by the system and so it should
> >> be empty and should not undergo a compaction. Can you make the change
> >> (below) to AccessGroup.cc so we can have a better idea of why its
> >> compacting? I suspect it might be a memory corruption issue (that we
> >> have a fix for in the upcoming release).
>
> >> It would be great if you ran the RangeServer with valgrind turned on
> >> and see if the valgrind log reveals anything further (to do this add
> >> the option --valgrind-rangeserver when calling the start-all-
> >> servers.sh script (eg: <$HYPERTABLE_INSTALL_DIR/bin/start-all-
> >> servers.sh --valgrind-rangeserver > ))
>
> >> -Sanjit
>
> >> --- a/src/cc/Hypertable/RangeServer/AccessGroup.cc
> >> +++ b/src/cc/Hypertable/RangeServer/AccessGroup.cc
> >> @@ -375,8 +375,8 @@ void AccessGroup::run_compaction(bool major) {
> >> if (m_immutable_cache->memory_used()==0 && m_stores.size()
> >> <= (size_t)1)
> >> HT_THROW(Error::OK, "");
> >> tableidx = 0;
> >> - HT_INFOF("Starting Major Compaction of %s(%s)",
> >> - m_range_name.c_str(), m_name.c_str());
> >> + HT_INFOF("Starting Major Compaction of %s(%s) immutable
> >> cache
> >> mem=%llu, num cell stores=%d",
> >> + m_range_name.c_str(), m_name.c_str(),
> >> m_immutable_cache->memory_used(), m_stores.size());
> >> }
> >> else {
> >> if (m_stores.size() >
> >> (size_t)Global::access_group_max_files) {
>
> >> On Jul 22, 2009, at 4:11 AM, kuer wrote:
>
> >>> Hi, all,
>
> >>> I find something interesting in cc/Hypertable/RangeServer/
> >>> CellStoreV1.cc :
>
> >>> 168 if (m_bloom_filter_mode != BLOOM_FILTER_DISABLED) {
> >>> 169 m_bloom_filter_items = new BloomFilterItems(); //
> >>> aproximator
> >>> items
> >>> 170 }
>
> >>> 367
> >>> 368 // if bloom_items haven't been spilled to create a bloom
> >>> filter
> >>> yet, do it
> >>> 369 if (m_bloom_filter_mode != BLOOM_FILTER_DISABLED) {
> >>> 370 if (m_bloom_filter_items) {
> >>> ^^^^^^^^^^^^^^^^^^^^^^^^^^
> >>> I think this cannot promise m_bloom_filter_items->size() > 0
>
> >>> 371 m_trailer.num_filter_items = m_bloom_filter_items->size();
> >>> ^^^^^ How about adding the following lines ???
> >>> +
> >>> + if (m_trailer.num_filter_items < 1 ) {
> >>> + m_trailer.num_filter_items = m_max_entries;
> >>> + }
> >>> + if (m_trailer.num_filter_items < 1) {
> >>> + m_trailer.num_filter_items = 1;
> >>> + }
> >>> +
> >>> 372 create_bloom_filter();
> >>> 373 }
> >>> 374 assert(!m_bloom_filter_items && m_bloom_filter);
> >>> 375
> >>> 376 m_bloom_filter->serialize(send_buf);
> >>> 377 m_filesys->append(m_fd, send_buf, 0, &m_sync_handler);
> >>> 378
> >>> 379 m_outstanding_appends++;
> >>> 380 m_offset += m_bloom_filter->size();
> >>> 381 }
> >>> 382
>
> >>> thanks
>
> >>> -- kuer
>
> >>> On 7月22日, 下午5时05分, kuer <[email protected]> wrote:
> >>>> Hi, all,
>
> >>>> the content of the file that cause assertion failure of
> >>>> BloomFilter :
>
> >>>> /hypertable/tables/METADATA/logging/AB2A0D28DE6B77FFDD6C72AF/cs0
>
> >>>> $ hexdump -C cs0
> >>>> 00000000 49 64 78 46 69 78 2d 2d 2d 2d 1a 00 ff ff ff ff |
> >>>> IdxFix----......|
> >>>> 00000010 00 00 00 00 00 00 00 00 7d 9f 49 64 78 56 61 72
> >>>> |........}.IdxVar|
> >>>> 00000020 2d 2d 2d 2d 1a 00 ff ff ff ff 00 00 00 00 00 00
> >>>> |----............|
> >>>> 00000030 00 00 87 97 |....|
> >>>> 00000034
>
> >>>> FYI
>
> >>>> -- kuer
>
> >>>> On 7月22日, 下午1时03分, Sanjit Jhala <[email protected]>
> >>>> wrote:
>
> >>>>> Recovering ranges from crashed RangeServers is one of the high
> >>>>> priority items Doug is working on.
>
> >>>>> -Sanjit
> >
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Hypertable Development" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/hypertable-dev?hl=en
-~----------~----~----~----~------~----~------~--~---