I think originally that merge compaction does not strip out of the garbage
(such as expired or deleted cell), but after reading the source code of
AccessGroup::run_compaction ,it seems that all compaction except minor
compaction strip out of the garbage. I think the following code fragment
can prove my idea:
......
{
if (m_in_memory) {
mscanner = new MergeScannerAccessGroup(m_table_name, scan_context,
MergeScanner::ACCUMULATE_COUNTERS);
scanner = mscanner;
m_cell_cache_manager->add_immutable_scanner(mscanner, scan_context);
filtered_cache = new CellCache();
}
else if (merging) {
mscanner = new MergeScannerAccessGroup(m_table_name, scan_context,
MergeScanner::IS_COMPACTION |
MergeScanner::RETURN_DELETES);
scanner = mscanner;
max_num_entries = 0;
for (size_t i=merge_offset; i<merge_offset+merge_length; i++) {
HT_ASSERT(m_stores[i].cs);
mscanner->add_scanner(m_stores[i].cs->create_scanner(scan_context));
int divisor =
(boost::any_cast<uint32_t>(m_stores[i].cs->get_trailer()->get("flags")) &
CellStoreTrailerV6::SPLIT) ? 2: 1;
max_num_entries += (boost::any_cast<int64_t>
(m_stores[i].cs->get_trailer()->get("total_entries")))/divisor;
}
}
else if (major || gc) {
mscanner = new MergeScannerAccessGroup(m_table_name, scan_context,
MergeScanner::IS_COMPACTION |
MergeScanner::ACCUMULATE_COUNTERS);
scanner = mscanner;
m_cell_cache_manager->add_immutable_scanner(mscanner, scan_context);
for (size_t i=0; i<m_stores.size(); i++) {
HT_ASSERT(m_stores[i].cs);
mscanner->add_scanner(m_stores[i].cs->create_scanner(scan_context));
int divisor =
(boost::any_cast<uint32_t>(m_stores[i].cs->get_trailer()->get("flags")) &
CellStoreTrailerV6::SPLIT) ? 2: 1;
max_num_entries += (boost::any_cast<int64_t>
(m_stores[i].cs->get_trailer()->get("total_entries")))/divisor;
}
}
else {
scanner =
m_cell_cache_manager->create_immutable_scanner(scan_context);
HT_ASSERT(scanner);
}
}
cellstore->create(cs_file.c_str(), max_num_entries, m_cellstore_props,
&m_identifier);
while (scanner->get(key, value)) {
cellstore->add(key, value);
if (m_in_memory)
filtered_cache->add(key, value);
scanner->forward();
}
......
Obviously, merging, major and gc do the similar operation, i.e. all place
the data purged into the new cellstore file, so i think all strip out of
the garbage.
Maybe i misunderstand the source code. Any ideas can be appreciated!
--
You received this message because you are subscribed to the Google Groups
"Hypertable Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/hypertable-dev.
For more options, visit https://groups.google.com/d/optout.