Kelson has submitted this change and it was merged. Change subject: [zimwriterfs] Try to avoid too big cluster. ......................................................................
[zimwriterfs] Try to avoid too big cluster. We check that cluster will not be too big *before* adding the content. This way, cluster are always closed before the maximum size and not just after. The only way a cluster can be too big is if the content of a sole article is bigger than the maximum size. Change-Id: I77a581df46ae87e01a3fe2689570a7c7355d1877 --- M zimlib/src/zimcreator.cpp 1 file changed, 9 insertions(+), 5 deletions(-) Approvals: Kelson: Verified; Looks good to me, approved diff --git a/zimlib/src/zimcreator.cpp b/zimlib/src/zimcreator.cpp index 0f0f6d0..1e4a21c 100644 --- a/zimlib/src/zimcreator.cpp +++ b/zimlib/src/zimcreator.cpp @@ -222,12 +222,12 @@ myDirents = &uncompDirents; otherDirents = &compDirents; } - dirents.back().setCluster(clusterOffsets.size(), cluster->count()); - cluster->addBlob(blob); - myDirents->push_back(dirents.size()-1); - // If cluster is now large enough, write it to disk. - if (cluster->size() >= minChunkSize * 1024) + // If cluster will be too large, write it to dis, and open a new + // one for the content. + if ( cluster->count() + && cluster->size()+blob.size() >= minChunkSize * 1024 + ) { log_info("cluster with " << cluster->count() << " articles, " << cluster->size() << " bytes; current title \"" << @@ -249,6 +249,10 @@ currentSize += (end - start) + sizeof(offset_type) /* for cluster pointer entry */; } + + dirents.back().setCluster(clusterOffsets.size(), cluster->count()); + cluster->addBlob(blob); + myDirents->push_back(dirents.size()-1); } // When we've seen all articles, write any remaining clusters. -- To view, visit https://gerrit.wikimedia.org/r/315238 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I77a581df46ae87e01a3fe2689570a7c7355d1877 Gerrit-PatchSet: 1 Gerrit-Project: openzim Gerrit-Branch: master Gerrit-Owner: Mgautierfr <mgaut...@kymeria.fr> Gerrit-Reviewer: Kelson <kel...@kiwix.org> _______________________________________________ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits