Package: apt-cacher-ng
X-Debbugs-Cc: aeru...@aerusso.net
Severity: important
Tags: patch upstream

Dear maintainer,

In bug 1026395, I identified behavior where apt-cacher-ng was tagging many
valid, referenced files for deletion.  One cause is mentioned there.  However,
I failed to notice another source of erroneous tagging: SHA256 sums in
the Packages/Sources/etc. files are not being detected.

For examples, debrep/dists/bullseye-backports/main/binary-amd64/Packages, 
contains "^SHA256: " lines that are not being used.

The first of the two patches fixes that behavior for Packages.

During this process, a third source showed up: the lists of files in Sources
was getting clobbered because of the behavior of 
cacheman.cc:ParseGenericRfc822File
("we don't merge").

The second patch implements streaming handling of Sources a la Packages.  That
patch parses possibly untrusted data, so please give it a close read (also I
haven't done a lot of C++ coding recently, so I apologize in advance).

I'm tagging this important because most files for bookworm (and later) will
be automatically purged after a few days, since they are found to never be
referenced.  This has significant impact on many use cases for this package.

Best,
Antonio Russo
diff --git a/src/cacheman.cc b/src/cacheman.cc
index 43d8f12..940be40 100644
--- a/src/cacheman.cc
+++ b/src/cacheman.cc
@@ -1700,7 +1700,7 @@ bool cacheman::ParseAndProcessMetaFile(std::function<void(const tRemoteFileInfo&
 	{
 	case EIDX_PACKAGES:
 		LOG("filetype: Packages file");
-		static const string sMD5sum("MD5sum"), sFilename("Filename"), sSize("Size");
+		static const string sMD5sum("MD5sum"), sSHA256("SHA256"), sFilename("Filename"), sSize("Size");
 
 		UrlUnescape(sPkgBaseDir);
 
@@ -1728,6 +1728,8 @@ bool cacheman::ParseAndProcessMetaFile(std::function<void(const tRemoteFileInfo&
 				// not looking for data we already have
 				if(key==sMD5sum)
 					info.fpr.SetCs(val, CSTYPE_MD5);
+				else if(key==sSHA256)
+					info.fpr.SetCs(val, CSTYPE_SHA256);
 				else if(key==sSize)
 					info.fpr.size=atoofft(val.c_str());
 				else if(key==sFilename)
commit 5d03dc3da84531a3902536b2e9fed01d5eb54e23
Author: Antonio Russo <aeru...@aerusso.net>
Date:   Thu Dec 22 04:41:14 2022 -0700

    Streaming support for Sources
    
    Signed-off-by: Antonio Russo <aeru...@aerusso.net>

diff --git a/src/cacheman.cc b/src/cacheman.cc
index 940be40..52f3a38 100644
--- a/src/cacheman.cc
+++ b/src/cacheman.cc
@@ -1695,6 +1695,7 @@ bool cacheman::ParseAndProcessMetaFile(std::function<void(const tRemoteFileInfo&
 	unsigned progHint=0;
 #define STEP 2048
 	tDtorEx postNewline([this, &progHint](){if(progHint>=STEP) SendChunk("<br>\n");});
+	CSTYPES current_cstype = CSTYPES::CSTYPE_INVALID;
 
 	switch(idxType)
 	{
@@ -1911,8 +1912,55 @@ bool cacheman::ParseAndProcessMetaFile(std::function<void(const tRemoteFileInfo&
 		return ParseDebianRfc822Index(reader, ret, sBaseDir, sPkgBaseDir,
 				EIDX_DIFFIDX, CSTYPES::CSTYPE_SHA256, "SHA256-Download", byHashMode);
 	case EIDX_SOURCES:
-		return ParseDebianRfc822Index(reader, ret, sBaseDir, sPkgBaseDir,
-				EIDX_SOURCES, CSTYPES::CSTYPE_MD5, "Files", byHashMode);
+		LOG("filetype: Sources file");
+		static const string sSrcMD5("MD5"), sSrcSHA256("Checksums-Sha256");
+
+		UrlUnescape(sPkgBaseDir);
+
+		while (reader.GetOneLine(sLine))
+		{
+			string key, val;
+			if(CheckStopSignal())
+				return true;
+
+			trimBack(sLine);
+			if(0 == ((++progHint) & (STEP-1)))
+				SendChunk("<wbr>.");
+
+			if (sLine.empty())
+			{
+				current_cstype = CSTYPES::CSTYPE_INVALID;
+				continue;
+			}
+
+			if (isspace((unsigned) (sLine[0])))
+			{
+				if(current_cstype == CSTYPES::CSTYPE_INVALID)
+					continue;
+
+				trimBoth(sLine);
+				info.fpr.csType = current_cstype;
+				if(ParseDebianIndexLine(info, sLine)) {
+					info.sDirectory=sPkgBaseDir;
+					ret(info);
+				}
+				info.SetInvalid();
+				continue;
+			}
+
+			current_cstype = CSTYPES::CSTYPE_INVALID;
+
+			if (ParseKeyValLine(sLine, key, val))
+			{
+				if(key==sSrcMD5)
+					current_cstype = CSTYPE_MD5;
+				else if(key==sSrcSHA256)
+					current_cstype = CSTYPE_SHA256;
+				else
+					continue;
+			}
+		}
+		break;
 	case EIDX_TRANSIDX:
 		return ParseDebianRfc822Index(reader, ret, sBaseDir, sPkgBaseDir,
 				EIDX_TRANSIDX, CSTYPES::CSTYPE_SHA1, "SHA1", byHashMode);

Reply via email to