Hello Emil and all,

Thanks for the clear and easily reproducible bug report.

Attached a suggested fix.
Comments very welcomed,

- Assaf

>From 11330058443e7cc92b4a53322d810725d42b4e34 Mon Sep 17 00:00:00 2001
From: Assaf Gordon <assafgor...@gmail.com>
Date: Mon, 16 Aug 2021 15:03:36 -0600
Subject: [PATCH] basenc: fix bug49741: using wrong decoding buffer length

Emil Lundberg <lundberg.e...@gmail.com> reports in
https://bugs.gnu.org/49741 about a 'basenc --base64 -d' decoding bug.
The input buffer was not divisible by 3, resulting in decoding errors.

* NEWS: Mention fix.
* src/basenc.c (DEC_BLOCKSIZE): Change from 1024*5 to 4200 (35*3*5*8)
which is divisible by 3,4,5,8 - satisfying both base32 and base64;
Use compile-time verify() macro to enforce the above.
* tests/misc/basenc.pl: Add test.
---
 NEWS                 | 4 ++++
 src/basenc.c         | 4 +++-
 tests/misc/basenc.pl | 9 +++++++++
 3 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/NEWS b/NEWS
index ddec56bdf..d490ed101 100644
--- a/NEWS
+++ b/NEWS
@@ -60,6 +60,10 @@ GNU coreutils NEWS                                    -*- outline -*-
   invalid combinations of case character classes.
   [bug introduced in coreutils-8.6]
 
+  basenc --base64 --decode no longer silently discard decoded characters
+  on (1024*5) buffer boundaries
+  [bug introduced in coreutils-8.31]
+
 ** Changes in behavior
 
   cp and install now default to copy-on-write (COW) if available.
diff --git a/src/basenc.c b/src/basenc.c
index 5c97a3652..2ffdb2d27 100644
--- a/src/basenc.c
+++ b/src/basenc.c
@@ -213,7 +213,9 @@ verify (DEC_BLOCKSIZE % 12 == 0);  /* So complete encoded blocks are used.  */
 
 /* Note that increasing this may decrease performance if --ignore-garbage
    is used, because of the memmove operation below.  */
-# define DEC_BLOCKSIZE (1024*5)
+# define DEC_BLOCKSIZE (4200)
+verify (DEC_BLOCKSIZE % 40 == 0); /* complete encoded blocks for base32 */
+verify (DEC_BLOCKSIZE % 12 == 0); /* complete encoded blocks for base64 */
 
 static int (*base_length) (int i);
 static bool (*isbase) (char ch);
diff --git a/tests/misc/basenc.pl b/tests/misc/basenc.pl
index 3383aaeef..ac5394731 100755
--- a/tests/misc/basenc.pl
+++ b/tests/misc/basenc.pl
@@ -37,6 +37,13 @@ my $base64url_out_nl = $base64url_out;
 $base64url_out_nl =~ s/(..)/\1\n/g; # add newline every two characters
 
 
+# Bug 49741:
+# The input  is 'abc' in base64, in an 8K buffer (larger than 1024*5,
+# the buffer size which caused the bug).
+my $base64_bug49741_in = "YWJj" x 2000 ;
+my $base64_bug49741_out = "abc" x 2000 ;
+
+
 my $base32_in = "\xfd\xd8\x07\xd1\xa5";
 my $base32_out = "7XMAPUNF";
 my $x = $base32_out;
@@ -111,6 +118,8 @@ my @Tests =
  ['b64u_7', '--base64url -d',  {IN=>$base64_out},
   {EXIT=>1},  {ERR=>"$prog: invalid input\n"}],
 
+ ['b64_bug49741', '--base64 -d',  {IN=>$base64_bug49741_in},
+  {OUT=>$base64_bug49741_out}],
 
 
 
-- 
2.20.1

Reply via email to