Hi Pádraig, Pádraig Brady <p...@draigbrady.com> writes:
> A 58 character encoding that: > - avoids visually ambiguous 0OIl characters > - uses only alphanumeric characters > Described at: > - https://tools.ietf.org/html/draft-msporny-base58-03 > > This implementation uses GMP (or gnulib's gmp fallback). > Performance is good in comparison to other implementations. > For example when using libgmp, encoding is 6 times faster, > and decoding 28 times faster than the implementation > using arbitrary precision ints in cypthon 3.13. > > Memory use is proportional to the size of input. > > Encoding benchmarks: > > $ time yes | head -c65535 | src/basenc --base58 -w0 >file.enc > real 0m1.533s > > $ ./configure --without-libgmp && make # gnulib gmp > $ time yes | head -c65535 | src/basenc --base58 -w0 >file.enc > real 0m3.587s > > # dnf install python3-base58 > $ time yes | head -c65535 | base58 >file.enc # cpython 3.13 > real 0m9.700s > > Decoding benchmarks: > > $ time src/basenc --base58 -d <file.enc >/dev/null > real 0m0.299s > > $ ./configure --without-libgmp && make # gnulib gmp > $ time src/basenc --base58 -d <file.enc >/dev/null > real 0m1.469s > > $ time base58 -d <file.enc >/dev/null # cpython 3.13 > real 0m8.302s > > * src/basenc.c (base_decode_ctx_finalize, base_encode_ctx_init, > base_encode_ctx, base_encode_ctx_finalize): New functions to > provide more general processing functionality. > (base58_{de,en}code_ctx{_init,,_finalize}): New functions to > accumulate all input before calling ... > (base58_{de,en}code): ... the GMP based encoding/decoding routines. > (do_encode, do_decode): Call the ctx variants if enabled. > * doc/coreutils.texi (basenc invocation): Describe the new option, > and indicate the main use case being interactive user use. > * src/local.mk: Link basenc with GMP. > * tests/basenc/basenc.pl: Add test cases. > * NEWS: Mention the new feature. > --- > NEWS | 5 + > doc/coreutils.texi | 9 + > src/basenc.c | 361 ++++++++++++++++++++++++++++++++++++++++- > src/local.mk | 1 + > tests/basenc/basenc.pl | 42 +++++ > 5 files changed, 413 insertions(+), 5 deletions(-) Interesting, is this encoding used anywhere outside of bitcoin? Just curious, the encoding seems interesting regardless. > +static void > +base58_encode (char const* data, size_t data_len, > + char *out, idx_t *outlen) > +{ > + affirm (base_length (data_len) <= *outlen); > + > + size_t leading_zeros = 0; > + while (leading_zeros < data_len && data[leading_zeros] == 0) > + leading_zeros++; > + > + /* Init GMP integer from binary (base 256) data. */ > + mpz_t num; > + mpz_init (num); > + mpz_import (num, data_len, 1, 1, 0, 0, data); > + > + char *ptr = out + *outlen; /* Start just beyond end. */ > + > + /* Convert to base 58 by repeatedly dividing by 58. */ > + mpz_t quotient, remainder; > + mpz_init (quotient); > + mpz_init (remainder); > + while (mpz_cmp_ui (num, 0) > 0) > + { > + mpz_fdiv_qr_ui (quotient, remainder, num, 58); > + unsigned long rem_val = mpz_get_ui (remainder); > + *(--ptr) = base58_alphabet[rem_val]; > + mpz_set (num, quotient); > + } > + > + /* Account for leading zeros. */ > + ptr -= leading_zeros; > + memset (ptr, '1', leading_zeros); > + > + /* Prepare return. */ > + *outlen -= (ptr - out); > + memmove (out, ptr, *outlen); > + > + mpz_clear (num); > + mpz_clear (quotient); > + Mpz_clear (remainder); > + > + return; > +} Did you mean to use 4 spaces for indentation here? Collin