I note that the specification in question does not deal with arbitrary bit strings but with "entropies" that are 128 to 256 bits long and a multiple of 32 bits. 4 to 8 bits are copied from the front to the end. (So selecting *this* bit field can be done by taking the first byte of a ByteArray.) This makes the sequence 132 to 264 bits. This is then chopped into 11 bit subsequences. These are not arbitrary subsequences and they are not taken in arbitrary order. They are a stream.
My own Smalltalk library include BitInputStream and BitOutputStream, wrapping byte streams. So we could do something like ent := aByteArray size * 8. "BIP-39 ENT" cs := ent // 32. "BIP-39 CS" foo := ByteArray new: (ent + cs) // 8. o := BitOutputStream on: foo writeStream. i := BitInputStream on: aByteArray readStream. 1 to: ent do: [:x | o nextPut: i next]. i reset. o nextUnsigned: cs put: (i nextUnsigned: cs). i close. o close. i := BitInputStream on: foo readStream. ans := (1 to: (ent + cs) // 11) collect: [:x | WordList at: 1 + (i nextUnsigned: 11)]. Stare at this for a bit, and you realise that you don't actually need the working byte array foo. ByteArray methods for: 'bitcoin' mnemonic |ent csn i t| (ent between: 128 and: 256) ifFalse: [self error: 'wrong size for BIP-39']. cs := ent // 32. n := (ent + cs) // 11. i := BitInputStream on: (ReadStream on: self). t := i nextUnsigned: cs. i reset. ^(1 to: n) collect: [:index | WordList at: 1 + (index = n ifTrue: [((i nextUnsigned: 11 - cs) bitShift: cs) bitOr: t] ifFalse: [i nextUnsigned: 11])] My BitInputStream and BitOutputStream classes are, um, not really mature. They aren't *completely* naive, but they could be a lot better, and in particular, BitInputStream>>nextUnsigned: and BitOutputStream>>nextUnsigned:put: are definitely suboptimal. I put this out there just to suggest that there is a completely different way of thinking about the problem. (Actually, this isn't *entirely* unlike using Erlang bit syntax.) Bit*Streams are useful enough to justify primitive support. (Which my classes don't have yet. I did say they are not mature...) This reminds me of a lesson I learned many years ago: STRINGS ARE WRONG. (Thank you, designers of Burroughs Extended Algol!) When trees aren't the answer, streams often are. On 6 March 2018 at 07:21, Esteban A. Maringolo <emaring...@gmail.com> wrote: > 2018-03-05 14:02 GMT-03:00 Stephane Ducasse <stepharo.s...@gmail.com>: > > On Sun, Mar 4, 2018 at 9:43 PM, Esteban A. Maringolo > > <emaring...@gmail.com> wrote: > >> 2018-03-04 17:15 GMT-03:00 Sven Van Caekenberghe <s...@stfx.eu>: > >>> Bits are actually numbered from right to left (seen from how they are > printed). > >> > >> I understand bit operations, used it extensively with IP address eons > ago. > >> > >> But if a spec says: "Take the first n bits from the hash", it means > >> the first significant bits. > >> so in 2r100101111 the first 3 bits are "100" and not "111". > > > > naive question: why? > > Because it says so. > "A checksum is generated by taking the first ENT / 32 bits of its > SHA256 hash. This checksum is appended to the end of the initial > entropy. > Next, these concatenated bits are split into groups of 11 bits, each > encoding a number from 0-2047, serving as an index into a wordlist. > Finally, we convert these numbers into words and use the joined words > as a mnemonic sentence." [1]. > > > To me it looks like a lousy specification. > > It might be, I can't really tell. > > But such manipulation could be useful if you are knitting different > parts of a binary packet whose boundaries are not at byte level, but > bit instead. So you can take "these 5 bits, concatenate with this > other 7, add 13 zero bits, then 1 followed by the payload". I'm > assuming a non real case here though, my use case was fulfilled > already. > > Regards! > > -- > Esteban A. Maringolo > > [1] https://github.com/bitcoin/bips/blob/master/bip-0039. > mediawiki#generating-the-mnemonic > >