I was more or less right. The input string, which you encoded to 
"Qm9uam91ciwgam95ZXV4IGxpb24K", contains an encoded newline at the end. 
It's not spurious.

Confirmed by the "echo" pipeline I gave above, or in Go itself:
https://go.dev/play/p/6kSxiCfCTo4

You can also confirm it by multiplying the length of the input by 3/4 

% echo -n "Qm9uam91ciwgam95ZXV4IGxpb24K" | wc -c
      28

28*3/4 = 21
B o n j o u r
, _ j o y e u
x _ l i o n \n


On Tuesday, 14 January 2025 at 10:10:22 UTC Brian Candler wrote:

> Sorry ignore that, I hadn't checked your playground link.
>
> On Tuesday, 14 January 2025 at 10:07:53 UTC Brian Candler wrote:
>
>> > AS I wrote earlier, I'm trying to avoid reading the entire email part 
>> into memory to discover if I should use base64.StdEncoding or 
>> base64.RawStdEncoding.
>>
>> As I asked before, why would you ever need to use RawStdEncoding? It just 
>> means the MIME part was invalid, most likely corrupted/truncated.
>>
>> > One odd thing is that I'm getting extraneous newlines (shown by stars 
>> in the output), eg:
>>
>> You are feeding two different inputs which do not differ by truncation 
>> alone.
>>
>> % echo -n "Qm9uam91ciwgam95ZXV4IGxpb24K" | base64 -D | hexdump -c
>> 0000000   B   o   n   j   o   u   r   ,       j   o   y   e   u   x
>> 0000010   l   i   o   n  \n
>> 0000015
>>
>> % echo -n "IkJvbmpvdXIsIGpveWV1eCBsaW9uIg==" | base64 -D | hexdump -c
>> 0000000   "   B   o   n   j   o   u   r   ,       j   o   y   e   u   x
>> 0000010       l   i   o   n   "
>> 0000016
>>
>> The second one has encoded double-quotes before and after the content.
>>
>> On Monday, 13 January 2025 at 22:43:51 UTC Rory Campbell-Lange wrote:
>>
>>> AS I wrote earlier, I'm trying to avoid reading the entire email part 
>>> into memory to discover if I should use base64.StdEncoding or 
>>> base64.RawStdEncoding. 
>>>
>>> The following seems to work reasonably well: 
>>>
>>> type B64Translator struct { 
>>> br *bufio.Reader 
>>> } 
>>>
>>> func NewB64Translator(r io.Reader) *B64Translator { 
>>> return &B64Translator{ 
>>> br: bufio.NewReader(r), 
>>> } 
>>> } 
>>>
>>> // Read reads off the buffered reader expecting base64.StdEncoding bytes 
>>> // with (potentially) 1-3 '=' padding characters at the end. 
>>> // RawStdEncoding can be used for both StdEncoded and RawStdEncoded data 
>>> // if the padding is removed. 
>>> func (b *B64Translator) Read(p []byte) (n int, err error) { 
>>> h := make([]byte, len(p)) 
>>> n, err = b.br.Read(h) 
>>> if err != nil { 
>>> return n, err 
>>> } 
>>> // to be optimised 
>>> c := bytes.Count(h, []byte("=")) 
>>> copy(p, h[:n-c]) 
>>> // fmt.Println(string(h), n, string(p), n-c) 
>>> return n - c, nil 
>>> } 
>>>
>>> https://go.dev/play/p/H6ii7Vy-8as 
>>>
>>> One odd thing is that I'm getting extraneous newlines (shown by stars in 
>>> the output), eg: 
>>>
>>> -- 
>>> raw: Bonjour joyeux lion 
>>> Qm9uam91ciwgam95ZXV4IGxpb24K 
>>> ok: false 
>>> decoded: Bonjour, joyeux lion* <-------------------- e.g. here 
>>> -- 
>>> std: "Bonjour, joyeux lion" 
>>> IkJvbmpvdXIsIGpveWV1eCBsaW9uIg== 
>>> ok: true 
>>> decoded: "Bonjour, joyeux lion" 
>>> -- 
>>>
>>> Any thoughts on that would be gratefully received. 
>>>
>>> Rory 
>>>
>>>
>>> On 13/01/25, Rory Campbell-Lange (ro...@campbell-lange.net) wrote: 
>>> > Thanks very much for the playground link and thoughts. 
>>> > 
>>> > The use case is reading base64 email parts, which could be of a very 
>>> large size. It is unclear when processing these parts if they are base64 
>>> padded or not. 
>>> > 
>>> > I'm trying to avoid reading the entire email part into memory. 
>>> Consequently I think your earlier idea of adding padding (or removing it) 
>>> in a wrapper could work. Perhaps wrapping the reader with another using a 
>>> bufio.Reader to track bytes read and detect EOF. At EOF the wrapper could 
>>> add padding if needed. 
>>> > 
>>> > Rory 
>>> > 
>>> > On 13/01/25, Axel Wagner (axel.wa...@googlemail.com) wrote: 
>>> > > Just realized: If you twist the idea around, you get something easy 
>>> to 
>>> > > implement and more correct. 
>>> > > Instead of stripping padding if it exist, you can ensure that the 
>>> body *is* 
>>> > > padded to a multiple of 4 bytes: https://go.dev/play/p/SsPRXV9ZfoS 
>>> > > You can then feed that to base64.StdEncoding. If the wrapped Reader 
>>> returns 
>>> > > padded Base64, this does nothing. If it returns unpadded Base64, it 
>>> adds 
>>> > > padding. If it returns incorrect Base64, it will create a padded 
>>> stream, 
>>> > > that will then get rejected by the Base64 decoder. 
>>> > > 
>>> > > On Mon, 13 Jan 2025 at 10:31, Axel Wagner <axel.wa...@googlemail.com> 
>>>
>>> > > wrote: 
>>> > > 
>>> > > > Hi, 
>>> > > > 
>>> > > > one way to solve your problem is to wrap the body into an 
>>> io.Reader that 
>>> > > > strips off everything after the first `=` it finds. That can then 
>>> be fed to 
>>> > > > base64.RawStdEncoding. This approach requires no extra buffering 
>>> or copying 
>>> > > > and is easy to implement: https://go.dev/play/p/CwcVz7oietI 
>>> > > > 
>>> > > > The downside is, that this will not verify that the body is 
>>> *either* 
>>> > > > correctly padded Base64 *or* unpadded Base64. So, it will not 
>>> report an 
>>> > > > error if fed something like "AAA=garbage". 
>>> > > > That can be remedied by buffering up to four bytes and, when 
>>> encountering 
>>> > > > an EOF, check that there are at most three trailing `=` and that 
>>> the total 
>>> > > > length of the stream is divisible by four. It's more finicky to 
>>> implement, 
>>> > > > but it should also be possible without any extra copies and only 
>>> requires a 
>>> > > > very small extra buffer. 
>>> > > > 
>>> > > > On Sun, 12 Jan 2025 at 22:29, Rory Campbell-Lange <
>>> ro...@campbell-lange.net> 
>>> > > > wrote: 
>>> > > > 
>>> > > >> Thanks very much for the links, pointers and possible solution. 
>>> > > >> 
>>> > > >> Trying to read base64 standard (padded) encoded data with 
>>> > > >> base64.RawStdEncoding can produce an error such as 
>>> > > >> 
>>> > > >> illegal base64 data at input byte <n> 
>>> > > >> 
>>> > > >> Reading base64 raw (unpadded) encoded data produces the EOF 
>>> error. 
>>> > > >> 
>>> > > >> I'll go with trying to read the standard encoded data up to maybe 
>>> 1MB and 
>>> > > >> then switch to base64.RawStdEncoding if I hit the "illegal base64 
>>> data" 
>>> > > >> problem, maybe with reference to bufio.Reader which has most of 
>>> the methods 
>>> > > >> suggested below. 
>>> > > >> 
>>> > > >> Yes, the use of a "Rewind" method would be crucial. I guess this 
>>> would 
>>> > > >> need to: 
>>> > > >> 1. error if more than one buffer of data has been read 
>>> > > >> 2. else re-read from byte 0 
>>> > > >> 
>>> > > >> Thanks again very much for these suggestions. 
>>> > > >> 
>>> > > >> Rory 
>>> > > >> 
>>> > > >> On 12/01/25, robert engels (ren...@ix.netcom.com) wrote: 
>>> > > >> > Also, see this 
>>> > > >> 
>>> https://stackoverflow.com/questions/69753478/use-base64-stdencoding-or-base64-rawstdencoding-to-decode-base64-string-in-go
>>>  
>>> > > >> as I expected the error should be reported earlier than the end 
>>> of stream 
>>> > > >> if the chosen format is wrong. 
>>> > > >> > 
>>> > > >> > > On Jan 12, 2025, at 2:57 PM, robert engels <
>>> ren...@ix.netcom.com> 
>>> > > >> wrote: 
>>> > > >> > > 
>>> > > >> > > Also, this is what Gemini provided which looks basically 
>>> correct - 
>>> > > >> but I think encapsulating it with a Rewind() method would be 
>>> easier to 
>>> > > >> understand. 
>>> > > >> > > 
>>> > > >> > > 
>>> > > >> > > 
>>> > > >> > > While Go doesn't have a built-in PushbackReader like some 
>>> other 
>>> > > >> languages (e.g., Java), you can implement similar functionality 
>>> using a 
>>> > > >> custom struct and a buffer. 
>>> > > >> > > 
>>> > > >> > > Here's an example implementation: 
>>> > > >> > > 
>>> > > >> > > package main 
>>> > > >> > > 
>>> > > >> > > import ( 
>>> > > >> > > "bytes" 
>>> > > >> > > "io" 
>>> > > >> > > ) 
>>> > > >> > > 
>>> > > >> > > type PushbackReader struct { 
>>> > > >> > > reader io.Reader 
>>> > > >> > > buffer *bytes.Buffer 
>>> > > >> > > } 
>>> > > >> > > 
>>> > > >> > > func NewPushbackReader(r io.Reader) *PushbackReader { 
>>> > > >> > > return &PushbackReader{ 
>>> > > >> > > reader: r, 
>>> > > >> > > buffer: new(bytes.Buffer), 
>>> > > >> > > } 
>>> > > >> > > } 
>>> > > >> > > 
>>> > > >> > > func (p *PushbackReader) Read(b []byte) (n int, err error) { 
>>> > > >> > > if p.buffer.Len() > 0 { 
>>> > > >> > > return p.buffer.Read(b) 
>>> > > >> > > } 
>>> > > >> > > return p.reader.Read(b) 
>>> > > >> > > } 
>>> > > >> > > 
>>> > > >> > > func (p *PushbackReader) UnreadByte() error { 
>>> > > >> > > if p.buffer.Len() == 0 { 
>>> > > >> > > return io.EOF 
>>> > > >> > > } 
>>> > > >> > > lastByte := p.buffer.Bytes()[p.buffer.Len()-1] 
>>> > > >> > > p.buffer.Truncate(p.buffer.Len() - 1) 
>>> > > >> > > p.buffer.WriteByte(lastByte) 
>>> > > >> > > return nil 
>>> > > >> > > } 
>>> > > >> > > 
>>> > > >> > > func (p *PushbackReader) Unread(buf []byte) error { 
>>> > > >> > > if p.buffer.Len() == 0 { 
>>> > > >> > > return io.EOF 
>>> > > >> > > } 
>>> > > >> > > p.buffer.Write(buf) 
>>> > > >> > > return nil 
>>> > > >> > > } 
>>> > > >> > > 
>>> > > >> > > func main() { 
>>> > > >> > > // Example usage 
>>> > > >> > > r := NewPushbackReader(bytes.NewBufferString("Hello, 
>>> World!")) 
>>> > > >> > > buf := make([]byte, 5) 
>>> > > >> > > r.Read(buf) 
>>> > > >> > > r.UnreadByte() 
>>> > > >> > > r.Read(buf) 
>>> > > >> > > } 
>>> > > >> > > 
>>> > > >> > > Explanation: 
>>> > > >> > > PushbackReader struct: This struct holds the underlying 
>>> io.Reader and 
>>> > > >> a buffer to store the pushed-back bytes. 
>>> > > >> > > NewPushbackReader: This function creates a new PushbackReader 
>>> from an 
>>> > > >> existing io.Reader. 
>>> > > >> > > Read method: This method reads bytes from either the buffer 
>>> (if it 
>>> > > >> contains data) or the underlying reader. 
>>> > > >> > > UnreadByte method: This method pushes back a single byte into 
>>> the 
>>> > > >> buffer. 
>>> > > >> > > Unread method: This method pushes back a slice of bytes into 
>>> the 
>>> > > >> buffer. 
>>> > > >> > > Important Considerations: 
>>> > > >> > > The buffer size is not managed automatically. You may need to 
>>> adjust 
>>> > > >> the buffer size based on your use case. 
>>> > > >> > > This implementation does not handle pushing back beyond the 
>>> initially 
>>> > > >> read data. If you need to support arbitrary pushback, you'll need 
>>> a more 
>>> > > >> complex solution. 
>>> > > >> > > 
>>> > > >> > > Generative AI is experimental. 
>>> > > >> > > 
>>> > > >> > >> On Jan 12, 2025, at 2:53 PM, Robert Engels <
>>> ren...@ix.netcom.com> 
>>> > > >> wrote: 
>>> > > >> > >> 
>>> > > >> > >> You can see the two pass reader here 
>>> > > >> 
>>> https://stackoverflow.com/questions/20666594/how-can-i-push-bytes-into-a-reader-in-go
>>>  
>>> > > >> > >> 
>>> > > >> > >> But yea, the basic premise is that you buffer the data so 
>>> you can 
>>> > > >> rewind if needed 
>>> > > >> > >> 
>>> > > >> > >> Are you certain it is reading to the end to return EOF? It 
>>> may be 
>>> > > >> returning eof once the parsing fails. 
>>> > > >> > >> 
>>> > > >> > >> Otherwise I would expect this is being decoded wrong - eg 
>>> the mime 
>>> > > >> type or encoding type should tell you the correct format before 
>>> you start 
>>> > > >> decoding. 
>>> > > >> > >> 
>>> > > >> > >>> On Jan 12, 2025, at 2:46 PM, Rory Campbell-Lange < 
>>> > > >> ro...@campbell-lange.net> wrote: 
>>> > > >> > >>> 
>>> > > >> > >>> Thanks for the suggestion of a ReadSeeker to wrap an 
>>> io.Reader. 
>>> > > >> > >>> 
>>> > > >> > >>> My google fu must be deserting me. I can find 
>>> PushbackReader 
>>> > > >> implementations in Java, but the only similar thing for Go I 
>>> could find was 
>>> > > >> https://gitlab.com/osaki-lab/iowrapper. If you have a specific 
>>> > > >> recommendation for a ReadSeeker wrapper to an io.Reader that 
>>> would be great 
>>> > > >> to know. 
>>> > > >> > >>> 
>>> > > >> > >>> Since the base64 decoding error I'm looking for is an EOF, 
>>> I guess 
>>> > > >> the wrapper approach will not work when the EOF byte position is 
>>> > than the 
>>> > > >> io.ReadSeeker buffer size. 
>>> > > >> > >>> 
>>> > > >> > >>> Rory 
>>> > > >> > >>> 
>>> > > >> > >>> On 12/01/25, robert engels (ren...@ix.netcom.com) wrote: 
>>> > > >> > >>>> create a ReadSeeker that wraps the Reader providing the 
>>> buffering 
>>> > > >> (mark & reset) - normally the buffer only needs to be large 
>>> enough to 
>>> > > >> detect the format contained in the Reader. 
>>> > > >> > >>>> 
>>> > > >> > >>>> You can search Google for PushbackReader in Go and you’ll 
>>> get a 
>>> > > >> basic implementation. 
>>> > > >> > >>>> 
>>> > > >> > >>>>> On Jan 12, 2025, at 12:52 PM, Rory Campbell-Lange < 
>>> > > >> ro...@campbell-lange.net> wrote: 
>>> > > >> > >>> ... 
>>> > > >> > >>>>> I'm attempting to rationalise the process [of avoiding 
>>> reading 
>>> > > >> email parts into byte slices] by simply wrapping the provided 
>>> io.Reader 
>>> > > >> with the necessary decoders to reduce memory usage and 
>>> unnecessary 
>>> > > >> processing. 
>>> > > >> > >>>>> 
>>> > > >> > >>>>> The wrapping strategy seems to work ok. However there is 
>>> a 
>>> > > >> particular issue in detecting base64.StdEncoding versus 
>>> > > >> base64.RawStdEncoding, which requires draining the io.Reader 
>>> using 
>>> > > >> base64.StdEncoding and (based on the current implementation) 
>>> switching to 
>>> > > >> base64.RawStdEncoding if an io.ErrUnexpectedEOF is found. 
>>> > > >> > >>>>> 
>>> > > >> > >> 
>>> > > >> > >> 
>>> > > >> > >> -- 
>>> > > >> > >> You received this message because you are subscribed to the 
>>> Google 
>>> > > >> Groups "golang-nuts" group. 
>>> > > >> > >> To unsubscribe from this group and stop receiving emails 
>>> from it, 
>>> > > >> send an email to golang-nuts...@googlegroups.com <mailto: 
>>> > > >> golang-nuts...@googlegroups.com>. 
>>> > > >> > >> To view this discussion visit 
>>> > > >> 
>>> https://groups.google.com/d/msgid/golang-nuts/DD0C1480-D237-447A-B978-78FC8951FE05%40ix.netcom.com
>>>  
>>> > > >> < 
>>> > > >> 
>>> https://groups.google.com/d/msgid/golang-nuts/DD0C1480-D237-447A-B978-78FC8951FE05%40ix.netcom.com?utm_medium=email&utm_source=footer
>>>  
>>> > > >> >. 
>>> > > >> > > 
>>> > > >> > 
>>> > > >> 
>>> > > >> -- 
>>> > > >> You received this message because you are subscribed to the 
>>> Google Groups 
>>> > > >> "golang-nuts" group. 
>>> > > >> To unsubscribe from this group and stop receiving emails from it, 
>>> send an 
>>> > > >> email to golang-nuts...@googlegroups.com. 
>>> > > >> To view this discussion visit 
>>> > > >> 
>>> https://groups.google.com/d/msgid/golang-nuts/Z4Q0AFRkkoNH52_B%40campbell-lange.net
>>>  
>>> > > >> . 
>>> > > >> 
>>> > > > 
>>> > 
>>> > -- 
>>> > You received this message because you are subscribed to the Google 
>>> Groups "golang-nuts" group. 
>>> > To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to golang-nuts...@googlegroups.com. 
>>> > To view this discussion visit 
>>> https://groups.google.com/d/msgid/golang-nuts/Z4UQYJmuk7Oe6xSG%40campbell-lange.net.
>>>  
>>>
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion visit 
https://groups.google.com/d/msgid/golang-nuts/a990ab8b-7437-45f3-a0e5-81d9b7cab4a3n%40googlegroups.com.

Reply via email to