You wouldn’t get an eof if the data is properly encoded. Not sure what the problem is.
You need to be doing something with the Reader - most likely writing to a file, streaming to a database record, etc. I would simplify the code to a single test case that demonstrates the issue you are having with the code. > On Jan 13, 2025, at 5:34 PM, Rory Campbell-Lange <r...@campbell-lange.net> > wrote: > > I'm just doing the reverse of that, I think, by removing the padding. > > I can't seem to trigger an EOF with this code below: > >>> n, err = b.br.Read(h) >>> if err != nil { >>> return n, err >>> } > > > On 13/01/25, robert engels (reng...@ix.netcom.com > <mailto:reng...@ix.netcom.com>) wrote: >> As has been pointing out, you don’t need to read the whole thing into >> memory, just wrap the data provider with one that adds the padding it >> doesn’t exist - and always read with the padded decoder. >> >> To add the padding you only need to keep track of the count of characters >> read before eof to determine how many padding characters to synthetically >> add - if the original data is padding this will be 0 (if it was padded >> correctly). >> >>> On Jan 13, 2025, at 4:42 PM, Rory Campbell-Lange <r...@campbell-lange.net> >>> wrote: >>> >>> AS I wrote earlier, I'm trying to avoid reading the entire email part into >>> memory to discover if I should use base64.StdEncoding or >>> base64.RawStdEncoding. >>> >>> The following seems to work reasonably well: >>> >>> type B64Translator struct { >>> br *bufio.Reader >>> } >>> >>> func NewB64Translator(r io.Reader) *B64Translator { >>> return &B64Translator{ >>> br: bufio.NewReader(r), >>> } >>> } >>> >>> // Read reads off the buffered reader expecting base64.StdEncoding bytes >>> // with (potentially) 1-3 '=' padding characters at the end. >>> // RawStdEncoding can be used for both StdEncoded and RawStdEncoded data >>> // if the padding is removed. >>> func (b *B64Translator) Read(p []byte) (n int, err error) { >>> h := make([]byte, len(p)) >>> n, err = b.br.Read(h) >>> if err != nil { >>> return n, err >>> } >>> // to be optimised >>> c := bytes.Count(h, []byte("=")) >>> copy(p, h[:n-c]) >>> // fmt.Println(string(h), n, string(p), n-c) >>> return n - c, nil >>> } >>> >>> https://go.dev/play/p/H6ii7Vy-8as >>> >>> One odd thing is that I'm getting extraneous newlines (shown by stars in >>> the output), eg: >>> >>> -- >>> raw: Bonjour joyeux lion >>> Qm9uam91ciwgam95ZXV4IGxpb24K >>> ok: false >>> decoded: Bonjour, joyeux lion* <-------------------- e.g. here >>> -- >>> std: "Bonjour, joyeux lion" >>> IkJvbmpvdXIsIGpveWV1eCBsaW9uIg== >>> ok: true >>> decoded: "Bonjour, joyeux lion" >>> -- >>> >>> Any thoughts on that would be gratefully received. >>> >>> Rory >>> >>> >>> On 13/01/25, Rory Campbell-Lange (r...@campbell-lange.net >>> <mailto:r...@campbell-lange.net> <mailto:r...@campbell-lange.net>) wrote: >>>> Thanks very much for the playground link and thoughts. >>>> >>>> The use case is reading base64 email parts, which could be of a very large >>>> size. It is unclear when processing these parts if they are base64 padded >>>> or not. >>>> >>>> I'm trying to avoid reading the entire email part into memory. >>>> Consequently I think your earlier idea of adding padding (or removing it) >>>> in a wrapper could work. Perhaps wrapping the reader with another using a >>>> bufio.Reader to track bytes read and detect EOF. At EOF the wrapper could >>>> add padding if needed. >>>> >>>> Rory >>>> >>>> On 13/01/25, Axel Wagner (axel.wagner...@googlemail.com >>>> <mailto:axel.wagner...@googlemail.com><mailto:axel.wagner...@googlemail.com>) >>>> wrote: >>>>> Just realized: If you twist the idea around, you get something easy to >>>>> implement and more correct. >>>>> Instead of stripping padding if it exist, you can ensure that the body >>>>> *is* >>>>> padded to a multiple of 4 bytes: https://go.dev/play/p/SsPRXV9ZfoS >>>>> You can then feed that to base64.StdEncoding. If the wrapped Reader >>>>> returns >>>>> padded Base64, this does nothing. If it returns unpadded Base64, it adds >>>>> padding. If it returns incorrect Base64, it will create a padded stream, >>>>> that will then get rejected by the Base64 decoder. >>>>> >>>>> On Mon, 13 Jan 2025 at 10:31, Axel Wagner <axel.wagner...@googlemail.com >>>>> <mailto:axel.wagner...@googlemail.com><mailto:axel.wagner...@googlemail.com>> >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> one way to solve your problem is to wrap the body into an io.Reader that >>>>>> strips off everything after the first `=` it finds. That can then be fed >>>>>> to >>>>>> base64.RawStdEncoding. This approach requires no extra buffering or >>>>>> copying >>>>>> and is easy to implement: https://go.dev/play/p/CwcVz7oietI >>>>>> >>>>>> The downside is, that this will not verify that the body is *either* >>>>>> correctly padded Base64 *or* unpadded Base64. So, it will not report an >>>>>> error if fed something like "AAA=garbage". >>>>>> That can be remedied by buffering up to four bytes and, when encountering >>>>>> an EOF, check that there are at most three trailing `=` and that the >>>>>> total >>>>>> length of the stream is divisible by four. It's more finicky to >>>>>> implement, >>>>>> but it should also be possible without any extra copies and only >>>>>> requires a >>>>>> very small extra buffer. >>>>>> >>>>>> On Sun, 12 Jan 2025 at 22:29, Rory Campbell-Lange >>>>>> <r...@campbell-lange.net >>>>>> <mailto:r...@campbell-lange.net><mailto:r...@campbell-lange.net>> >>>>>> wrote: >>>>>> >>>>>>> Thanks very much for the links, pointers and possible solution. >>>>>>> >>>>>>> Trying to read base64 standard (padded) encoded data with >>>>>>> base64.RawStdEncoding can produce an error such as >>>>>>> >>>>>>> illegal base64 data at input byte <n> >>>>>>> >>>>>>> Reading base64 raw (unpadded) encoded data produces the EOF error. >>>>>>> >>>>>>> I'll go with trying to read the standard encoded data up to maybe 1MB >>>>>>> and >>>>>>> then switch to base64.RawStdEncoding if I hit the "illegal base64 data" >>>>>>> problem, maybe with reference to bufio.Reader which has most of the >>>>>>> methods >>>>>>> suggested below. >>>>>>> >>>>>>> Yes, the use of a "Rewind" method would be crucial. I guess this would >>>>>>> need to: >>>>>>> 1. error if more than one buffer of data has been read >>>>>>> 2. else re-read from byte 0 >>>>>>> >>>>>>> Thanks again very much for these suggestions. >>>>>>> >>>>>>> Rory >>>>>>> >>>>>>> On 12/01/25, robert engels (reng...@ix.netcom.com >>>>>>> <mailto:reng...@ix.netcom.com> <mailto:reng...@ix.netcom.com>) wrote: >>>>>>>> Also, see this >>>>>>> https://stackoverflow.com/questions/69753478/use-base64-stdencoding-or-base64-rawstdencoding-to-decode-base64-string-in-go >>>>>>> as I expected the error should be reported earlier than the end of >>>>>>> stream >>>>>>> if the chosen format is wrong. >>>>>>>> >>>>>>>>> On Jan 12, 2025, at 2:57 PM, robert engels <reng...@ix.netcom.com >>>>>>>>> <mailto:reng...@ix.netcom.com>> >>>>>>> wrote: >>>>>>>>> >>>>>>>>> Also, this is what Gemini provided which looks basically correct - >>>>>>> but I think encapsulating it with a Rewind() method would be easier to >>>>>>> understand. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> While Go doesn't have a built-in PushbackReader like some other >>>>>>> languages (e.g., Java), you can implement similar functionality using a >>>>>>> custom struct and a buffer. >>>>>>>>> >>>>>>>>> Here's an example implementation: >>>>>>>>> >>>>>>>>> package main >>>>>>>>> >>>>>>>>> import ( >>>>>>>>> "bytes" >>>>>>>>> "io" >>>>>>>>> ) >>>>>>>>> >>>>>>>>> type PushbackReader struct { >>>>>>>>> reader io.Reader >>>>>>>>> buffer *bytes.Buffer >>>>>>>>> } >>>>>>>>> >>>>>>>>> func NewPushbackReader(r io.Reader) *PushbackReader { >>>>>>>>> return &PushbackReader{ >>>>>>>>> reader: r, >>>>>>>>> buffer: new(bytes.Buffer), >>>>>>>>> } >>>>>>>>> } >>>>>>>>> >>>>>>>>> func (p *PushbackReader) Read(b []byte) (n int, err error) { >>>>>>>>> if p.buffer.Len() > 0 { >>>>>>>>> return p.buffer.Read(b) >>>>>>>>> } >>>>>>>>> return p.reader.Read(b) >>>>>>>>> } >>>>>>>>> >>>>>>>>> func (p *PushbackReader) UnreadByte() error { >>>>>>>>> if p.buffer.Len() == 0 { >>>>>>>>> return io.EOF >>>>>>>>> } >>>>>>>>> lastByte := p.buffer.Bytes()[p.buffer.Len()-1] >>>>>>>>> p.buffer.Truncate(p.buffer.Len() - 1) >>>>>>>>> p.buffer.WriteByte(lastByte) >>>>>>>>> return nil >>>>>>>>> } >>>>>>>>> >>>>>>>>> func (p *PushbackReader) Unread(buf []byte) error { >>>>>>>>> if p.buffer.Len() == 0 { >>>>>>>>> return io.EOF >>>>>>>>> } >>>>>>>>> p.buffer.Write(buf) >>>>>>>>> return nil >>>>>>>>> } >>>>>>>>> >>>>>>>>> func main() { >>>>>>>>> // Example usage >>>>>>>>> r := NewPushbackReader(bytes.NewBufferString("Hello, World!")) >>>>>>>>> buf := make([]byte, 5) >>>>>>>>> r.Read(buf) >>>>>>>>> r.UnreadByte() >>>>>>>>> r.Read(buf) >>>>>>>>> } >>>>>>>>> >>>>>>>>> Explanation: >>>>>>>>> PushbackReader struct: This struct holds the underlying io.Reader and >>>>>>> a buffer to store the pushed-back bytes. >>>>>>>>> NewPushbackReader: This function creates a new PushbackReader from an >>>>>>> existing io.Reader. >>>>>>>>> Read method: This method reads bytes from either the buffer (if it >>>>>>> contains data) or the underlying reader. >>>>>>>>> UnreadByte method: This method pushes back a single byte into the >>>>>>> buffer. >>>>>>>>> Unread method: This method pushes back a slice of bytes into the >>>>>>> buffer. >>>>>>>>> Important Considerations: >>>>>>>>> The buffer size is not managed automatically. You may need to adjust >>>>>>> the buffer size based on your use case. >>>>>>>>> This implementation does not handle pushing back beyond the initially >>>>>>> read data. If you need to support arbitrary pushback, you'll need a more >>>>>>> complex solution. >>>>>>>>> >>>>>>>>> Generative AI is experimental. >>>>>>>>> >>>>>>>>>> On Jan 12, 2025, at 2:53 PM, Robert Engels <reng...@ix.netcom.com >>>>>>>>>> <mailto:reng...@ix.netcom.com>> >>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> You can see the two pass reader here >>>>>>> https://stackoverflow.com/questions/20666594/how-can-i-push-bytes-into-a-reader-in-go >>>>>>>>>> >>>>>>>>>> But yea, the basic premise is that you buffer the data so you can >>>>>>> rewind if needed >>>>>>>>>> >>>>>>>>>> Are you certain it is reading to the end to return EOF? It may be >>>>>>> returning eof once the parsing fails. >>>>>>>>>> >>>>>>>>>> Otherwise I would expect this is being decoded wrong - eg the mime >>>>>>> type or encoding type should tell you the correct format before you >>>>>>> start >>>>>>> decoding. >>>>>>>>>> >>>>>>>>>>> On Jan 12, 2025, at 2:46 PM, Rory Campbell-Lange < >>>>>>> r...@campbell-lange.net <mailto:r...@campbell-lange.net>> wrote: >>>>>>>>>>> >>>>>>>>>>> Thanks for the suggestion of a ReadSeeker to wrap an io.Reader. >>>>>>>>>>> >>>>>>>>>>> My google fu must be deserting me. I can find PushbackReader >>>>>>> implementations in Java, but the only similar thing for Go I could find >>>>>>> was >>>>>>> https://gitlab.com/osaki-lab/iowrapper. If you have a specific >>>>>>> recommendation for a ReadSeeker wrapper to an io.Reader that would be >>>>>>> great >>>>>>> to know. >>>>>>>>>>> >>>>>>>>>>> Since the base64 decoding error I'm looking for is an EOF, I guess >>>>>>> the wrapper approach will not work when the EOF byte position is > than >>>>>>> the >>>>>>> io.ReadSeeker buffer size. >>>>>>>>>>> >>>>>>>>>>> Rory >>>>>>>>>>> >>>>>>>>>>> On 12/01/25, robert engels (reng...@ix.netcom.com >>>>>>>>>>> <mailto:reng...@ix.netcom.com>) wrote: >>>>>>>>>>>> create a ReadSeeker that wraps the Reader providing the buffering >>>>>>> (mark & reset) - normally the buffer only needs to be large enough to >>>>>>> detect the format contained in the Reader. >>>>>>>>>>>> >>>>>>>>>>>> You can search Google for PushbackReader in Go and you’ll get a >>>>>>> basic implementation. >>>>>>>>>>>> >>>>>>>>>>>>> On Jan 12, 2025, at 12:52 PM, Rory Campbell-Lange < >>>>>>> r...@campbell-lange.net <mailto:r...@campbell-lange.net>> wrote: >>>>>>>>>>> ... >>>>>>>>>>>>> I'm attempting to rationalise the process [of avoiding reading >>>>>>> email parts into byte slices] by simply wrapping the provided io.Reader >>>>>>> with the necessary decoders to reduce memory usage and unnecessary >>>>>>> processing. >>>>>>>>>>>>> >>>>>>>>>>>>> The wrapping strategy seems to work ok. However there is a >>>>>>> particular issue in detecting base64.StdEncoding versus >>>>>>> base64.RawStdEncoding, which requires draining the io.Reader using >>>>>>> base64.StdEncoding and (based on the current implementation) switching >>>>>>> to >>>>>>> base64.RawStdEncoding if an io.ErrUnexpectedEOF is found. >>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "golang-nuts" group. >>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to golang-nuts+unsubscr...@googlegroups.com >>>>>>> <mailto:golang-nuts+unsubscr...@googlegroups.com> >>>>>>> <mailto:golang-nuts+unsubscr...@googlegroups.com> <mailto: >>>>>>> golang-nuts+unsubscr...@googlegroups.com >>>>>>> <mailto:golang-nuts+unsubscr...@googlegroups.com> >>>>>>> <mailto:golang-nuts+unsubscr...@googlegroups.com>>. >>>>>>>>>> To view this discussion visit >>>>>>> https://groups.google.com/d/msgid/golang-nuts/DD0C1480-D237-447A-B978-78FC8951FE05%40ix.netcom.com >>>>>>> < >>>>>>> https://groups.google.com/d/msgid/golang-nuts/DD0C1480-D237-447A-B978-78FC8951FE05%40ix.netcom.com?utm_medium=email&utm_source=footer >>>>>>>> . >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups >>>>>>> "golang-nuts" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>>>> an >>>>>>> email to golang-nuts+unsubscr...@googlegroups.com >>>>>>> <mailto:golang-nuts+unsubscr...@googlegroups.com> >>>>>>> <mailto:golang-nuts+unsubscr...@googlegroups.com>. >>>>>>> To view this discussion visit >>>>>>> https://groups.google.com/d/msgid/golang-nuts/Z4Q0AFRkkoNH52_B%40campbell-lange.net >>>>>>> . >>>>>>> >>>>>> >>>> >>>> -- >>>> You received this message because you are subscribed to the Google Groups >>>> "golang-nuts" group. >>>> To unsubscribe from this group and stop receiving emails from it, send an >>>> email to golang-nuts+unsubscr...@googlegroups.com >>>> <mailto:golang-nuts+unsubscr...@googlegroups.com> >>>> <mailto:golang-nuts+unsubscr...@googlegroups.com>. >>>> To view this discussion visit >>>> https://groups.google.com/d/msgid/golang-nuts/Z4UQYJmuk7Oe6xSG%40campbell-lange.net. >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "golang-nuts" group. >>> To unsubscribe from this group and stop receiving emails from it, send an >>> email to golang-nuts+unsubscr...@googlegroups.com >>> <mailto:golang-nuts+unsubscr...@googlegroups.com> >>> <mailto:golang-nuts+unsubscr...@googlegroups.com>. >>> To view this discussion visit >>> https://groups.google.com/d/msgid/golang-nuts/Z4WW2goeTO5Vz5Lc%40campbell-lange.net. -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/golang-nuts/9F6FCA2F-9641-41F5-AB0F-42055287BB85%40ix.netcom.com.