Thanks very much for the playground link and thoughts. The use case is reading base64 email parts, which could be of a very large size. It is unclear when processing these parts if they are base64 padded or not.
I'm trying to avoid reading the entire email part into memory. Consequently I think your earlier idea of adding padding (or removing it) in a wrapper could work. Perhaps wrapping the reader with another using a bufio.Reader to track bytes read and detect EOF. At EOF the wrapper could add padding if needed. Rory On 13/01/25, Axel Wagner (axel.wagner...@googlemail.com) wrote: > Just realized: If you twist the idea around, you get something easy to > implement and more correct. > Instead of stripping padding if it exist, you can ensure that the body *is* > padded to a multiple of 4 bytes: https://go.dev/play/p/SsPRXV9ZfoS > You can then feed that to base64.StdEncoding. If the wrapped Reader returns > padded Base64, this does nothing. If it returns unpadded Base64, it adds > padding. If it returns incorrect Base64, it will create a padded stream, > that will then get rejected by the Base64 decoder. > > On Mon, 13 Jan 2025 at 10:31, Axel Wagner <axel.wagner...@googlemail.com> > wrote: > > > Hi, > > > > one way to solve your problem is to wrap the body into an io.Reader that > > strips off everything after the first `=` it finds. That can then be fed to > > base64.RawStdEncoding. This approach requires no extra buffering or copying > > and is easy to implement: https://go.dev/play/p/CwcVz7oietI > > > > The downside is, that this will not verify that the body is *either* > > correctly padded Base64 *or* unpadded Base64. So, it will not report an > > error if fed something like "AAA=garbage". > > That can be remedied by buffering up to four bytes and, when encountering > > an EOF, check that there are at most three trailing `=` and that the total > > length of the stream is divisible by four. It's more finicky to implement, > > but it should also be possible without any extra copies and only requires a > > very small extra buffer. > > > > On Sun, 12 Jan 2025 at 22:29, Rory Campbell-Lange <r...@campbell-lange.net> > > wrote: > > > >> Thanks very much for the links, pointers and possible solution. > >> > >> Trying to read base64 standard (padded) encoded data with > >> base64.RawStdEncoding can produce an error such as > >> > >> illegal base64 data at input byte <n> > >> > >> Reading base64 raw (unpadded) encoded data produces the EOF error. > >> > >> I'll go with trying to read the standard encoded data up to maybe 1MB and > >> then switch to base64.RawStdEncoding if I hit the "illegal base64 data" > >> problem, maybe with reference to bufio.Reader which has most of the methods > >> suggested below. > >> > >> Yes, the use of a "Rewind" method would be crucial. I guess this would > >> need to: > >> 1. error if more than one buffer of data has been read > >> 2. else re-read from byte 0 > >> > >> Thanks again very much for these suggestions. > >> > >> Rory > >> > >> On 12/01/25, robert engels (reng...@ix.netcom.com) wrote: > >> > Also, see this > >> https://stackoverflow.com/questions/69753478/use-base64-stdencoding-or-base64-rawstdencoding-to-decode-base64-string-in-go > >> as I expected the error should be reported earlier than the end of stream > >> if the chosen format is wrong. > >> > > >> > > On Jan 12, 2025, at 2:57 PM, robert engels <reng...@ix.netcom.com> > >> wrote: > >> > > > >> > > Also, this is what Gemini provided which looks basically correct - > >> but I think encapsulating it with a Rewind() method would be easier to > >> understand. > >> > > > >> > > > >> > > > >> > > While Go doesn't have a built-in PushbackReader like some other > >> languages (e.g., Java), you can implement similar functionality using a > >> custom struct and a buffer. > >> > > > >> > > Here's an example implementation: > >> > > > >> > > package main > >> > > > >> > > import ( > >> > > "bytes" > >> > > "io" > >> > > ) > >> > > > >> > > type PushbackReader struct { > >> > > reader io.Reader > >> > > buffer *bytes.Buffer > >> > > } > >> > > > >> > > func NewPushbackReader(r io.Reader) *PushbackReader { > >> > > return &PushbackReader{ > >> > > reader: r, > >> > > buffer: new(bytes.Buffer), > >> > > } > >> > > } > >> > > > >> > > func (p *PushbackReader) Read(b []byte) (n int, err error) { > >> > > if p.buffer.Len() > 0 { > >> > > return p.buffer.Read(b) > >> > > } > >> > > return p.reader.Read(b) > >> > > } > >> > > > >> > > func (p *PushbackReader) UnreadByte() error { > >> > > if p.buffer.Len() == 0 { > >> > > return io.EOF > >> > > } > >> > > lastByte := p.buffer.Bytes()[p.buffer.Len()-1] > >> > > p.buffer.Truncate(p.buffer.Len() - 1) > >> > > p.buffer.WriteByte(lastByte) > >> > > return nil > >> > > } > >> > > > >> > > func (p *PushbackReader) Unread(buf []byte) error { > >> > > if p.buffer.Len() == 0 { > >> > > return io.EOF > >> > > } > >> > > p.buffer.Write(buf) > >> > > return nil > >> > > } > >> > > > >> > > func main() { > >> > > // Example usage > >> > > r := NewPushbackReader(bytes.NewBufferString("Hello, World!")) > >> > > buf := make([]byte, 5) > >> > > r.Read(buf) > >> > > r.UnreadByte() > >> > > r.Read(buf) > >> > > } > >> > > > >> > > Explanation: > >> > > PushbackReader struct: This struct holds the underlying io.Reader and > >> a buffer to store the pushed-back bytes. > >> > > NewPushbackReader: This function creates a new PushbackReader from an > >> existing io.Reader. > >> > > Read method: This method reads bytes from either the buffer (if it > >> contains data) or the underlying reader. > >> > > UnreadByte method: This method pushes back a single byte into the > >> buffer. > >> > > Unread method: This method pushes back a slice of bytes into the > >> buffer. > >> > > Important Considerations: > >> > > The buffer size is not managed automatically. You may need to adjust > >> the buffer size based on your use case. > >> > > This implementation does not handle pushing back beyond the initially > >> read data. If you need to support arbitrary pushback, you'll need a more > >> complex solution. > >> > > > >> > > Generative AI is experimental. > >> > > > >> > >> On Jan 12, 2025, at 2:53 PM, Robert Engels <reng...@ix.netcom.com> > >> wrote: > >> > >> > >> > >> You can see the two pass reader here > >> https://stackoverflow.com/questions/20666594/how-can-i-push-bytes-into-a-reader-in-go > >> > >> > >> > >> But yea, the basic premise is that you buffer the data so you can > >> rewind if needed > >> > >> > >> > >> Are you certain it is reading to the end to return EOF? It may be > >> returning eof once the parsing fails. > >> > >> > >> > >> Otherwise I would expect this is being decoded wrong - eg the mime > >> type or encoding type should tell you the correct format before you start > >> decoding. > >> > >> > >> > >>> On Jan 12, 2025, at 2:46 PM, Rory Campbell-Lange < > >> r...@campbell-lange.net> wrote: > >> > >>> > >> > >>> Thanks for the suggestion of a ReadSeeker to wrap an io.Reader. > >> > >>> > >> > >>> My google fu must be deserting me. I can find PushbackReader > >> implementations in Java, but the only similar thing for Go I could find was > >> https://gitlab.com/osaki-lab/iowrapper. If you have a specific > >> recommendation for a ReadSeeker wrapper to an io.Reader that would be great > >> to know. > >> > >>> > >> > >>> Since the base64 decoding error I'm looking for is an EOF, I guess > >> the wrapper approach will not work when the EOF byte position is > than the > >> io.ReadSeeker buffer size. > >> > >>> > >> > >>> Rory > >> > >>> > >> > >>> On 12/01/25, robert engels (reng...@ix.netcom.com) wrote: > >> > >>>> create a ReadSeeker that wraps the Reader providing the buffering > >> (mark & reset) - normally the buffer only needs to be large enough to > >> detect the format contained in the Reader. > >> > >>>> > >> > >>>> You can search Google for PushbackReader in Go and you’ll get a > >> basic implementation. > >> > >>>> > >> > >>>>> On Jan 12, 2025, at 12:52 PM, Rory Campbell-Lange < > >> r...@campbell-lange.net> wrote: > >> > >>> ... > >> > >>>>> I'm attempting to rationalise the process [of avoiding reading > >> email parts into byte slices] by simply wrapping the provided io.Reader > >> with the necessary decoders to reduce memory usage and unnecessary > >> processing. > >> > >>>>> > >> > >>>>> The wrapping strategy seems to work ok. However there is a > >> particular issue in detecting base64.StdEncoding versus > >> base64.RawStdEncoding, which requires draining the io.Reader using > >> base64.StdEncoding and (based on the current implementation) switching to > >> base64.RawStdEncoding if an io.ErrUnexpectedEOF is found. > >> > >>>>> > >> > >> > >> > >> > >> > >> -- > >> > >> You received this message because you are subscribed to the Google > >> Groups "golang-nuts" group. > >> > >> To unsubscribe from this group and stop receiving emails from it, > >> send an email to golang-nuts+unsubscr...@googlegroups.com <mailto: > >> golang-nuts+unsubscr...@googlegroups.com>. > >> > >> To view this discussion visit > >> https://groups.google.com/d/msgid/golang-nuts/DD0C1480-D237-447A-B978-78FC8951FE05%40ix.netcom.com > >> < > >> https://groups.google.com/d/msgid/golang-nuts/DD0C1480-D237-447A-B978-78FC8951FE05%40ix.netcom.com?utm_medium=email&utm_source=footer > >> >. > >> > > > >> > > >> > >> -- > >> You received this message because you are subscribed to the Google Groups > >> "golang-nuts" group. > >> To unsubscribe from this group and stop receiving emails from it, send an > >> email to golang-nuts+unsubscr...@googlegroups.com. > >> To view this discussion visit > >> https://groups.google.com/d/msgid/golang-nuts/Z4Q0AFRkkoNH52_B%40campbell-lange.net > >> . > >> > > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/golang-nuts/Z4UQYJmuk7Oe6xSG%40campbell-lange.net.