I'm just doing the reverse of that, I think, by removing the padding.

I can't seem to trigger an EOF with this code below:

> >        n, err = b.br.Read(h)
> >        if err != nil {
> >            return n, err
> >        }


On 13/01/25, robert engels (reng...@ix.netcom.com) wrote:
> As has been pointing out, you don’t need to read the whole thing into memory, 
> just wrap the data provider with one that adds the padding it doesn’t exist - 
> and always read with the padded decoder.
> 
> To add the padding you only need to keep track of the count of characters 
> read before eof to determine how many padding characters to synthetically add 
> - if the original data is padding this will be 0 (if it was padded correctly).
> 
> > On Jan 13, 2025, at 4:42 PM, Rory Campbell-Lange <r...@campbell-lange.net> 
> > wrote:
> > 
> > AS I wrote earlier, I'm trying to avoid reading the entire email part into 
> > memory to discover if I should use base64.StdEncoding or 
> > base64.RawStdEncoding.
> > 
> > The following seems to work reasonably well:
> > 
> >    type B64Translator struct {
> >        br *bufio.Reader
> >    }
> > 
> >    func NewB64Translator(r io.Reader) *B64Translator {
> >        return &B64Translator{
> >            br: bufio.NewReader(r),
> >        }
> >    }
> > 
> >    // Read reads off the buffered reader expecting base64.StdEncoding bytes
> >    // with (potentially) 1-3 '=' padding characters at the end.
> >    // RawStdEncoding can be used for both StdEncoded and RawStdEncoded data
> >    // if the padding is removed.
> >    func (b *B64Translator) Read(p []byte) (n int, err error) {
> >        h := make([]byte, len(p))
> >        n, err = b.br.Read(h)
> >        if err != nil {
> >            return n, err
> >        }
> >        // to be optimised
> >        c := bytes.Count(h, []byte("="))
> >        copy(p, h[:n-c])
> >        // fmt.Println(string(h), n, string(p), n-c)
> >        return n - c, nil
> >    }
> > 
> > https://go.dev/play/p/H6ii7Vy-8as
> > 
> > One odd thing is that I'm getting extraneous newlines (shown by stars in 
> > the output), eg:
> > 
> >     --
> >                raw: Bonjour joyeux lion
> >                             Qm9uam91ciwgam95ZXV4IGxpb24K
> >                     ok: false
> >        decoded: Bonjour, joyeux lion* <-------------------- e.g. here
> >     --
> >                std: "Bonjour, joyeux lion"
> >                             IkJvbmpvdXIsIGpveWV1eCBsaW9uIg==
> >                     ok: true
> >        decoded: "Bonjour, joyeux lion"
> >     --
> > 
> > Any thoughts on that would be gratefully received. 
> > 
> > Rory
> > 
> > 
> > On 13/01/25, Rory Campbell-Lange (r...@campbell-lange.net 
> > <mailto:r...@campbell-lange.net>) wrote:
> >> Thanks very much for the playground link and thoughts.
> >> 
> >> The use case is reading base64 email parts, which could be of a very large 
> >> size. It is unclear when processing these parts if they are base64 padded 
> >> or not.
> >> 
> >> I'm trying to avoid reading the entire email part into memory. 
> >> Consequently I think your earlier idea of adding padding (or removing it) 
> >> in a wrapper could work. Perhaps wrapping the reader with another using a 
> >> bufio.Reader to track bytes read and detect EOF. At EOF the wrapper could 
> >> add padding if needed.
> >> 
> >> Rory
> >> 
> >> On 13/01/25, Axel Wagner (axel.wagner...@googlemail.com 
> >> <mailto:axel.wagner...@googlemail.com>) wrote:
> >>> Just realized: If you twist the idea around, you get something easy to
> >>> implement and more correct.
> >>> Instead of stripping padding if it exist, you can ensure that the body 
> >>> *is*
> >>> padded to a multiple of 4 bytes: https://go.dev/play/p/SsPRXV9ZfoS
> >>> You can then feed that to base64.StdEncoding. If the wrapped Reader 
> >>> returns
> >>> padded Base64, this does nothing. If it returns unpadded Base64, it adds
> >>> padding. If it returns incorrect Base64, it will create a padded stream,
> >>> that will then get rejected by the Base64 decoder.
> >>> 
> >>> On Mon, 13 Jan 2025 at 10:31, Axel Wagner <axel.wagner...@googlemail.com 
> >>> <mailto:axel.wagner...@googlemail.com>>
> >>> wrote:
> >>> 
> >>>> Hi,
> >>>> 
> >>>> one way to solve your problem is to wrap the body into an io.Reader that
> >>>> strips off everything after the first `=` it finds. That can then be fed 
> >>>> to
> >>>> base64.RawStdEncoding. This approach requires no extra buffering or 
> >>>> copying
> >>>> and is easy to implement: https://go.dev/play/p/CwcVz7oietI
> >>>> 
> >>>> The downside is, that this will not verify that the body is *either*
> >>>> correctly padded Base64 *or* unpadded Base64. So, it will not report an
> >>>> error if fed something like "AAA=garbage".
> >>>> That can be remedied by buffering up to four bytes and, when encountering
> >>>> an EOF, check that there are at most three trailing `=` and that the 
> >>>> total
> >>>> length of the stream is divisible by four. It's more finicky to 
> >>>> implement,
> >>>> but it should also be possible without any extra copies and only 
> >>>> requires a
> >>>> very small extra buffer.
> >>>> 
> >>>> On Sun, 12 Jan 2025 at 22:29, Rory Campbell-Lange 
> >>>> <r...@campbell-lange.net <mailto:r...@campbell-lange.net>>
> >>>> wrote:
> >>>> 
> >>>>> Thanks very much for the links, pointers and possible solution.
> >>>>> 
> >>>>> Trying to read base64 standard (padded) encoded data with
> >>>>> base64.RawStdEncoding can produce an error such as
> >>>>> 
> >>>>>    illegal base64 data at input byte <n>
> >>>>> 
> >>>>> Reading base64 raw (unpadded) encoded data produces the EOF error.
> >>>>> 
> >>>>> I'll go with trying to read the standard encoded data up to maybe 1MB 
> >>>>> and
> >>>>> then switch to base64.RawStdEncoding if I hit the "illegal base64 data"
> >>>>> problem, maybe with reference to bufio.Reader which has most of the 
> >>>>> methods
> >>>>> suggested below.
> >>>>> 
> >>>>> Yes, the use of a "Rewind" method would be crucial. I guess this would
> >>>>> need to:
> >>>>> 1. error if more than one buffer of data has been read
> >>>>> 2. else re-read from byte 0
> >>>>> 
> >>>>> Thanks again very much for these suggestions.
> >>>>> 
> >>>>> Rory
> >>>>> 
> >>>>> On 12/01/25, robert engels (reng...@ix.netcom.com 
> >>>>> <mailto:reng...@ix.netcom.com>) wrote:
> >>>>>> Also, see this
> >>>>> https://stackoverflow.com/questions/69753478/use-base64-stdencoding-or-base64-rawstdencoding-to-decode-base64-string-in-go
> >>>>> as I expected the error should be reported earlier than the end of 
> >>>>> stream
> >>>>> if the chosen format is wrong.
> >>>>>> 
> >>>>>>> On Jan 12, 2025, at 2:57 PM, robert engels <reng...@ix.netcom.com>
> >>>>> wrote:
> >>>>>>> 
> >>>>>>> Also, this is what Gemini provided which looks basically correct -
> >>>>> but I think encapsulating it with a Rewind() method would be easier to
> >>>>> understand.
> >>>>>>> 
> >>>>>>> 
> >>>>>>> 
> >>>>>>> While Go doesn't have a built-in PushbackReader like some other
> >>>>> languages (e.g., Java), you can implement similar functionality using a
> >>>>> custom struct and a buffer.
> >>>>>>> 
> >>>>>>> Here's an example implementation:
> >>>>>>> 
> >>>>>>> package main
> >>>>>>> 
> >>>>>>> import (
> >>>>>>>    "bytes"
> >>>>>>>    "io"
> >>>>>>> )
> >>>>>>> 
> >>>>>>> type PushbackReader struct {
> >>>>>>>    reader io.Reader
> >>>>>>>    buffer *bytes.Buffer
> >>>>>>> }
> >>>>>>> 
> >>>>>>> func NewPushbackReader(r io.Reader) *PushbackReader {
> >>>>>>>    return &PushbackReader{
> >>>>>>>        reader: r,
> >>>>>>>        buffer: new(bytes.Buffer),
> >>>>>>>    }
> >>>>>>> }
> >>>>>>> 
> >>>>>>> func (p *PushbackReader) Read(b []byte) (n int, err error) {
> >>>>>>>    if p.buffer.Len() > 0 {
> >>>>>>>        return p.buffer.Read(b)
> >>>>>>>    }
> >>>>>>>    return p.reader.Read(b)
> >>>>>>> }
> >>>>>>> 
> >>>>>>> func (p *PushbackReader) UnreadByte() error {
> >>>>>>>    if p.buffer.Len() == 0 {
> >>>>>>>        return io.EOF
> >>>>>>>    }
> >>>>>>>    lastByte := p.buffer.Bytes()[p.buffer.Len()-1]
> >>>>>>>    p.buffer.Truncate(p.buffer.Len() - 1)
> >>>>>>>    p.buffer.WriteByte(lastByte)
> >>>>>>>    return nil
> >>>>>>> }
> >>>>>>> 
> >>>>>>> func (p *PushbackReader) Unread(buf []byte) error {
> >>>>>>>    if p.buffer.Len() == 0 {
> >>>>>>>        return io.EOF
> >>>>>>>    }
> >>>>>>>    p.buffer.Write(buf)
> >>>>>>>    return nil
> >>>>>>> }
> >>>>>>> 
> >>>>>>> func main() {
> >>>>>>>    // Example usage
> >>>>>>>    r := NewPushbackReader(bytes.NewBufferString("Hello, World!"))
> >>>>>>>    buf := make([]byte, 5)
> >>>>>>>    r.Read(buf)
> >>>>>>>    r.UnreadByte()
> >>>>>>>    r.Read(buf)
> >>>>>>> }
> >>>>>>> 
> >>>>>>> Explanation:
> >>>>>>> PushbackReader struct: This struct holds the underlying io.Reader and
> >>>>> a buffer to store the pushed-back bytes.
> >>>>>>> NewPushbackReader: This function creates a new PushbackReader from an
> >>>>> existing io.Reader.
> >>>>>>> Read method: This method reads bytes from either the buffer (if it
> >>>>> contains data) or the underlying reader.
> >>>>>>> UnreadByte method: This method pushes back a single byte into the
> >>>>> buffer.
> >>>>>>> Unread method: This method pushes back a slice of bytes into the
> >>>>> buffer.
> >>>>>>> Important Considerations:
> >>>>>>> The buffer size is not managed automatically. You may need to adjust
> >>>>> the buffer size based on your use case.
> >>>>>>> This implementation does not handle pushing back beyond the initially
> >>>>> read data. If you need to support arbitrary pushback, you'll need a more
> >>>>> complex solution.
> >>>>>>> 
> >>>>>>> Generative AI is experimental.
> >>>>>>> 
> >>>>>>>> On Jan 12, 2025, at 2:53 PM, Robert Engels <reng...@ix.netcom.com>
> >>>>> wrote:
> >>>>>>>> 
> >>>>>>>> You can see the two pass reader here
> >>>>> https://stackoverflow.com/questions/20666594/how-can-i-push-bytes-into-a-reader-in-go
> >>>>>>>> 
> >>>>>>>> But yea, the basic premise is that you buffer the data so you can
> >>>>> rewind if needed
> >>>>>>>> 
> >>>>>>>> Are you certain it is reading to the end to return EOF? It may be
> >>>>> returning eof once the parsing fails.
> >>>>>>>> 
> >>>>>>>> Otherwise I would expect this is being decoded wrong - eg the mime
> >>>>> type or encoding type should tell you the correct format before you 
> >>>>> start
> >>>>> decoding.
> >>>>>>>> 
> >>>>>>>>> On Jan 12, 2025, at 2:46 PM, Rory Campbell-Lange <
> >>>>> r...@campbell-lange.net> wrote:
> >>>>>>>>> 
> >>>>>>>>> Thanks for the suggestion of a ReadSeeker to wrap an io.Reader.
> >>>>>>>>> 
> >>>>>>>>> My google fu must be deserting me. I can find PushbackReader
> >>>>> implementations in Java, but the only similar thing for Go I could find 
> >>>>> was
> >>>>> https://gitlab.com/osaki-lab/iowrapper. If you have a specific
> >>>>> recommendation for a ReadSeeker wrapper to an io.Reader that would be 
> >>>>> great
> >>>>> to know.
> >>>>>>>>> 
> >>>>>>>>> Since the base64 decoding error I'm looking for is an EOF, I guess
> >>>>> the wrapper approach will not work when the EOF byte position is > than 
> >>>>> the
> >>>>> io.ReadSeeker buffer size.
> >>>>>>>>> 
> >>>>>>>>> Rory
> >>>>>>>>> 
> >>>>>>>>> On 12/01/25, robert engels (reng...@ix.netcom.com) wrote:
> >>>>>>>>>> create a ReadSeeker that wraps the Reader providing the buffering
> >>>>> (mark & reset) - normally the buffer only needs to be large enough to
> >>>>> detect the format contained in the Reader.
> >>>>>>>>>> 
> >>>>>>>>>> You can search Google for PushbackReader in Go and you’ll get a
> >>>>> basic implementation.
> >>>>>>>>>> 
> >>>>>>>>>>> On Jan 12, 2025, at 12:52 PM, Rory Campbell-Lange <
> >>>>> r...@campbell-lange.net> wrote:
> >>>>>>>>> ...
> >>>>>>>>>>> I'm attempting to rationalise the process [of avoiding reading
> >>>>> email parts into byte slices] by simply wrapping the provided io.Reader
> >>>>> with the necessary decoders to reduce memory usage and unnecessary
> >>>>> processing.
> >>>>>>>>>>> 
> >>>>>>>>>>> The wrapping strategy seems to work ok. However there is a
> >>>>> particular issue in detecting base64.StdEncoding versus
> >>>>> base64.RawStdEncoding, which requires draining the io.Reader using
> >>>>> base64.StdEncoding and (based on the current implementation) switching 
> >>>>> to
> >>>>> base64.RawStdEncoding if an io.ErrUnexpectedEOF is found.
> >>>>>>>>>>> 
> >>>>>>>> 
> >>>>>>>> 
> >>>>>>>> --
> >>>>>>>> You received this message because you are subscribed to the Google
> >>>>> Groups "golang-nuts" group.
> >>>>>>>> To unsubscribe from this group and stop receiving emails from it,
> >>>>> send an email to golang-nuts+unsubscr...@googlegroups.com 
> >>>>> <mailto:golang-nuts+unsubscr...@googlegroups.com> <mailto:
> >>>>> golang-nuts+unsubscr...@googlegroups.com 
> >>>>> <mailto:golang-nuts+unsubscr...@googlegroups.com>>.
> >>>>>>>> To view this discussion visit
> >>>>> https://groups.google.com/d/msgid/golang-nuts/DD0C1480-D237-447A-B978-78FC8951FE05%40ix.netcom.com
> >>>>> <
> >>>>> https://groups.google.com/d/msgid/golang-nuts/DD0C1480-D237-447A-B978-78FC8951FE05%40ix.netcom.com?utm_medium=email&utm_source=footer
> >>>>>> .
> >>>>>>> 
> >>>>>> 
> >>>>> 
> >>>>> --
> >>>>> You received this message because you are subscribed to the Google 
> >>>>> Groups
> >>>>> "golang-nuts" group.
> >>>>> To unsubscribe from this group and stop receiving emails from it, send 
> >>>>> an
> >>>>> email to golang-nuts+unsubscr...@googlegroups.com 
> >>>>> <mailto:golang-nuts+unsubscr...@googlegroups.com>.
> >>>>> To view this discussion visit
> >>>>> https://groups.google.com/d/msgid/golang-nuts/Z4Q0AFRkkoNH52_B%40campbell-lange.net
> >>>>> .
> >>>>> 
> >>>> 
> >> 
> >> -- 
> >> You received this message because you are subscribed to the Google Groups 
> >> "golang-nuts" group.
> >> To unsubscribe from this group and stop receiving emails from it, send an 
> >> email to golang-nuts+unsubscr...@googlegroups.com 
> >> <mailto:golang-nuts+unsubscr...@googlegroups.com>.
> >> To view this discussion visit 
> >> https://groups.google.com/d/msgid/golang-nuts/Z4UQYJmuk7Oe6xSG%40campbell-lange.net.
> > 
> > -- 
> > You received this message because you are subscribed to the Google Groups 
> > "golang-nuts" group.
> > To unsubscribe from this group and stop receiving emails from it, send an 
> > email to golang-nuts+unsubscr...@googlegroups.com 
> > <mailto:golang-nuts+unsubscr...@googlegroups.com>.
> > To view this discussion visit 
> > https://groups.google.com/d/msgid/golang-nuts/Z4WW2goeTO5Vz5Lc%40campbell-lange.net.
> 

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion visit 
https://groups.google.com/d/msgid/golang-nuts/Z4WjAeHQBLOYMu2J%40campbell-lange.net.

Reply via email to