[go-nuts] Re: files, readers, byte arrays (slices?), byte buffers and http.requests

Sri G Sat, 02 Jul 2016 15:20:55 -0700

Update:

Adding file.Seek(0,0) does fix the issue in Version 2. The uploaded file is 
the correct size on disk with the correct md5. Without it, the uploaded 
file which is saved is missing the first 1024 bytes. This makes sense.


There is something wrong with the way the md5 is calculated, it keeps 
giving the same hash. Any ideas?

This version, while most likely not idiomatic, works:

mimebuf := make([]byte, 1024)
 _, err = file.Read(mimebuf)


mime := mimemagic.Match("", mimebuf)

file.Seek(0, 0)

checksum := md5.New()

io.Copy(checksum, file)

md5hex := hex.EncodeToString(checksum.Sum(nil))
fmt.Println("md5=", md5hex)

file.Seek(0, 0)
io.Copy(f, file)

It would be much appreciated if someone understands the idiomatic way to do 
this with and can explain it.

On Saturday, July 2, 2016 at 5:48:45 PM UTC-4, Sri G wrote:
>
> Thanks for the pointer. I also found this helpful Asynchronously Split an 
> io.Reader in Go (golang) « Rodaine 
> <http://rodaine.com/2015/04/async-split-io-reader-in-golang/> but I'm 
> still missing something.
>
> Version 1: the uploaded file is 1024 bytes extra at the end (too big):
>
> mimebuf := make([]byte, 1024)
> _, err = file.Read(mimebuf)
>
> mime := mimemagic.Match("", mimebuf)
>
> fileReader := io.MultiReader(bytes.NewReader(mimebuf), file)
>
> checksum := md5.New()
>
> b := io.TeeReader(fileReader, checksum)
>
> md5hex := hex.EncodeToString(checksum.Sum(nil))
>
> // Save file
> io.Copy(f, b)
>
> Version 2: the uploaded file is truncated by 1024 byte (too small): (this 
> makes sense since the first 1024 bytes of file was consumed)
>
> mimebuf := make([]byte, 1024)
> _, err = file.Read(mimebuf)
>
> mime := mimemagic.Match("", mimebuf)
>
> checksum := md5.New()
>
> // Adding file.Seek(0,0) here does not fix this issue
>
> b := io.TeeReader(file, checksum)
>
> md5hex := hex.EncodeToString(checksum.Sum(nil))
>
> // Save file
> io.Copy(f, b)
>
>
> What is incorrect which is causing this? How do I get the goldilocks 
> version that's just right?
>
> On Saturday, July 2, 2016 at 3:18:51 AM UTC-4, Tamás Gulácsi wrote:
>>
>>
>> 2016. július 2., szombat 8:15:19 UTC+2 időpontban Sri G a következőt írta:
>>>
>>> I'm working on receiving uploads through a form.
>>>
>>> The tricky part is validation.
>>>
>>> I attempt to read the first 1024 bytes to check the mime of the file and 
>>> then if valid read the rest and hash it and also save it to disk. Reading 
>>> the mime type is successful and I've gotten it to work by chaining 
>>> TeeReader but it seems very hackish. Whats the idiomatic way to do this?
>>>
>>> I'm trying something like this: 
>>>
>>>
>>> // Parse my multi part form 
>>> ...
>>> // Get file handle
>>> file, err := fh.Open()
>>>
>>> var a bytes.Buffer
>>>
>>> io.CopyN(&a, file, 1024)
>>>
>>> mime := mimemagic.Match("", a.Bytes())
>>> // Check mime type (this works fine)
>>>
>>> I'm trying to seek a stream so this should be no-op
>>> file.Seek(0, 0)
>>>
>>> The file stored on disk is 1KB larger than the original so it appears to 
>>> be re-copying the entire file and appending it to bytes.Buffer
>>> io.Copy(&a, file)
>>>
>>> checksum := md5.New()
>>> b := io.TeeReader(&a, checksum)
>>>
>>> md5hex := hex.EncodeToString(checksum.Sum(nil))
>>> fmt.Println("md5=", md5hex)
>>>
>>> //Open file f for writing to disk
>>> ...
>>> //Save file
>>> io.Copy(f, b)
>>>
>>>
>>> Checked the md5 of (1KB of orig + orig), and (orginal - first 1 KB), 
>>> neither match the md5 of the file being hashed.
>>>
>>> Why can't I append the rest of the stream to the byte buffer to get the 
>>> complete file in memory and why is the byte buffer being "consumed"? 
>>>
>>> I simply need to read the same array of byte multiple times, I don't 
>>> need to "copy" them. I'm coming from a C background so I'm wondering what 
>>> is going on behind the scenes as well.
>>>
>>
>> If you know you'll have to read the whole file into memory, then do that, 
>> and use bytes.NewReader to create  a reader for that byte slice.
>>
>> If you read partly, to decide whether to go on, then use fh.Read or 
>> io.ReadAtLeast with a byte slice.
>>
>> If you read sth, then want to read the whole from the beginning, 
>> construct a Reader with io.MultiReader(bytes.NewReader(b), fh).
>>
>> You can combine these approaches, but if the while file size is less than 
>> a few KiB, I think it is easier, simpler and more performant (!) to read 
>> the whole file up into memory,
>> into a bytes.Buffer, and construct the needed readers with 
>> bytes.NewReader(buf.Bytes()). 
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[go-nuts] Re: files, readers, byte arrays (slices?), byte buffers and http.requests

Reply via email to