Re: Multiple FileStorage.save() operations

wgoulet Wed, 01 Aug 2012 06:26:05 -0700

Thanks Ludvig; this is helpful advice. I will take another look at my 
validation functions to see if I can make them work without needing to save 
the files to disk; esp since it sounds like F.save is not designed to offer 
multiple file save operations.


Thanks to all for the advice re: StringIO/BytesIO.

On Wednesday, August 1, 2012 8:03:02 AM UTC-5, Ludvig Ericson wrote:
>
> *[Sorry if this reply appears twice, I have a non-existant Apps account 
> subscribed to this group.]*
>
> Most people expect a function like *F.save* to advance the file pointer 
> to the end of the file.
>
> This won't change I'm fairly confident, and Werkzeug will store your 
> uploaded file on disk if it's too large to hold in RAM already!
>
> As for *StringIO*, it can hold binary data just fine without unnecessary 
> encodings.
>
> My advise is to make the validation function operate on the file object as 
> returned by *request.files[k]*, that way you can remain agnostic as to 
> whether the file has been stored on disk already or exists in RAM.
>
> I realize that might not be possible. If not then you can either a) save 
> to disk and reopen the saved file, or b) save to disk and rewind the file 
> pointer and resave it again to permanent storage.
>
> Does that help?
>
> -Ludvig
> On Wednesday, August 1, 2012 2:37:31 AM UTC+2, wgoulet wrote:
>>
>> But does that work for binary data? In my application I'm processing 
>> zipfiles and Java keystore files. My read of StringIO is that it works 
>> great for anything that can be directly represented as ASCII or Unicode, 
>> but it seems like a lot of overhead to base64 encode something so I can 
>> store it in memory.
>>
>> On Tuesday, July 31, 2012 6:31:38 PM UTC-5, mr.meker wrote:
>>>
>>> You shouldn't need to save a file twice. You are doing extra disk IO 
>>> when you should keep it in RAM until it needs to hit the disk. This is what 
>>> you should use instead of writing the file to /tmp. 
>>> http://docs.python.org/library/stringio.html
>>>
>>> On Tue, Jul 31, 2012 at 3:27 PM, wgoulet wrote:
>>>
>>>> Hi,
>>>>
>>>> I'm a relatively new Python and brand new Flask user, so please bear 
>>>> with me.
>>>>
>>>> I'm developing a webapp that requires that I permit users to upload 
>>>> files which are validated before I store them in their final location on 
>>>> the web server's local filesystem. To satisfy this requirement, I have 
>>>> defined helper methods that I use to create a temporary copy of a 
>>>> FileStorage object that is processed to determine if it is valid. Once 
>>>> this 
>>>> check passes, I then want to save the uploaded file in a permanent 
>>>> location.
>>>>
>>>> Here's an example subset of my code:
>>>>
>>>> def confirm():
>>>>     amfile = request.files['zipfile']
>>>>     if validate_file(amfile):
>>>>             
>>>>  amfile.save(os.path.join("/www/docs",secure_filename(amfile.filename)) 
>>>>
>>>> def validate_file(infile):
>>>>     infile.save(os.path.join("/tmp",secure_filename(infile.filename))
>>>>     # My validation code goes in here; I read the file in /tmp and 
>>>> return true or false depending on the results
>>>>     
>>>>
>>>> The problem I'm running into is that when I call the save() function on 
>>>> a FileStorage object twice in a row, the second save() function creates an 
>>>> empty copy of the file. In the code above, I have a valid copy of my file 
>>>> stored in /tmp, but the file in /www/docs is zero file size. I don't think 
>>>> that copying the file from /tmp to /www/docs is the right solution, 
>>>> because 
>>>> the validation code could potentially destroy or overwrite the temp copy 
>>>> (say if the file is a zip file as it is in my case)
>>>>
>>>> Looking at the source of FileStorage.save(), it looks like the reason 
>>>> for this behavior is that the shutil.copyfileobj function advances the 
>>>> filepointer when it copies from the source, but it doesn't move it back to 
>>>> the beginning of the file stream when it's finished (the copyfileobj docs 
>>>> state as much).
>>>>
>>>> As a simple test, I modified the FileStorage.save() function in my 
>>>> local werkzeug install to add a file seek call before the copyfileobj call 
>>>> as follows:
>>>>
>>>>  from shutil import copyfileobj
>>>>         close_dst = False
>>>>         if isinstance(dst, basestring):
>>>>             dst = file(dst, 'wb')
>>>>             close_dst = True
>>>>         try:
>>>>             # Reset file pointer before copying from object
>>>>             self.stream.seek(0)
>>>>             copyfileobj(self.stream, dst, buffer_size)
>>>>         finally:
>>>>             if close_dst:
>>>>                 dst.close()
>>>>
>>>> With this change, I can use the FileStorage.save() multiple times to 
>>>> save multiple copies of the FileStorage file.
>>>>
>>>> Would it make sense to modify FileStorage.save() as I've done here, or 
>>>> is there another, better way to achieve my goal?
>>>>
>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"pocoo-libs" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/pocoo-libs/-/yA6WmA0cVNkJ.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/pocoo-libs?hl=en.

Re: Multiple FileStorage.save() operations

Reply via email to