Re: [Zope-dev] Large file support

2000-10-25 Thread Toby Dickenson

On Tue, 24 Oct 2000 20:31:52 +0200, [EMAIL PROTECTED] wrote:

   If the Zope object knows how to produce the data themselves, they
   could push producer(s) directly to the channel.  I added a single
   check in ZServer.HTTPResponse(256) where a temporary file is only
   created if the data is larger than the in-memory buffer *and*
   doesn't already look like a producer with 'more' as a method.

Wahay! thats been on my todo list for ages. Ill take a look when I get
some time.


Toby Dickenson
[EMAIL PROTECTED]

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




Re: [Zope-dev] Large file support

2000-10-25 Thread Chris Withers

How does this differ from Local FS?

cheers,

Chris

[EMAIL PROTECTED] wrote:
 
 I have been building an "ExternalFile" class which stores the body of
 the file in an external file, mirroring the Zope path/hierarchy.  This
 will allow easy integration with servers that can mount the external
 representation of the content and serve it with a consistent namespace.
 
 To make life zimple, I tried to move all file manipulation to Zope,
 including upload/download/copy/cut/paste/delete and permissions.  These
 external files are transaction aware, blah blah..
 
 Working with files  20MB I notices some serious performance/scalability
 issues and investigated.  Here are the results.
 
 A diff with my changes against version 2.2.2 is available at
 http://www.superchannel.org/Playground/large_file_zope2.2.2_200010241.diff
 
 Concerns:
 
 Zope objects like File require data as a seekable file or as a
 coherent block, rather than as a stream.  Initializing/updating
 these objects *may* require loading the entire file into memory.
 
 In memory buffering of request or response data could cause
 excessive swapping of the working set.
 
 Multi-service architecture (ZServer-ZPublisher) could limit the
 reuse of stream handles.
 
 Creating temporary files as FIFOs buffers between the services
 causes signficant swapping.
 
 Modifications:
 
 Using pipes I found that FTPServer.ContentCollector was using a
 StringIO to buffer the uploads from FTP clients.  I changed this
 into a TemporaryFile for a while which revealed the leaked file
 descriptor bug (see below).  This intermediary temp file caused 1
 extra file copy for each request.  The goal is to not have any
 intermediary files at all, and pipeline the content directly into
 the Zope objects.
 
 To remove this FTP upload file buffer, I converted the FTP collector
 again from a TemporaryFile into a pipe with a reader and writer file
 objects.  The FTPRequest receives the reader from which it can
 process the input on the publish thread in processInputs.
 
 Since we are dealing with blocking pipes it is OK to have a reader
 on the publish thread and a writer on the ZServer thread.  The major
 considerations were regarding the proper way to read from a pipe
 through the chain of control, especially in cgi.FieldStorage.
 
 Stdin is treated as the reader of the pipe throughout the code.  All
 seek()s and tell()s on sys.stdin type objects (a tty not a seekable
 file) should be considered illegal and removed.
 
 Usage of FieldStorage from FTP (Unknown content-length)
 
 To gain access to the body of a request, one typically calls
 REQUEST['BODY'] or REQUEST['BODYFILE'].  This returns the file
 object the FieldStorage copied from stdin.
 
 To prevent FieldStorage from copying the file from stdin to a
 temporary file, we can set the CONTENT_LENGTH header to '0' in the
 FTP _get_env for a STOR.
 
 In this case, FieldStorage creates a temporary file but doesn't read
 any data from stdin so we can return stdin directly when BODYFILE is
 requested and 'content-length' is '0'.  However, BODYFILE could be a
 pipe which doesn't support 'seek' or 'tell'.  The code used to suck
 the data off the BODYFILE needs to be modified to adapt to the
 possibly of being passed a pipe.
 
 Updating Image.File to play with pipes
 
 The _read_data method of Image.File pulls the data out of the
 BODYFILE and sticks it in the instance as a string, pdata object, or
 a linked list of pdata objects.  The existing code reads and builds
 the list in one clean sweep back-to-front.  I belive this keeps the
 pdata.data chunks out of memory, quickly (sub)committing then
 deactivating (_p_changed = None) them.
 
 Since we can no longer safely assume 'seek' is valid for BODYFILE, I
 tried to read and build the list front-to-back.  This kept the data
 in memory, even though I tried to deactivate the objects quickly.
 
 As a tradeoff, I read the data front-to-back then built the list
 back-to-front taking another pass to reverse the list so it is in
 the correct order.
 
 Memory usage appears to be steady, meaning the whole file is not
 loaded into the working set.  This also prevents unecessary reading
 into a temporary FieldStorage file during an FTP upload.
 
 Web based uploads...
 
 ...suck.  I do not recommend doing a web based upload for files
  1mb.  First, a content-length is known, so we don't get the
 advantage of pipelining the data directly from the socket, a
 temporary file must be created, written and read.  Second, I 

Re: [Zope-dev] Large file support

2000-10-25 Thread seant

There is not much difference between the ExternalFile class I'm working
with and the File objects produced by LocalFS except External Files can
be put anywhere in the Zope hierarchy and LocalFS files need to be under
a LocalFS.  Each approach has its pros and cons.

This proposal mostly deals with the Zope framework, which will effect
both products.

Chris Withers([EMAIL PROTECTED])@Wed, Oct 25, 2000 at 12:35:23PM +0100:
 How does this differ from Local FS?
 
 cheers,
 
 Chris

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




RE: [Zope-dev] Large file support

2000-10-25 Thread Toby Dickenson

 I should also note that if you create a producer, you will have to
 override the __len__ method to return the entire length of the data.

 This is because RESPONSE.write doesn't allow you to set the 
 length of a
 write and there code during output that checks the size of the written
 object.

  Wahay! thats been on my todo list for ages. Ill take a look 
 when I get
  some time.


Damn. Its back on my to-do list then ;-)

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )