[Zope-dev] Re: Are input streams seekable in Zope? (+ patch proposal)

2004-08-12 Thread Ames Andreas (MPA/DF)
Hi,

at first let me say that I'm very sorry for the long delay; it was
just an oversight.

Tres Seaver wrote:

 Is there a reason why the AJP protocol won't allow you to rewind
 to the beginning of the request stream?  I don't think that the
 publisher does any other seek than to the start of the stream.

There is nothing in AJP that supports such a rewind but also nothing
that prevents it.  I think it's very comparable to the HTTP part of
the ZServer.  The problem probably was that I simply didn't expect
'stdin.seek()' to do something else than throwing an error.

I just buffer configurable amounts of bytes of an AJP requests in RAM,
no disk buffers or something for avoiding blocking I/O in ZServer.
When the configured amount is in memory I just stop reading from the
socket, thus I'm relying on TCP's traffic control.  The respective
ZPublisher thread informs the ZServer, when it has removed sth. from
the input buffer, by a trigger event.

 Perhaps you need to derive an AJPRequest from HTTPRequest, and
 arrange not to need the 'stidn' stream during a retry; another
 possibility would be to submit a patch which allowed the retry
 mechanism to work without re-parsing the request stream (basically,
 the patch would need to clone the cgi.FieldRequest set from the
 original request into the one used for the retry).

The first solution you propose is the one that immediately came to my
mind because it sounds similar to what the ZServer does for HTTP,
i.e. buffer big requests on disk smaller ones in StringIO.  The cause
I don't like that too much is that cgi.FileStorage doesn't let me
control if its input is disk buffered once again internally.  Writing
the same request to disk (and reading it in) potentially twice (or
even more often in the retry-case) is a bit too heavy for me.

I thought about trying to come up with a patch against
cgi.FieldStorage that would leave it to the caller to decide if
FieldStorage could do without its internal disk copy and just use the
input.  But even if I could do that it would take a long time to get
effective, so there'd be still the need for an immediate workaround.

So your second proposal is what I actually want to try first.  I've
just finished a patch against ZPublisher.HTTPRequest.py, that I
attached to this mail, that, on retry, doesn't read the whole request
again from the input stream (as during the first attempt) but just
keeps the toplevel cgi.FieldStorage, that has been created by the
first attempt in HTTPRequest.processInputs, in a newly created
instance variable self._fs. This instance variable is then preserved
across retry boundaries.  I've tested the patch preliminarily and it
worked for me.

As I don't know how and where to send a patch against the Zope(2) core
(and also because I'd hope for a little review here :-) I'd just be
glad if you could tell me how the correct procedure is.  I just hope
that I wouldn't have to go through some kind of 'paper war' just to
get three or four lines of code changed.


cheers,

andreas



--- ./orig.lib/python/ZPublisher/HTTPRequest.py	2004-03-12 21:00:48.0 +0100
+++ ./lib/python/ZPublisher/HTTPRequest.py	2004-08-12 17:11:55.0 +0200
@@ -125,10 +125,14 @@
 
 def retry(self):
 self.retry_count=self.retry_count+1
-self.stdin.seek(0)
+if self._fs:
+self._fs.file.seek(0)   # XXX not sure this is really needed
+else:
+raise ValueError, Retrying on partially initialized request is not supported.  Call processInputs().
 r=self.__class__(stdin=self.stdin,
  environ=self._orig_env,
- response=self.response.retry()
+ response=self.response.retry(),
+ fs = self._fs
  )
 r.retry_count=self.retry_count
 return r
@@ -138,6 +142,7 @@
 # removing tempfiles.
 self.stdin = None
 self._file = None
+self._fs = None
 self.form.clear()
 # we want to clear the lazy dict here because BaseRequests don't have
 # one.  Without this, there's the possibility of memory leaking
@@ -237,7 +242,7 @@
 
 return self._client_addr
 
-def __init__(self, stdin, environ, response, clean=0):
+def __init__(self, stdin, environ, response, clean=0, fs=None):
 self._orig_env=environ
 # Avoid the overhead of scrubbing the environment in the
 # case of request cloning for traversal purposes. If the
@@ -261,6 +266,7 @@
 self.steps=[]
 self._steps=[]
 self._lazies={}
+self._fs = fs
 
 
 if environ.has_key('REMOTE_ADDR'):
@@ -382,23 +388,24 @@
 taintedform=self.taintedform
 
 meth=None
-fs=FieldStorage(fp=fp,environ=environ,keep_blank_values=1)
-if not hasattr(fs,'list') or fs.list is None:
+if not self._fs:
+

[Zope-dev] Re: Are input streams seekable in Zope?

2004-07-30 Thread Tres Seaver
Ames Andreas (MPA/DF) wrote:
I've stumbled over some code in Zope 2.7.0 that seems to suggest that
input streams are meant to be seekable.  In an extension I wrote for
ZServer, i.e. AJPServer, I throw an exception when this is tried and
I'm quite sure the call to seek wasn't there in 2.6.x.  I found the
call to seek in ZPublisher.HTTPRequest.py, line 128, in method
retry.
Does this mean that only seekable streams are allowed in ZPublisher
(i.e. not stdin) or is this a bug?
When the object database raises a ConflictError, ZPublisher makes 3 
attmpts (by default) to retry the request (in most cases, the request 
can succeed when retried).

Is there a reason why the AJP protocol won't allow you to rewind to 
the beginning of the request stream?  I don't think that the publisher 
does any other seek than to the start of the stream.

Perhaps you need to derive an AJPRequest from HTTPRequest, and arrange 
not to need the 'stidn' stream during a retry;  another possibility 
would be to submit a patch which allowed the retry mechanism to work 
without re-parsing the request stream (basically, the patch would need 
to clone the cgi.FieldRequest set from the original request into the one 
used for the retry).

Tres.
--
===
Tres Seaver[EMAIL PROTECTED]
Zope Corporation  Zope Dealers   http://www.zope.com
___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
http://mail.zope.org/mailman/listinfo/zope-announce
http://mail.zope.org/mailman/listinfo/zope )