Re: [pkg-discuss] Code review request for [bugs 1154, 1237, 1845, 1887, 1888]

Shawn Walker Sun, 18 May 2008 18:42:26 -0700

2008/5/16  <[EMAIL PROTECTED]>:
>> I did post a question to the cherrypy users group though, so I'll see
>> if someone knows a better way.
>
> Okay, great.  Let me know if you'd like me to particiapte in the
> discussion on their list.
>
> I spent some time this afternoon looking at the CherryPy code and some
> of the Tools() that they've implemented.  The code that handles the
> gzipping of a stream seems awfully close to what we want to do, except
> that we actually want to bypass the stack of transforms they employ in
> the before_finalizer hooks.
>
> We may end up having to write some kind of file-like object that holds
> the lines it gets from tarfile and then writes them out to the response
> body.  I'd be interested to see if the cherrypy users group knows of a
> good way to do this.
>
> The response object has a attribute named stream (or streaming, maybe?)
> and if that gets set, the framework omits key content-length headers
> that could cause mischief.


I did post to the list and the response I got was that what I was
doing now was really the only way. As for file-like object that holds
the lines, I had already experimented with that using StringIO.
However, due to an idiotic mistake on my part, it never worked quite
right.

So I reworked the filelist operation again to use StringIO initially,
and it worked this time. Going further, I discovered I could get a
little better performance using cStringIO instead.

As such, I believe I've managed to find an acceptable solution with
cStringIO. After each file is written to the tar stream, I stream the
cStringIO contents and then truncate and add the next file to the tar
stream.

This isn't as efficient as the old approach, but it does avoid
creating a temporary file.

>> I think, unfortunately, that this will be an issue I'll have to
>> resolve somehow before this gets putback.
>
> We'll need to figure something out before putback.  The streaming is
> necessary for the production servers handling pkg.opensolaris.org.
> It'll be a problem if their memory footprint suddenly baloons.  I'll let
> Stephen comment on this more, if it's appropriate.

The memory footprint is going to be a little bigger than what we
currently have; though acceptably so in my view.

The increase in footprint, from what I can tell, is only about one to
two megabytes for an idle depot process (meaning before and after an
operation).

During an operation (such as pkg install SUNWfirefox into an *empty*
user image), anonymous memory profiling shows about a total of 30MiB
being allocated and *freed* for the new depot code. That is only for
the first time a client does this.

Repeating the same operation (without stopping the depot server) shows
only about 500KiB being freed and allocated.

By comparison, the old depot code allocated and freed a total of about
1MiB *every* time the operation is performed since it starts and kills
a thread for every transaction.

I obtained that information using the anonprofile.d DTrace script that
Brendan Gregg wrote.

>> httplib2 can supposedly handle pipelined requests.
>>
>> The cherrypy guys also have an example of doing it "by hand" using the
>> existing python libraries.
>
> I took a look at httplib2, but didn't see how it handled pipelined
> requests.  It looks like it employs keep-alive, so that you can send
> multiple requests over the same socket; however, I didn't see any way to
> send multiple requests at the same time.  What did I miss?

According to the authors, you should be able to perform multiple get
requests and then receive the data.

However, I have not found any examples.

> I'd be interested to see the example that the cherrypy guys gave you, if
> it's handy.

This is the example they pointed me to:
http://www.cherrypy.org/browser/trunk/cherrypy/test/test_conn.py?rev=1956#L282

>> >> One of the things I struggled with while making these changes was
>> >> whether it was more efficient to pass the request and response object
>> >> around (and cleaner) or whether it was better to simply use the
>> >> singleton object to access them.
>> >
>> > My guess it that it might be faster to pass the request and response
>> > object; however, the difference probably isn't enough to be appreciable.
>>
>> I'll take a look back at the code and see how big of a change it would
>> be to do this.
>
> Unless you want to make this change, I wouldn't worry about it.

I have further reduced the imports of cherrypy. Now only depot.py,
server/repository.py and server/transaction.py use it.

I'm going to write-up a small summary of the changes I've made since I
first posted the depot code changes for review and then a link to the
new webrev just to make it easier for others to look at.

Cheers,
-- 
Shawn Walker

"To err is human -- and to blame it on a computer is even more so." -
Robert Orben
_______________________________________________
pkg-discuss mailing list
pkg-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss

Re: [pkg-discuss] Code review request for [bugs 1154, 1237, 1845, 1887, 1888]

Reply via email to