On Thu, Jan 29, 2009 at 11:44 PM, "Martin v. Löwis" <mar...@v.loewis.de> wrote: >> I'm running a threaded app using some calls via xmlrpc to pypi. What >> I'm trying to get is a to get a littler more responses in a shorter >> time, as I see that the bandwidth used by xmlrpc calls are minimal >> (<kb). The problem I run into is that connection is reset by peer >> after about 10min (~500 calls). I use a single connection and a queue >> of 8 threads to get the data. Would anybody have an example on how to >> run xmlrpc in a thread? Do I set multiple connections, or is there a >> setting to keep the connection live or reconnect if disconnected? > > Using threads will not at all make it faster to communicate over a > single connection. For a single connection, all communication must > be serialized; you cannot issue a new request until the previous > request has completed. So you might as well just issue the requests > from a single thread. > >> Also, please advice if you think that somehow I am overloading your >> servers. I've tasted some downloads speeds and I am sure you web >> browser can accept 100+ requests per second, but what about xmlrpc? >> Without threads I get <5 requests per second. > > I think 5 requests per second is fairly fast. > Its more like 2 requests per second.
If I set it to 2 threads I can list each package version in about an hour, but I lost connection when I was at a z packages. If I used 5-8 I can get half way in about 25min but I lose connection. ("Connection reset by peer") Would you know how can I issue more requests, and/or increase the number of connections? I know "http://www.faqs.org/rfcs/rfc2068.html See section 8.1.4. The RFC says "should limit 2 connections per server" and a lot of http client libraries obey this." Does xmlrpc lib used by pypi does the same? Does pypi use http://docs.python.org/library/xmlrpclib.html#multicall-objects This is my last try. I was hoping that I can increase the number of connections to at least 10/second ~20min but I can't seem to find any performance increases on xmlrpc. Is there another way to get: pypi.list_packages() pypi.package_releases('xyz') pypi.release_data(' xyz' ,' 0.7.79dev' ) If not then I guess I will go back to the regular for loop and loop through all the records in a serialized manner. (Its been 1h 15min and I am on packages starting with letter R.) Cpickle file coming soon for the metadata available in release_data for all packages. Thanks, Lucas _______________________________________________ Catalog-SIG mailing list Catalog-SIG@python.org http://mail.python.org/mailman/listinfo/catalog-sig