[Fwd: Re: Why POE is slower then Python - Twisted ... quite a lot?]
forwarding for apocalypse
--- Begin Message ---
Hello,
I have done some analysis on this, and used wireshark to track down
the differences like Rocco suggested. Looking at the dumps, I
immediately noticed 2 things, one being more important.
1. The python code doesn't retrieve the first article in the group,
it starts at the last-100 ( or whatever you set the articleCount to in
the script ) This little fix solved it and now we're fetching the same
articles as the perl one.
Change line 17 to: self.articlesToFetch = range(first,
first+self.factory.articleCount)
2. ( the important difference ) Looking at the nntp protocol itself,
I see that the behavior of both code is very different. The perl one
does it "the right way" by using the server-side article pointer and
issues the "next" command to move it to the next article id. The python
one naively increments the article id and tries to fetch it.
By incurring an extra roundtrip to the server, the perl code becomes
2x slower. THIS is the culprit, and any other differences/factors are
insignificant when you realize this. Also, since the python code just
increments the article id, sometimes the server returns "423 no such
article number in group" to the script. This means sometimes the article
isn't downloaded - a slight decrease in the network activity :)
Attached is the patch to the perl code to make it behave in the same
way as the python code. IMO it's not the "right" way but at least this
provides us with a fair comparison, ha! This is the output of both
(fixed) code, with 100 articles. Let me know if you have any other
questions :)
a...@satellite:~/Desktop/nntptest$ time perl nntp_fixed.pl
...
real0m19.627s
user0m0.588s
sys0m0.048s
a...@satellite:~/Desktop/nntptest$ time python nntp_fixed.py
news.microsoft.com microsoft.public.access access.mbox
...
real0m19.589s
user0m0.420s
sys0m0.060s
howard chen wrote:
I have 2 simple scripts which connect to NNTP server to fetch 100
articles (no writing to local disk) in order to test the performance
of two frameworks.
I have found Twisted is faster by at least 2 times which make me surprised.
I know it is not fair to compare two frameworks by this simple test.
But I think 2 times is quite much.
Therefore, I upload the code online to see if any experts can find any problem?
Thanks.
http://howachen.googlepages.com/nntp.py
http://howachen.googlepages.com/nntp.pl
time python nntp.py news.microsoft.com microsoft.public.access access.mbox
18 sec
time perl nntp.pl
37 sec
Any comments?
--- nntp.pl 2009-03-26 14:18:11.0 +0100
+++ nntp_fixed.pl 2009-03-26 14:46:17.0 +0100
@@ -7,6 +7,7 @@
$|=1;
my $count = 0;
+ my $article_id;
my $nntp = POE::Component::Client::NNTP->spawn ( 'NNTP-Client', { NNTPServer => 'news.microsoft.com' } );
POE::Session->create(
@@ -17,7 +18,7 @@
nntp_200 => '_connected',
nntp_201 => '_connected',
},
-'main' => [ qw(_start nntp_211 nntp_220 nntp_223 nntp_registered)
+'main' => [ qw(_start nntp_211 nntp_220 nntp_223 nntp_registered nntp_423)
],
],
);
@@ -47,8 +48,10 @@
sub nntp_211 {
my ($kernel,$heap,$text) = @_[KERNEL,HEAP,ARG0];
+ my @data = split( ' ', $text );
+ $article_id = $data[1];
-$kernel->post( 'NNTP-Client' => 'article' );
+$kernel->post( 'NNTP-Client' => 'article' => $article_id );
undef;
}
@@ -59,7 +62,17 @@
$count++;
die if $count >= 100;
-$kernel->post( 'NNTP-Client' => 'next' );
+$kernel->post( 'NNTP-Client' => 'article' => ++$article_id );
+undef;
+ }
+
+ sub nntp_423 {
+ print "Failed to retrieve $article_id\n";
+
+ $count++;
+die if $count >= 100;
+
+$_[KERNEL]->post( 'NNTP-Client' => 'article' => ++$article_id );
undef;
}
--- End Message ---
Re: Why POE is slower then Python - Twisted ... quite a lot?
It is too soon to conclude that POE is the cause. The difference in CPU time between your Python and Perl tests is 0.0002 seconds per article (user+sys). It's not a significant difference at this scale. It could be caused by any number of low- level variations between POE and Twisted, or Perl and Python. The wall times are a better place to look for inefficiencies. They are three orders of magnitude larger than the total CPU times. The large difference between wall and CPU times indicates a non-CPU bottleneck. In your test case, it's probably the network. What could account for network differences? You are comparing two different NNTP client libraries. They may interact differently with the server, which is a likely place to look. To investigate this, I would scale the tests down to a small number of articles (2 < count <= 10). I would dump the resulting network traffic for each test. I would use the dumps to compare interaction patterns between the two libraries. I would also look at timings: the times between command & response, the times between one command and the next, the length of time to receive an article, and so on. tcpdump and tshark (or wireshark) are very good for this sort of thing. If network interaction is comparable, we should revisit the points I outlined in my last message. They don't seem to have made much of an impression, but they may still be relevant. Thank you. -- Rocco Caputo - [email protected] On Mar 25, 2009, at 13:18, howard chen wrote: But anyway, now I re-run the test, instead of downloading 100 posts, now I download 1K articles. python real3m6.007s user0m0.088s sys 0m0.016s perl real6m13.730s user0m0.268s sys 0m0.024s
Re: Why POE is slower then Python - Twisted ... quite a lot?
Hi, On Wed, Mar 25, 2009 at 11:14 PM, Chris Prather wrote: > > That's a big supposition to make: > Please understand that my original email is comparing 37 vs 18 seconds difference (i.e. 200% speedup), so I don't really care about the startup/shutdown time since the difference is such obvious. But anyway, now I re-run the test, instead of downloading 100 posts, now I download 1K articles. >> python real3m6.007s user0m0.088s sys 0m0.016s >> perl real6m13.730s user0m0.268s sys 0m0.024s
Re: Why POE is slower then Python - Twisted ... quite a lot?
On Wed, Mar 25, 2009 at 11:03 AM, howard chen wrote: > Hello, > > On Wed, Mar 25, 2009 at 4:10 AM, Rocco Caputo wrote: >> >> time(1) includes Perl & Python startup/shutdown time, which may be >> significant. >> > > Suppose the difference is negligible. That's a big supposition to make: [11:12:35] g...@~/ $time python -c'import twisted' real0m1.487s user0m0.282s sys 0m0.103s [11:12:58] g...@~/ $time perl -MPOE -e1 real0m0.092s user0m0.076s sys 0m0.014s Note that I am not a Python person so I don't know if these two commands really are equivalent. -Chris
Re: Why POE is slower then Python - Twisted ... quite a lot?
Hello, On Wed, Mar 25, 2009 at 4:10 AM, Rocco Caputo wrote: > > time(1) includes Perl & Python startup/shutdown time, which may be > significant. > Suppose the difference is negligible. > Your times seem to be wall clock based, which can vary based on system load, > network conditions, and so on. Show the CPU times instead. > > It appears that you've only run one iteration. Benchmarks should be run > enough times to account for statistical variance. > No, I run the test for several times, on an idle server (Intel Quad Core, SMP, 4GB RAM. using both latest POE & Twisted). The difference is roughly 200% differences, so I was surprised. Thank you.
Re: Why POE is slower then Python - Twisted ... quite a lot?
I have a few issues with the way the benchmarks are run, and then some actual useful comments. :) time(1) includes Perl & Python startup/shutdown time, which may be significant. Your times seem to be wall clock based, which can vary based on system load, network conditions, and so on. Show the CPU times instead. It appears that you've only run one iteration. Benchmarks should be run enough times to account for statistical variance. Ok, that said: A deep and accurate answer requires a deep knowledge of all technologies involved. I'm not familiar with Python and Twisted, so I can't make a good comparison by myself. A previous reply on this thread suggested that CORE::select() was taking the most time, which strongly implies that POE spends more time waiting for network activity than Twisted does. If that's the case, it may be that POE may be checking for I/O faster than Twisted, and therefore more often, which could result in more and smaller reads from the socket. The overhead per read will be higher, although the reads will be more timely. Someone with dual Perl/Python and POE/Twisted experience would be best to dig into this to find the actual cause. We can then act on that, rather than on conjecture. Since the cause is likely to be within POE's implementation, we have room for improvement. For example, it's been suggested on irc.perl.org #poe to make POE's I/O events latency tunable. An interactive application might choose better responsiveness, while a bulk transfer application (such as yours?) could be tuned for higher throughput. It's an interesting idea, but I don't have resources to pursue it. If someone else does, I'll be happy to provide advice and guidance, here, in private e-mail, or on IRC. -- Rocco Caputo - [email protected] On Mar 23, 2009, at 07:18, howard chen wrote: I have 2 simple scripts which connect to NNTP server to fetch 100 articles (no writing to local disk) in order to test the performance of two frameworks. I have found Twisted is faster by at least 2 times which make me surprised. I know it is not fair to compare two frameworks by this simple test. But I think 2 times is quite much. Therefore, I upload the code online to see if any experts can find any problem? Thanks. http://howachen.googlepages.com/nntp.py http://howachen.googlepages.com/nntp.pl time python nntp.py news.microsoft.com microsoft.public.access access.mbox 18 sec time perl nntp.pl 37 sec Any comments?
Re: Why POE is slower then Python - Twisted ... quite a lot?
On Tue, 2009-03-24 at 10:05 +0900, Daisuke Maki wrote: > I'm going to take a wild guess here: POE::Component::Pluggable. Nope, that isn't it. I would have found that surprising anyway, since there aren't any plugins being used, so it's just a few checks if there are any plugins. (I checked by disabling the plugin handling) I thought maybe it had something to do with twisted running a more efficient loop by default, so I tried with some alternate loops (EV, POE::Loop::XS::EPoll), but that doesn't make much of a difference either Then I thought maybe it's the fact that Client::NNTP uses Filter::Line, so there's an event dispatched for each line of message instead of just one per message. While hacking up Client::NNTP to use Filter::Stream instead reduced the number of _invoke_state calls from almost 6000 to slightly over a thousand, that still only shaves off a few seconds of runtime. Devel::NYTProf says most of the time is being spent in CORE::select in loop_do_timeslice for the select loop or in EV::loop() for the EV loop (and not much difference between the two). I'm not an expert on POE internals, nor familiar with Twisted, so I can't say if it's just because Twisted does less than POE, or that there's something that could be optimised. Martijn
Re: Why POE is slower then Python - Twisted ... quite a lot?
I'm going to take a wild guess here: POE::Component::Pluggable. In looking at twister.news.nntp.py, it's a straigh nntp implementation, with a bunch of hook points. PoCo::Client::NNTP is built on top of POE::Component::Pluggable, and so does all dispatching via a context-less layer (i.e., PoCo::Pluggable doesn't know anything about NNTP or what have you), so dispatching works in a somewhat inefficient way. I suppose one could make it faster by inlining the dispatch code in PoCo::Plubble or something like that, though. --d howard chen wrote: I have 2 simple scripts which connect to NNTP server to fetch 100 articles (no writing to local disk) in order to test the performance of two frameworks. I have found Twisted is faster by at least 2 times which make me surprised. I know it is not fair to compare two frameworks by this simple test. But I think 2 times is quite much. Therefore, I upload the code online to see if any experts can find any problem? Thanks. http://howachen.googlepages.com/nntp.py http://howachen.googlepages.com/nntp.pl time python nntp.py news.microsoft.com microsoft.public.access access.mbox 18 sec time perl nntp.pl 37 sec Any comments?
