>> > Ideally, zoossh should do the heavy lifting as it's implemented in a >> > compiled language. >> >> This is assuming zoossh is dramatically faster than Stem by virtue of being >> compiled. I know we've discussed this before but I forget the results - with >> the latest tip of Stem (ie, with lazy loading) how do they compare? I'd >> expect >> time to be mostly bound by disk IO, so little to no difference. > > zoossh's test framework says that it takes 36364357 nanoseconds to > lazily parse a consensus that is cached in memory (to eliminate the I/O > bottleneck). That amounts to approximately 27 consensuses a second. > > I used the following simple Python script to get a similar number for > Stem: > > with open(file_name) as consensus_file: > for router in stem.descriptor.parse_file(consensus_file, > 'network-status-consensus-3 1.0', > document_handler = stem.descriptor.DocumentHandler.ENTRIES): > pass > > This script manages to parse 24 consensus files in ~13 seconds, which > amounts to 1.8 consensuses a second. Let me know if there's a more > efficient way to do this in Stem.
Interesting! First thought is 'wonder if zoossh is even reading the file content'. Couple quick things to try are... with open(file_name) as consensus_file: consensus_file.read() ... to see how much time is disk IO verses parsing. Second is to try doing something practical (say, count the number of relays with the exit flag). Stem does some bytes => unicode normalization which might account for some difference but other than that I'm at a loss for what would be taking the time. Cheers! -Damian _______________________________________________ tor-dev mailing list [email protected] https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
