Re: freebsd and multiprocessing
On 3/2/2010 12:59 PM, Tim Arnold wrote: I'll write some test programs using multiprocessing and see how they go before committing to rewrite my current code. I've also been looking at 'parallel python' although it may have the same issues. http://www.parallelpython.com/ parallelpython works for me on FreeBSD 6.2. -- http://mail.python.org/mailman/listinfo/python-list
Re: python vs. grep
Anton Slesarev wrote: But I have some problem with writing performance grep analog. I don't think you can ever catch grep. Searching is its only purpose in life and its very good at it. You may be able to come closer, this thread relates. http://groups.google.com/group/comp.lang.python/browse_thread/thread/2f564523f476840a/d9476da5d7a9e466 This relates to the speed of re. If you don't need regex don't use re. If you do need re an alternate re library might be useful but you aren't going to catch grep. -- http://mail.python.org/mailman/listinfo/python-list
Re: Efficient way of testing for substring being one of a set?
[EMAIL PROTECTED] wrote: Dennis Benzinger: You could use the Aho-Corasick algorithm http://en.wikipedia.org/wiki/ Aho-Corasick_algorithm. I don't know if there's a Python implementation yet. http://hkn.eecs.berkeley.edu/~dyoo/python/ahocorasick/ http://nicolas.lehuen.com/download/pytst/ can do it as well. -- http://mail.python.org/mailman/listinfo/python-list
Re: Regex Speed
[EMAIL PROTECTED] wrote: While creating a log parser for fairly large logs, we have run into an issue where the time to process was relatively unacceptable (upwards of 5 minutes for 1-2 million lines of logs). In contrast, using the Linux tool grep would complete the same search in a matter of seconds. Its very hard to beat grep depending on the nature of the regex you are searching using. The regex engines in python/perl/php/ruby have traded the speed of grep/awk for the ability to do more complex searches. http://swtch.com/~rsc/regexp/regexp1.html This might not be your problem but if it is you can always popen grep. It would be nice if there were a Thompson NFA re module. -- http://mail.python.org/mailman/listinfo/python-list
Re: Regex Speed
John Machin wrote: Or a Glushkov NFA simulated by bit parallelism re module ... see http://citeseer.ist.psu.edu/551772.html (which Russ Cox (author of the paper you cited) seems not to have read). NR-grep looks interesting, I'll read that. Thanks. Cox uses a pathological regex (regex = a? * 29 + a * 29, in Python code) to make his point: grep uses a Thompson gadget and takes linear time, while Python perl and friends use backtracking and go off the planet. It might be pathological but based on the original posters timings his situation seems to relate. My main point was that its quite possible he isn't going to get faster than grep regardless of the language he uses and if grep wins, use it. I frequently do. Getting back to the It would be nice ... bit: yes, it would be nice to have even more smarts in re, but who's going to do it? It's not a rainy Sunday afternoon job : One of these days. :) -- http://mail.python.org/mailman/listinfo/python-list