Re: In-place memory manager, mmap (was: Fastest way to store ints and floats on disk)

2008-08-24 Thread Kris Kennaway
castironpi wrote: Hi, I've got an in-place memory manager that uses a disk-backed memory- mapped buffer. Among its possibilities are: storing variable-length strings and structures for persistence and interprocess communication with mmap. It allocates segments of a generic buffer by length

Re: In-place memory manager, mmap

2008-08-24 Thread Kris Kennaway
castironpi wrote: On Aug 24, 9:52 am, Kris Kennaway [EMAIL PROTECTED] wrote: castironpi wrote: Hi, I've got an in-place memory manager that uses a disk-backed memory- mapped buffer. Among its possibilities are: storing variable-length strings and structures for persistence and interprocess

Re: benchmark

2008-08-11 Thread Kris Kennaway
Peter Otten wrote: [EMAIL PROTECTED] wrote: On Aug 10, 10:10 pm, Kris Kennaway [EMAIL PROTECTED] wrote: jlist wrote: I think what makes more sense is to compare the code one most typically writes. In my case, I always use range() and never use psyco. But I guess for most of my work

Re: SSH utility

2008-08-11 Thread Kris Kennaway
James Brady wrote: Hi all, I'm looking for a python library that lets me execute shell commands on remote machines. I've tried a few SSH utilities so far: paramiko, PySSH and pssh; unfortunately all been unreliable, and repeated questions on their respective mailing lists haven't been

Re: benchmark

2008-08-11 Thread Kris Kennaway
Peter Otten wrote: Kris Kennaway wrote: Peter Otten wrote: [EMAIL PROTECTED] wrote: On Aug 10, 10:10 pm, Kris Kennaway [EMAIL PROTECTED] wrote: jlist wrote: I think what makes more sense is to compare the code one most typically writes. In my case, I always use range() and never use psyco

Re: benchmark

2008-08-10 Thread Kris Kennaway
Angel Gutierrez wrote: Steven D'Aprano wrote: On Thu, 07 Aug 2008 00:44:14 -0700, alex23 wrote: Steven D'Aprano wrote: In other words, about 20% of the time he measures is the time taken to print junk to the screen. Which makes his claim that all the console outputs have been removed so

Re: benchmark

2008-08-10 Thread Kris Kennaway
jlist wrote: I think what makes more sense is to compare the code one most typically writes. In my case, I always use range() and never use psyco. But I guess for most of my work with Python performance hasn't been a issue. I haven't got to write any large systems with Python yet, where

Re: Constructing MIME message without loading message stream

2008-08-10 Thread Kris Kennaway
Diez B. Roggisch wrote: Kris Kennaway schrieb: I would like to MIME encode a message from a large file without first loading the file into memory. Assume the file has been pre-encoded on disk (actually I am using encode_7or8bit, so the encoding should be null). Is there a way to construct

Constructing MIME message without loading message stream

2008-08-09 Thread Kris Kennaway
I would like to MIME encode a message from a large file without first loading the file into memory. Assume the file has been pre-encoded on disk (actually I am using encode_7or8bit, so the encoding should be null). Is there a way to construct the flattened MIME message such that data is

Re: variable expansion with sqlite

2008-08-08 Thread Kris Kennaway
marc wyburn wrote: Hi and thanks, I was hoping to avoid having to weld qmarks together but I guess that's why people use things like SQL alchemy instead. It's a good lesson anyway. The '?' substitution is there to safely handle untrusted input. You *don't* want to pass in arbitrary user

Re: pyprocessing/multiprocessing for x64?

2008-08-07 Thread Kris Kennaway
Benjamin Kaplan wrote: The only problem I can see is that 32-bit programs can't access 64-bit dlls, so the OP might have to install the 32-bit version of Python for it to work. Anyway, all of this is beside the point, because the multiprocessing module works fine on amd64 systems. Kris --

Re: re.search much slower then grep on some regular expressions

2008-07-10 Thread Kris Kennaway
John Machin wrote: Uh-huh ... try this, then: http://hkn.eecs.berkeley.edu/~dyoo/python/ahocorasick/ You could use this to find the Str cases and the prefixes of the re cases (which seem to be no more complicated than 'foo.*bar.*zot') and use something slower like Python's re to search the

Re: re.search much slower then grep on some regular expressions

2008-07-10 Thread Kris Kennaway
J. Cliff Dyer wrote: On Wed, 2008-07-09 at 12:29 -0700, samwyse wrote: On Jul 8, 11:01 am, Kris Kennaway [EMAIL PROTECTED] wrote: samwyse wrote: You might want to look at Plex. http://www.cosc.canterbury.ac.nz/greg.ewing/python/Plex/ Another advantage of Plex is that it compiles all

Re: multithreading in python ???

2008-07-10 Thread Kris Kennaway
Laszlo Nagy wrote: Abhishek Asthana wrote: Hi all , I have large set of data computation and I want to break it into small batches and assign it to different threads .I am implementing it in python only. Kindly help what all libraries should I refer to implement the multithreading in

Re: re.search much slower then grep on some regular expressions

2008-07-09 Thread Kris Kennaway
John Machin wrote: Hmm, unfortunately it's still orders of magnitude slower than grep in my own application that involves matching lots of strings and regexps against large files (I killed it after 400 seconds, compared to 1.5 for grep), and that's leaving aside the much longer compilation time

Re: re.search much slower then grep on some regular expressions

2008-07-09 Thread Kris Kennaway
Jeroen Ruigrok van der Werven wrote: -On [20080709 14:08], Kris Kennaway ([EMAIL PROTECTED]) wrote: It's compiler/build output. Sounds like the FreeBSD ports build cluster. :) Yes indeed! Kris, have you tried a PGO build of Python with your specific usage? I cannot guarantee

Re: re.search much slower then grep on some regular expressions

2008-07-09 Thread Kris Kennaway
samwyse wrote: On Jul 8, 11:01 am, Kris Kennaway [EMAIL PROTECTED] wrote: samwyse wrote: You might want to look at Plex. http://www.cosc.canterbury.ac.nz/greg.ewing/python/Plex/ Another advantage of Plex is that it compiles all of the regular expressions into a single DFA. Once that's done

Re: re.search much slower then grep on some regular expressions

2008-07-08 Thread Kris Kennaway
samwyse wrote: On Jul 4, 6:43 am, Henning_Thornblad [EMAIL PROTECTED] wrote: What can be the cause of the large difference between re.search and grep? While doing a simple grep: grep '[^ =]*/' input (input contains 156.000 a in one row) doesn't even take a second. Is this a

Re: re.search much slower then grep on some regular expressions

2008-07-08 Thread Kris Kennaway
samwyse wrote: On Jul 4, 6:43 am, Henning_Thornblad [EMAIL PROTECTED] wrote: What can be the cause of the large difference between re.search and grep? While doing a simple grep: grep '[^ =]*/' input (input contains 156.000 a in one row) doesn't even take a second. Is this a

Re: re.search much slower then grep on some regular expressions

2008-07-07 Thread Kris Kennaway
Paddy wrote: On Jul 4, 1:36 pm, Peter Otten [EMAIL PROTECTED] wrote: Henning_Thornblad wrote: What can be the cause of the large difference between re.search and grep? grep uses a smarter algorithm ;) This script takes about 5 min to run on my computer: #!/usr/bin/env python import re

Re: Bit substring search

2008-06-25 Thread Kris Kennaway
Scott David Daniels wrote: Kris Kennaway wrote: Thanks for the pointers, I think a C extension will end up being the way to go, unless someone has beaten me to it and I just haven't found it yet. Depending on the pattern length you are targeting, it may be fastest to increase the out-of-loop

Bit substring search

2008-06-24 Thread Kris Kennaway
I am trying to parse a bit-stream file format (bzip2) that does not have byte-aligned record boundaries, so I need to do efficient matching of bit substrings at arbitrary bit offsets. Is there a package that can do this? This one comes close: http://ilan.schnell-web.net/prog/bitarray/ but

Re: Bit substring search

2008-06-24 Thread Kris Kennaway
[EMAIL PROTECTED] wrote: Kris Kennaway: I am trying to parse a bit-stream file format (bzip2) that does not have byte-aligned record boundaries, so I need to do efficient matching of bit substrings at arbitrary bit offsets. Is there a package that can do this? You may take a look at Hachoir

Re: Bit substring search

2008-06-24 Thread Kris Kennaway
[EMAIL PROTECTED] wrote: Kris Kennaway: Unfortunately I didnt find anything else useful here yet :( I see, I'm sorry, I have found hachoir quite nice in the past. Maybe there's no really efficient way to do it with Python, but you can create a compiled extension, so you can see if it's fast

ZFS bindings

2008-06-18 Thread Kris Kennaway
Is anyone aware of python bindings for ZFS? I just want to replicate (or at least wrap) the command line functionality for interacting with snapshots etc. Searches have turned up nothing. Kris -- http://mail.python.org/mailman/listinfo/python-list

Re: Looking for lots of words in lots of files

2008-06-18 Thread Kris Kennaway
Calvin Spealman wrote: Upload, wait, and google them. Seriously tho, aside from using a real indexer, I would build a set of the words I'm looking for, and then loop over each file, looping over the words and doing quick checks for containment in the set. If so, add to a dict of file names

Re: Faster I/O in a script

2008-06-04 Thread Kris Kennaway
Gary Herron wrote: [EMAIL PROTECTED] wrote: On Jun 2, 2:08 am, kalakouentin [EMAIL PROTECTED] wrote: Do you know a way to actually load my data in a more batch-like way so I will avoid the constant line by line reading? If your files will fit in memory, you can just do text =

Re: UNIX credential passing

2008-05-30 Thread Kris Kennaway
Sebastian 'lunar' Wiesner wrote: [ Kris Kennaway [EMAIL PROTECTED] ] I want to make use of UNIX credential passing on a local domain socket to verify the identity of a user connecting to a privileged service. However it looks like the socket module doesn't implement sendmsg/recvmsg wrappers

mmap class has slow in operator

2008-05-29 Thread Kris Kennaway
If I do the following: def mmap_search(f, string): fh = file(f) mm = mmap.mmap(fh.fileno(), 0, mmap.MAP_SHARED, mmap.PROT_READ) return mm.find(string) def mmap_is_in(f, string): fh = file(f) mm = mmap.mmap(fh.fileno(), 0, mmap.MAP_SHARED, mmap.PROT_READ)

UNIX credential passing

2008-05-29 Thread Kris Kennaway
I want to make use of UNIX credential passing on a local domain socket to verify the identity of a user connecting to a privileged service. However it looks like the socket module doesn't implement sendmsg/recvmsg wrappers, and I can't find another module that does this either. Is there