How to print SRE_Pattern (regexp object) text for debugging purposes?

2010-06-17 Thread dmtr
I need to print the regexp pattern text (SRE_Pattern object ) for debugging purposes, is there any way to do it gracefully? I've came up with the following hack, but it is rather crude... Is there an official way to get the regexp pattern text? import re, pickle r = re.compile('^abc$', re.I) r

Re: How to print SRE_Pattern (regexp object) text for debugging purposes?

2010-06-17 Thread dmtr
On Jun 17, 3:35 pm, MRAB pyt...@mrabarnett.plus.com wrote:   import re   r = re.compile('^abc$', re.I)   r.pattern '^abc$'   r.flags 2 Hey, thanks. It works. Couldn't find it in a reference somehow. And it's not in the inspect.getmembers(r). Must be doing something wrong. -- Cheers,

Is there any way to minimize str()/unicode() objects memory usage [Python 2.6.4] ?

2010-08-06 Thread dmtr
I'm running into some performance / memory bottlenecks on large lists. Is there any easy way to minimize/optimize memory usage? Simple str() and unicode objects() [Python 2.6.4/Linux/x86]: sys.getsizeof('') 24 bytes sys.getsizeof('0')25 bytes sys.getsizeof(u'')28 bytes

Re: Is there any way to minimize str()/unicode() objects memory usage [Python 2.6.4] ?

2010-08-06 Thread dmtr
Steven, thank you for answering. See my comments inline. Perhaps I should have formulated my question a bit differently: Are there any *compact* high performance containers for unicode()/str() objects in Python? By *compact* I don't mean compression. Just optimized for memory usage, rather than

Re: Is there any way to minimize str()/unicode() objects memory usage [Python 2.6.4] ?

2010-08-06 Thread dmtr
Well...  63 bytes per item for very short unicode strings... Is there any way to do better than that? Perhaps some compact unicode objects? There is a certain price you pay for having full-feature Python objects. Are there any *compact* Python objects? Optimized for compactness? What are

Re: Is there any way to minimize str()/unicode() objects memory usage [Python 2.6.4] ?

2010-08-07 Thread dmtr
On Aug 6, 10:56 pm, Michael Torrie torr...@gmail.com wrote: On 08/06/2010 07:56 PM, dmtr wrote: Ultimately a dict that can store ~20,000,000 entries: (u'short string' : (int, int, int, int, int, int, int)). I think you really need a real database engine.  With the proper indexes, MySQL

Re: Is there any way to minimize str()/unicode() objects memory usage [Python 2.6.4] ?

2010-08-07 Thread dmtr
On Aug 6, 11:50 pm, Peter Otten __pete...@web.de wrote: I don't know to what extent it still applys but switching off cyclic garbage collection with import gc gc.disable() Haven't tried it on the real dataset. On the synthetic test it (and sys.setcheckinterval(10)) gave ~2% speedup and

Re: Is there any way to minimize str()/unicode() objects memory usage [Python 2.6.4] ?

2010-08-07 Thread dmtr
Correction. I've copy-pasted it wrong! array.array('i', (i, i+1, i+2, i +3, i+4, i+5, i+6)) was the best. for i in xrange(0, 100): d[unicode(i)] = (i, i+1, i+2, i+3, i+4, i+5, i+6) 100 keys, ['VmPeak:\t 224704 kB', 'VmSize:\t 224704 kB'], 4.079240 seconds, 245143.698209 keys per

Re: Is there any way to minimize str()/unicode() objects memory usage [Python 2.6.4] ?

2010-08-07 Thread dmtr
Looking at your benchmark, random.choice(letters) has probably less overhead than letters[random.randint(...)]. You might even try to inline it as Right... random.choice()... I'm a bit new to python, always something to learn. But anyway in that benchmark (from http://bugs.python.org/issue9520

Re: Is there any way to minimize str()/unicode() objects memory usage [Python 2.6.4] ?

2010-08-07 Thread dmtr
I guess with the actual dataset I'll be able to improve the memory usage a bit, with BioPython::trie. That would probably be enough optimization to continue working with some comfort. On this test code BioPython::trie gives a bit of improvement in terms of memory. Not much though... d = dict()

Ignoring XML Namespaces with cElementTree

2010-04-27 Thread dmtr
Is there any way to configure cElementTree to ignore the XML root namespace? Default cElementTree (Python 2.6.4) appears to add the XML root namespace URI to _every_ single tag. I know that I can strip URIs manually, from every tag, but it is a rather idiotic thing to do (performance wise). --

Re: Ignoring XML Namespaces with cElementTree

2010-04-29 Thread dmtr
I'm referring to xmlns/URI prefixes. Here's a code example: from xml.etree.cElementTree import iterparse from cStringIO import StringIO xml = root xmlns=http://www.very_long_url.com;child// root for event, elem in iterparse(StringIO(xml)): print event, elem The output is: end Element

Re: Ignoring XML Namespaces with cElementTree

2010-04-30 Thread dmtr
I think that's your main mistake: don't remove them. Instead, use the fully qualified names when comparing. Stefan Yes. That's what I'm forced to do. Pre-calculating tags like tagChild = {%s}child % uri and using them instead of child. As a result the code looks ugly and there is extra

Re: Ignoring XML Namespaces with cElementTree

2010-04-30 Thread dmtr
Here's a link to the patch exposing this parameter: http://bugs.python.org/issue8583 -- http://mail.python.org/mailman/listinfo/python-list

Re: Ignoring XML Namespaces with cElementTree

2010-05-01 Thread dmtr
Unless you have multiple namespaces or are working with defined schema or something, it's useless boilerplate. It'd be a nice feature if ElementTree could let users optionally ignore a namespace, unfortunately it doesn't have it. Yep. Exactly my point. Here's a link to the patch addressing

Re: Parser

2010-05-02 Thread dmtr
On May 2, 12:54 pm, Andreas Löscher andreas.loesc...@s2005.tu- chemnitz.de wrote: Hi, I am looking for an easy to use parser. I am want to get an overview over parsing and want to try to get some information out of a C-Header file. Which parser would you recommend? ANTLR --

Re: Parser

2010-05-02 Thread dmtr
ANTLR I don't know if it's that easy to get started with though. The companion for-pay book is *most excellent*, but it seems to have been written to the detriment of the normal online docs. Cheers, Chris --http://blog.rebertia.com IMO ANTLR is much easier to use compared to any other

A python interface to google-sparsehash?

2010-05-04 Thread dmtr
Anybody knows if a python sparsehash module is there in the wild? -- http://mail.python.org/mailman/listinfo/python-list

An empty object with dynamic attributes (expando)

2010-06-03 Thread dmtr
How can I create an empty object with dynamic attributes? It should be something like: m = object() m.myattr = 1 But this doesn't work. And I have to resort to: class expando(object): pass m = expando() m.myattr = 1 Is there a one-liner that would do the thing? -- Cheers, Dmitry --

Re: getting MemoryError with dicts; suspect memory fragmentation

2010-06-03 Thread dmtr
On Jun 3, 3:43 pm, Emin.shopper Martinian.shopper emin.shop...@gmail.com wrote: Dear Experts, I am getting a MemoryError when creating a dict in a long running process and suspect this is due to memory fragmentation. Any suggestions would be welcome. Full details of the problem are below. I

Re: getting MemoryError with dicts; suspect memory fragmentation

2010-06-03 Thread dmtr
I have a long running processing which eventually dies to a MemoryError exception. When it dies, it is using roughly 900 MB on a 4 GB Windows XP machine running Python 2.5.4. If I do import pdb; BTW have you tried the same code with the Python 2.6.5? -- Dmitry --

Re: getting MemoryError with dicts; suspect memory fragmentation

2010-06-03 Thread dmtr
I'm still unconvinced that it is a memory fragmentation problem. It's very rare. Can you give more concrete example that one can actually try to execute? Like: python -c list([list([0]*xxx)+list([1]*xxx)+list([2]*xxx) +list([3]*xxx) for xxx in range(10)]) -- Dmitry --

Re: An empty object with dynamic attributes (expando)

2010-06-04 Thread dmtr
Why does it have to be a one-liner? Is the Enter key on your keyboard broken? Nah. I was simply looking for something natural and intuitive, like: m = object(); m.a = 1; Usually python is pretty good providing these natural and intuitive solutions. You have a perfectly good solution: define

Re: An empty object with dynamic attributes (expando)

2010-06-05 Thread dmtr
Right. m = lambda:expando m.myattr = 1 print m.myattr 1 -- Cheers, Dmitry -- http://mail.python.org/mailman/listinfo/python-list

Re: An empty object with dynamic attributes (expando)

2010-06-10 Thread dmtr
On Jun 9, 7:31 pm, a...@pythoncraft.com (Aahz) wrote: dmtr  dchich...@gmail.com wrote: m = lambda:expando m.myattr = 1 print m.myattr 1 That's a *great* technique if your goal is to confuse people. -- Yeah. But it is kinda cute. Let's hope it won't get adapted (adopted ;). -- Dmitry