Hi Jacob,

First of all, please remember to subscribe to this list before to post.  
If not, you are risking that your messages would get lost in the middle 
of tons of spam that we all (not only me) are unfortunately receiving.

A Sunday 29 March 2009, escriguéreu:
> Subject: Complex query does not compile
> From: Jacob Quinn Shenker <jqshen...@gmail.com>
> To: pytables-users@lists.sourceforge.net
> Date: 09-03-29 17:58:51
>    
> Hi,
>
> I'm evaluating PyTables Pro at the moment, and in my testing I find
> that it cannot handle very complex queries. Unfortunately, this is
> what my application requires. I have a table sorted on-disk on a CSI
> index ('IPIX'), and I need to select many intervals on it.
> Unfortunately, Numexpr complains:
>
> s_push: parser stack overflow
> Traceback (most recent call last):
>   File "bench_table.py", line 39, in <module>
>     main()
>   File "bench_table.py", line 27, in main
>     coords = table.getWhereList(q, {})
>   File "/u/gl/jshenker/lib/python/tables/table.py", line 1309, in
> getWhereList self._where(condition, condvars, start, stop, step) ]
>   File "/u/gl/jshenker/lib/python/tables/table.py", line 1218, in
> _where self, compiled, condvars, start, stop, step)
>   File "/u/gl/jshenker/lib/python/tables/_table_pro.py", line 202, in
> _table__whereIndexed
>     chunkmap = numexpr.evaluate(strexpr, cmvars)
>   File "/u/gl/jshenker/lib/python/tables/numexpr/compiler.py", line
> 590, in evaluate
>     _names_cache[expr_key] = getExprNames(ex, context)
>   File "/u/gl/jshenker/lib/python/tables/numexpr/compiler.py", line
> 565, in getExprNames
>     ex = stringToExpression(text, {}, context)
>   File "/u/gl/jshenker/lib/python/tables/numexpr/compiler.py", line
> 198, in stringToExpression
>     c = compile(s, '<expr>', 'eval')
> MemoryError

That's curious.  The offending function, `compile()`, is one that 
belongs to the Python core, and it seems to work well on my machine for 
arbitrarily large expressions:

In [43]: l = ["((IPIX >= %d) & (IPIX <= %d)) | " % (i, i+1) for i in 
range(1000)]
In [44]: s = "".join(l)[:-3]
In [45]: compile(s, '<expr>', 'eval')
Out[45]: <code object <module> at 0x8ac2530, file "<expr>", line 1>

so, I suppose that the problem is in another place than Python's 
`compile()` function.  Which version of Python and operating system are 
you working with?  And, how many RAM do you have available?

> I have two questions:
> 1) How do I get this to work?

So as to answer this, it would help if you can send a small script that 
reproduces the problem.

> 2) Is there a better/faster way to do this selection of many
> intervals on indices with PyTables?

Well, you may want to try to do the 'and' selections one by one, and 
then join all the results together.  Something like:

l = []
for l1, l2 in zip(lim_inf, lim_sup):
  l.append([r[:] for r in tbl.where("(IPIX >= l1) & (IPIX <= l2)")])

`lim_inf` and `lim_sup` being sequences like:

lim1 = [4, 3, ...]
lim2 = [10, 12, ...]

and in `l` you would have the results you want.  Provided that 
expression above can be cached, this should be reasonably fast.
Would that work with your setup?

Cheers,

-- 
Francesc Alted

"One would expect people to feel threatened by the 'giant
brains or machines that think'.  In fact, the fightening
computer becomes less frightening if it is used only to
simulate a familiar noncomputer."

-- Edsger W. Dykstra
   "On the cruelty of really teaching computer science"

------------------------------------------------------------------------------
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to