Thanks for the reply.   I will put some investigation of C++ access on my list 
for items to look at over the slow holiday season.

For the short term we will store a C++ ready index as a different table object 
in the same h5 file.  It will work... just a bit of a waste on disk space.

One follow up question
Why would my performance of
for row in node.where('stringField == "SomeString"'):
not be noticeably faster than
for row in node:
                               if row.stringField == "SomeString" :

Specifically when there is no index.   I understand and see the speed 
improvement only when I have a index.  I expected to see some benefit from 
numexpr even with no index.  I expected node.where() to be much faster.  What I 
see is identical performance.  Is numexpr benefit only seen for complex math 
like (floatField ** intField > otherFloatField)  I did not see that to be the 
case on my first attempt....   Seems that I only benefit from a index.


From: Anthony Scopatz [mailto:scop...@gmail.com]
Sent: Friday, November 09, 2012 12:24 AM
To: Discussion list for PyTables
Subject: Re: [Pytables-users] pyTable index from c++


On Thu, Nov 8, 2012 at 10:19 PM, Jim Knoll 
<jim.kn...@spottradingllc.com<mailto:jim.kn...@spottradingllc.com>> wrote:
I love the index function and promote the internal use of PyTables at my 
company.  The availability of a indexed method to speed the search is the main 
reason why.

We are a mixed shop using c++ to create H5 (just for the raw speed ...  need to 
keep up with streaming data)  End users start with python pyTables to consume 
the data.  (Often after we have created indexes from python 
pytables.col.col1.createIndex())

Sometimes the users come up with something we want to do thousands of times and 
performance is critical.  But then we are falling back to c++ We can use our 
own index method but would like to make dbl use of the PyTables index.

I know the python table.where(   is implemented in C.

Hi Jim,

This is only kind of true.  Querying (ie all of the where*() methods) are 
actually mostly written in Python in the tables.py and expressions.py files.  
However, they make use of numexpr [1].

 Is there a way to access that from c or c++?    Don't mind if I need to do 
work to get the result I think in my case the work may be worth it.

PLAN 1: One possibility is that the parts of PyTables are written in Cython.  
We could maybe try (without making any edits to these files) to convert them to 
Cython.  This has the advantage that for Cython files, if you write the 
appropriate C++ header file and link against the shared library correctly, it 
is possible to access certain functions from C/C++.   BUT, I am not sure how 
much of speed boost you would get out of this since you would still be calling 
out to the Python interpreter to get these result.  You are just calling 
Python's virtual machine from C++ rather than calling it from Python (like 
normal).   This has the advantage that you would basically get access to these 
functions acting on tables from C++.

PLAN 2: Alternatively, numexpr itself is mostly written in C++ already.  You 
should be able to call core numexpr functions directly.  However, you would 
have to feed it data that you read from the tables yourself.  These could even 
be table indexes.  On a personal note, if you get code working that does this, 
I would be interested in seeing your implementation.  (I have another project 
where I have tables that I want to query from C++)

Let us know what route you ultimately end up taking or if you have any further 
questions!

Be Well
Anthony

1. http://code.google.com/p/numexpr/source/browse/#hg%2Fnumexpr


________________________________

    Jim Knoll
     Data Developer

     Spot Trading L.L.C
     440 South LaSalle St., Suite 2800
     Chicago, IL 60605
     Office: 312.362.4550<tel:312.362.4550>
     Direct: 312-362-4798<tel:312-362-4798>
     Fax: 312.362.4551<tel:312.362.4551>
     jim.kn...@spottradingllc.com<mailto:jim.kn...@spottradingllc.com>
     www.spottradingllc.com<http://www.spottradingllc.com/>

________________________________

The information contained in this message may be privileged and confidential 
and protected from disclosure. If the reader of this message is not the 
intended recipient, or an employee or agent responsible for delivering this 
message to the intended recipient, you are hereby notified that any 
dissemination, distribution or copying of this communication is strictly 
prohibited. If you have received this communication in error, please notify us 
immediately by replying to the message and deleting it from your computer. 
Thank you. Spot Trading, LLC



------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_nov
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net<mailto:Pytables-users@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/pytables-users

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_nov
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to