Thanks for the reply. I will put some investigation of C++ access on my list
for items to look at over the slow holiday season.
For the short term we will store a C++ ready index as a different table object
in the same h5 file. It will work... just a bit of a waste on disk space.
One follow up question
Why would my performance of
for row in node.where('stringField == "SomeString"'):
not be noticeably faster than
for row in node:
if row.stringField == "SomeString" :
Specifically when there is no index. I understand and see the speed
improvement only when I have a index. I expected to see some benefit from
numexpr even with no index. I expected node.where() to be much faster. What I
see is identical performance. Is numexpr benefit only seen for complex math
like (floatField ** intField > otherFloatField) I did not see that to be the
case on my first attempt.... Seems that I only benefit from a index.
From: Anthony Scopatz [mailto:scop...@gmail.com]
Sent: Friday, November 09, 2012 12:24 AM
To: Discussion list for PyTables
Subject: Re: [Pytables-users] pyTable index from c++
On Thu, Nov 8, 2012 at 10:19 PM, Jim Knoll
<jim.kn...@spottradingllc.com<mailto:jim.kn...@spottradingllc.com>> wrote:
I love the index function and promote the internal use of PyTables at my
company. The availability of a indexed method to speed the search is the main
reason why.
We are a mixed shop using c++ to create H5 (just for the raw speed ... need to
keep up with streaming data) End users start with python pyTables to consume
the data. (Often after we have created indexes from python
pytables.col.col1.createIndex())
Sometimes the users come up with something we want to do thousands of times and
performance is critical. But then we are falling back to c++ We can use our
own index method but would like to make dbl use of the PyTables index.
I know the python table.where( is implemented in C.
Hi Jim,
This is only kind of true. Querying (ie all of the where*() methods) are
actually mostly written in Python in the tables.py and expressions.py files.
However, they make use of numexpr [1].
Is there a way to access that from c or c++? Don't mind if I need to do
work to get the result I think in my case the work may be worth it.
PLAN 1: One possibility is that the parts of PyTables are written in Cython.
We could maybe try (without making any edits to these files) to convert them to
Cython. This has the advantage that for Cython files, if you write the
appropriate C++ header file and link against the shared library correctly, it
is possible to access certain functions from C/C++. BUT, I am not sure how
much of speed boost you would get out of this since you would still be calling
out to the Python interpreter to get these result. You are just calling
Python's virtual machine from C++ rather than calling it from Python (like
normal). This has the advantage that you would basically get access to these
functions acting on tables from C++.
PLAN 2: Alternatively, numexpr itself is mostly written in C++ already. You
should be able to call core numexpr functions directly. However, you would
have to feed it data that you read from the tables yourself. These could even
be table indexes. On a personal note, if you get code working that does this,
I would be interested in seeing your implementation. (I have another project
where I have tables that I want to query from C++)
Let us know what route you ultimately end up taking or if you have any further
questions!
Be Well
Anthony
1. http://code.google.com/p/numexpr/source/browse/#hg%2Fnumexpr
________________________________
Jim Knoll
Data Developer
Spot Trading L.L.C
440 South LaSalle St., Suite 2800
Chicago, IL 60605
Office: 312.362.4550<tel:312.362.4550>
Direct: 312-362-4798<tel:312-362-4798>
Fax: 312.362.4551<tel:312.362.4551>
jim.kn...@spottradingllc.com<mailto:jim.kn...@spottradingllc.com>
www.spottradingllc.com<http://www.spottradingllc.com/>
________________________________
The information contained in this message may be privileged and confidential
and protected from disclosure. If the reader of this message is not the
intended recipient, or an employee or agent responsible for delivering this
message to the intended recipient, you are hereby notified that any
dissemination, distribution or copying of this communication is strictly
prohibited. If you have received this communication in error, please notify us
immediately by replying to the message and deleting it from your computer.
Thank you. Spot Trading, LLC
------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_nov
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net<mailto:Pytables-users@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/pytables-users
------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_nov
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users