Re: [Pytables-users] Speed of in-kernel Full-Table Search

Wagner Sebastian Tue, 25 Jun 2013 00:37:44 -0700

Hi Anthony and Antonio,

Thanks for your fast responses. It's great to hear all features are now free to 
use, though I needed one and a half week to get this.


The first reference I read to learn the usage of PyTables was Hints for SQL 
Users [1], where is stated several times, for example in the section ' Creating 
an index':
> Indexing is supported in the commercial version of PyTables (PyTablesPro).
I would suggest that these texts should be updated.
Being convinced it's only available in Pro-Version after I read it so often, I 
also overread the warning in the PyTables Pro page[2] (As I were only 
interested in the features not available in the free version I just scrolled 
down immediately, diagonal reading...). So the next suggestion is to give a 
color to the warning text there :)

[1]
http://www.pytables.org/moin/HintsForSQLUsers#Creatinganindex
http://www.pytables.org/moin/HintsForSQLUsers#Selectingdata
[2]
http://www.pytables.org/moin/PyTablesPro

regards,
Sebastian

On Mon, Jun 24, 2013 at 4:25 AM, Wagner Sebastian < 
sebastian.wagner...@ait.ac.at> wrote:

>  Dear PyTables-Users,****
>
> ** **
>
> For testing purposes I use a PyTables DB with 4 columns (1x Uint8 and
> 3xFloat) with 750k rows, the total file size about 90MB. As the free 
> version does no support indexing I thought that a search (full-table) 
> on this database would last a least one or two seconds, because the 
> file has to be loaded first (throttleneck I/O), and then the search 
> over ~20k rows can begin. But PyTables took only 0.05 seconds for a 
> full table search (in-kernel, so near C-speed, but nevertheless full 
> table), while my bisecting algorithm with a precomputed sorted list 
> wrapped around PyTables (but saved in there), took about 0.5 
> seconds.****
>
> ** **
>
> So the thing I don?t understand: How can PyTables be so fast without 
> any Indexing?
>

Hi Sebastian,

First, there is no longer a non-free version of PyTables and v3.0 *does* have 
indexing capabilities.  However, you have to enable them so you probably 
weren't using them.

PyTables is fast because HDF5 is a binary format, it using pthreads under the 
covers to parallelize some tasks, and it uses numexpr (which is also
parallel) to evaluate many expressions.  All of these things help make PyTables 
great!

Be Well
Anthony


Il 24/06/2013 11:25, Wagner Sebastian ha scritto:
> Dear PyTables-Users,
> 
> For testing purposes I use a PyTables DB with 4 columns (1x Uint8 and 
> 3xFloat) with 750k rows, the total file size about 90MB. As the free version 
> does no support indexing I thought that a search (full-table) on this 
> database would last a least one or two seconds, because the file has to be 
> loaded first (throttleneck I/O), and then the search over ~20k rows can 
> begin. But PyTables took only 0.05 seconds for a full table search 
> (in-kernel, so near C-speed, but nevertheless full table), while my bisecting 
> algorithm with a precomputed sorted list wrapped around PyTables (but saved 
> in there), took about 0.5 seconds.
> 
> So the thing I don't understand: How can PyTables be so fast without any 
> Indexing?
> 
> I'm using 3.0.0rc2 coming with WinPython
> 
> Regards,
> Sebastian

The indexing features of PyTables Pro are now available in the open source 
version of PyTables since version 2.3 (please see [1]).



[1]
http://pytables.github.io/release-notes/RELEASE_NOTES_v2.3.x.html#changes-from-2-2-1-to-2-3

ciao

--
Antonio Valentino

------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Re: [Pytables-users] Speed of in-kernel Full-Table Search

Reply via email to