Hi List, After several months of intense hacking, I'm happy to announce the availability of PyTables 2.1 beta3 as well as PyTables Pro 2.1 beta3. This will probably be the last beta before announcing a release candidate later this month. My plan is to not further change the API until the final release, unless I have a compelling reason to do so.
You can get the software from: http://www.pytables.org/download/preliminary/tables-2.1b3.tar.gz For those with a Pro license, I've dropped the: tables-2.1b3.devpro.tar.gz tarball in their regular download areas. Please notice that, as this is a beta release, you will have to compile the beast yourself. Also, be in mind that this is a beta quality release and not apt for production purposes (I've tested it only with Linux 32-bit and 64-bit, but not on Win nor MacOSX). I'd be very grateful if people can get its hands onto this release and act as beta-testers. Finally, I'm taking some holidays for the rest of the week, so expect a response time adequate to this fact ;-) And now, what's new in 2.1b3: ======================================= Release notes for PyTables 2.1 series ======================================= :Author: Francesc Alted i Abad :Contact: [EMAIL PROTECTED] :Author: Ivan Vilata i Balaguer :Contact: [EMAIL PROTECTED] Changes from 2.0.4 to 2.1b3 =========================== Main improvements ----------------- - Now, when opening a node, that will be done directly (i.e. without populating first all the parent directories). So, for opening pre-known group and leaf locations, the new code is substantially faster (in fact, the cost of these operations is O(1) now). - The `EArray.truncate()` method has been generalized and implemented as `Leaf.truncate()`. Now, it is possible to truncate all *enlargeable* datasets (i.e. all except `Array` and `CArray` objects). Fixes #174. - Disabling the LRU node cache is now supported by setting the NODE_MAX_SLOTS (in parameters.py) to 0 (this can also be achieved through the `nodeCacheSize` parameter of openFile() function). Disabling this cache may be useful in situations where you suspect that maintaining a LRU node cache is actually reducing performance. Besides, this figure can also be negative, meaning that all the touched nodes will be kept in an internal dictionary. See more info about this features in the updated "Getting the most from the node LRU cache" section of chapter 5 of User's Guide. Main improvements (Pro edition) ------------------------------- - New light indexes that can take up to 4x less space than 2.0 indexes, and more than 15x less space than indexes in traditional databases. Four levels of index "lightness", namely ``ultralight``, ``light``, ``medium`` and ``full`` (the latter being the one that implemented the 2.0 version), are available so that the user will be able to choose the most appropriate for her needs. - The index query code has been completely revamped and it is based now on the concept of chunkmaps. This allows for a much more effective way to retrieve table data in queries that have low selectivity, while retaining good performance for high selectivity ones. - A new query optimizer being able to use several indexes simultaneously in a broad range of complex queries. For example, in the query:: (((c_int32 == 3) | (c_bool == True)) & (c_int32 == 5)) & (c_extra > 0) if ``c_int32`` and ``c_bool`` columns are indexed but ``c_extra`` is not, both ``c_int32`` and ``c_bool`` indexes will be used. That will greatly enhance the response times of potentially complicated queries. - An additional optimization in the index creation process permits to achieve completely sorted indexes (CSI), allowing not only to get better performance in queries, but also to create completely sorted tables ordered by a specific field. API additions from 2.0.4 to 2.1b3 --------------------------------- - The `AttributeSet` class has received the next dictionary like methods: `__getitem__()`, `__setitem__()` and `__delitem__()`, so that you can do things like:: for name in node._v_attrs._f_list(): print "name: %s, value: %s" % (name, node._v_attrs[name]) - New `File.fileno()` added. This returns the underlying OS file descriptor for the file. This is meant to allow `File` objects to better interact with the `fcntl` module. - A new `chunkshape` argument has been added to `Leaf.copy()` allowing to specify a chunkshape. It can also take the special values 'auto' (compute a sensible value) and 'keep' (keep the original value, which is the default). - Added a new '--chunkshape' flag to the `ptrepack` console command that corresponds to the new `chunkshape` added to `Leaf.copy()`. - `File.copyNode()` can copy now complete hierarchies directly from the root. This can be useful when one wants to create a new file by merging the contents of others. API additions from 2.0.4 to 2.1b3 (Pro edition) ----------------------------------------------- - A new `Table.itersorted()` iterator allows to iterate through a table following the order of a certain index. It supports iteration on ranges, including negative steps (i.e. reverse sorted order). - New `Table.readSorted()` method that can read a table following the order of a certain index. It supports the reads on ranges, including negative steps (i.e. reverse sorted order). - New `Table.colindexes` property that returns a dictionary with the indexes of the indexed columns in table. - A new `sortby` argument has been added to Table.copy() allowing to a Table to be sorted during the copy operation. - Added a new `propindexes` argument in `Table.copy()`. If true, the indexes in the source table are propagated (created) to the new table. If false (the default), the indexes are not propagated. - New public `Index.readSorted()` and `Index.readIndices()` methods that allow direct access to the index data. - Added new '--sortby' (sort a table by a column key), '--forceCSI' (force the creation of a CSI index) and '--propindexes' (propagate the indexes in original tables) flags to the `ptrepack` utility. Bug fixes --------- - In order to avoid a long-standing bug, all the possible 64-bit class attributes of leaf objects (like `nrows`, `shape` or `nrow`) have been converted into a new `SizeType` type (actually an alias for `numpy.int64`). This change should be backward compatible with existing programs, so you should not need any action to adapt to this. Fixes #118. - When in `ptrepack` a range is not specified, all the elements of leaves are copied now. Before, only the first row was copied, which was clearly wrong. - The `Atom` default value (`Atom.dflt`) is honored now when creating `CArrays`. Fixes #176. Backward incompatible API changes from 2.0.4 to 2.1b3 ===================================================== - The semantics of `Leaf.copy()` has changed: before the chunkshape of destination was computed 'auto'matically while now the default is that the value is 'keep't. This behaviour is thought to satisfy better the least surprise principle. - The `trMap` argument has been removed from the `tables.openFile()` function. Also, the `Node._v_hdf5name` attribute has been removed as well. Fixes #117. - The `sort` parameter of `Table.itersequence()` has been removed as it will not allow to sort sequences larger than memory. Moreover, it is not clear that the sorting operation would be a clear advantage in every situation. Backward incompatible API changes from 2.0.4 to 2.1b3 (Pro edition) =================================================================== - The `Column.createIndex()` has received a new parameter named `kind` which is the first now in the argument list. This is intentional and *incompatible* with previous arglist, so that people should update their existing `Column.createIndex()` calls. - Added a new `Column.createCSIndex()` as a handy way to create a completely sorted index (CSI). - The `Table.indexFilters` property has been removed (after a period of ``DeprecationWarnings``). If you want to change filters in indexes, please use the `filters` parameter of the `Column.createIndex()` method (and the like). - `Table.willQueryUseIndexing()` has changed its return value from a list to a frozen set of usable indexed columns. - Now, the copy of the 'AUTO_INDEX' system attribute of the `Index` class is done only if the `copyuserattrs` in `Table.copy()` is true (the default). ---- **Enjoy data!** -- The PyTables Team -- Francesc Alted Freelance developer Tel +34-964-282-249 ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users