I know they are unimplemented, but I'm working with non-pytables files that do 
contain them. I'm currently trying to delete a few redundant datasets from 
these files, then copy/repack to reclaim space. Using pytables, and a script 
derived from ptrepack have got me very far.

First, is there any ongoing work in this area I should be aware of? Or, is 
pytables focused on fixed length data types as I've read elsewhere and has no 
plans to support these (otherwise very common in hdf5 files) variable length 
strings?

Either way, I've encountered a few issues, and have a few fixes to suggest.

If a dataset has variable length strings, they come up "UnImplemented", a 
warning is issued, and they are ignored.  Until they are supported, this is OK.

However, ptrepack chokes, since it queries srcNode._c_classId (at the end of 
copyLeaf), and this isn't set for UnImplemented. A quick patch setting:
UnImplemented._c_classId = "UnImplemented"
works, and allows ptrepack to run successfully.

If an *attribute* has a variable length string, the error is fatal [full 
exception below]. I would expect it to also come up UnImplemented. Since 
attributes are loaded when the node is retrieved using _g_getChild(), I can't 
even walk a tree (via n._v_children.itervalues()) that contains variable length 
strings as attributes. In the same spirit as the "UnImplemented" class, I would 
expect these attributes to simply come up as an UnImplemented type, so I can 
walk the nodes and skip them as needed (ideally via introspection).

As of now, I have to know in advance which nodes to skip, so I can skip by 
name. Messy.

Any thoughts or suggestions?

~Jonathan


PS: Test data can be generated with the hdf5 example code at 
http://www.hdfgroup.org/training/other-ex5/attrvstr.c
Result of running sample attrvstr and dumping with h5dump -A:

HDF5 "vlstra.h5" {
GROUP "/" {
   ATTRIBUTE "test_string_array" {
      DATATYPE  H5T_STRING {
            STRSIZE H5T_VARIABLE;
            STRPAD H5T_STR_NULLTERM;
            CSET H5T_CSET_ASCII;
            CTYPE H5T_C_S1;
         }
      DATASPACE  SIMPLE { ( 2 ) / ( 2 ) }
      DATA {
      (0): "This is array attribute", "It has two strings"
      }
   }
}
}

And output from 'ptrepack vlstra.h5 foo.h5':

Traceback (most recent call last):
  File "/opt/python-2.5/bin/ptrepack", line 5, in <module>
    pkg_resources.run_script('tables==2.1.1', 'ptrepack')
  File "build/bdist.linux-i686/egg/pkg_resources.py", line 448, in run_script
  File "build/bdist.linux-i686/egg/pkg_resources.py", line 1173, in run_script
  File "/opt/python-2.5/bin/ptrepack", line 3, in <module>
    __requires__ = 'tables==2.1.1'
  File "build/bdist.linux-x86_64/egg/tables/scripts/ptrepack.py", line 458, in 
main
  File "build/bdist.linux-x86_64/egg/tables/file.py", line 230, in openFile
  File "build/bdist.linux-x86_64/egg/tables/file.py", line 523, in __init__
  File "build/bdist.linux-x86_64/egg/tables/group.py", line 288, in 
_g_postInitHook
  File "build/bdist.linux-x86_64/egg/tables/utils.py", line 228, in newfget
  File "build/bdist.linux-x86_64/egg/tables/node.py", line 224, in _v_attrs
  File "build/bdist.linux-x86_64/egg/tables/attributeset.py", line 248, in 
__init__
  File "build/bdist.linux-x86_64/egg/tables/attributeset.py", line 302, in 
__getattr__
  File "hdf5Extension.pyx", line 518, in 
tables.hdf5Extension.AttributeSet._g_getAttr
  File "utilsExtension.pyx", line 673, in tables.utilsExtension.HDF5ToNPExtType
TypeError: variable length strings are not supported yet
Closing remaining open files: vlstra.h5... done


------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to