I know they are unimplemented, but I'm working with non-pytables files that do contain them. I'm currently trying to delete a few redundant datasets from these files, then copy/repack to reclaim space. Using pytables, and a script derived from ptrepack have got me very far.
First, is there any ongoing work in this area I should be aware of? Or, is pytables focused on fixed length data types as I've read elsewhere and has no plans to support these (otherwise very common in hdf5 files) variable length strings? Either way, I've encountered a few issues, and have a few fixes to suggest. If a dataset has variable length strings, they come up "UnImplemented", a warning is issued, and they are ignored. Until they are supported, this is OK. However, ptrepack chokes, since it queries srcNode._c_classId (at the end of copyLeaf), and this isn't set for UnImplemented. A quick patch setting: UnImplemented._c_classId = "UnImplemented" works, and allows ptrepack to run successfully. If an *attribute* has a variable length string, the error is fatal [full exception below]. I would expect it to also come up UnImplemented. Since attributes are loaded when the node is retrieved using _g_getChild(), I can't even walk a tree (via n._v_children.itervalues()) that contains variable length strings as attributes. In the same spirit as the "UnImplemented" class, I would expect these attributes to simply come up as an UnImplemented type, so I can walk the nodes and skip them as needed (ideally via introspection). As of now, I have to know in advance which nodes to skip, so I can skip by name. Messy. Any thoughts or suggestions? ~Jonathan PS: Test data can be generated with the hdf5 example code at http://www.hdfgroup.org/training/other-ex5/attrvstr.c Result of running sample attrvstr and dumping with h5dump -A: HDF5 "vlstra.h5" { GROUP "/" { ATTRIBUTE "test_string_array" { DATATYPE H5T_STRING { STRSIZE H5T_VARIABLE; STRPAD H5T_STR_NULLTERM; CSET H5T_CSET_ASCII; CTYPE H5T_C_S1; } DATASPACE SIMPLE { ( 2 ) / ( 2 ) } DATA { (0): "This is array attribute", "It has two strings" } } } } And output from 'ptrepack vlstra.h5 foo.h5': Traceback (most recent call last): File "/opt/python-2.5/bin/ptrepack", line 5, in <module> pkg_resources.run_script('tables==2.1.1', 'ptrepack') File "build/bdist.linux-i686/egg/pkg_resources.py", line 448, in run_script File "build/bdist.linux-i686/egg/pkg_resources.py", line 1173, in run_script File "/opt/python-2.5/bin/ptrepack", line 3, in <module> __requires__ = 'tables==2.1.1' File "build/bdist.linux-x86_64/egg/tables/scripts/ptrepack.py", line 458, in main File "build/bdist.linux-x86_64/egg/tables/file.py", line 230, in openFile File "build/bdist.linux-x86_64/egg/tables/file.py", line 523, in __init__ File "build/bdist.linux-x86_64/egg/tables/group.py", line 288, in _g_postInitHook File "build/bdist.linux-x86_64/egg/tables/utils.py", line 228, in newfget File "build/bdist.linux-x86_64/egg/tables/node.py", line 224, in _v_attrs File "build/bdist.linux-x86_64/egg/tables/attributeset.py", line 248, in __init__ File "build/bdist.linux-x86_64/egg/tables/attributeset.py", line 302, in __getattr__ File "hdf5Extension.pyx", line 518, in tables.hdf5Extension.AttributeSet._g_getAttr File "utilsExtension.pyx", line 673, in tables.utilsExtension.HDF5ToNPExtType TypeError: variable length strings are not supported yet Closing remaining open files: vlstra.h5... done ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users