[Pytables-users] Ubuntu 11.10: blosc is not supported?
Forwarding to the list. ~JoshBegin forwarded message:From: pytables-users-boun...@lists.sourceforge.netDate: December 6, 2011 2:20:35 PM GMT+01:00To: pytables-users-ow...@lists.sourceforge.netSubject: Auto-discard notificationThe attached message has been automatically discarded.From: Martin Felder martin.fel...@zsw-bw.deDate: December 6, 2011 2:05:15 PM GMT+01:00To: pytables-users@lists.sourceforge.netSubject: Ubuntu 11.10: blosc is not supported?Hi,I installed pytables via the Ubuntu package manager (currently version 2.1.2-3.1build1), and we use it a lot for production work. Thanks for this great package!So far I haven't tried enabling compression, but since it says in the documentation BLOSC comes with it, I created a filter with complib="blosc", only to get:ValueError: compression library ``blosc`` is not supported; it must be one of: zlib, lzo, bzip2Do I have to compile a newer version from source to enable BLOSC?Thanks,Martinattachment: martin.felder.vcf-- Cloud Services Checklist: Pricing and Packaging Optimization This white paper is intended to serve as a reference, checklist and point of discussion for anyone considering optimizing the pricing and packaging model of a cloud services business. Read Now! http://www.accelacomm.com/jaw/sfnl/114/51491232/___ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users
Re: [Pytables-users] Giant HDF5/PyTables error message
2011/12/5 Francesc Alted fal...@pytables.org Regarding the big error, the HDF5 error stack could be converted into a Python error so that it can be caught, if needed. Hmm, I'll file a ticket on this later on. https://github.com/PyTables/PyTables/issues/120 -- Francesc Alted -- Cloud Services Checklist: Pricing and Packaging Optimization This white paper is intended to serve as a reference, checklist and point of discussion for anyone considering optimizing the pricing and packaging model of a cloud services business. Read Now! http://www.accelacomm.com/jaw/sfnl/114/51491232/___ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users
Re: [Pytables-users] Ubuntu 11.10: blosc is not supported?
2011/12/6 PyTables Org pytab...@googlemail.com Forwarding to the list. ~Josh Begin forwarded message: *From: *pytables-users-boun...@lists.sourceforge.net *Date: *December 6, 2011 2:20:35 PM GMT+01:00 *To: *pytables-users-ow...@lists.sourceforge.net *Subject: **Auto-discard notification* The attached message has been automatically discarded. *From: *Martin Felder martin.fel...@zsw-bw.de *Date: *December 6, 2011 2:05:15 PM GMT+01:00 *To: *pytables-users@lists.sourceforge.net *Subject: **Ubuntu 11.10: blosc is not supported?* Hi, I installed pytables via the Ubuntu package manager (currently version 2.1.2-3.1build1), and we use it a lot for production work. Thanks for this great package! So far I haven't tried enabling compression, but since it says in the documentation BLOSC comes with it, I created a filter with complib=blosc, only to get: ValueError: compression library ``blosc`` is not supported; it must be one of: zlib, lzo, bzip2 Do I have to compile a newer version from source to enable BLOSC? Yes, you need at least PyTables 2.2 for using Blosc. Antonio has recently released PyTables binaries for Debian in: http://sourceforge.net/projects/pytables/files/pytables/2.3.1/ that might be useful for Ubuntu too. -- Francesc Alted -- Cloud Services Checklist: Pricing and Packaging Optimization This white paper is intended to serve as a reference, checklist and point of discussion for anyone considering optimizing the pricing and packaging model of a cloud services business. Read Now! http://www.accelacomm.com/jaw/sfnl/114/51491232/___ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users
Re: [Pytables-users] Ubuntu 11.10: blosc is not supported?
Hi Francesc, hi Martin, Il 06/12/2011 14:55, Francesc Alted ha scritto: 2011/12/6 PyTables Org pytab...@googlemail.com Forwarding to the list. ~Josh Begin forwarded message: *From: *pytables-users-boun...@lists.sourceforge.net *Date: *December 6, 2011 2:20:35 PM GMT+01:00 *To: *pytables-users-ow...@lists.sourceforge.net *Subject: **Auto-discard notification* The attached message has been automatically discarded. *From: *Martin Felder martin.fel...@zsw-bw.de *Date: *December 6, 2011 2:05:15 PM GMT+01:00 *To: *pytables-users@lists.sourceforge.net *Subject: **Ubuntu 11.10: blosc is not supported?* Hi, I installed pytables via the Ubuntu package manager (currently version 2.1.2-3.1build1), and we use it a lot for production work. Thanks for this great package! So far I haven't tried enabling compression, but since it says in the documentation BLOSC comes with it, I created a filter with complib=blosc, only to get: ValueError: compression library ``blosc`` is not supported; it must be one of: zlib, lzo, bzip2 Do I have to compile a newer version from source to enable BLOSC? Yes, you need at least PyTables 2.2 for using Blosc. Antonio has recently released PyTables binaries for Debian in: http://sourceforge.net/projects/pytables/files/pytables/2.3.1/ that might be useful for Ubuntu too. Yes it should work but it is only for amd64. Users of Ubuntu 11.10 can use the following PPA: https://launchpad.net/~a.valentino/+archive/eotools I'm trying to ush it in the official debian/ubuntu archives but I have serious problems to contact current maintainers. cheers -- Antonio Valentino -- Cloud Services Checklist: Pricing and Packaging Optimization This white paper is intended to serve as a reference, checklist and point of discussion for anyone considering optimizing the pricing and packaging model of a cloud services business. Read Now! http://www.accelacomm.com/jaw/sfnl/114/51491232/ ___ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users
[Pytables-users] Some experiences with PyTables
Forwarding to the list. ~Josh. Begin forwarded message: From: pytables-users-boun...@lists.sourceforge.net Date: December 6, 2011 9:27:27 PM GMT+01:00 To: pytables-users-ow...@lists.sourceforge.net Subject: Auto-discard notification The attached message has been automatically discarded. From: Edward C. Jones edcjo...@comcast.net Date: December 6, 2011 9:25:08 PM GMT+01:00 To: pytables-users@lists.sourceforge.net Subject: Some experiences with PyTables My computer has an up-to-date Debian stable distribution installed. I have the following Debian packages (plus most dev packages): python 2.6.6-3+squeeze6 python-numpy 1:1.4.1-5 hdf5-tools 1.8.4-patch1-2 libhdf5-serial-1.8.4 1.8.4-patch1-2 I have compiled and installed PyTables 2.3.1. 1. There seems to be an unpythonic design choice with the start, stop, step convention for PyTables. Anything that is unnatural to a Python programmer should be heavily documented. 2. There may be a bug in itersorted. Here is code for (1) and (2): #! /usr/bin/env python import random, tables h5file = tables.openFile('mytable.h5', mode='w') class silly_class(tables.IsDescription): num = tables.Int32Col(pos=0) mytable = h5file.createTable(h5file.root, 'mytable', silly_class, 'a few ints', expectedrows=4) row = mytable.row for i in range(10): row['num'] = random.randint(0, 99) row.append() mytable.flush() mytable.cols.num.createCSIndex() # Python's idiom for start, stop, step: print 'Python:', range(9, -1, -1) output = mytable.readSorted('num', start=0, stop=10, step=-1) print 'readSorted:', 0, 10, -1, output # copy supports a negative step. It seems that start and stop are applied # _after_ the sort is done. Very unlike Python. Please document thoroughly. print h5file.root.mytable[:] h5file.root.mytable.copy(h5file.root, 'mytable2', sortby='num', start=0, stop=5, step=-1) print h5file.root.mytable2[:] # The following raises an OverflowError. The documentation (2.3.1) says # negative steps are supported for itersorted. Documentation error or bug # in itersorted? output = [x['num'] for x in mytable.itersorted('num', start=0, stop=10, step=-1)] print 'itersorted:', 0, 10, -1, output 3. Null bytes are stripped from the end of strings when they are stored in a table. Since a Python does not expect this, it needs to be explicitly documented in all the relevant places. Here is some code: #! /usr/bin/env python import tables def hash2hex(stringin): out = list() for c in stringin: s = hex(ord(c))[2:] if len(s) == 1: s = '0' + s out.append(s) return ''.join(out) h5file = tables.openFile('mytable.h5', mode='w') class silly_class(tables.IsDescription): astring = tables.StringCol(16, pos=0) mytable = h5file.createTable(h5file.root, 'mytable', silly_class, 'a few strings', expectedrows=4) # Problem when string ends with null bytes: nasty = 'abdcef' + '\x00\x00' print repr(nasty) print hash2hex(nasty) row = mytable.row row['astring'] = nasty row.append() mytable.flush() print repr(mytable[0][0]) print hash2hex(mytable[0][0]) h5file.close() 4. Has the 64K limit for attributes been lifted? 5. The reference manual for numpy contains _many_ small examples. They partially compensate for any lack of precision or excessive precision in the documents. Also many people learn best from examples. 6. Suppose that the records (key, data1) and (key, data2) are two rows in a table with (key, data1) being a earlier row than (key, data2). Both records have the same value in the first column. If a CSIndex is created using the first column, will (key, data1) still be before (key, data2) in the index? This property is called stability. Some sorting algorithms guarantee this and others don't. Are the sorts in PyTables stable? 7. The table.append in PyTables behaves like extend in Python. Why? 8. I get a mysterious PerformanceWarning from the PyTables file table.py, line 2742. This message needs to be split into two messages. In my case, after I appended to a table, 'row' in self.__dict__ was True and self.row._getUnsavedNrows() was 1. To resolve the problem, I added a line that flushes the table after every append. Does h5file.mytable.flush() do something that h5file.flush() doesn't? Do I need to flush every table after every append or are there only certain situations when this is needed? What does preempted from alive nodes mean? 9. Does the following code contain a bug in PyTables? #! /usr/bin/env python import sys import numpy, tables # No failure if projections and winsize are small enough. In the original # program, gauss.shape is (2000, 196, 196). projections = 105 winsize = 2500 h5 = tables.openFile('mess.h5', mode='w') shape = (projections, winsize)
Re: [Pytables-users] Some experiences with PyTables
Hello Edward, I'd like to respond point by point: On Tue, Dec 6, 2011 at 2:54 PM, PyTables Org pytab...@googlemail.comwrote: 1. There seems to be an unpythonic design choice with the start, stop, step convention for PyTables. Anything that is unnatural to a Python programmer should be heavily documented. Agreed in general. Do you have specific example we could address? 2. There may be a bug in itersorted. Yes, this looks like a bug. This deserves an issue on github... Here is code for (1) and (2): #! /usr/bin/env python import random, tables h5file = tables.openFile('mytable.h5', mode='w') class silly_class(tables.IsDescription): num = tables.Int32Col(pos=0) mytable = h5file.createTable(h5file.root, 'mytable', silly_class, 'a few ints', expectedrows=4) row = mytable.row for i in range(10): row['num'] = random.randint(0, 99) row.append() mytable.flush() mytable.cols.num.createCSIndex() # Python's idiom for start, stop, step: print 'Python:', range(9, -1, -1) output = mytable.readSorted('num', start=0, stop=10, step=-1) print 'readSorted:', 0, 10, -1, output # copy supports a negative step. It seems that start and stop are applied # _after_ the sort is done. Very unlike Python. Please document thoroughly. We could certainly add some text to the docstring of Table.copy(). Still, I guess I am missing how this is 'wrong.' To the best of my knowledge, Python itself has no single function which both sorts and slices. (Please correct me if I am wrong ~_~.) When performing both operations one needs to be done first. However, you are correct in that this could be better documented. print h5file.root.mytable[:] h5file.root.mytable.copy(h5file.root, 'mytable2', sortby='num', start=0, stop=5, step=-1) print h5file.root.mytable2[:] # The following raises an OverflowError. The documentation (2.3.1) says # negative steps are supported for itersorted. Documentation error or bug # in itersorted? output = [x['num'] for x in mytable.itersorted('num', start=0, stop=10, step=-1)] print 'itersorted:', 0, 10, -1, output 3. Null bytes are stripped from the end of strings when they are stored in a table. Since a Python does not expect this, it needs to be explicitly documented in all the relevant places. Here is some code: This is a function of the underlying HDF5 storage mechanism and not explicitly PyTables. When storing fixed length strings, the array of characters it is converted to *must* be exactly length-N. When serializing a string of length-M, HDF5 does the following: 1. M N: truncate the string at N bytes (chop off the end). 2. M == N: do nothing. 3. M N: pad the character array with N - M null characters to achieve length N. Because of this technique, when deserializing all trailing null characters are dropped. This supports the much more common use case of storing shorter strings in a longer buffer but wanting to only recover the shorter version. If you wanted to append null bytes to the end of the string, you could always store the python length (M) in another column. #! /usr/bin/env python import tables def hash2hex(stringin): out = list() for c in stringin: s = hex(ord(c))[2:] if len(s) == 1: s = '0' + s out.append(s) return ''.join(out) h5file = tables.openFile('mytable.h5', mode='w') class silly_class(tables.IsDescription): astring = tables.StringCol(16, pos=0) mytable = h5file.createTable(h5file.root, 'mytable', silly_class, 'a few strings', expectedrows=4) # Problem when string ends with null bytes: nasty = 'abdcef' + '\x00\x00' print repr(nasty) print hash2hex(nasty) row = mytable.row row['astring'] = nasty row.append() mytable.flush() print repr(mytable[0][0]) print hash2hex(mytable[0][0]) h5file.close() 4. Has the 64K limit for attributes been lifted? No, unfortunately. Once again, this is a compile time parameter of HDF5. You could change this value and recompile HDF5, but then any h5 file you create would not be portable with other versions of HDF5. Trust me, you are not the only one who wishes this were as run-time variable. (Still there are good reasons for it being static, ie speed and size) 5. The reference manual for numpy contains _many_ small examples. They partially compensate for any lack of precision or excessive precision in the documents. Also many people learn best from examples. If you would like to write up some additional example or contribute to the docs in any way *please* let me know. We would be ecstatic for your help! 6. Suppose that the records (key, data1) and (key, data2) are two rows in a table with (key, data1) being a earlier row than (key, data2). Both records have the same value in the first column. If a CSIndex is created using the first column, will (key, data1) still be before (key, data2) in