Please, stop reporting carray problems here. Let's communicate privately if you want.
Thanks, Francesc On 12/7/12 8:22 PM, Alvaro Tejero Cantero wrote: > Thanks Francesc, that solved it. Having the disk datastructures load > compressed in memory can be a deal-breaker when you got daily 50Gb+ > datasets to process! > > The carray google group (I had not noticed it) seems unreachable at > the moment. That's why I am going to report a problem here for the > moment. With the following code > > ct0 = ca.ctable((h5f.root.c_000[:],), names=('c_000',), > rootdir=u'/lfpd1/tmp/ctable-1', mode='w', cparams=ca.cparams(5), > dtype='u2', expectedlen=len(h5f.root.c_000)) > > for k in h5f.root._v_children.keys()[:3]: #just some of the HDF5 datasets > try: > col = getattr(h5f.root, k) > ct0.addcol(col[:], name=k, expectedlen=len(col), dtype='u2') > except ValueError: > pass #exists > ct0.flush() > > >>> ct0 > ctable((303390000,), [('c_000', '<u2'), ('c_007', '<u2'), ('c_006', '<u2'), > ('c_005', '<u2')]) > nbytes: 2.26 GB; cbytes: 1.30 GB; ratio: 1.73 > cparams := cparams(clevel=5, shuffle=True) > rootdir := '/lfpd1/tmp/ctable-1' > [(312, 37, 65432, 91) (313, 32, 65439, 65) (320, 24, 65433, 66) ..., > (283, 597, 677, 647) (276, 600, 649, 635) (298, 607, 635, 620)] > > The newly-added datasets/columns exist in memory > > >>> ct0['c_007'] > carray((303390000,), uint16) > nbytes: 578.67 MB; cbytes: 333.50 MB; ratio: 1.74 > cparams := cparams(clevel=5, shuffle=True) > [ 37 32 24 ..., 597 600 607] > > but they do not appear in the rootdir, not even after .flush() > > /lfpd1/tmp/ctable-1]$ ls > __attrs__ c_000 __rootdirs__ > > and something seems amiss with __rootdirs__: > /lfpd1/tmp/ctable-1]$ cat __rootdirs__ > {"dirs": {"c_007": null, "c_006": null, "c_005": null, "c_000": > "/lfpd1/tmp/ctable-1/c_000"}, "names": ["c_000", "c_007", "c_006", > "c_005"]} > > >>> ct0.cbytes//1024**2 > 1334 > > vs > /lfpd1/tmp]$ du -h ctable-1 > 12K ctable-1/c_000/meta > 340M ctable-1/c_000/data > 340M ctable-1/c_000 > 340M ctable-1 > > > and, finally, no 'open' > > ct0_disk = ca.open(rootdir='/lfpd1/tmp/ctable-1', mode='r') > > --------------------------------------------------------------------------- > ValueError Traceback (most recent call last) > /home/tejero/Dropbox/O/nb/nonridge/<ipython-input-26-41e1cb01ffe6> > in<module>() > ----> 1 ct0_disk= ca.open(rootdir='/lfpd1/tmp/ctable-1', mode='r') > > /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/toplevel.pyc > inopen(rootdir, mode) > 104 # Not a carray. Now with a ctable > > 105 try: > --> 106 obj= ca.ctable(rootdir=rootdir, mode=mode) > 107 except IOError: > 108 # Not a ctable > > > /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/ctable.pyc > in__init__(self, columns, names, **kwargs) > 193 _new= True > 194 else: > --> 195 self.open_ctable() > 196 _new= False > 197 > > /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/ctable.pyc > inopen_ctable(self) > 282 > 283 # Open the ctable by reading the metadata > > --> 284 self.cols.read_meta_and_open() > 285 > 286 # Get the length out of the first column > > > /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/ctable.pyc > inread_meta_and_open(self) > 40 # Initialize the cols by instatiating the carrays > > 41 for name, dir_in data['dirs'].items(): > ---> 42 self._cols[str(name)] = ca.carray(rootdir=dir_, > mode=self.mode) > 43 > 44 def update_meta(self): > > /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/carrayExtension.so > incarray.carrayExtension.carray.__cinit__ (carray/carrayExtension.c:8637)() > > ValueError: You need at least to pass an array or/and a rootdir > > -á. > > > > On 7 December 2012 17:04, Francesc Alted <fal...@gmail.com > <mailto:fal...@gmail.com>> wrote: > > Hmm, perhaps cythonizing by hand is your best bet: > > $ cython carray/carrayExtension.pyx > > If you continue having problems, please write to the carray > mailing list. > > Francesc > > On 12/7/12 5:29 PM, Alvaro Tejero Cantero wrote: > > I have now similar dependencies as you, except for Numpy 1.7 beta 2. > > > > I wish I could help with the carray flavor. > > > > -- > > Running setup.py install for carray > > * Found Cython 0.17.2 package installed. > > * Found numpy 1.6.2 package installed. > > * Found numexpr 2.0.1 package installed. > > building 'carray.carrayExtension' extension > > C compiler: gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall > > -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector > > --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC > > -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > > -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 > > -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC > > compile options: '-Iblosc > > > > -I/home/tejero/Local/Envs/test/lib/python2.7/site-packages/numpy/core/include > > -I/usr/include/python2.7 -c' > > extra options: '-msse2' > > gcc: blosc/blosclz.c > > gcc: carray/carrayExtension.c > > gcc: error: carray/carrayExtension.c: No such file or directory > > gcc: fatal error: no input files > > compilation terminated. > > gcc: error: carray/carrayExtension.c: No such file or directory > > gcc: fatal error: no input files > > compilation terminated. > > error: Command "gcc -pthread -fno-strict-aliasing -O2 -g -pipe > > -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector > > --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC > > -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 > > -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 > > -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -Iblosc > > > > -I/home/tejero/Local/Envs/test/lib/python2.7/site-packages/numpy/core/include > > -I/usr/include/python2.7 -c carray/carrayExtension.c -o > > build/temp.linux-x86_64-2.7/carray/carrayExtension.o -msse2" failed > > with exit status 4 > > > > > > > > -á. > > > > > > > > On 7 December 2012 12:47, Francesc Alted <fal...@gmail.com > <mailto:fal...@gmail.com> > > <mailto:fal...@gmail.com <mailto:fal...@gmail.com>>> wrote: > > > > On 12/6/12 1:42 PM, Alvaro Tejero Cantero wrote: > > > Thank you for the comprehensive round-up. I have some > ideas and > > > reports below. > > > > > > What about ctables? The documentation says that it is > specificly > > > column-access optimized, which is what I need in this scenario > > > (sometimes sequential, sometimes random). > > > > Yes, ctables is optimized for column access. > > > > > > > > Unfortunately I could not get the rootdir parameter for > ctables > > > __init__ to work in carray 0.4 and pip-installing 0.5 or > 0.5.1 leads > > > to compilation errors. > > > > Yep, persistence for carray/ctables objects was added in 0.5. > > > > > > > > This is the ctables-to-disk error: > > > > > > ct2 = ca.ctable((np.arange(30000000),), names=('range2',), > > > rootdir='/tmp/ctable2.ctable') > > > > > > > --------------------------------------------------------------------------- > > > TypeError Traceback (most > > recent call last) > > > > > /home/tejero/Dropbox/O/nb/nonridge/<ipython-input-29-255842877a0b> > > in<module>() > > > ----> 1 ct2= ca.ctable((np.arange(30000000),), > > names=('range2',), rootdir='/tmp/ctable2.ctable') > > > > > > > > > /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/ctable.pyc > > in__init__(self, cols, names, **kwargs) > > > 158 if column.dtype== np.void: > > > 159 raise ValueError, "`cols` > > elements cannot be of type void" > > > --> 160 column= ca.carray(column, **kwargs) > > > 161 elif ratype: > > > 162 column= ca.carray(cols[name], > **kwargs) > > > > > > > > > > /home/tejero/Local/Envs/test/lib/python2.7/site-packages/carray/carrayExtension.so > > incarray.carrayExtension.carray.__cinit__ > > (carray/carrayExtension.c:3917)() > > > > > > TypeError: __cinit__() got an unexpected keyword argument > 'rootdir' > > > > > > > > > And this is cut from the pip output when trying to upgrade > carray. > > > > > > gcc: carray/carrayExtension.c > > > > > > gcc: error: carray/carrayExtension.c: No such file or > directory > > > > Hmm, that's strange, because the carrayExtension should have > been > > cythonized automatically. Here it is part of my install process > > with pip: > > > > Running setup.py install for carray > > * Found Cython 0.17.1 package installed. > > * Found numpy 1.7.0b2 package installed. > > * Found numexpr 2.0.1 package installed. > > cythoning carray/carrayExtension.pyx to > carray/carrayExtension.c > > building 'carray.carrayExtension' extension > > C compiler: gcc -fno-strict-aliasing > > -I/Users/faltet/anaconda/include -arch x86_64 -DNDEBUG -g > -fwrapv -O3 > > -Wall -Wstrict-prototypes > > > > Hmm, perhaps you need a newer version of Cython? > > > > > > > > > > > Two more notes: > > > > > > * a way was added to check in-disk (compressed) vs in-memory > > > (uncompressed) node sizes. I was unable to find the way to > use it > > > either from the 2.4.0 release notes or from the git issue > > > > https://github.com/PyTables/PyTables/issues/141#issuecomment-5018763 > > > > You already found the answer. > > > > > > > > * is/will it be possible to load PyTables carrays as in-memory > > carrays > > > without decompression? > > > > Actually, that has been my idea from the very beginning. The > > concept of > > 'flavor' for the returned objects when reading is already > there, so it > > should be relatively easy to add a new 'carray' flavor. > Maybe you can > > contribute this? > > > > -- > > Francesc Alted > > > > > > > > ------------------------------------------------------------------------------ > > LogMeIn Rescue: Anywhere, Anytime Remote support for IT. > Free Trial > > Remotely access PCs and mobile devices and provide instant > support > > Improve your efficiency, and focus on delivering more value-add > > services > > Discover what IT Professionals Know. Rescue delivers > > http://p.sf.net/sfu/logmein_12329d2d > > _______________________________________________ > > Pytables-users mailing list > > Pytables-users@lists.sourceforge.net > <mailto:Pytables-users@lists.sourceforge.net> > > <mailto:Pytables-users@lists.sourceforge.net > <mailto:Pytables-users@lists.sourceforge.net>> > > https://lists.sourceforge.net/lists/listinfo/pytables-users > > > > > > > > > > > > ------------------------------------------------------------------------------ > > LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial > > Remotely access PCs and mobile devices and provide instant support > > Improve your efficiency, and focus on delivering more value-add > services > > Discover what IT Professionals Know. Rescue delivers > > http://p.sf.net/sfu/logmein_12329d2d > > > > > > _______________________________________________ > > Pytables-users mailing list > > Pytables-users@lists.sourceforge.net > <mailto:Pytables-users@lists.sourceforge.net> > > https://lists.sourceforge.net/lists/listinfo/pytables-users > > > -- > Francesc Alted > > > > ------------------------------------------------------------------------------ > LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial > Remotely access PCs and mobile devices and provide instant support > Improve your efficiency, and focus on delivering more value-add > services > Discover what IT Professionals Know. Rescue delivers > http://p.sf.net/sfu/logmein_12329d2d > _______________________________________________ > Pytables-users mailing list > Pytables-users@lists.sourceforge.net > <mailto:Pytables-users@lists.sourceforge.net> > https://lists.sourceforge.net/lists/listinfo/pytables-users > > > > > ------------------------------------------------------------------------------ > LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial > Remotely access PCs and mobile devices and provide instant support > Improve your efficiency, and focus on delivering more value-add services > Discover what IT Professionals Know. Rescue delivers > http://p.sf.net/sfu/logmein_12329d2d > > > _______________________________________________ > Pytables-users mailing list > Pytables-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/pytables-users -- Francesc Alted ------------------------------------------------------------------------------ LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial Remotely access PCs and mobile devices and provide instant support Improve your efficiency, and focus on delivering more value-add services Discover what IT Professionals Know. Rescue delivers http://p.sf.net/sfu/logmein_12329d2d _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users