[Pytables-users] pytables 2.2 build error on amd64

2010-08-13 Thread Jeff Reback
Hi,

I tried to build pytables 2.2 on an x86-64bit (amd64) debian machine.
but received the following build errors (below)

I then successfully built 2.1.2 with no issues (and passes all tests)

Jeff



* Found numpy 1.5.0b1 package installed.
* Found numexpr 1.4 package installed.
* Found HDF5 headers at ``/usr/include``, library at ``/usr/lib64``.
* Found LZO 2 headers at ``/usr/include``, library at ``/usr/lib64``.
* Skipping detection of LZO 1 since LZO 2 has already been found.
* Could not find bzip2 headers and library; disabling support for it.
* Found pthreads headers at ``/usr/include``, library at ``/usr/lib64``.
running build_ext
cythoning tables/hdf5Extension.pyx to tables/hdf5Extension.c

Error converting Pyrex file to C:

...
self.disk_type_id = AtomToHDF5Type(atom, self.byteorder)

# Allocate space for the dimension axis info and fill it
dims = numpy.array(shape, dtype=numpy.intp)
self.rank = len(shape)
self.dims = npy_malloc_dims(self.rank, npy_intp *(dims.data))
  ^


/data/arb/distros/tables-2.2/tables/hdf5Extension.pyx:836:31: Cannot convert 
'definitions.hsize_t *' to Python object

Error converting Pyrex file to C:

...
# Save the array
complib = PyString_AsString(self.filters.complib or '')
version = PyString_AsString(self._v_version)
class_ = PyString_AsString(self._c_classId)
self.dataset_id = H5ARRAYmake(self.parent_id, self.name, version,
  self.rank, self.dims,
^


/data/arb/distros/tables-2.2/tables/hdf5Extension.pyx:844:49: Cannot convert 
Python object to 'definitions.hsize_t *'

Error converting Pyrex file to C:

...
atom = self.atom
itemsize = atom.itemsize
self.disk_type_id = AtomToHDF5Type(atom, self.byteorder)

self.rank = len(self.shape)
self.dims = malloc_dims(self.shape)
  ^


/data/arb/distros/tables-2.2/tables/hdf5Extension.pyx:882:27: Cannot convert 
'definitions.hsize_t *' to Python object

Error converting Pyrex file to C:

...
self.disk_type_id = AtomToHDF5Type(atom, self.byteorder)

self.rank = len(self.shape)
self.dims = malloc_dims(self.shape)
if self.chunkshape:
  self.dims_chunk = malloc_dims(self.chunkshape)
  ^


/data/arb/distros/tables-2.2/tables/hdf5Extension.pyx:884:35: Cannot convert 
'definitions.hsize_t *' to Python object

Error converting Pyrex file to C:

...
  atom.dflt = dflts

# Create the CArray/EArray
self.dataset_id = H5ARRAYmake(
  self.parent_id, self.name, version, self.rank,
  self.dims, self.extdim, self.disk_type_id, self.dims_chunk,
 ^


/data/arb/distros/tables-2.2/tables/hdf5Extension.pyx:907:10: Cannot convert 
Python object to 'definitions.hsize_t *'

Error converting Pyrex file to C:

...
  atom.dflt = dflts

# Create the CArray/EArray
self.dataset_id = H5ARRAYmake(
  self.parent_id, self.name, version, self.rank,
  self.dims, self.extdim, self.disk_type_id, self.dims_chunk,
^


/data/arb/distros/tables-2.2/tables/hdf5Extension.pyx:907:53: Cannot convert 
Python object to 'definitions.hsize_t *'

Error converting Pyrex file to C:

...
self.disk_type_id, self.type_id = self._get_type_ids()
# Get the atom for this type
atom = AtomFromHDF5Type(self.disk_type_id)

# Get the rank for this array object
if H5ARRAYget_ndims(self.dataset_id, self.rank)  0:
^


/data/arb/distros/tables-2.2/tables/hdf5Extension.pyx:954:41: Cannot take 
address of Python variable

Error converting Pyrex file to C:

...

# Get the rank for this array object
if H5ARRAYget_ndims(self.dataset_id, self.rank)  0:
  raise HDF5ExtError(Problems getting ndims!)
# Allocate space for the dimension axis info
self.dims = hsize_t *malloc(self.rank * sizeof(hsize_t))
   ^



[Pytables-users] pytables 2.3.1 indexing issue

2012-01-19 Thread Jeff Reback
Hi,
 
using the configuration:
 
pytables 2.3.1
numexpr 1.4.1
python 2.7.1 (on adm64 debian and win64)
 
using a readWhere to select rows from a table, if I give a selector with 
multiple operands on the index column (e.g. (column  value)  ( column  
value2))
doesn't seem to work (though works fine with a single operand and on a 
non-indexed table)
 
in the test output the 4th case (index with start and stop operands), I don't 
receive any selection (contrast with the 2nd case which shows non-indexed 
behavior)
 
is this behavior expected?
 
thanks,
 
Jeff
 
--- test script ---
 
#!/usr/local/bin/python

import tables
import numpy as np
import datetime, time

# create table
def create(name, add_index = False):
    test_file = %s.hdf % name
    handle = tables.openFile(test_file, w)
    
    table = handle.createTable(handle.root, 'table', dict(
            index = tables.Time64Col(),
            column = tables.StringCol(25),
            values  = tables.FloatCol(shape=(3)),
            ))

    # add data
    date = datetime.datetime(2011,1,1,8,0,0)

    r = table.row
    for i in xrange(100):
        r['index'] = time.mktime((date + 
datetime.timedelta(days=i)).timetuple())
        r['column'] = (str-%d % (i % 5))
        r['values'] = np.arange(3)
        r.append()
 table.flush()

    if add_index:
        col = table.cols._f_col('index')
        col.createIndex(filters = None)

    handle.close()
    return test_file

def select(name, start, stop = None):
     start and stop are dates 

    test_file = %s.hdf % name
    handle = tables.openFile(test_file,r)
    selectors = []

    # index selector
    selectors.append((index = %s) % time.mktime(start.timetuple()))
    if stop is not None:
        selectors.append((index = %s) % time.mktime(stop.timetuple()))

    # column selector
    selectors.append(((column == 'str-0') | (column == 'str-1')))

    selector = '  '.join(selectors)

    print selector - [f-%s,start-%s,stop-%s] -- %s % 
(test_file,start,stop,selector)
    ans = getattr(handle.root,'table').readWhere(selector)
    print ans      - %s % ans
    handle.close()


# no indexing
create('no_index',add_index = False)
select('no_index', start = datetime.datetime(2011,2,1,0,0,0))
select('no_index', start = datetime.datetime(2011,2,1,0,0,0), stop = 
datetime.datetime(2011,3,1,0,0,0))

# with indexing
create('with_index',add_index = True)
select('with_index', start = datetime.datetime(2011,2,1,0,0,0))
select('with_index', start = datetime.datetime(2011,2,1,0,0,0), stop = 
datetime.datetime(2011,3,1,0,0,0))
 
--ptdump -v on the no_index.hdf
[cow-jreback-/tmp] ptdump -v no_index.hdf
/ (RootGroup) ''
/table (Table(100,)) ''
  description := {
  column: StringCol(itemsize=25, shape=(), dflt='', pos=0),
  index: Time64Col(shape=(), dflt=0.0, pos=1),
  values: Float64Col(shape=(3,), dflt=0.0, pos=2)}
  byteorder := 'little'
  chunkshape := (1149,)

-- ptdump -v on the with_index.hdf 
 
[cow-jreback-/tmp] ptdump -v with_index.hdf 
/ (RootGroup) ''
/table (Table(100,)) ''
  description := {
  column: StringCol(itemsize=25, shape=(), dflt='', pos=0),
  index: Time64Col(shape=(), dflt=0.0, pos=1),
  values: Float64Col(shape=(3,), dflt=0.0, pos=2)}
  byteorder := 'little'
  chunkshape := (1149,)
  autoIndex := True
  colindexes := {
    index: Index(6, medium, shuffle, zlib(1)).is_CSI=False}

test output ---
selector - [f-no_index.hdf,start-2011-02-01 00:00:00,stop-None] -- (index 
= 1296536400.0)  ((column == 'str-0') | (column == 'str-1'))
ans      - [('str-1', 1296565200.0, [0.0, 1.0, 2.0])
('str-0', 1296910800.0, [0.0, 1.0, 2.0])
('str-1', 1296997200.0, [0.0, 1.0, 2.0])
('str-0', 1297342800.0, [0.0, 1.0, 2.0])
('str-1', 1297429200.0, [0.0, 1.0, 2.0])
('str-0', 1297774800.0, [0.0, 1.0, 2.0])
('str-1', 1297861200.0, [0.0, 1.0, 2.0])
('str-0', 1298206800.0, [0.0, 1.0, 2.0])
('str-1', 1298293200.0, [0.0, 1.0, 2.0])
('str-0', 1298638800.0, [0.0, 1.0, 2.0])
('str-1', 1298725200.0, [0.0, 1.0, 2.0])
('str-0', 1299070800.0, [0.0, 1.0, 2.0])
('str-1', 1299157200.0, [0.0, 1.0, 2.0])
('str-0', 1299502800.0, [0.0, 1.0, 2.0])
('str-1', 1299589200.0, [0.0, 1.0, 2.0])
('str-0', 1299934800.0, [0.0, 1.0, 2.0])
('str-1', 1300017600.0, [0.0, 1.0, 2.0])
('str-0', 1300363200.0, [0.0, 1.0, 2.0])
('str-1', 1300449600.0, [0.0, 1.0, 2.0])
('str-0', 1300795200.0, [0.0, 1.0, 2.0])
('str-1', 1300881600.0, [0.0, 1.0, 2.0])
('str-0', 1301227200.0, [0.0, 1.0, 2.0])
('str-1', 1301313600.0, [0.0, 1.0, 2.0])
('str-0', 1301659200.0, [0.0, 1.0, 2.0])
('str-1', 1301745600.0, [0.0, 1.0, 2.0])
('str-0', 1302091200.0, [0.0, 1.0, 2.0])
('str-1', 1302177600.0, [0.0, 1.0, 2.0])]
selector - [f-no_index.hdf,start-2011-02-01 00:00:00,stop-2011-03-01 
00:00:00] -- (index = 1296536400.0)  (index = 1298955600.0)  ((column == 
'str-0') | (column == 'str-1'))
ans      - [('str-1', 1296565200.0, [0.0, 1.0, 2.0])
('str-0', 1296910800.0, [0.0, 1.0, 2.0])
('str-1', 1296997200.0, [0.0, 1.0, 2.0])

Re: [Pytables-users] variable length strings in tables?

2012-12-03 Thread Jeff Reback
thanks

created 

https://github.com/PyTables/PyTables/issues/198


 
I can be reached on my cell (917)971-6387



 From: Anthony Scopatz scop...@gmail.com
To: Jeff Reback j...@reback.net; Discussion list for PyTables 
pytables-users@lists.sourceforge.net 
Sent: Monday, December 3, 2012 11:15 AM
Subject: Re: [Pytables-users] variable length strings in tables?
 

On Sun, Dec 2, 2012 at 2:49 PM, Jeff Reback jreb...@yahoo.com wrote:

Hi,
 
Pandas uses pytables as a storage backend and has worked out quite well
fyi ... http://pandas.pydata.org/pandas-docs/dev/io.html#hdf5-pytables
 
I have a particular use case where I build a table, then later append to it.
Fixed types are no problem. However, I often index these tables by StringCols, 
which I pre-allocated
to the largest size I think that i'll need. So, wanted to think about 
supporting
variable-length string columns in the table.
 
any thoughts on these strategies:
1) any way to directly support a variable-length string in a particular 
column? (e.g. VLStringCol doesn't exist but a stand-alone VLStringAtom does)

This is possible as the underlying HDF5 library will support it.  However, no 
one has had the time to write it.  Please open an issue (or possibly a pull 
request related to this.)
 
2) As an alternative, I could store along with the table a VLArray the same # 
of rows as the table and keep string data here
   -- of course have to keep the synchronization up to date (and this doesn't 
help with an 'indexing' column, just with 'data' columns)

This is what I do in PyTables and HDF5 itself.  It works out quite well for me. 
 This has the advantage that the VLString data get compressed separately from 
the numeric data (if using compression).  Yes, it is one more thing to manage, 
but the file sizes I are much significantly smaller.

Be Well
Anthony
 
 
thanks,
 
Jeff
--
Keep yourself connected to Go Parallel:
DESIGN Expert tips on starting your parallel project right.
http://goparallel.sourceforge.net/
___
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

--
Keep yourself connected to Go Parallel: 
BUILD Helping you discover the best ways to construct your parallel projects.
http://goparallel.sourceforge.net___
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users


[Pytables-users] readWhere, number of selectors issue

2013-01-23 Thread Jeff Reback
It seems there is a limit to the condition sytax when using readWhere
 
I get various exceptions when passing increasing number of terms 
 
is this some kind of hard coded limit? 
 
is there a way to pre-compile this and test for it? (e.g. when I am actually 
creating the condition) 
- my alternative is simple to drop that part of the condition and filter out 
after
 
thanks,
 
Jeff
 
ans  - [n-2   ,len_selector-58  ] -- 
(399,)
ans  - [n-10  ,len_selector-234 ] -- 
(999,)
ans  - [n-100 ,len_selector-2304    ] -- 
(999,)
ans  - [n-200 ,len_selector-4704    ] -- 
(999,)
ans  - [n-254 ,len_selector-6000    ] -- 
chr() arg not in range(256)
ans  - [n-255 ,len_selector-6024    ] -- 
chr() arg not in range(256)
ans  - [n-300 ,len_selector-7104    ] -- 
chr() arg not in range(256)
ans  - [n-400 ,len_selector-9504    ] -- 
maximum recursion depth exceeded while calling a Python object
ans  - [n-500 ,len_selector-11904   ] -- 
maximum recursion depth exceeded while calling a Python object

 script to reproduce 
#!/usr/local/bin/python
import tables
import numpy as np
import datetime, time
test_file = 'test_select.h5'
handle = tables.openFile(test_file, w)
node   = handle.createGroup(handle.root, 'test')
table  = handle.createTable(node, 'table', dict(
    index   = tables.Int64Col(),
    column  = tables.StringCol(25),
    values  = tables.FloatCol(shape=(3)),
    ))
    
# add data
r = table.row
for i in xrange(1000):
    r['index'] = i
    r['column'] = (str-%d % (i % 5))
    r['values'] = np.arange(3)
    r.append()
table.flush()
handle.close()
 
def read_for(n):
    handle = tables.openFile(test_file,r)
    selector = (index = 1)  %s % '(' + ' | '.join([ (column == 'str-%s') 
% v for v in range(n) ]) + ')'
    #print selector - [%s] -- %s % (n,selector)
    try:
    ans = handle.root.test.table.readWhere(selector)
    print ans  - [n-%-20.20s,len_selector-%-20.20s] -- %s % 
(n,len(selector),ans.shape)
    except (Exception), detail:
    print ans  - [n-%-20.20s,len_selector-%-20.20s] -- %s % 
(n,len(selector),str(detail))
    handle.close()
 
for n in [ 2, 10, 100, 200, 254, 255, 300, 400, 500 ]:
    read_for(n)
--
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. ON SALE this month only -- learn more at:
http://p.sf.net/sfu/learnnow-d2d___
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users


[Pytables-users] pytable 3 - with encoding

2013-06-04 Thread Jeff Reback
anthony,

I can be reached on my cell (917)971-6387--
How ServiceNow helps IT people transform IT departments:
1. A cloud service to automate IT design, transition and operations
2. Dashboards that offer high-level views of enterprise services
3. A single system of record for all IT processes
http://p.sf.net/sfu/servicenow-d2d-j___
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users


[Pytables-users] pytable 30 - encoding

2013-06-04 Thread Jeff Reback
anthony,
where am I going wrong here? 
#!/usr/local/bin/python3
import tables
import numpy as np
import datetime, time
encoding = 'UTF-8'
test_file = 'test_select.h5'
handle = tables.openFile(test_file, w)
node   = handle.createGroup(handle.root, 'test')
table  = handle.createTable(node, 'table', dict(
index   = tables.Int64Col(),
    column  = tables.StringCol(25),
values  = tables.FloatCol(shape=(3)),
))

# add data
r = table.row
for i in range(10):
r['index'] = i
r['column'] = (str-%d % (i % 5)).encode(encoding)
r['values'] = np.arange(3)
r.append()
table.flush()
handle.close()
# read
handle = tables.openFile(test_file,r)
result = handle.root.test.table.read()
print(table data\n)
print(result)
# where
print(\nselector\n)
selector = (column == 'str-2').encode(encoding)
print(selector)
result = handle.root.test.table.readWhere(selector)
print(result)

and the following out:

[sheep-jreback-/code/arb/test] python3 pytables-3.py
table data
[(b'str-0', 0, [0.0, 1.0, 2.0]) (b'str-1', 1, [0.0, 1.0, 2.0])
(b'str-2', 2, [0.0, 1.0, 2.0]) (b'str-3', 3, [0.0, 1.0, 2.0])
(b'str-4', 4, [0.0, 1.0, 2.0]) (b'str-0', 5, [0.0, 1.0, 2.0])
(b'str-1', 6, [0.0, 1.0, 2.0]) (b'str-2', 7, [0.0, 1.0, 2.0])
(b'str-3', 8, [0.0, 1.0, 2.0]) (b'str-4', 9, [0.0, 1.0, 2.0])]
selector
b(column == 'str-2')
Traceback (most recent call last):
File pytables-3.py, line 37, in module
result = handle.root.test.table.readWhere(selector)
File 
/usr/local/lib/python3.3/site-packages/tables-3.0.0-py3.3-linux-x86_64.egg/tables/_past.py,
 line 35, in oldfunc
return obj(*args, **kwargs)
File 
/usr/local/lib/python3.3/site-packages/tables-3.0.0-py3.3-linux-x86_64.egg/tables/table.py,
 line 1522, in read_where
self._where(condition, condvars, start, stop, step)]
File 
/usr/local/lib/python3.3/site-packages/tables-3.0.0-py3.3-linux-x86_64.egg/tables/table.py,
 line 1484, in _where
compiled = self._compile_condition(condition, condvars)
File 
/usr/local/lib/python3.3/site-packages/tables-3.0.0-py3.3-linux-x86_64.egg/tables/table.py,
 line 1358, in _compile_condition
compiled = compile_condition(condition, typemap, indexedcols)
File 
/usr/local/lib/python3.3/site-packages/tables-3.0.0-py3.3-linux-x86_64.egg/tables/conditions.py,
 line 419, in compile_condition
func = NumExpr(expr, signature)
File 
/usr/local/lib/python3.3/site-packages/numexpr-2.1-py3.3-linux-x86_64.egg/numexpr/necompiler.py,
 line 559, in NumExpr
precompile(ex, signature, context)
File 
/usr/local/lib/python3.3/site-packages/numexpr-2.1-py3.3-linux-x86_64.egg/numexpr/necompiler.py,
 line 511, in precompile
constants_order, constants = getConstants(ast)
File 
/usr/local/lib/python3.3/site-packages/numexpr-2.1-py3.3-linux-x86_64.egg/numexpr/necompiler.py,
 line 294, in getConstants
for a in constants_order]
File 
/usr/local/lib/python3.3/site-packages/numexpr-2.1-py3.3-linux-x86_64.egg/numexpr/necompiler.py,
 line 294, in listcomp
for a in constants_order]
File 
/usr/local/lib/python3.3/site-packages/numexpr-2.1-py3.3-linux-x86_64.egg/numexpr/necompiler.py,
 line 284, in convertConstantToKind
return kind_to_type[kind](x)
TypeError: string argument without an encoding
Closing remaining open files: test_select.h5... done 
--
How ServiceNow helps IT people transform IT departments:
1. A cloud service to automate IT design, transition and operations
2. Dashboards that offer high-level views of enterprise services
3. A single system of record for all IT processes
http://p.sf.net/sfu/servicenow-d2d-j___
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users


Re: [Pytables-users] pytable 30 - encoding

2013-06-04 Thread Jeff Reback
Anthony,
 
I am using numexpr 2.1 (latest)
 
this is puzzling; doesn't matter what I pass (bytes or str) , same result?
 
(column == 'str-2')
 /mnt/code/arb/test/pytables-3.py(38)module()
- result = handle.root.test.table.readWhere(selector)
(Pdb) handle.root.test.table.readWhere(selector)
*** TypeError: string argument without an encoding
(Pdb) handle.root.test.table.readWhere(selector.encode(encoding))
*** TypeError: string argument without an encoding
(Pdb) 

  


 From: Anthony Scopatz scop...@gmail.com
To: Jeff Reback j...@reback.net; Discussion list for PyTables 
pytables-users@lists.sourceforge.net 
Sent: Tuesday, June 4, 2013 12:25 PM
Subject: Re: [Pytables-users] pytable 30 - encoding
  


Hi Jeff, 

Have you also updated numexpr to the most recent version?  The error is coming 
from numexpr not compiling the expression correctly. Also, you might try making 
selector a str, rather than bytes: 

selector = (column == 'str-2')


rather than

selector = (column == 'str-2').encode(encoding)


Be Well
Anthony



On Tue, Jun 4, 2013 at 8:51 AM, Jeff Reback jreb...@yahoo.com wrote:

anthony,where am I going wrong here?  
#!/usr/local/bin/python3
import tables
import numpy as np
import datetime, time
encoding = 'UTF-8'
test_file = 'test_select.h5'
handle = tables.openFile(test_file, w)
node   = handle.createGroup(handle.root, 'test')
table  = handle.createTable(node, 'table', dict(
index   = tables.Int64Col(),
    column  = tables.StringCol(25),
values  = tables.FloatCol(shape=(3)),
))

# add data
r = table.row
for i in range(10):
r['index'] = i
r['column'] = (str-%d % (i % 5)).encode(encoding)
r['values'] = np.arange(3)
r.append()
table.flush()
handle.close()
# read
handle =
 tables.openFile(test_file,r)
result = handle.root.test.table.read()
print(table data\n)
print(result)
# where
print(\nselector\n)
selector = (column == 'str-2').encode(encoding)
print(selector)
result = handle.root.test.table.readWhere(selector)
print(result)

and the following out:

[sheep-jreback-/code/arb/test] python3 pytables-3.py
table data
[(b'str-0', 0, [0.0, 1.0, 2.0]) (b'str-1', 1, [0.0, 1.0, 2.0])
(b'str-2', 2, [0.0, 1.0, 2.0]) (b'str-3', 3, [0.0, 1.0, 2.0])
(b'str-4', 4, [0.0, 1.0, 2.0]) (b'str-0', 5, [0.0, 1.0, 2.0])
(b'str-1', 6, [0.0, 1.0, 2.0]) (b'str-2', 7, [0.0, 1.0, 2.0])
(b'str-3', 8, [0.0, 1.0, 2.0]) (b'str-4', 9, [0.0, 1.0, 2.0])]
selector
b(column == 'str-2')
Traceback (most recent call last):
File pytables-3.py, line 37, in module
result =
 handle.root.test.table.readWhere(selector)
File 
/usr/local/lib/python3.3/site-packages/tables-3.0.0-py3.3-linux-x86_64.egg/tables/_past.py,
 line 35, in oldfunc
return obj(*args, **kwargs)
File 
/usr/local/lib/python3.3/site-packages/tables-3.0.0-py3.3-linux-x86_64.egg/tables/table.py,
 line 1522, in read_where
self._where(condition, condvars, start, stop, step)]
File 
/usr/local/lib/python3.3/site-packages/tables-3.0.0-py3.3-linux-x86_64.egg/tables/table.py,
 line 1484, in _where
compiled = self._compile_condition(condition, condvars)
File 
/usr/local/lib/python3.3/site-packages/tables-3.0.0-py3.3-linux-x86_64.egg/tables/table.py,
 line 1358, in _compile_condition
compiled = compile_condition(condition, typemap, indexedcols)
File 
/usr/local/lib/python3.3/site-packages/tables-3.0.0-py3.3-linux-x86_64.egg/tables/conditions.py,
 line 419, in compile_condition
func = NumExpr(expr, signature)
File 
/usr/local/lib/python3.3/site-packages/numexpr-2.1-py3.3-linux-x86_64.egg/numexpr/necompiler.py,
 line 559, in NumExpr
precompile(ex, signature, context)
File 
/usr/local/lib/python3.3/site-packages/numexpr-2.1-py3.3-linux-x86_64.egg/numexpr/necompiler.py,
 line 511, in precompile
constants_order, constants = getConstants(ast)
File 
/usr/local/lib/python3.3/site-packages/numexpr-2.1-py3.3-linux-x86_64.egg/numexpr/necompiler.py,
 line 294, in getConstants
for a in constants_order]
File 
/usr/local/lib/python3.3/site-packages/numexpr-2.1-py3.3-linux-x86_64.egg/numexpr/necompiler.py,
 line 294, in listcomp
for a in constants_order]
File 
/usr/local/lib/python3.3/site-packages/numexpr-2.1-py3.3-linux-x86_64.egg/numexpr/necompiler.py,
 line 284, in convertConstantToKind
return kind_to_type[kind](x)
TypeError: string argument without an encoding
Closing remaining open files:
 test_select.h5... done 

--
How ServiceNow helps IT people transform IT departments:
1. A cloud service to automate IT design, transition and operations
2. Dashboards that offer high-level views of enterprise services
3. A single system of record for all IT processes
http://p.sf.net/sfu/servicenow-d2d-j
___
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

--
How ServiceNow helps IT people transform

Re: [Pytables-users] pytable 30 - encoding

2013-06-04 Thread Jeff Reback
Anthony,
 
I created an issue with more info
 
I am not sure if this is a bug, or just a way both ne/pytables treat strings 
that need to touch an encoded value;
 
I found workaround by specifying the condvars to readWhere. Any more thoughts 
on this?
 
thanks Jeff
 
 
https://github.com/PyTables/PyTables/issues/265

I can be reached on my cell (917)971-6387
 


 From: Anthony Scopatz scop...@gmail.com
To: Jeff Reback j...@reback.net 
Cc: Discussion list for PyTables pytables-users@lists.sourceforge.net 
Sent: Tuesday, June 4, 2013 6:39 PM
Subject: Re: [Pytables-users] pytable 30 - encoding
  


Hi Jeff,
Hmmm, Could you try doing the same thing on just an in-memory numpy array using 
numexpr.  If this succeeds it tells us that the problem is in PyTables, not 
numexpr.

Be Well
Anthony



On Tue, Jun 4, 2013 at 11:35 AM, Jeff Reback jreb...@yahoo.com wrote:

Anthony,
  
I am using numexpr 2.1 (latest)
 
this is puzzling; doesn't matter what I pass (bytes or str) , same result?
 
(column == 'str-2')
 /mnt/code/arb/test/pytables-3.py(38)module()
- result = handle.root.test.table.readWhere(selector)
(Pdb) handle.root.test.table.readWhere(selector)
*** TypeError: string argument without an encoding
(Pdb) handle.root.test.table.readWhere(selector.encode(encoding))
*** TypeError: string argument without an encoding
(Pdb) 


 From: Anthony Scopatz scop...@gmail.com
To: Jeff Reback j...@reback.net; Discussion list for PyTables 
pytables-users@lists.sourceforge.net 
Sent: Tuesday, June 4, 2013 12:25 PM
Subject: Re: [Pytables-users] pytable 30 - encoding
 


Hi Jeff, 


Have you also updated numexpr to the most recent version?  The error is coming 
from numexpr not compiling the expression correctly. Also, you might try 
making selector a str, rather than bytes: 


selector = (column == 'str-2')



rather than


selector = (column == 'str-2').encode(encoding)



Be Well
Anthony



On Tue, Jun 4, 2013 at 8:51 AM, Jeff Reback jreb...@yahoo.com wrote:

anthony,where am I going wrong here?  
#!/usr/local/bin/python3
import tables
import numpy as np
import datetime, time
encoding = 'UTF-8'
test_file = 'test_select.h5'
handle = tables.openFile(test_file, w)
node   = handle.createGroup(handle.root, 'test')
table  = handle.createTable(node, 'table', dict(
index   = tables.Int64Col(),
    column  = tables.StringCol(25),
values  = tables.FloatCol(shape=(3)),
))

# add data
r = table.row
for i in range(10):
r['index'] = i
r['column'] = (str-%d % (i % 5)).encode(encoding)
r['values'] = np.arange(3)
r.append()
table.flush()
handle.close()
# read
handle =
 tables.openFile(test_file,r)
result = handle.root.test.table.read()
print(table data\n)
print(result)
# where
print(\nselector\n)
selector = (column == 'str-2').encode(encoding)
print(selector)
result = handle.root.test.table.readWhere(selector)
print(result)

and the following out:

[sheep-jreback-/code/arb/test] python3 pytables-3.py
table data
[(b'str-0', 0, [0.0, 1.0, 2.0]) (b'str-1', 1, [0.0, 1.0, 2.0])
(b'str-2', 2, [0.0, 1.0, 2.0]) (b'str-3', 3, [0.0, 1.0, 2.0])
(b'str-4', 4, [0.0, 1.0, 2.0]) (b'str-0', 5, [0.0, 1.0, 2.0])
(b'str-1', 6, [0.0, 1.0, 2.0]) (b'str-2', 7, [0.0, 1.0, 2.0])
(b'str-3', 8, [0.0, 1.0, 2.0]) (b'str-4', 9, [0.0, 1.0, 2.0])]
selector
b(column == 'str-2')
Traceback (most recent call last):
File pytables-3.py, line 37, in module
result =
 handle.root.test.table.readWhere(selector)
File 
/usr/local/lib/python3.3/site-packages/tables-3.0.0-py3.3-linux-x86_64.egg/tables/_past.py,
 line 35, in oldfunc
return obj(*args, **kwargs)
File 
/usr/local/lib/python3.3/site-packages/tables-3.0.0-py3.3-linux-x86_64.egg/tables/table.py,
 line 1522, in read_where
self._where(condition, condvars, start, stop, step)]
File 
/usr/local/lib/python3.3/site-packages/tables-3.0.0-py3.3-linux-x86_64.egg/tables/table.py,
 line 1484, in _where
compiled = self._compile_condition(condition, condvars)
File 
/usr/local/lib/python3.3/site-packages/tables-3.0.0-py3.3-linux-x86_64.egg/tables/table.py,
 line 1358, in _compile_condition
compiled = compile_condition(condition, typemap, indexedcols)
File 
/usr/local/lib/python3.3/site-packages/tables-3.0.0-py3.3-linux-x86_64.egg/tables/conditions.py,
 line 419, in compile_condition
func = NumExpr(expr, signature)
File 
/usr/local/lib/python3.3/site-packages/numexpr-2.1-py3.3-linux-x86_64.egg/numexpr/necompiler.py,
 line 559, in NumExpr
precompile(ex, signature, context)
File 
/usr/local/lib/python3.3/site-packages/numexpr-2.1-py3.3-linux-x86_64.egg/numexpr/necompiler.py,
 line 511, in precompile
constants_order, constants = getConstants(ast)
File 
/usr/local/lib/python3.3/site-packages/numexpr-2.1-py3.3-linux-x86_64.egg/numexpr/necompiler.py,
 line 294, in getConstants
for a in constants_order]
File 
/usr/local/lib/python3.3/site-packages/numexpr-2.1-py3.3-linux-x86_64.egg/numexpr/necompiler.py,
 line 294, in listcomp
for a in constants_order]
File 
/usr/local/lib/python3.3/site-packages/numexpr-2.1-py3.3

Re: [Pytables-users] dates and space

2013-08-05 Thread Jeff Reback
Here is a pandas solution for doing just this (which uses PyTables under the 
hood):

# create a frame
In [45]: df = DataFrame(randn(1000,2),index=date_range('2101',periods=1000))

In [53]: df
Out[53]: 
class 'pandas.core.frame.DataFrame'
DatetimeIndex: 1000 entries, 2000-01-01 00:00:00 to 2002-09-26 00:00:00
Freq: D
Data columns (total 2 columns):
0    1000  non-null values
1    1000  non-null values
dtypes: float64(2)

# store it as a table
In [46]: store = pd.HDFStore('test.h5',mode='w')

In [47]: store.append('df',df)

# select out the index (a datetimeindex in this case)
In [48]: c = store.select_column('df','index')

# get the coordinates of matching index
In [49]: coords = c[pd.DatetimeIndex(c).month==5]

# select those rows
In [51]: from pandas.io.pytables import Coordinates

In [50]: store.select('df',where=Coordinates(coords.index,None,None))
Out[50]: 
class 'pandas.core.frame.DataFrame'
DatetimeIndex: 93 entries, 2000-05-01 00:00:00 to 2002-05-31 00:00:00
Data columns (total 2 columns):
0    93  non-null values
1    93  non-null values
dtypes: float64(2)




 From: Anthony Scopatz scop...@gmail.com
To: Discussion list for PyTables pytables-users@lists.sourceforge.net 
Sent: Monday, August 5, 2013 2:54 PM
Subject: Re: [Pytables-users] dates and space
 


On Mon, Aug 5, 2013 at 1:38 PM, Oleksandr Huziy guziy.sa...@gmail.com wrote:

Hi Pytables users and developers:


I have a few questions to which I could not find the answer in the 
documentation. Thank you in advance for any help.


1. If I store dates in Pytables, does it mean I could write queries like 
table.where('date.month == 5')? Is there a common way to pass from python's 
datetime to pytable's datetime and inversely?

Hello Sasha, 

Pytables times are the actual based off of C time, not Python's date times.  
This is because they use the HDF5 time types.  So unfortunately you can't write 
queries like the one above.  (You'd need to talk to numexpr about getting that 
kind of query implemented ~_~.)

Instead I would suggest that you store your times as Float64Atoms and 
Float64Cols and then use arithmetic to figure out the query:

table.where((x / 3600 / 24)%12 == 5) 

This is not perfect...
 
2. I have several variables stored in the same file in a separate table for 
each variable. And I use separate columns year, month, day, hour, minute, 
second  - to mark the time for a record (the records are not necessarily 
ordered in time) and this is for each variable. I was thinking to put all the 
variables in the same table and put missing values for the variables which do 
not have outputs for a given time step. Is it possible to put None as a default 
value into a table (so I could easily filter dummy rows).


It is not possible to use None since that is a Python object of a different 
type than the other integers you are trying to stick in the column.  I would 
suggest that you use values with no actual meaning.  If you are using normal 
ints you can use -1 to represent missing values.  If you are using unsigned 
ints you have to pick other values, like 13 for month on the Julian calendar.
 
But then again the data comes in chunks, does this mean I would have to check 
if a row with the same date already exist for a different variable?

No you wouldn't you can store the same data multiple times in different rows.
 
I don't really like the ideas in 2, which are intended to save space, but maybe 
all I need is a good compression level? Can somebody advise me on this?


Compression would definitely help here since the date numbers are all fairly 
similar.  Probably even a compression level of 1 would work.  Keep in mind that 
sometime using compression actually speeds things up (see the starving CPU 
problem).  You might just need to experiment with a few different compression 
level to see how things go. 0, 1, 5, 9 gives you a good spread.

Be Well
Anthony
 






Cheers
--
Oleksandr (Sasha) Huziy   
--
Get your SQL database under version control now!
Version control is standard for application code, but databases havent
caught up. So what steps can you take to put your SQL databases under
version control? Why should you start doing it? Read more to find out.
http://pubads.g.doubleclick.net/gampad/clk?id=49501711iu=/4140/ostg.clktrk
___
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users



--
Get your SQL database under version control now!
Version control is standard for application code, but databases havent 
caught up. So what steps can you take to put your SQL databases under 
version control? Why should you start doing it? Read more to find out.
http://pubads.g.doubleclick.net/gampad/clk?id=49501711iu=/4140/ostg.clktrk