[Pytables-users] PyTables in-kernel query using Time64Col returns wrong results

2013-04-15 Thread Charles de Villiers

0
down vote
favorite I'm using PyTables 2.4.0 and Python 2.7 I've got a database that 
contains the following typical table:
/anc/asc_wind_speed (Table(87591,),shuffle,blosc(3))'Wind speed'description 
:={value_seconds:Time64Col(shape=(),dflt=0.0,pos=0),update_seconds:Time64Col(shape=(),dflt=0.0,pos=1),status:UInt8Col(shape=(),dflt=0,pos=2),value:Float64Col(shape=(),dflt=0.0,pos=3)}byteorder
 :='little'chunkshape :=(2621,)autoIndex :=Truecolindexes 
:={update_seconds:Index(9,full,shuffle,zlib(1)).is_CSI=True,value:Index(9,full,shuffle,zlib(1)).is_CSI=True}
I populate the timestamp columns using float seconds.
The data looks OK in my IPython session:
array([(1343779432.2160001,1343779431.852,0,5.29750003),(1343779433.2190001,1343779432.9430001,0,5.74749996),(1343779434.217,1343779433.980,0,5.8603),...,(1343866301.934,1343866301.513,0,3.84249998),(1343866302.934,1343866302.579,0,4.0596),(1343866303.934,1343866303.642,0,3.78250002)],dtype=[('value_seconds','f8'),('update_seconds','f8'),('status','|u1'),('value','f8')])
.. but when I try to do an in-kernel search using the indexed column 
'update_seconds', everything goes pear-shaped:
len(wstable.readWhere('(update_seconds = 1343866303.642)'))0
ie I get 0 rows returned when I was expecting all 87591 of them. Occasionally I 
do manage to get some rows with a '=' query, but the timestamp columns are 
then returned as huge floats (~10^79). It seems that there is some implicit 
type-conversion going on that causes the Time64Col values to be misinterpreted. 
Can someone spot my mistake, or should I forget about Time64Cols and convert 
them all to Float64 (and how do I do this?) 
--
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis  visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter___
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users


Re: [Pytables-users] PyTables in-kernel query using Time64Col returns wrong results

2013-04-15 Thread Charles de Villiers
Hi Anthony,

Thanks for your response.

I had come across that discussion, but I don't think the floating-point 
precision thing really explains my results, because I'm querying for intervals, 
not instants.
if I have a table containing, say, one-second samples between 500.0 and 1500.0, 
and I use a where clause like this:
'(update_seconds = 1000.0)  (update_seconds = 1060.0)'
then I expect to get at least 58 samples, even with floating-point 'fuzziness' 
- but in fact I get none. 
However, I have now tried the approach of storing my epoch seconds in 
Float64Cols and that seems to be working just fine. 
The question I'm left with is - just what does a Time64Col represent? Since 
there's no standard Python Time class with a float representation, I just 
guessed I  could assign it float seconds a la time.time(), but Float64 works 
just as well for that (and as it turns out, better). How could you use a 
Time64Col in practice?

Thanks again, 
 

Charles de Villiers

They have computers, and they may have other weapons of mass destruction.
(Janet Reno) 




 From: Anthony Scopatz scop...@gmail.com
To: Charles de Villiers chas...@yahoo.com; Discussion list for PyTables 
pytables-users@lists.sourceforge.net 
Sent: Monday, April 15, 2013 5:13 PM
Subject: Re: [Pytables-users] PyTables in-kernel query using Time64Col returns 
wrong results
 


Hi Charles, 

We just discussed this last week and I am too lazy to retype it all so here is 
a link to the archive post [1].

Be Well
Anthony

1. http://sourceforge.net/mailarchive/message.php?msg_id=30708089



On Mon, Apr 15, 2013 at 9:20 AM, Charles de Villiers chas...@yahoo.com wrote:


0
down vote
favorite I'm using PyTables 2.4.0 and Python 2.7 I've got a database that 
contains the following typical table:
/anc/asc_wind_speed (Table(87591,),shuffle,blosc(3))'Wind speed'description 
:={value_seconds:Time64Col(shape=(),dflt=0.0,pos=0),update_seconds:Time64Col(shape=(),dflt=0.0,pos=1),status:UInt8Col(shape=(),dflt=0,pos=2),value:Float64Col(shape=(),dflt=0.0,pos=3)}byteorder
 :='little'chunkshape :=(2621,)autoIndex :=Truecolindexes 
:={update_seconds:Index(9,full,shuffle,zlib(1)).is_CSI=True,value:Index(9,full,shuffle,zlib(1)).is_CSI=True}
I populate the timestamp columns using float seconds.
The data looks OK in my IPython session:
array([(1343779432.2160001,1343779431.852,0,5.29750003),(1343779433.2190001,1343779432.9430001,0,5.74749996),(1343779434.217,1343779433.980,0,5.8603),...,(1343866301.934,1343866301.513,0,3.84249998),(1343866302.934,1343866302.579,0,4.0596),(1343866303.934,1343866303.642,0,3.78250002)],dtype=[('value_seconds','f8'),('update_seconds','f8'),('status','|u1'),('value','f8')])
.. but when I try to do an in-kernel search using the indexed column 
'update_seconds', everything goes pear-shaped:
len(wstable.readWhere('(update_seconds = 1343866303.642)'))0
ie I get 0 rows returned when I was expecting all 87591 of them. Occasionally 
I do manage to get some rows with a '=' query, but the timestamp columns are 
then returned as huge floats (~10^79). It seems that there is some implicit 
type-conversion going on that causes the Time64Col values to be 
misinterpreted. Can someone spot my mistake, or should I forget about 
Time64Cols and convert them all to Float64 (and how do I do this?) 


--
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis  visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
___
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

--
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis  visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter___
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users