Hello Derek, and devs,

After playing around with your data,  I am able to reproduce this error on
my system.
I am not sure exactly where the problem is but I do know how to fix it!

It turns out that this is an issue with the indexes not being properly in
sync with the
original table OR the start and stop values are not
being propagated properly down
to the indexes.  When I tried to reindex by calling table.reIndex(), this
did not fix the
issue.  This makes me think that the problem is propagating start, stop,
and step all
the way through correctly.  I'll go ahead an make a ticket reflecting this.

That said, the way to fix this in the short term is to do one of the
following

1)  Only use start=0, and step=1 (I bet that other stop values work)
2) Don't use indexes.  When I removed the indexes from the file using
    "ptrepack analysis.h5 analysis2.h5", everything worked fine.

Thanks a ton for reporting this!
Be Well
Anthony

On Tue, Sep 25, 2012 at 12:30 PM, Derek Shockey <derek.shoc...@gmail.com>wrote:

> Hi Anthony,
>
> It doesn't happen if I set start=0 or seemingly any number below 3257
> (though I didn't try them *all*). I am new to PyTables and hdf5, so
> I'm not sure about the chunksize or if I'm at a boundary. I did
> however notice that the table's chunkshape is 203, and this happens
> for exactly 203 sequential records, so I doubt that's a coincidence.
> The table description is below.
>
> Thanks,
> Derek
>
> /events (Table(5988,)) ''
>   description := {
>   "client_id": StringCol(itemsize=24, shape=(), dflt='', pos=0),
>   "data_01": StringCol(itemsize=36, shape=(), dflt='', pos=1),
>   "data_02": StringCol(itemsize=36, shape=(), dflt='', pos=2),
>   "data_03": StringCol(itemsize=36, shape=(), dflt='', pos=3),
>   "data_04": StringCol(itemsize=36, shape=(), dflt='', pos=4),
>   "data_05": StringCol(itemsize=36, shape=(), dflt='', pos=5),
>   "device_id": StringCol(itemsize=36, shape=(), dflt='', pos=6),
>   "id": StringCol(itemsize=36, shape=(), dflt='', pos=7),
>   "timestamp": Time64Col(shape=(), dflt=0.0, pos=8),
>   "type": UInt16Col(shape=(), dflt=0, pos=9),
>   "user_id": StringCol(itemsize=36, shape=(), dflt='', pos=10)}
>   byteorder := 'little'
>   chunkshape := (203,)
>   autoIndex := True
>   colindexes := {
>     "timestamp": Index(9, full, shuffle, zlib(1)).is_CSI=True,
>     "type": Index(9, full, shuffle, zlib(1)).is_CSI=True,
>     "id": Index(9, full, shuffle, zlib(1)).is_CSI=True,
>     "user_id": Index(9, full, shuffle, zlib(1)).is_CSI=True}
>
> On Tue, Sep 25, 2012 at 9:32 AM, Anthony Scopatz <scop...@gmail.com>
> wrote:
> > Hi Derek,
> >
> > Ok That is very strange.  I cannot reproduce this on any of my data.  A
> > quick couple of extra questions:
> >
> > 1) Does this still happen when you set start=0?
> > 2) What is the chunksize of this data set (are you at a boundary)?
> > 3) Could you send us the full table information, ie repr(table).
> >
> > Be Well
> > Anthony
> >
> >
> > On Tue, Sep 25, 2012 at 12:42 AM, Derek Shockey <derek.shoc...@gmail.com
> >
> > wrote:
> >>
> >> I ran the tests. All 4988 passed. The information it output is:
> >>
> >> PyTables version:  2.4.0
> >> HDF5 version:      1.8.9
> >> NumPy version:     1.6.2
> >> Numexpr version:   2.0.1 (not using Intel's VML/MKL)
> >> Zlib version:      1.2.5 (in Python interpreter)
> >> LZO version:       2.06 (Aug 12 2011)
> >> BZIP2 version:     1.0.6 (6-Sept-2010)
> >> Blosc version:     1.1.3 (2010-11-16)
> >> Cython version:    0.16
> >> Python version:    2.7.3 (default, Jul  6 2012, 00:17:51)
> >> [GCC 4.2.1 Compatible Apple Clang 3.1 (tags/Apple/clang-318.0.58)]
> >> Platform:          darwin-x86_64
> >> Byte-ordering:     little
> >> Detected cores:    4
> >>
> >> -Derek
> >>
> >> On Mon, Sep 24, 2012 at 9:09 PM, Anthony Scopatz <scop...@gmail.com>
> >> wrote:
> >> > Hi Derek,
> >> >
> >> > Can you please run the following command and report back what you see?
> >> >
> >> > python -c "import tables; tables.test()"
> >> >
> >> > Be Well
> >> > Anthony
> >> >
> >> > On Mon, Sep 24, 2012 at 10:56 PM, Derek Shockey
> >> > <derek.shoc...@gmail.com>
> >> > wrote:
> >> >>
> >> >> Hello,
> >> >>
> >> >> I'm hoping someone can help me. When I specify start and stop values
> >> >> for calls to where() and readWhere(), it is returning blatantly
> >> >> incorrect results:
> >> >>
> >> >> >>> table.readWhere("id == 'ceec536a-394e-4dd7-a182-eea557f3bb93'",
> >> >> >>> start=3257, stop=table.nrows)[0]['id']
> >> >> '7f589d3e-a0e1-4882-b69b-0223a7de3801'
> >> >>
> >> >> >>> table.where("id == 'ceec536a-394e-4dd7-a182-eea557f3bb93'",
> >> >> >>> start=3257, stop=table.nrows).next()['id']
> >> >> '7f589d3e-a0e1-4882-b69b-0223a7de3801'
> >> >>
> >> >> This happens with a sequential block of about 150 rows of data, and
> >> >> each time it seems to be 8 rows off (i.e. the row it returns is 8
> rows
> >> >> ahead of the row it should be returning). If I remove the start and
> >> >> stop args, it behaves correctly. This seems to be a bug, unless I am
> >> >> misunderstanding something. I'm using Python 2.7.3, PyTables 2.4.0,
> >> >> and hdf5 1.8.9 on OS X 10.8.2.
> >> >>
> >> >> Any ideas?
> >> >>
> >> >> Thanks,
> >> >> Derek
> >> >>
> >> >>
> >> >>
> >> >>
> ------------------------------------------------------------------------------
> >> >> Live Security Virtual Conference
> >> >> Exclusive live event will cover all the ways today's security and
> >> >> threat landscape has changed and how IT managers can respond.
> >> >> Discussions
> >> >> will include endpoint security, mobile security and the latest in
> >> >> malware
> >> >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> >> >> _______________________________________________
> >> >> Pytables-users mailing list
> >> >> Pytables-users@lists.sourceforge.net
> >> >> https://lists.sourceforge.net/lists/listinfo/pytables-users
> >> >
> >> >
> >> >
> >> >
> >> >
> ------------------------------------------------------------------------------
> >> > Live Security Virtual Conference
> >> > Exclusive live event will cover all the ways today's security and
> >> > threat landscape has changed and how IT managers can respond.
> >> > Discussions
> >> > will include endpoint security, mobile security and the latest in
> >> > malware
> >> > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> >> > _______________________________________________
> >> > Pytables-users mailing list
> >> > Pytables-users@lists.sourceforge.net
> >> > https://lists.sourceforge.net/lists/listinfo/pytables-users
> >> >
> >>
> >>
> >>
> ------------------------------------------------------------------------------
> >> Live Security Virtual Conference
> >> Exclusive live event will cover all the ways today's security and
> >> threat landscape has changed and how IT managers can respond.
> Discussions
> >> will include endpoint security, mobile security and the latest in
> malware
> >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> >> _______________________________________________
> >> Pytables-users mailing list
> >> Pytables-users@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/pytables-users
> >
> >
> >
> >
> ------------------------------------------------------------------------------
> > Live Security Virtual Conference
> > Exclusive live event will cover all the ways today's security and
> > threat landscape has changed and how IT managers can respond. Discussions
> > will include endpoint security, mobile security and the latest in malware
> > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> > _______________________________________________
> > Pytables-users mailing list
> > Pytables-users@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/pytables-users
> >
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> Pytables-users mailing list
> Pytables-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/pytables-users
>
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to