I have to run, but here's what you requested (I won't be back on this
computer until monday)

>>> cvals = np.fromiter([x['val'] for x in wctab02.where('val>1')], 
>>> dtype=np.int16)
>>> cvals
array([], dtype=int16)

>>> timeit big=np.argwhere(np.greater(wa02[:], 1))
1 loops, best of 3: 15.3 s per loop

this gives me a mask, that I can get with

>>> big2 = wa02[:]>1
>>> np.alltrue(big == big2)
True

and in far less time:
>>> timeit big2 = wa02[:]>1
1 loops, best of 3: 348 ms per loop




-á.

/raw/t0/wa02 (Array(312000000,)) ''
  atom := Int16Atom(shape=(), dflt=0)
  maindim := 0
  flavor := 'numpy'
  byteorder := 'little'
  chunkshape := None


On Thu, Apr 19, 2012 at 15:33, Anthony Scopatz <scop...@gmail.com> wrote:
> I was interested in how long it takes to iterate, since this is arguably
> where the
> majority of the time is spent.
>
> On Thu, Apr 19, 2012 at 8:43 AM, Alvaro Tejero Cantero <alv...@minin.es>
> wrote:
>>
>> Some complementary info (I copy the details of the tables below)
>>
>> timeit vals = numpy.fromiter((x['val'] for x in
>> my.root.raw.t0.wtab02.where('val>1')),dtype=np.int16)
>> 1 loops, best of 3: 30.4 s per loop
>>
>>
>> Using the compressed and indexed version, it mysteriously does not
>> work (output is empty list)
>> >>> cvals = np.fromiter((x['val'] for x in wctab02.where('val>1')),
>> >>> dtype=np.int16)
>> >>> cvals
>> array([], dtype=int16)
>
>
> This doesn't work because numpy doesn't accept generators.  The following
> should work:
>>>> cvals = np.fromiter([x['val'] for x in wctab02.where('val>1')],
>>>> dtype=np.int16)
>
> Also, I am a little concerned that np.nonzero() doesn't really compare to
> Table.getWhereList('val>1').  Testing for all zero bits should be a lot
> faster
> than a numeric comparison.  Could you instead try the same actual operation
> in numpy as whereList():
>
>>>> timeit big=np.argwhere(np.greater(wa02[:], 1))
>
> Thanks!
> Anthony
>
>>
>>
>> But it does if we skip using where ( I don't print cvals, but it is
>> correct )
>> >>> timeit cvals = np.fromiter((x['val'] for x in wctab02 if x['val']>1),
>> >>> dtype=np.int16)
>> 1 loops, best of 3: 54.8 s per loop
>>
>> (the version with longer chunklen works fine and times to 30.7s).
>>
>>
>> -á.
>>
>> wtab02: not compressed, not indexed, small chunklen:
>> /raw/t0/wtab02 (Table(312000000,)) ''
>>  description := {
>>  "val": Int16Col(shape=(), dflt=0, pos=0)}
>>  byteorder := 'little'
>>  chunkshape := (32768,)
>>
>> larger chunklen (as calculated from expectedrows=312000000)
>> /raw/t0/wcetab02 (Table(312000000,)) 'test'
>>  description := {
>>  "val": Int16Col(shape=(), dflt=0, pos=0)}
>>  byteorder := 'little'
>>  chunkshape := (131072,)
>>
>> wctab02: compressed, with CSI index
>> /raw/t0/wctab02 (Table(312000000,), shuffle, blosc(9)) 'test'
>>  description := {
>>  "val": Int16Col(shape=(), dflt=0, pos=0)}
>>  byteorder := 'little'
>>  chunkshape := (32768,)
>>  autoIndex := True
>>  colindexes := {
>>    "val": Index(9, full, shuffle, zlib(1)).is_CSI=True}
>>
>>
>>
>> On Thu, Apr 19, 2012 at 12:46, Alvaro Tejero Cantero <alv...@minin.es>
>> wrote:
>> > where will give me an iterator over the /values/; in this case I
>> > wanted the indexes. Plus, it will give me an iterator, so it will be
>> > trivially fast.
>> >
>> > Are you interested in the timings of where + building a list? or where
>> > + building an array?
>> >
>> >
>> > -á.
>> >
>> >
>> >
>> > On Wed, Apr 18, 2012 at 19:02, Anthony Scopatz <scop...@gmail.com>
>> > wrote:
>> >>
>>
>>
>> ------------------------------------------------------------------------------
>> For Developers, A Lot Can Happen In A Second.
>> Boundary is the first to Know...and Tell You.
>> Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
>> http://p.sf.net/sfu/Boundary-d2dvs2
>>
>> _______________________________________________
>> Pytables-users mailing list
>> Pytables-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/pytables-users
>
>
>
> ------------------------------------------------------------------------------
> For Developers, A Lot Can Happen In A Second.
> Boundary is the first to Know...and Tell You.
> Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
> http://p.sf.net/sfu/Boundary-d2dvs2
> _______________________________________________
> Pytables-users mailing list
> Pytables-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/pytables-users
>

------------------------------------------------------------------------------
For Developers, A Lot Can Happen In A Second.
Boundary is the first to Know...and Tell You.
Monitor Your Applications in Ultra-Fine Resolution. Try it FREE!
http://p.sf.net/sfu/Boundary-d2dvs2
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to