Re: [Numpy-discussion] numpy ufuncs and COREPY - any info?

2009-05-22 Thread Gregor Thalhammer
dmitrey schrieb:
 hi all,
 has anyone already tried to compare using an ordinary numpy ufunc vs
 that one from corepy, first of all I mean the project
 http://socghop.appspot.com/student_project/show/google/gsoc2009/python/t124024628235

 It would be interesting to know what is speedup for (eg) vec ** 0.5 or
 (if it's possible - it isn't pure ufunc) numpy.dot(Matrix, vec). Or
 any another example.
   
I have no experience with the mentioned CorePy, but recently I was 
playing around with accelerated ufuncs using Intels Math Kernel Library 
(MKL). These improvements are now part of the numexpr package 
http://code.google.com/p/numexpr/
Some remarks on possible speed improvements on recent Intel x86 processors.
1) basic arithmetic ufuncs (add, sub, mul, ...) in standard numpy are 
fast (SSE is used) and speed is limited by memory bandwidth.
2) the speed of many transcendental functions (exp, sin, cos, pow, ...) 
can be improved by _roughly_ a factor of five (single core) by using the 
MKL. Most of the improvements stem from using faster algorithms with a 
vectorized implementation. Note: the speed improvement depends on a 
_lot_ of other circumstances.
3) Improving performance by using multi cores is much more difficult. 
Only for sufficiently large (1e5) arrays a significant speedup is 
possible. Where a speed gain is possible, the MKL uses several cores. 
Some experimentation showed that adding a few OpenMP constructs you 
could get a similar speedup with numpy.
4) numpy.dot uses optimized implementations.

Gregor
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] where are the benefits of ldexp and/or array times 2?

2009-05-22 Thread Gregor Thalhammer
dmitrey schrieb:
 Hi all,
 I expected to have some speedup via using ldexp or multiplying an
 array by a power of 2 (doesn't it have to perform a simple shift of
 mantissa?), but I don't see the one.

 # Let me also note -
 # 1) using b = 2 * ones(N) or b = zeros(N) doesn't yield any speedup
 vs b = rand()
 # 2) using A * 2.0 (or mere 2) instead of 2.1 doesn't yield any
 speedup, despite it is exact integer power of 2.
   
On recent processors multiplication is very fast and takes 1.5 clock 
cycles (float, double precision), independent of the values. There is 
very little gain by using bit shift operators.

Gregor
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy ufuncs and COREPY - any info?

2009-05-22 Thread Francesc Alted
A Friday 22 May 2009 11:42:56 Gregor Thalhammer escrigué:
 dmitrey schrieb:
  hi all,
  has anyone already tried to compare using an ordinary numpy ufunc vs
  that one from corepy, first of all I mean the project
  http://socghop.appspot.com/student_project/show/google/gsoc2009/python/t1
 24024628235
 
  It would be interesting to know what is speedup for (eg) vec ** 0.5 or
  (if it's possible - it isn't pure ufunc) numpy.dot(Matrix, vec). Or
  any another example.

 I have no experience with the mentioned CorePy, but recently I was
 playing around with accelerated ufuncs using Intels Math Kernel Library
 (MKL). These improvements are now part of the numexpr package
 http://code.google.com/p/numexpr/
 Some remarks on possible speed improvements on recent Intel x86 processors.
 1) basic arithmetic ufuncs (add, sub, mul, ...) in standard numpy are
 fast (SSE is used) and speed is limited by memory bandwidth.
 2) the speed of many transcendental functions (exp, sin, cos, pow, ...)
 can be improved by _roughly_ a factor of five (single core) by using the
 MKL. Most of the improvements stem from using faster algorithms with a
 vectorized implementation. Note: the speed improvement depends on a
 _lot_ of other circumstances.
 3) Improving performance by using multi cores is much more difficult.
 Only for sufficiently large (1e5) arrays a significant speedup is
 possible. Where a speed gain is possible, the MKL uses several cores.
 Some experimentation showed that adding a few OpenMP constructs you
 could get a similar speedup with numpy.
 4) numpy.dot uses optimized implementations.

Good points Gregor.  However, I wouldn't say that improving performance by 
using multi cores is *that* difficult, but rather that multi cores can only be 
used efficiently *whenever* the memory bandwith is not a limitation.  An 
example of this is the computation of transcendental functions, where, even 
using vectorized implementations, the computation speed is still CPU-bounded 
in many cases.  And you have experimented yourself very good speed-ups for 
these cases with your implementation of numexpr/MKL :)

Cheers,

-- 
Francesc Alted
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] where are the benefits of ldexp and/or array times 2?

2009-05-22 Thread Francesc Alted
A Friday 22 May 2009 11:55:31 Gregor Thalhammer escrigué:
 dmitrey schrieb:
  Hi all,
  I expected to have some speedup via using ldexp or multiplying an
  array by a power of 2 (doesn't it have to perform a simple shift of
  mantissa?), but I don't see the one.
 
  # Let me also note -
  # 1) using b = 2 * ones(N) or b = zeros(N) doesn't yield any speedup
  vs b = rand()
  # 2) using A * 2.0 (or mere 2) instead of 2.1 doesn't yield any
  speedup, despite it is exact integer power of 2.

 On recent processors multiplication is very fast and takes 1.5 clock
 cycles (float, double precision), independent of the values. There is
 very little gain by using bit shift operators.

...unless you use the vectorization capabilities of modern Intel-compatible 
processors and shift data in bunches of up to 4 elements (i.e. the number of 
floats that fits on a 128-bit SSE2 register), in which case you can perform 
operations up to a speed of 0.25 cycles/element.  Indeed, that requires 
dealing with SSE2 instructions in your code, but using latest GCC, ICC or MSVC 
implementations, this is not that difficult.

Cheers,

-- 
Francesc Alted
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Home for pyhdf5io?

2009-05-22 Thread Pauli Virtanen
Fri, 22 May 2009 10:00:56 +0200, Francesc Alted kirjoitti:
[clip: pyhdf5io]
 I've been having a look at your module and seems pretty cute. 
 Incidentally, there is another module module that does similar things:
 
 http://www.elisanet.fi/ptvirtan/software/hdf5pickle/index.html
 
 However, I do like your package better in the sense that it adds more
 'magic' to the load/save routines.  But maybe you want to have a look at
 the above: it can give you more ideas, like for example, using CArrays
 and compression for very large arrays, or Tables for structured arrays.

I don't think these two are really comparable. The significant difference 
appears to be that pyhdf5io is a thin wrapper for File.createArray, so 
when it encounters non-array objects, it will pickle them to strings, and 
save the strings to the HDF5 file.

Hdf5pickle, OTOH, implements the pickle protocol, and will unwrap non-
array objects so that all their attributes etc. are exposed in the hdf5 
file and can be read by non-Python applications.

-- 
Pauli Virtanen

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy ufuncs and COREPY - any info?

2009-05-22 Thread Andrew Friedley
(sending again)

Hi,

I'm the student doing the project.  I have a blog here, which contains 
some initial performance numbers for a couple test ufuncs I did:

http://numcorepy.blogspot.com

It's really too early yet to give definitive results though; GSoC 
officially starts in two days :)  What I'm finding is that the existing 
ufuncs are already pretty fast; it appears right now that the main 
limitation is memory bandwidth.  If that's really the case, the 
performance gains I'll get will be through cache tricks (non-temporal 
loads/stores), reducing memory accesses and using multiple cores to get 
more bandwidth.

Another alternative we've talked about, and I (more and more likely) may 
look into is composing multiple operations together into a single ufunc. 
  Again the main idea being that memory accesses can be reduced/eliminated.

Andrew

dmitrey wrote:
 hi all,
 has anyone already tried to compare using an ordinary numpy ufunc vs
 that one from corepy, first of all I mean the project
 http://socghop.appspot.com/student_project/show/google/gsoc2009/python/t124024628235
 
 It would be interesting to know what is speedup for (eg) vec ** 0.5 or
 (if it's possible - it isn't pure ufunc) numpy.dot(Matrix, vec). Or
 any another example.
 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy ufuncs and COREPY - any info?

2009-05-22 Thread Andrew Friedley


Francesc Alted wrote:
 A Friday 22 May 2009 11:42:56 Gregor Thalhammer escrigué:
 dmitrey schrieb:
 3) Improving performance by using multi cores is much more difficult.
 Only for sufficiently large (1e5) arrays a significant speedup is
 possible. Where a speed gain is possible, the MKL uses several cores.
 Some experimentation showed that adding a few OpenMP constructs you
 could get a similar speedup with numpy.
 4) numpy.dot uses optimized implementations.
 
 Good points Gregor.  However, I wouldn't say that improving performance by 
 using multi cores is *that* difficult, but rather that multi cores can only 
 be 
 used efficiently *whenever* the memory bandwith is not a limitation.  An 
 example of this is the computation of transcendental functions, where, even 
 using vectorized implementations, the computation speed is still CPU-bounded 
 in many cases.  And you have experimented yourself very good speed-ups for 
 these cases with your implementation of numexpr/MKL :)

Using multiple cores is pretty easy for element-wise ufuncs; no 
communication needs to occur and the work partitioning is trivial.  And 
actually I've found with some initial testing that multiple cores does 
still help when you are memory bound.  I don't fully understand why yet, 
though I have some ideas.  One reason is multiple memory controllers due 
to multiple sockets (ie opteron).  Another is that each thread is 
pulling memory from a different bank, utilizing more bandwidth than a 
single sequential thread could.  However if that's the case, we could 
possibly come up with code for a single thread that achieves (nearly) 
the same additional throughput..

Andrew
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy ufuncs and COREPY - any info?

2009-05-22 Thread Francesc Alted
A Friday 22 May 2009 13:59:17 Andrew Friedley escrigué:
 Using multiple cores is pretty easy for element-wise ufuncs; no
 communication needs to occur and the work partitioning is trivial.  And
 actually I've found with some initial testing that multiple cores does
 still help when you are memory bound.  I don't fully understand why yet,
 though I have some ideas.  One reason is multiple memory controllers due
 to multiple sockets (ie opteron).

Yeah.  I think this must likely be the reason.  If, as in your case, you have 
several independent paths from different processors to your data, then you can 
achieve speed-ups even if you are having a memory bound in a one-processor 
scenario.

 Another is that each thread is
 pulling memory from a different bank, utilizing more bandwidth than a
 single sequential thread could.  However if that's the case, we could
 possibly come up with code for a single thread that achieves (nearly)
 the same additional throughput..

Well, I don't think you can achieve important speed-ups in this case, but 
experimenting never hurts :)

Good luck!

-- 
Francesc Alted
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy ufuncs and COREPY - any info?

2009-05-22 Thread Francesc Alted
A Friday 22 May 2009 13:52:46 Andrew Friedley escrigué:
 (sending again)

 Hi,

 I'm the student doing the project.  I have a blog here, which contains
 some initial performance numbers for a couple test ufuncs I did:

 http://numcorepy.blogspot.com

 It's really too early yet to give definitive results though; GSoC
 officially starts in two days :)  What I'm finding is that the existing
 ufuncs are already pretty fast; it appears right now that the main
 limitation is memory bandwidth.  If that's really the case, the
 performance gains I'll get will be through cache tricks (non-temporal
 loads/stores), reducing memory accesses and using multiple cores to get
 more bandwidth.

 Another alternative we've talked about, and I (more and more likely) may
 look into is composing multiple operations together into a single ufunc.
   Again the main idea being that memory accesses can be reduced/eliminated.

IMHO, composing multiple operations together is the most promising venue for 
leveraging current multicore systems.

Another interesting approach is to implement costly operations (from the point 
of view of CPU resources), namely, transcendental functions like sin, cos or 
tan, but also others like sqrt or pow) in a parallel way.  If besides, you can 
combine this with vectorized versions of them (by using the well spread SSE2 
instruction set, see [1] for an example), then you would be able to achieve 
really good results for sure (at least Intel did with its VML library ;)

[1] http://gruntthepeon.free.fr/ssemath/

Cheers,

-- 
Francesc Alted
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] List/location of consecutive integers

2009-05-22 Thread Andrea Gavana
Hi All,

this should be a very easy question but I am trying to make a
script run as fast as possible, so please bear with me if the solution
is easy and I just overlooked it.

I have a list of integers, like this one:

indices = [1,2,3,4,5,6,7,8,9,255,256,257,258,10001,10002,10003,10004]

From this list, I would like to find out which values are consecutive
and store them in another list of tuples (begin_consecutive,
end_consecutive) or a simple list: as an example, the previous list
will become:

new_list = [(1, 9), (255, 258), (10001, 10004)]

I can do it with for loops, but I am trying to speed up a fotran-based
routine which I wrap with f2py (ideally I would like to do this step
in Fortran too, so if you have a suggestion on how to do it also in
Fortran it would be more than welcome). Do you have any suggestions?

Thank you for your time.

Andrea.

Imagination Is The Only Weapon In The War Against Reality.
http://xoomer.alice.it/infinity77/
http://thedoomedcity.blogspot.com/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] List/location of consecutive integers

2009-05-22 Thread josef . pktd
On Fri, May 22, 2009 at 12:31 PM, Andrea Gavana andrea.gav...@gmail.com wrote:
 Hi All,

    this should be a very easy question but I am trying to make a
 script run as fast as possible, so please bear with me if the solution
 is easy and I just overlooked it.

 I have a list of integers, like this one:

 indices = [1,2,3,4,5,6,7,8,9,255,256,257,258,10001,10002,10003,10004]

 From this list, I would like to find out which values are consecutive
 and store them in another list of tuples (begin_consecutive,
 end_consecutive) or a simple list: as an example, the previous list
 will become:

 new_list = [(1, 9), (255, 258), (10001, 10004)]

 I can do it with for loops, but I am trying to speed up a fotran-based
 routine which I wrap with f2py (ideally I would like to do this step
 in Fortran too, so if you have a suggestion on how to do it also in
 Fortran it would be more than welcome). Do you have any suggestions?

 Thank you for your time.


something along the line of:

 indices = 
 np.array([1,2,3,4,5,6,7,8,9,255,256,257,258,10001,10002,10003,10004])
 idx = (np.diff(indices) != 1).nonzero()[0]
 idx
array([ 8, 12])
 idxf = np.hstack((-1,idx,len(indices)-1))
 vmin = indices[idxf[:-1]+1]
 vmax = indices[idxf[1:]]
 zip(vmin,vmax)
[(1, 9), (255, 258), (10001, 10004)]


Josef
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] List/location of consecutive integers

2009-05-22 Thread Christopher Barker
Andrea Gavana wrote:
 I have a list of integers, like this one:
 
 indices = [1,2,3,4,5,6,7,8,9,255,256,257,258,10001,10002,10003,10004]
 
From this list, I would like to find out which values are consecutive
 and store them in another list of tuples (begin_consecutive,
 end_consecutive) or a simple list: as an example, the previous list
 will become:
 
 new_list = [(1, 9), (255, 258), (10001, 10004)]

Is this faster?

In [102]: indices = 
np.array([1,2,3,4,5,6,7,8,9,255,256,257,258,10001,10002,10003,10004,sys.maxint])

In [103]: breaks = np.diff(indices) != 1

In [104]: zip(indices[np.r_[True, breaks[:-1]]], indices[breaks])
Out[104]: [(1, 9), (255, 258), (10001, 10004)]



-Chris



-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] List/location of consecutive integers

2009-05-22 Thread David Warde-Farley
On 22-May-09, at 1:03 PM, Christopher Barker wrote:

 In [104]: zip(indices[np.r_[True, breaks[:-1]]], indices[breaks])



I don't think this is very general:

In [53]: indices
Out[53]:
array([   -3, 1, 2, 3, 4, 5, 6, 7, 8,
9,   255,   256,   257,   258, 10001, 10002, 10003, 10004])

In [54]: breaks = diff(indices) != 1

In [55]: zip(indices[np.r_[True, breaks[:-1]]], indices[breaks])
Out[55]: [(-3, -3), (1, 9), (255, 258)]


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Join us for Scientific Computing with Python Webinar

2009-05-22 Thread Francesc Alted
A Wednesday 20 May 2009 16:45:12 Travis Oliphant escrigué:
 Hello all Python users:

 I am pleased to announce the beginning of a free Webinar series that
 discusses using Python for scientific computing.   Enthought will host
 this free series  which will take place once a month for 30-45
 minutes.   The schedule and length may change based on participation
 feedback, but for now it is scheduled for the fourth Friday of every
 month. This free webinar should not be confused with the EPD
 webinar on the first Friday of each month which is open only to
 subscribers to the Enthought Python Distribution.

Mmh, I'm trying to connect, but it seems that Linux is not supported for this 
sort of webinars:

To join the Webinar, please use one of the following supported operating 
systems:
•   Windows® 2000, XP Pro, XP Home, 2003 Server, Vista
•   Mac OS® X, Panther® 10.3.9, Tiger™ 10.4.5 or higher

Too bad :-/

-- 
Francesc Alted
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Join us for Scientific Computing with Python Webinar

2009-05-22 Thread Eric Firing
Francesc Alted wrote:
 A Wednesday 20 May 2009 16:45:12 Travis Oliphant escrigué:
 Hello all Python users:

 I am pleased to announce the beginning of a free Webinar series that
 discusses using Python for scientific computing.   Enthought will host
 this free series  which will take place once a month for 30-45
 minutes.   The schedule and length may change based on participation
 feedback, but for now it is scheduled for the fourth Friday of every
 month. This free webinar should not be confused with the EPD
 webinar on the first Friday of each month which is open only to
 subscribers to the Enthought Python Distribution.
 
 Mmh, I'm trying to connect, but it seems that Linux is not supported for this 
 sort of webinars:
 
 To join the Webinar, please use one of the following supported operating 
 systems:
 • Windows® 2000, XP Pro, XP Home, 2003 Server, Vista
 • Mac OS® X, Panther® 10.3.9, Tiger™ 10.4.5 or higher
 
 Too bad :-/
 

See comments 3, 4, and 5 in the blog:

http://blog.enthought.com/?p=116

Eric
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] List/location of consecutive integers

2009-05-22 Thread Pierre GM

On May 22, 2009, at 12:31 PM, Andrea Gavana wrote:

 Hi All,

this should be a very easy question but I am trying to make a
 script run as fast as possible, so please bear with me if the solution
 is easy and I just overlooked it.

 I have a list of integers, like this one:

 indices = [1,2,3,4,5,6,7,8,9,255,256,257,258,10001,10002,10003,10004]

 From this list, I would like to find out which values are consecutive
 and store them in another list of tuples (begin_consecutive,
 end_consecutive) or a simple list: as an example, the previous list
 will become:

 new_list = [(1, 9), (255, 258), (10001, 10004)]


Josef's and Chris's solutions are pretty neat in this case. I've been  
recently working on a more generic case where integers are grouped  
depending on some condition (equals, differing by 1 or 2...). A  
version in pure Python/numpy, the `Cluster` class is available in  
scikits.hydroclimpy.core.tools (hydroclimpy.sourceforge.net).  
Otherwise, here's a Cython version of the same class.
Let me know if it works. And I'm not ultra happy with the name, so if  
you have any suggestions...


cdef class Brackets:
 
 Groups consecutive data from an array according to a clustering  
condition.

 A cluster is defined as a group of consecutive values differing  
by at most the
 increment value.

 Missing values are **not** handled: the input sequence must  
therefore be free
 of missing values.

 Parameters
 --
 darray : ndarray
 Input data array to clusterize.
 increment : {float}, optional
 Increment between two consecutive values to group.
 By default, use a value of 1.
 operator : {function}, optional
 Comparison operator for the definition of clusters.
 By default, use :func:`numpy.less_equal`.


 Attributes
 --
 inishape
 Shape of the argument array (stored for resizing).
 inisize
 Size of the argument array.
 uniques : sequence
 List of unique cluster values, as they appear in  
chronological order.
 slices : sequence
 List of the slices corresponding to each cluster of data.
 starts : ndarray
 Lists of the indices at which the clusters start.
 ends : ndarray
 Lists of the indices at which the clusters end.
 clustered : list
 List of clustered data.


 Examples
 
  A = [0, 0, 1, 2, 2, 2, 3, 4, 3, 4, 4, 4]
  klust = cluster(A,0)
  [list(_) for _ in klust.clustered]
 [[0, 0], [1], [2, 2, 2], [3], [4], [3], [4, 4, 4]]
  klust.uniques
 array([0, 1, 2, 3, 4, 3, 4])

  x = [ 1.8, 1.3, 2.4, 1.2, 2.5, 3.9, 1. , 3.8, 4.2, 3.3,
 ...   1.2, 0.2, 0.9, 2.7, 2.4, 2.8, 2.7, 4.7, 4.2, 0.4]
  Brackets(x,1).starts
 array([ 0,  2,  3,  4,  5,  6,  7, 10, 11, 13, 17, 19])
  Brackets(x,1.5).starts
 array([ 0,  6,  7, 10, 13, 17, 19])
  Brackets(x,2.5).starts
 array([ 0,  6,  7, 19])
  Brackets(x,2.5,greater).starts
 array([ 0,  1,  2,  3,  4,  5,  8,  9, 10,
 ...11, 12, 13, 14, 15, 16, 17, 18])
  y = [ 0, -1, 0, 0, 0, 1, 1, -1, -1, -1, 1, 1, 0, 0, 0, 0, 1,  
1, 0, 0]
  Brackets(y,1).starts
 array([ 0,  1,  2,  5,  7, 10, 12, 16, 18])

 

 cdef readonly double increment
 cdef readonly np.ndarray data
 cdef readonly list _starts
 cdef readonly list _ends

 def __init__(Brackets self, object data, double increment=1,
  object operator=np.less_equal):
 
 
 cdef int i, n, ifirst, ilast, test
 cdef double last
 cdef list starts, ends
 #
 self.increment = increment
 self.data = np.asanyarray(data)
 data = np.asarray(data)
 #
 n = len(data)
 starts = []
 ends = []
 #
 last = data[0]
 ifirst = 0
 ilast = 0
 for 1 = i  n:
 test = operator(abs(data[i] - last), increment)
 ilast = i
 if not test:
 starts.append(ifirst)
 ends.append(ilast-1)
 ifirst = i
 last = data[i]
 starts.append(ifirst)
 ends.append(n-1)
 self._starts = starts
 self._ends = ends

 def __len__(self):
 return len(self.starts)

 property starts:
 #
 def __get__(Brackets self):
 return np.asarray(self._starts)

 property ends:
 #
 def __get__(Brackets self):
 return np.asarray(self._ends)

 property sizes:
 #
 def __get__(Brackets self):
 return np.asarray(self._ends) - np.asarray(self._firsts)


 property slices:
 #
 def __get__(Brackets self):
 cdef int i
 cdef list starts = self._starts, ends = self._ends
 cdef list slices = []
 for 0 = i  len(starts):
 

Re: [Numpy-discussion] List/location of consecutive integers

2009-05-22 Thread josef . pktd
On Fri, May 22, 2009 at 3:59 PM, David Warde-Farley d...@cs.toronto.edu wrote:
 On 22-May-09, at 1:03 PM, Christopher Barker wrote:

 In [104]: zip(indices[np.r_[True, breaks[:-1]]], indices[breaks])



 I don't think this is very general:

 In [53]: indices
 Out[53]:
 array([   -3,     1,     2,     3,     4,     5,     6,     7,     8,
            9,   255,   256,   257,   258, 10001, 10002, 10003, 10004])

 In [54]: breaks = diff(indices) != 1

 In [55]: zip(indices[np.r_[True, breaks[:-1]]], indices[breaks])
 Out[55]: [(-3, -3), (1, 9), (255, 258)]



this still works:

 indices = 
 np.array([-5,-4,-3,1,1,2,3,4,5,6,7,8,9,255,256,257,258,10001,10002,10003,10004])
 idx = (np.diff(indices) != 1).nonzero()[0]
 idxf = np.hstack((-1,idx,len(indices)-1))
 vmin = indices[idxf[:-1]+1]
 vmax = indices[idxf[1:]]
 zip(vmin,vmax)
[(-5, -3), (1, 1), (1, 9), (255, 258), (10001, 10004)]

Josef
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] List/location of consecutive integers

2009-05-22 Thread Christopher Barker
David Warde-Farley wrote:
 I don't think this is very general:
 
 In [53]: indices
 Out[53]:
 array([   -3, 1, 2, 3, 4, 5, 6, 7, 8,
 9,   255,   256,   257,   258, 10001, 10002, 10003, 10004])
 
 In [54]: breaks = diff(indices) != 1
 
 In [55]: zip(indices[np.r_[True, breaks[:-1]]], indices[breaks])
 Out[55]: [(-3, -3), (1, 9), (255, 258)]

that's why I put a sys.maxint at the end of the series...


In [13]: indices = np.array([   -3, 1, 2, 3, 4, 5, 
6, 7, 8,
 9,   255,   256,   257,   258, 10001, 10002, 10003, 10004, 
sys.maxint])

In [15]: breaks = np.diff(indices) != 1

In [16]: zip(indices[np.r_[True, breaks[:-1]]], indices[breaks])
Out[16]: [(-3, -3), (1, 9), (255, 258), (10001, 10004)]


Though that's probably not very robust!

-Chris



-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion