subject:"Re\: \[Numpy\-discussion\] Problems with Numexpr and discontiguous arrays"

Re: [Numpy-discussion] Problems with Numexpr and discontiguous arrays

2006-10-05 Thread Ivan Vilata i Balaguer

En/na Tim Hochberg ha escrit::

 Ivan Vilata i Balaguer wrote:
 It seemed that discontiguous arrays worked OK in Numexpr since r1977 or
 so, but I have come across some alignment or striding problems which can
 be seen with the following code::
 I looked at this just a little bit and clearly this bit from interp_body 
 cannot work in the presence of recor arrays:
 
 //
 intp sf1 = sb1 / sizeof(double);\
 //...
 #define f1((double *)x1)[j*sf1]
 
 There are clearly some assumptions that sb1 is evenly divisible by 
 sizeof(double). [...]

I noticed something strange in those statements when implementing
support for strings, and I must confess that I didn't grasp their
meaning, so I implemented it a little differently for strings::

#define s1((char   *)x1 + j*params.memsteps[arg1])

That seemed to work, but it might not be right (though I tested a bit),
and certainly it may not be efficient enough.  Here you have my previous
patches if you want to have a look at how I (try to) do it:

1.http://www.mail-archive.com/numpy-discussion%40lists.sourceforge.net/msg01551.html
2.http://www.mail-archive.com/numpy-discussion%40lists.sourceforge.net/msg02261.html
3.http://www.mail-archive.com/numpy-discussion%40lists.sourceforge.net/msg02644.html

::

Ivan Vilata i Balaguer   qo   http://www.carabos.com/
   Cárabos Coop. V.  V  V   Enjoy Data
  



signature.asc
Description: OpenPGP digital signature
-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion

Re: [Numpy-discussion] Problems with Numexpr and discontiguous arrays

2006-10-05 Thread Tim Hochberg

Tim Hochberg wrote:
 David M. Cooke wrote:
 On Wed, 04 Oct 2006 10:19:08 -0700
 Tim Hochberg [EMAIL PROTECTED] wrote:

  
 Ivan Vilata i Balaguer wrote:

 It seemed that discontiguous arrays worked OK in Numexpr since 
 r1977 or
 so, but I have come across some alignment or striding problems 
 which can
 be seen with the following code::
   
 I looked at this just a little bit and clearly this bit from 
 interp_body cannot work in the presence of recor arrays:

 //
 intp sf1 = sb1 / sizeof(double);\
 //...
 #define f1((double *)x1)[j*sf1]


 There are clearly some assumptions that sb1 is evenly divisible by 
 sizeof(double). Blech!. This is likely my fault, and I expect it 
 won't be too horrible to fix, but I don't know that I'll have time 
 immediately.
 

 My thinking is that this should be handled by a copy, so that the 
 opcodes
 always work on contiguous data. The copy can be another opcode. One 
 advantage
 of operating on contiguous data is that it's easier to use the 
 processor's
 vector instructions, if applicable.
   

 That would be easy to do. Right now the opcodes should work correctly 
 on data that is spaced in multiples of the itemsize on the last axis. 
 Other arrays are copied (no opcode required, it's embedded at the top 
 of interp_body lines 64-80). The record array case apparently slips 
 through the cracks when we're checking whether an array is suitable to 
 be used correctly (interpreter.c 1086-1103). It would certainly not be 
 any harder to only allow contiguous arrays than to correctly deal with 
 record arrays. Only question I have is whether the extra copy will 
 overwhelm the savings of that operating on contiguous data gives.  The 
 thing to do is probably try it and see what happens.

OK, I've checked in a fix for this that makes a copy when the array is 
not strided in an even multiple of the itemsize. I first tried copying 
for all discontiguous array, but this resulted in a large speed hit for 
vanilla strided arrays (a=arange(10)[::2], etc.), so I was more frugal 
with my copying. I'm not entirely certain that I caught all of the 
problematic cases, so let me know if you run into any more issues like this.

-tim


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion

Re: [Numpy-discussion] Problems with Numexpr and discontiguous arrays

2006-10-05 Thread Travis Oliphant

Tim Hochberg wrote:

  
  

That would be easy to do. Right now the opcodes should work correctly 
on data that is spaced in multiples of the itemsize on the last axis. 
Other arrays are copied (no opcode required, it's embedded at the top 
of interp_body lines 64-80). The record array case apparently slips 
through the cracks when we're checking whether an array is suitable to 
be used correctly (interpreter.c 1086-1103). It would certainly not be 
any harder to only allow contiguous arrays than to correctly deal with 
record arrays. Only question I have is whether the extra copy will 
overwhelm the savings of that operating on contiguous data gives.  The 
thing to do is probably try it and see what happens.



OK, I've checked in a fix for this that makes a copy when the array is 
not strided in an even multiple of the itemsize. I first tried copying 
for all discontiguous array, but this resulted in a large speed hit for 
vanilla strided arrays (a=arange(10)[::2], etc.), so I was more frugal 
with my copying. I'm not entirely certain that I caught all of the 
problematic cases, so let me know if you run into any more issues like this.

  

There is an ElementStrides check and similar requirement flag you can 
use to make sure that you have an array whose strides are multiples of 
it's itemsize.

-Travis


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion

Re: [Numpy-discussion] Problems with Numexpr and discontiguous arrays

2006-10-04 Thread Sebastian Haase

Quick question hopefully somewhat related to this:
Does numexpr fully support float32 arrays ?
-Sebastian


On Wednesday 04 October 2006 09:32, Tim Hochberg wrote:
 Ivan Vilata i Balaguer wrote:
  It seemed that discontiguous arrays worked OK in Numexpr since r1977 or
  so, but I have come across some alignment or striding problems which can
  be seen with the following code::
 
  import numpy
  import numexpr
 
  array_length = 10
  array_descr = [('c1', numpy.int32), ('c2', numpy.uint16)]
 
  array = numpy.empty((array_length,), dtype=array_descr)
  for i in xrange(array_length):
  array['c1'][i] = i
  array['c2'][i] = 0x
 
  print numexpr.evaluate('c1', {'c1': array['c1']})
  print numexpr.evaluate('c1', {'c1': array['c1'].copy()})
 
  Im my computer, Pentium IV with NumPy 1.0rc1 and Numexpr r2239
  (unmodified) this gives the following result::
 
  [  0  109226 -1431699456   2  240298 -1431699456
 4  371370   8  633514]
  [0 1 2 3 4 5 6 7 8 9]
 
  The test works right when ``evaluate()`` is used with 'c2' instead of
  'c1', and also when 'c2' also measures 32 bits and fields are aligned.
  Maybe the ``memsteps`` value is not getting used somewhere.  Any ideas
  on this?

 I suspect that there are some assumptions that the element separation
 is an integral multiple of the element size. I certainly didn't have
 record arrays in mind when I was working on the striding stuff, so it
 wouldn't surprise me. This should be fixed: preferably to do the right
 thing and at a minimum to cleanly raise an exception rather than
 spitting out garbage. I don't know that I'll have time to mess with it
 soon though.

 -tim


 -
 Take Surveys. Earn Cash. Influence the Future of IT
 Join SourceForge.net's Techsay panel and you'll get the chance to share
 your opinions on IT  business topics through brief surveys -- and earn
 cash
 http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
 ___
 Numpy-discussion mailing list
 Numpy-discussion@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/numpy-discussion

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion

Re: [Numpy-discussion] Problems with Numexpr and discontiguous arrays

2006-10-04 Thread Tim Hochberg

Sebastian Haase wrote:
 Quick question hopefully somewhat related to this:
 Does numexpr fully support float32 arrays ?
   
I don't recall. At one point there was a tentative plan to support 
float32 by casting them a block at a time to float64, operating on them 
and them casting them back. That's to limit the number of bytecodes that 
we need to support and keep the switch statement at a manageable size. 
However, it doesn't look like that ever got implemented, so the answer 
is probably no.

-tim



 -Sebastian


 On Wednesday 04 October 2006 09:32, Tim Hochberg wrote:
   
 Ivan Vilata i Balaguer wrote:
 
 It seemed that discontiguous arrays worked OK in Numexpr since r1977 or
 so, but I have come across some alignment or striding problems which can
 be seen with the following code::

 import numpy
 import numexpr

 array_length = 10
 array_descr = [('c1', numpy.int32), ('c2', numpy.uint16)]

 array = numpy.empty((array_length,), dtype=array_descr)
 for i in xrange(array_length):
 array['c1'][i] = i
 array['c2'][i] = 0x

 print numexpr.evaluate('c1', {'c1': array['c1']})
 print numexpr.evaluate('c1', {'c1': array['c1'].copy()})

 Im my computer, Pentium IV with NumPy 1.0rc1 and Numexpr r2239
 (unmodified) this gives the following result::

 [  0  109226 -1431699456   2  240298 -1431699456
4  371370   8  633514]
 [0 1 2 3 4 5 6 7 8 9]

 The test works right when ``evaluate()`` is used with 'c2' instead of
 'c1', and also when 'c2' also measures 32 bits and fields are aligned.
 Maybe the ``memsteps`` value is not getting used somewhere.  Any ideas
 on this?
   
 I suspect that there are some assumptions that the element separation
 is an integral multiple of the element size. I certainly didn't have
 record arrays in mind when I was working on the striding stuff, so it
 wouldn't surprise me. This should be fixed: preferably to do the right
 thing and at a minimum to cleanly raise an exception rather than
 spitting out garbage. I don't know that I'll have time to mess with it
 soon though.

 -tim


 -
 Take Surveys. Earn Cash. Influence the Future of IT
 Join SourceForge.net's Techsay panel and you'll get the chance to share
 your opinions on IT  business topics through brief surveys -- and earn
 cash
 http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
 ___
 Numpy-discussion mailing list
 Numpy-discussion@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/numpy-discussion
 

 -
 Take Surveys. Earn Cash. Influence the Future of IT
 Join SourceForge.net's Techsay panel and you'll get the chance to share your
 opinions on IT  business topics through brief surveys -- and earn cash
 http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
 ___
 Numpy-discussion mailing list
 Numpy-discussion@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/numpy-discussion


   



-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion

Re: [Numpy-discussion] Problems with Numexpr and discontiguous arrays

2006-10-04 Thread David M. Cooke

On Wed, 04 Oct 2006 10:19:08 -0700
Tim Hochberg [EMAIL PROTECTED] wrote:

 Ivan Vilata i Balaguer wrote:
  It seemed that discontiguous arrays worked OK in Numexpr since r1977 or
  so, but I have come across some alignment or striding problems which can
  be seen with the following code::
 I looked at this just a little bit and clearly this bit from interp_body 
 cannot work in the presence of recor arrays:
 
 //
 intp sf1 = sb1 / sizeof(double);\
 //...
 #define f1((double *)x1)[j*sf1]
 
 
 There are clearly some assumptions that sb1 is evenly divisible by 
 sizeof(double). Blech!. This is likely my fault, and I expect it won't 
 be too horrible to fix, but I don't know that I'll have time immediately.

My thinking is that this should be handled by a copy, so that the opcodes
always work on contiguous data. The copy can be another opcode. One advantage
of operating on contiguous data is that it's easier to use the processor's
vector instructions, if applicable.

-- 
||\/|
/--\
|David M. Cooke  http://arbutus.physics.mcmaster.ca/dmc/
|[EMAIL PROTECTED]

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion

Re: [Numpy-discussion] Problems with Numexpr and discontiguous arrays

2006-10-04 Thread Tim Hochberg

David M. Cooke wrote:
 On Wed, 04 Oct 2006 10:19:08 -0700
 Tim Hochberg [EMAIL PROTECTED] wrote:

   
 Ivan Vilata i Balaguer wrote:
 
 It seemed that discontiguous arrays worked OK in Numexpr since r1977 or
 so, but I have come across some alignment or striding problems which can
 be seen with the following code::
   
 I looked at this just a little bit and clearly this bit from interp_body 
 cannot work in the presence of recor arrays:

 //
 intp sf1 = sb1 / sizeof(double);\
 //...
 #define f1((double *)x1)[j*sf1]


 There are clearly some assumptions that sb1 is evenly divisible by 
 sizeof(double). Blech!. This is likely my fault, and I expect it won't 
 be too horrible to fix, but I don't know that I'll have time immediately.
 

 My thinking is that this should be handled by a copy, so that the opcodes
 always work on contiguous data. The copy can be another opcode. One advantage
 of operating on contiguous data is that it's easier to use the processor's
 vector instructions, if applicable.
   

That would be easy to do. Right now the opcodes should work correctly on 
data that is spaced in multiples of the itemsize on the last axis. Other 
arrays are copied (no opcode required, it's embedded at the top of 
interp_body lines 64-80). The record array case apparently slips through 
the cracks when we're checking whether an array is suitable to be used 
correctly (interpreter.c 1086-1103). It would certainly not be any 
harder to only allow contiguous arrays than to correctly deal with 
record arrays. Only question I have is whether the extra copy will 
overwhelm the savings of that operating on contiguous data gives.  The 
thing to do is probably try it and see what happens.

-tim



-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion

Re: [Numpy-discussion] Problems with Numexpr and discontiguous arrays

2006-10-04 Thread David M. Cooke

On Wed, 4 Oct 2006 10:23:25 -0700
Sebastian Haase [EMAIL PROTECTED] wrote:

 On Wednesday 04 October 2006 10:13, Tim Hochberg wrote:
  Sebastian Haase wrote:
   Quick question hopefully somewhat related to this:
   Does numexpr fully support float32 arrays ?
 
  I don't recall. At one point there was a tentative plan to support
  float32 by casting them a block at a time to float64, operating on them
  and them casting them back. That's to limit the number of bytecodes that
  we need to support and keep the switch statement at a manageable size.
  However, it doesn't look like that ever got implemented, so the answer
  is probably no.
 
  -tim
 
 Does that mean its considered impratical to ever add native float32 
 support ?   Is the switch-statement you mention written by hand or is that 
 automatically generated ?
 -Sebastian
 

Currently by hand. I've got a rewrite lying around that generates the C code
for it using a description in Python, but I haven't finished it yet. It
should make it much easier to add different types, along with different
methods of calculating and switching (switch vs. gcc's label pointers, for
instance).

Probably, if float32 is added, there will be two (internal) implementations:
one that uses float64 mainly, and coerces float32 to float64, and one that
does the reverse (this would be invisible to the user, of course). The same
would handle int32 and int64.

-- 
||\/|
/--\
|David M. Cooke  http://arbutus.physics.mcmaster.ca/dmc/
|[EMAIL PROTECTED]

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion

Re: [Numpy-discussion] Problems with Numexpr and discontiguous arrays

2006-10-04 Thread Travis Oliphant

Tim Hochberg wrote:

David M. Cooke wrote:
  

On Wed, 04 Oct 2006 10:19:08 -0700
Tim Hochberg [EMAIL PROTECTED] wrote:

  


Ivan Vilata i Balaguer wrote:

  

It seemed that discontiguous arrays worked OK in Numexpr since r1977 or
so, but I have come across some alignment or striding problems which can
be seen with the following code::
  


I looked at this just a little bit and clearly this bit from interp_body 
cannot work in the presence of recor arrays:

//
intp sf1 = sb1 / sizeof(double);\
//...
#define f1((double *)x1)[j*sf1]


There are clearly some assumptions that sb1 is evenly divisible by 
sizeof(double). Blech!. This is likely my fault, and I expect it won't 
be too horrible to fix, but I don't know that I'll have time immediately.

  

My thinking is that this should be handled by a copy, so that the opcodes
always work on contiguous data. The copy can be another opcode. One advantage
of operating on contiguous data is that it's easier to use the processor's
vector instructions, if applicable.
  



That would be easy to do. Right now the opcodes should work correctly on 
data that is spaced in multiples of the itemsize on the last axis. Other 
arrays are copied (no opcode required, it's embedded at the top of 
interp_body lines 64-80). The record array case apparently slips through 
the cracks when we're checking whether an array is suitable to be used 
correctly (interpreter.c 1086-1103). It would certainly not be any 
harder to only allow contiguous arrays than to correctly deal with 
record arrays. Only question I have is whether the extra copy will 
overwhelm the savings of that operating on contiguous data gives.  


With record arrays you have to worry about alignment issues.   The most 
complicated part of the ufunc code is to handle that. 

The usual approach is to copy (and possibly byte-swap at least the axis 
you are working on) over to a buffer (the copyswapn functions will do 
that using a pretty fast approach for each data-type).  This is 
ultimately how the ufuncs work (though the buffer-size is fixed so the 
data is copied and operated on in chunks).

-Travis


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion

Re: [Numpy-discussion] Problems with Numexpr and discontiguous arrays

Re: [Numpy-discussion] Problems with Numexpr and discontiguous arrays

Re: [Numpy-discussion] Problems with Numexpr and discontiguous arrays

Re: [Numpy-discussion] Problems with Numexpr and discontiguous arrays

Re: [Numpy-discussion] Problems with Numexpr and discontiguous arrays

Re: [Numpy-discussion] Problems with Numexpr and discontiguous arrays

Re: [Numpy-discussion] Problems with Numexpr and discontiguous arrays

Re: [Numpy-discussion] Problems with Numexpr and discontiguous arrays

Re: [Numpy-discussion] Problems with Numexpr and discontiguous arrays

9 matches

Site Navigation

Mail list logo

Footer information