Re: [Numpy-discussion] www.numpy.org url issues

2010-10-13 Thread Robert Ferrell

I forwarded this msg to John, in case he isn't watching this list.

I recall that around that time (Y2K) John grabbed a few domains of  
public projects and donated them as soon as the project was ready for  
it.  (To keep the squatters at bay I guess.)


-robert

On Oct 13, 2010, at 8:03 AM, Benjamin Root wrote:




On Wed, Oct 13, 2010 at 12:11 AM, David Warde-Farley warde...@iro.umontreal.ca 
 wrote:
I'm not sure who registered/owns numpy.org, but it looks like a  
frame sitting on top of numpy.scipy.org.



whois says that it is a John Turner of Technical Computing  
Solutions in Tennessee.  Looks like he has owned it since 2000.


Ben Root

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Question on timeseries, for financial application

2009-12-13 Thread Robert Ferrell

On Dec 13, 2009, at 1:31 AM, Pierre GM wrote:

 On Dec 13, 2009, at 12:11 AM, Robert Ferrell wrote:
 Have you considered creating a TimeSeries for each data series, and
 then putting them all together in a dict, keyed by symbol?

 That's an idea

 One disadvantage of one big monster numpy array for all the series is
 that not all series may have a full set of 1800 data points.  So the
 array isn't really nicely rectangular.

 Bah, there's adjust_endpoints to take care of that.

Maybe this will work for the OP.  In my work, if a series is missing  
data the desirable thing is to use the data I have.  I don't' want to  
truncate existing series to fit the short ones, nor pad to fit the  
long ones.

Really depends on the analysis the OP is trying to do.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Question on timeseries, for financial application

2009-12-13 Thread Robert Ferrell

On Dec 13, 2009, at 7:07 AM, josef.p...@gmail.com wrote:

 On Sun, Dec 13, 2009 at 3:31 AM, Pierre GM pgmdevl...@gmail.com  
 wrote:
 On Dec 13, 2009, at 12:11 AM, Robert Ferrell wrote:
 Have you considered creating a TimeSeries for each data series, and
 then putting them all together in a dict, keyed by symbol?

 That's an idea

 As far as I understand, that's what pandas.DataFrame does.
 pandas.DataMatrix used 2d array to store data


 One disadvantage of one big monster numpy array for all the series  
 is
 that not all series may have a full set of 1800 data points.  So the
 array isn't really nicely rectangular.

 Bah, there's adjust_endpoints to take care of that.


 Not sure exactly what kind of analysis you want to do, but  
 grabbing a
 series from a dict is quite fast.

 Thomas, as robert F. pointed out, everything depends on the kind of  
 analysis you want. If you want to normalize your series, having all  
 of them in a big array is the best (plain array, not structured, so  
 that you can apply .mean and .std directly without having to loop  
 on the series). If you need to apply the same function over all the  
 series, here again having a big ndarray is easiest. Give us an  
 example of what you wanna do.

 Or a structured array with homogeneous type that allows fast creation
 of views for data analysis.

These kinds of financial series don't have that much data (speaking  
from the early 21st century point of view).  The OP says 1000 series,  
1800 observations per series.  Maybe 5 data items per observation, 4  
bytes each.  That's well under 50MB.  I've found it satisfactory to  
keep the data someplace that's handy to get at, and easy to use.  When  
I want to do analysis I pull it into whatever format is best for that  
analysis.  Depending on the needs, it may not be necessary to try to  
arrange the data so you can get a view for analysis - the time for a  
copy can be negligible if the analysis takes a while.

-r
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Question on timeseries, for financial application

2009-12-12 Thread Robert Ferrell
Have you considered creating a TimeSeries for each data series, and  
then putting them all together in a dict, keyed by symbol?

One disadvantage of one big monster numpy array for all the series is  
that not all series may have a full set of 1800 data points.  So the  
array isn't really nicely rectangular.

Not sure exactly what kind of analysis you want to do, but grabbing a  
series from a dict is quite fast.

-r

On Dec 12, 2009, at 6:08 PM, THOMAS BROWNE wrote:

 Hello all,

 Quite new to numpy / timeseries module, please forgive the  
 elementary question.

 I wish to do quite to do a bunch of multivariate analysis on 1000  
 different financial markets series, each holding about 1800 data  
 points (5 years of daily data).

 What's the best way to put this into a TimeSeries object? Should I  
 use a structured data type (in which case I can reference each  
 series by name), or should I put it into one big numpy array object  
 (in which case I guess I'll have to keep track of the series name in  
 an internal structure)? What are the advantages and disadvantages of  
 each?

 Ideally I'd have liked the ease and simplicity of being able to  
 reference each series by name, while maintaining the fast speed and  
 clean structure of one big numpy array. Any way of getting both?

 Once I have a multivariate TimeSeries, how do I add another series  
 to it?

 Thanks for the help.

 Thomas.


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Objected-oriented SIMD API for Numpy

2009-10-22 Thread Robert Ferrell

On Oct 22, 2009, at 1:35 AM, Sturla Molden wrote:

 Robert Kern skrev:
 No, I think you're right. Using SIMD to refer to numpy-like
 operations is an abuse of the term not supported by any outside
 community that I am aware of. Everyone else uses SIMD to describe
 hardware instructions, not the application of a single syntactical
 element of a high level language to a non-trivial data structure
 containing lots of atomic data elements.

 Then you should pick up a book on parallel computing.

 It is common to differentiate between four classes of computers: SISD,
 MISD, SIMD, and MIMD machines.

 A SISD system is the classical von Neuman machine. A MISD system is a
 pipelined von Neuman machine, for example the x86 processor.

 A SIMD system is one that has one CPU dedicated to control, and a  
 large
 collection of subordinate ALUs for computation. Each ALU has a small
 amount of private memory. The IBM Cell processor is the typical SIMD
 machine.

 A special class of SIMD machines are the so-called vector  
 machines, of
 which the most famous is the Cray C90. The MMX and SSE instructions in
 Intel Pentium processors are an example of vector instructions. Some
 computer scientists regard vector machines a subtype of MISD systems,
 orthogonal to piplines, because there are no subordinate ALUs with
 private memory.

 MIMD systems multiple independent CPUs. MIMD systems comes in two
 categories: shared-memory processors (SMP) and distributed-memory
 machines (also called cluster computers). The dual- and quad-core x86
 processors are shared-memory MIMD machines.

 Many people associate the word SIMD with SSE due to Intel marketing.  
 But
 to the extent that vector machines are MISD orthogonal to piplined von
 Neuman machines, SSE cannot be called SIMD.

 NumPy is a software simulated vector machine, usually executed on MISD
 hardware. To the extent that vector machines (such as SSE and C90) are
 SIMD, we must call NumPy an object-oriented SIMD library.

This is not the terminology I am familiar with.  Calling NumPy an   
object-oriented SIMD library is very confusing for me.  I worked in  
the parallel computer world for a while (back in the dark ages) and  
this terminology would have been confusing to everyone I dealt with.   
I've also read many parallel computing books.  In my experience SIMD  
refers to hardware, not software.  There is no reason that NumPy can't  
be written to run great (get good speed-ups) on an 8-core shared  
memory system.  That would be a MIMD system, and there's nothing about  
it that doesn't fit with the NumPy abstraction.  And, although SIMD  
can be a subset of MIMD, there are things that can be done in NumPy  
that be parallelized on MIMD machines but not on SIMD machines (e.g.  
the NumPy vector type is flexible enough it can store a list of tasks,  
and the operations on that vector can be parallelized easily on a  
shared memory MIMD machine - task parallelism - but not on a SIMD  
machine).

If we say that  NumPy is a software simulated vector machine or an   
object-oriented SIMD library we are pigeonholing NumPy in a way which  
is too limiting and isn't accurate.  As a user it feels to me that  
NumPy is built around various algebra abstractions, many of which map  
well onto vector machine operations.  That means that many of the  
operations are amenable to efficient implementation on SIMD hardware.   
But, IMO, one of the nice features of NumPy is it is built around high- 
level operations, and I would hate to see the project go down a path  
which insists that everything in NumPy be efficient on all SIMD  
hardware.

Of course, I would also love to see implementations which take as much  
advantage of available HW as possible (e.g. exploit SIMD HW if  
available).

That's my $0.02, worth only a couple cents less than that.

-robert

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Scientific Computing with Python, September 18, 2009

2009-09-11 Thread Robert Ferrell


On Sep 11, 2009, at 5:07 PM, Neal Becker wrote:

 I'd love to participate in these webinars.  Problem is, AFAICT,  
 gotomeeting
 only supports windows.

I'm not certain that is correct.  I've participated in some of these,  
and Im' running OS X (10.5).  I think those were gotomeeting, although  
don't honestly recall.  Assuming nothing's changed, though, worked  
great on OS X.



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Slices of structured arrays

2009-06-01 Thread Robert Ferrell
Is there a way to get slices of a structured array and keep the field  
names?  For instance, I've got dtype=[('x','f4'),('y','f4'), 
('z','f4')] and I want to get just the x  y slices into a new array  
with dtype=[('x','f4'),('y','f4')].

I can just make a new dtype, and extract what I need, but I'm  
wondering if there's some simple way to do this that I haven't found.

Here's what I know works:

# Make a len 10 array with 3 fields, 'x', 'y', 'z'
In [647]: xyz = np.array(zip(*np.random.random_integers(low=10,  
size=(3,10))), dtype=[('x', 'f4'), ('y', 'f4'), ('z', 'f4')])

# Get just the 'x' and 'y' fields
In [648]: xy = np.array( zip(xyz['x'], xyz['y'] ), dtype=[('x','f4'),  
('y', 'f4')])

In [649]: xyz['x']
Out[649]: array([ 4.,  1.,  1.,  5.,  1.,  2.,  9.,  8.,  1.,  9.],  
dtype=float32)

In [650]: xy['x']
Out[650]: array([ 4.,  1.,  1.,  5.,  1.,  2.,  9.,  8.,  1.,  9.],  
dtype=float32)

That works, but just feels like there's probably an elegant solution I  
don't know.  I couldn't find anything in the docs, but I may not have  
been using the right search words.

thanks,
-robert
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Slices of structured arrays

2009-06-01 Thread Robert Ferrell

On Jun 1, 2009, at 4:41 PM, Robert Kern wrote:

 On Mon, Jun 1, 2009 at 17:32, Robert Ferrell  
 ferr...@diablotech.com wrote:
 Is there a way to get slices of a structured array and keep the field
 names?  For instance, I've got dtype=[('x','f4'),('y','f4'),
 ('z','f4')] and I want to get just the x  y slices into a new array
 with dtype=[('x','f4'),('y','f4')].

 I can just make a new dtype, and extract what I need, but I'm
 wondering if there's some simple way to do this that I haven't found.

 Here's what I know works:

 # Make a len 10 array with 3 fields, 'x', 'y', 'z'
 In [647]: xyz = np.array(zip(*np.random.random_integers(low=10,
 size=(3,10))), dtype=[('x', 'f4'), ('y', 'f4'), ('z', 'f4')])

 # Get just the 'x' and 'y' fields
 In [648]: xy = np.array( zip(xyz['x'], xyz['y'] ), dtype=[('x','f4'),
 ('y', 'f4')])

 In [649]: xyz['x']
 Out[649]: array([ 4.,  1.,  1.,  5.,  1.,  2.,  9.,  8.,  1.,  9.],
 dtype=float32)

 In [650]: xy['x']
 Out[650]: array([ 4.,  1.,  1.,  5.,  1.,  2.,  9.,  8.,  1.,  9.],
 dtype=float32)

 That works, but just feels like there's probably an elegant  
 solution I
 don't know.  I couldn't find anything in the docs, but I may not have
 been using the right search words.

 In numpy 1.4, there will be a function that does this,
 numpy.lib.recfunctions.drop_fields(). In the meantime, you can
 copy-and-paste it into your own code:

  http://svn.scipy.org/svn/numpy/trunk/numpy/lib/recfunctions.py

 Or use it from it's original source,
 matplotlib.mlab.rec_drop_fields(), if you have matplotlib.

That's perfect.  I've got matplotlib, so I'll use that for now.

thanks,
-robert


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] add xirr to numpy financial functions?

2009-05-26 Thread Robert Ferrell

On May 25, 2009, at 10:59 PM, Joe Harrington wrote:

 Let's keep this thread focussed on the original issue:

 just add a floating array of times to irr or a new xirr
 continuous interest
 no more

 Anyone can use the timeseries package to produce a floating array of
 times from normal dates, if those are the dates they want.  If they
 want some specialized financial date, they may want a different
 conversion, however.  All we should provide in NumPy would be the
 simplest tool.  Specialized dates and date-time conversion belong
 elsewhere.

 If we're *not* skipping dates, there is no need for xirr, just use
 irr, which exists.

 scikits.financial seems like a great idea, and then knock yourselves
 out for date conversions and definitions of compounding.  Just think
 big and design it first.  But let's keep this thread on the simple
 question for NumPy.

My vote is against adding xirr to NumPy.  In my experience, if you  
want internal rate of return, then you also want time weighted return,  
for instance, and all of sudden it becomes surprising that NumPy  
tantalizes with a some of the needed capability but not all of it.

I read in an old thread that irr was included partly because OLPC was  
including NumPy and it was great that kids would have a tool to help  
them understand the present value of money.  In my opinion, cumprod()  
is an even better teaching tool for that.  I'm not advocating reducing  
functionality in NumPy, but I prefer the idea of keeping NumPy as an  
array core, and having higher-level capability available as add-ons  
(scipy, scikit, etc...)

-r

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] add xirr to numpy financial functions?

2009-05-25 Thread Robert Ferrell
I haven't read all the messages in detail, and I'm a consumer not a  
producer, but I'll comment anyways.

I'd love to see additional financial functionality, but I'd like to  
see them in a scikit, not in numpy.  I think to be useful they are too  
complicated to go into numpy.  A couple of my many reasons:

1. Doing a precise, bang-up job with dates is paramount to any  
interesting implementation of many financial functions.  I've found  
timeseries to be a great package - there are some things I'd like to  
see, but overall it is at the foundation of all of my financial  
analysis.  Any moderately interesting extension of the current  
capabilities would rapidly end up trying to duplicate much of the  
timeseries functionality, IMO.  Rather than partially re-implement the  
wheel in numpy, as a consumer I'd like to see financial stuff built on  
a common basis, and timeseries would be a great start.

2. I've read enough of this discussion to hear a requirement for both  
good date handling and capable solvers - just for xirr.  To do a  
really interesting job on an interesting amount of capability requires  
even more dependencies, I think.

Although it might be tempting to include a few more lightweight  
financial functions in numpy, I doubt they will be that useful.  Most  
of the lightweight ones are easy enough to whip up when you need  
them.  Also, an approximation that's good today isn't the right one  
tomorrow - only the really robust stuff seems to survive the test of  
time, in my limited experience.  A start on a really solid scikits  
financial package would be awesome, though.

A few months ago, when the open source software for pricing CDS's was  
released (http://www.cdsmodel.com/information/cds-model) I took a look  
and noticed that it had a ton of code for dealing with dates.  (I also  
didn't see any tests in the code.  I wonder what that means.  Scary  
for anybody that might want to modify it.)  I thought if I had an  
extra 100 hours in every day it would be fun to re-write that code in  
numpy/scipy and release it.

-r


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] add xirr to numpy financial functions?

2009-05-25 Thread Robert Ferrell

On May 25, 2009, at 9:15 PM, Matt Knox wrote:

 josef.pktd at gmail.com writes:

 So, while python won't get any industrial strength finance package,
 a more modest designer package would be feasible, if there were any
 interest in it (which I haven't seen).

 ...

 The even more modest question is whether we would want to match open
 office in it's finance part.

 These are pretty different use cases from those use cases where you
 have quantlib all set up and running.


 As you have hinted, the scope of what will/should be covered with  
 numpy
 financial functions needs to be defined better before putting more  
 such
 functions into numpy. If that scope turns out to be something  
 comparable to
 what excel or openoffice offers, that's fine, but I think a  
 maturation period
 outside the numpy core (in the form of a scikit or otherwise) would  
 be still
 be a good idea to avoid getting stuck with a poorly thought out API.

+1 for a maturation period outside the numpy core.



 As for my personal feelings on how much financial functionality  
 numpy/scipy
 should offer... I would agree that QuantLib-like functionality is  
 far beyond
 what numpy can/should try to achieve. More basic functionality like  
 OpenOffice
 or Excel probably seems about right. Although maybe it is more  
 appropriate for
 scipy than numpy.

+1 for something outside numpy.  Even OpenOffice or Excel financial  
capability might, perhaps, go into scipy, but why not have it optional?

-r
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Masked array usage

2008-11-27 Thread Robert Ferrell
I have a question about assigning to masked arrays.  a is a len ==3  
masked array, with 2 unmasked elements.  b is a len == 2 array.  I  
want to put the elements of b into the unmasked elements of a.  How do  
I do that?

In [598]: a
Out[598]:
masked_array(data = [1 -- 3],
   mask = [False  True False],
   fill_value=99)


In [599]: b
Out[599]: array([7, 8])

I'd like an operation that gives me:

masked_array(data = [7 -- 8],
   mask = [False  True False],
   fill_value=99)

Seems like it shouldn't be that hard, but I can't figure it out.  Any  
suggestions?

thanks,
-robert
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Masked array usage

2008-11-27 Thread Robert Ferrell
Sweet.  So simple.  That works great.

thanks,
-robert

On Nov 27, 2008, at 8:41 AM, Angus McMorland wrote:

 2008/11/27 Robert Ferrell [EMAIL PROTECTED]:
 I have a question about assigning to masked arrays.  a is a len ==3
 masked array, with 2 unmasked elements.  b is a len == 2 array.  I
 want to put the elements of b into the unmasked elements of a.  How  
 do
 I do that?

 In [598]: a
 Out[598]:
 masked_array(data = [1 -- 3],
  mask = [False  True False],
  fill_value=99)


 In [599]: b
 Out[599]: array([7, 8])

 I'd like an operation that gives me:

 masked_array(data = [7 -- 8],
  mask = [False  True False],
  fill_value=99)

 Seems like it shouldn't be that hard, but I can't figure it out.  Any
 suggestions?

 How about:

 c = a.copy()
 c[~a.mask] = b

 Angus.
 -- 
 AJC McMorland
 Post-doctoral research fellow
 Neurobiology, University of Pittsburgh
 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion