On Fri, Apr 4, 2008 at 3:31 PM, Anne Archibald <[EMAIL PROTECTED]>
wrote:

> On 04/04/2008, Alan G Isaac <[EMAIL PROTECTED]> wrote:
> > On Fri, 4 Apr 2008, Gael Varoquaux apparently wrote:
> >  > I really thing numpy should be as thin as possible, so
> >  > that you can really say that it is only an array
> >  > manipulation package. This will also make it easier to
> >  > sell as a core package for developpers who do not care
> >  > about "calculator" features.
> >
> >
> > I'm a user rather than a developer, but I wonder:
> >  is this true?
> >
> >  1. Even as a user, I agree that what I really want from
> >  NumPy is a core array manipulation package (including
> >  matrices).  BUT as long as this is the core of NumPy,
> >  will a developer care if other features are available?
> >
> >  2. Even if the answer to 1. is yes, could the
> >  build/installation process include an option not to
> >  build/install anything but the core array functionality?
> >
> >  3. It seems to me that pushing things out into SciPy remains
> >  a problem: a basic NumPy is easy to build on any platform,
> >  but SciPy still seems to generate many questions.
> >
> >  4. One reason this keeps coming up is that he NumPy/SciPy
> >  split is rather too crude.  If the split were instead
> >  something like NumPy/SciPyBasic/SciPyFull/SciPyFull+Kits
> >  where SciPyBasic contained only pure Python code (no
> >  extensions), perhaps the desired location would be more
> >  obvious and some of this recurrent discussion would go away.
>
> It seems to me that there are two separate issues people are talking
> about when they talk about packaging:
>
> * How should functions be arranged in the namespaces? numpy.foo(),
> scipy.foo(), numpy.lib.financial.foo(), scikits.foo(),
> numkitfull.foo()?
>
> * Which code should be distributed together? Should scipy require
> separate downloading and compilation from numpy?
>
> The two questions are not completely independent - it would be
> horribly confusing to have the set of functions available in a given
> namespace depend on which packages you had installed - but for the
> most part it's not a problem to have several toplevel namespaces in
> one package (python's library is the biggest example of this I know
> of).
>
> For the first question, there's definitely a question about how much
> should be done with namespaces and how much with documentation. The
> second is a different story.
>
> Personally, I would prefer if numpy and scipy were distributed
> together, preferably with matplotlib. Then anybody who used numpy
> would have available all the scpy tools and all the plotting tools; I
> think it would cut down on wheel reinvention and make application
> development easier. Teachers would not need to restrict themselves to
> using only functions built into numpy for fear that their students
> might not have scipy installed - how many students have learned to
> save their arrays in unportable binary formats because their teacher
> didn't want them to have to install scipy?
>
> I realize that this poses technical problems. For me installing scipy
> is just a matter of clicking on a checkbox and installing a 30 MB
> package, but I realize that some platforms make this much more
> difficult. If we can't just bundle the two, fine. But I think it is
> mad to consider subdividing further if we don't have to.


If these were tightly tied together, for instance in one big dll , this
would be unpleasant for me. I still have people downloading stuff over 56k
modems and adding an extra 30 MB to the already somewhat bloated numpy
distribution would make there lives more tedious than they already are.

 I think python's success is due in part to its "batteries included"

> library. The fact that you can just write a short python script with
> no extra dependencies that can download files from the Web, parse XML,
> manage subprocesses, and save persistent objects makes development
> much faster than if you had to forever decide between adding
> dependencies and reinventing the wheel. I think numpy and scipy should
> follow the same principle, of coming "batteries included".


One thing they try to do in Python proper is think a lot more before adding
stuff to the standard library. Generally packages need to exist separately
for some period of time to prove there general utility and to stabilize
before they get accepted.  Particularly in the core, but in the library as
well, they make an effort to chose a compact set of primitive operations
without a lot of duplication (the old "There should be one-- and preferably
only one --obvious way to do it."). The numpy community has, particularly of
late, been rather quick to add things that seem like they *might *be useful.

One of the advantages of having multiple namespaces would have been to
enforce a certain amount of discipline on what went into numpy, since it
would've been easier to look at and evaluate a few dozen functions that
might have comprised some subpackage rather than, let's say, five hundred or
so.

I suspect it's too late now; numpy has chosen the path of matlab and the
other array packages and is slowly accumulating nearly everything in one big
flat namespace. I don't like it, but it seems pointless to fight it at this
point.


So in this specific case, yes, I think the financial functions should
> absolutely be included; whether they should be included in scipy or
> numpy is less important to me because I think everyone should install
> both packages.
>

Personally I think it's a bad idea to add stuff that, as far as I can tell,
no has even asked for yet. Put them in the sandbox. Advertise them. If
people use them, figure out what needs to be changed. Then add them to SciPy
once they've stabilized, if they actually get used.




-- 
. __
. |-\
.
. [EMAIL PROTECTED]
_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to