Re: C is it always faster than nump?

2022-02-28 Thread Edmondo Giovannozzi
Il giorno sabato 26 febbraio 2022 alle 19:41:37 UTC+1 Dennis Lee Bieber ha 
scritto:
> On Fri, 25 Feb 2022 21:44:14 -0800, Dan Stromberg  
> declaimed the following:
> >Fortran, (still last I heard) did not support pointers, which gives Fortran 
> >compilers the chance to exploit a very nice class of optimizations you 
> >can't use nearly as well in languages with pointers. 
> >
> Haven't looked much at Fortran-90/95 then... 
> 
> Variable declaration gained a POINTER qualifier, and there is an 
> ALLOCATE intrinsic to obtain memory. 
> 
> And with difficulty one could get the result in DEC/VMS FORTRAN-77 
> since DEC implemented (across all their language compilers) intrinsics 
> controlling how arguments are passed -- overriding the language native 
> passing: 
> CALL XYZ(%val(M)) 
> would actually pass the value of M, not Fortran default address-of, with 
> the result that XYZ would use that value /as/ the address of the actual 
> argument. (Others were %ref() and %descr() -- descriptor being a small 
> structure with the address reference along with, say, upper/lower bounds; 
> often used for strings).
> -- 
> Wulfraed Dennis Lee Bieber AF6VN 
> wlf...@ix.netcom.com http://wlfraed.microdiversity.freeddns.org/

The latest Fortran revision is the 2018.
A variable can also have the VALUE attribute even though nowhere in the 
standard is written that it means passing the data by value. It just means that 
if a variable is changed in a procedure the changes don't propagate back to the 
caller.
With the iso_c_binding one can directly call a C function or let a Fortran 
procedure appear as a C function. There is the C_LOC that gives the C address 
of a variable if needed. Of course from 2003 it is fully object oriented.
The claim that it was faster then C is mostly related to the aliasing rule that 
is forbidden in Fortran. The C introduced the "restrict" qualifier for the same 
reason.
In Fortran you also have array operation like you have in numpy. 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: C is it always faster than nump?

2022-02-26 Thread Dennis Lee Bieber
On Fri, 25 Feb 2022 21:44:14 -0800, Dan Stromberg 
declaimed the following:

>Fortran, (still last I heard) did not support pointers, which gives Fortran
>compilers the chance to exploit a very nice class of optimizations you
>can't use nearly as well in languages with pointers.
>
Haven't looked much at Fortran-90/95 then... 

Variable declaration gained a POINTER qualifier, and there is an
ALLOCATE intrinsic to obtain memory.

And with difficulty one could get the result in DEC/VMS FORTRAN-77
since DEC implemented (across all their language compilers) intrinsics
controlling how arguments are passed -- overriding the language native
passing: 
CALL XYZ(%val(M))
would actually pass the value of M, not Fortran default address-of, with
the result that XYZ would use that value /as/ the address of the actual
argument. (Others were %ref() and %descr() -- descriptor being a small
structure with the address reference along with, say, upper/lower bounds;
often used for strings).



-- 
Wulfraed Dennis Lee Bieber AF6VN
wlfr...@ix.netcom.comhttp://wlfraed.microdiversity.freeddns.org/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: C is it always faster than nump?

2022-02-26 Thread Neil
Dan Stromberg  wrote:
> On Fri, Feb 25, 2022 at 8:12 AM BELAHCENE Abdelkader <
> abdelkader.belahc...@enst.dz> wrote:
> 
>> Hi,
>> a lot of people think that C (or C++) is faster than python, yes I agree,
>> but I think that's not the case with numpy, I believe numpy is faster than
>> C, at least in some cases.
>>
> 
> This is all "last time I heard".
> 
> numpy is written, in significant part, in Fortran.
> 
> Fortran, especially for matrix math with variable dimensions, can be faster
> than C.
> 
> Fortran, (still last I heard) did not support pointers, which gives Fortran
> compilers the chance to exploit a very nice class of optimizations you
> can't use nearly as well in languages with pointers.
> 
> I used to code C to be built with the "noalias" optimization, to get much
> of the speed of Fortran in C.  But it required using an error prone subset
> of C without good error detection.

Pointers were introduced in Fortran 90.

Neil.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: C is it always faster than nump?

2022-02-25 Thread BELAHCENE Abdelkader
Thanks every body,
I want to close the subject, but just a naive question:
Does numpy use  a* vectorization *for arrays?
I mean when I add 2 arrays ( or in sum function) how it is done,
in an other word
b=np.arange(100);
 t=np.sum(b)
is equivalent or not to
 s=0
for i in range(100): s +=b[i]
thanks a lot


Le sam. 26 févr. 2022 à 06:44, Dan Stromberg  a écrit :

>
> On Fri, Feb 25, 2022 at 8:12 AM BELAHCENE Abdelkader <
> abdelkader.belahc...@enst.dz> wrote:
>
>> Hi,
>> a lot of people think that C (or C++) is faster than python, yes I agree,
>> but I think that's not the case with numpy, I believe numpy is faster than
>> C, at least in some cases.
>>
>
> This is all "last time I heard".
>
> numpy is written, in significant part, in Fortran.
>
> Fortran, especially for matrix math with variable dimensions, can be
> faster than C.
>
> Fortran, (still last I heard) did not support pointers, which gives
> Fortran compilers the chance to exploit a very nice class of optimizations
> you can't use nearly as well in languages with pointers.
>
> I used to code C to be built with the "noalias" optimization, to get much
> of the speed of Fortran in C.  But it required using an error prone subset
> of C without good error detection.
>
>
>
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: C is it always faster than nump?

2022-02-25 Thread Dan Stromberg
On Fri, Feb 25, 2022 at 8:12 AM BELAHCENE Abdelkader <
abdelkader.belahc...@enst.dz> wrote:

> Hi,
> a lot of people think that C (or C++) is faster than python, yes I agree,
> but I think that's not the case with numpy, I believe numpy is faster than
> C, at least in some cases.
>

This is all "last time I heard".

numpy is written, in significant part, in Fortran.

Fortran, especially for matrix math with variable dimensions, can be faster
than C.

Fortran, (still last I heard) did not support pointers, which gives Fortran
compilers the chance to exploit a very nice class of optimizations you
can't use nearly as well in languages with pointers.

I used to code C to be built with the "noalias" optimization, to get much
of the speed of Fortran in C.  But it required using an error prone subset
of C without good error detection.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: C is it always faster than nump?

2022-02-25 Thread Dan Stromberg
On Fri, Feb 25, 2022 at 9:03 PM Chris Angelico  wrote:

> On Sat, 26 Feb 2022 at 15:39, Avi Gross via Python-list
>  wrote:
> > Take interpreted languages including Python and R that specify all kinds
> of functions that may be written within the language at first. Someone may
> implement a function like sum() (just an example) that looks like the sum
> of a long list of items is the first item added to a slightly longer sum of
> the remaining items. It stops when the final recursive sum is about to be
> called with no remaining arguments. Clearly this implementation may be a
> tad slow. But does Python require this version of sum() or will it allow
> any version that can be called the same way and returns the same results
> every time?
> >
>
> That's also true of C and pretty much every language I know of. They
> define semantics, not implementation.
>

This comes back to something we've discussed before.

A language that is described primarily by a reference implementation rather
than a standard, runs the risk of being defined by that implementation.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: C is it always faster than nump?

2022-02-25 Thread Chris Angelico
On Sat, 26 Feb 2022 at 15:39, Avi Gross via Python-list
 wrote:
> Take interpreted languages including Python and R that specify all kinds of 
> functions that may be written within the language at first. Someone may 
> implement a function like sum() (just an example) that looks like the sum of 
> a long list of items is the first item added to a slightly longer sum of the 
> remaining items. It stops when the final recursive sum is about to be called 
> with no remaining arguments. Clearly this implementation may be a tad slow. 
> But does Python require this version of sum() or will it allow any version 
> that can be called the same way and returns the same results every time?
>

That's also true of C and pretty much every language I know of. They
define semantics, not implementation.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: C is it always faster than nump?

2022-02-25 Thread Avi Gross via Python-list
Agreed, Chris. There are many ways to get something done. I often use the 
Anaconda distribution because it tends to bundle many of the modules I need and 
more.

Not that it is such a big deal to load the ones you need, but if you share your 
program, others trying to use it may have some problems.


-Original Message-
From: Chris Angelico 
To: python-list@python.org 
Sent: Fri, Feb 25, 2022 11:16 pm
Subject: Re: C is it always faster than nump?


On Sat, 26 Feb 2022 at 14:35, Avi Gross via Python-list
 wrote:
> But with numpy and more available anyway, it may not be necessary to reinvent 
> much of that. I was just wondering if it ever made sense to simply include it 
> in the base python, perhaps as a second executable with a name like pythonn 
> to signify that it is more numeric. So if you run that, you know you do not 
> need to add an assortment of modules. I keep seeing programs that just 
> automatically add numpy and pandas and various graphic modules and other 
> scientific and machine learning modules. Of course not everyone needs or even 
> wants this. Many simply use base Python techniques even if they are low for 
> larger amounts of data.
>

How would that be different from getting one of the numeric/scientific
distributions of Python? Why should it be a different Python
executable?!?

ChrisA

-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: C is it always faster than nump?

2022-02-25 Thread Avi Gross via Python-list
Yes, Chris, C is real as a somewhat abstract concept. There are a whole slew of 
different variations each time it is released anew with changes and then some 
people at various times built actual compilers that implement a varying subset 
of what is possible, and not necessarily in quite the same way.

As you gathered, I am saying that comparing languages is not so effective as 
comparing implementations and even better specific programs on specific data. 
And yet, you can still get odd results if you cherry pick what to test. 
Consider a sorting algorithm that rapidly checks if the data is already sorted, 
and if so, does not bother sorting it. It will quite possibly be the fastest 
one in a comparison if the data is chosen to be already in order! But on many 
other sets of data it will have wasted some time checking if it is in order 
while other algorithms have started sorting it!

Bad example, maybe, but there are better ones. Consider an algorithm that does 
no checking for one of many errors that can happen. It does not see if the 
arguments it gets are within expected ranges of types or values. It does not 
intercept attempts to divide by zero and much more. Another algorithm is quite 
bulletproof and thus has lots more code and maybe runs much slower. Is it 
shocking if it tests slower . But the other code may end up failing faster in 
the field and need a rewrite.

A really fair comparison is often really hard. Languages are abstract and 
sometimes a new implementation makes a huge change.

Take interpreted languages including Python and R that specify all kinds of 
functions that may be written within the language at first. Someone may 
implement a function like sum() (just an example) that looks like the sum of a 
long list of items is the first item added to a slightly longer sum of the 
remaining items. It stops when the final recursive sum is about to be called 
with no remaining arguments. Clearly this implementation may be a tad slow. But 
does Python require this version of sum() or will it allow any version that can 
be called the same way and returns the same results every time? Does it even 
matter if the function is written in C or C++ or FORTRAN or even assembler of 
some kind, as long as it is placed in an accessible library and there is some 
interface that allows you to make the call in python notation and it is fed to 
the function in the way it requires, and similarly deals with returned values? 
A wrapper, sort of.

The use of such a shortcut is not against the spirit of the language. You can 
still specify you want the sum() function from some module, or write your own. 
This is true most places. I remember way back when how early UNIX shells did 
silly things like call /bin/echo to do trivial things, or call an external 
program to do something as trivial as i=i+1 and then they started building in 
such functionality and your shell scripts suddenly really speeded up. A 
non-programmer I once worked for wrote some truly humongous shell scripts that 
brought machines it was run on remotely in places like Japan during their 
day-time to their knees. Collecting billing data from all over by running a 
pipeline with 9 processes per line/row was a bit much. 

At first I sped it up quite a bit by using newer built-in features like I 
described, or doing more with fewer elements in pipelines. But I saw how much 
was caused by using the wrong tools for the job and there were programs 
designed to analyze data in various ways.

I replaced almost all of it with an AWK script that speeded things up many 
orders of magnitude. And, yes, AWK was not as fast as C but more trivial to 
program in for this need as it had so  many needed aspects built-in or 
happening automagically.

Would we do the entire project differently today? Definitely. All the billing 
records would not be sitting in an assortment of flat files all over the place 
but rather be fed into some database that made retrieval of all kinds of 
reports straightforward without needing to write much code at all.

How many modules or "packages" were once written largely using the language and 
then gradually "improved" by replacing parts, especially slower parts, with 
external content as we have been discussing? In a sense, some Python 
applications run on older versions of Python may be running faster as newer 
versions have improved some of the "same" code while to the user, they see it 
running on the same language, Python?

-Original Message-
From: Chris Angelico 
To: python-list@python.org 
Sent: Fri, Feb 25, 2022 2:58 pm
Subject: Re: C is it always faster than nump?


On Sat, 26 Feb 2022 at 06:44, Avi Gross via Python-list
 wrote:
>
> I agree with Richard.
>
> Some people may be confused and think c is the speed of light and 
> relativistically speaking, nothing can be faster. (OK, just joking. The uses 
> of the same letter of the alphabet are not at all related. 

Re: C is it always faster than nump?

2022-02-25 Thread Chris Angelico
On Sat, 26 Feb 2022 at 14:35, Avi Gross via Python-list
 wrote:
> But with numpy and more available anyway, it may not be necessary to reinvent 
> much of that. I was just wondering if it ever made sense to simply include it 
> in the base python, perhaps as a second executable with a name like pythonn 
> to signify that it is more numeric. So if you run that, you know you do not 
> need to add an assortment of modules. I keep seeing programs that just 
> automatically add numpy and pandas and various graphic modules and other 
> scientific and machine learning modules. Of course not everyone needs or even 
> wants this. Many simply use base Python techniques even if they are low for 
> larger amounts of data.
>

How would that be different from getting one of the numeric/scientific
distributions of Python? Why should it be a different Python
executable?!?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: C is it always faster than nump?

2022-02-25 Thread Avi Gross via Python-list
Dennis,

What you describe may be a start but is it anything I might not have easily 
created myself? https://docs.python.org/3/library/array.html

I can see creating my own object and adding those methods and attributes while 
gaining very little, except perhaps storage. 

Can I add or multiply two such items efficiently if it contains a numeric 
value? Can I offer them as an argument to all kinds of functions which can now 
handle it well? How does it work if a second operand is a scalar or an array of 
another data type. Can two be compared and result in an array of boolean (not 
seen in the list of types). Numpy does quite a bit of that kind of thing but 
perhaps better is a language like R where all that and more are built in. 

But with numpy and more available anyway, it may not be necessary to reinvent 
much of that. I was just wondering if it ever made sense to simply include it 
in the base python, perhaps as a second executable with a name like pythonn to 
signify that it is more numeric. So if you run that, you know you do not need 
to add an assortment of modules. I keep seeing programs that just automatically 
add numpy and pandas and various graphic modules and other scientific and 
machine learning modules. Of course not everyone needs or even wants this. Many 
simply use base Python techniques even if they are low for larger amounts of 
data.



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: C is it always faster than nump?

2022-02-25 Thread Oscar Benjamin
On Sat, 26 Feb 2022 at 03:10, Dennis Lee Bieber  wrote:
>
> On Fri, 25 Feb 2022 23:06:57 + (UTC), Avi Gross 
> declaimed the following:
>
> >I do have to wonder if anyone ever considered adding back enough 
> >functionality into base Python to make some additions less needed. Is there 
> >any reason the kind of structures used by so many languages cannot be made 
> >part of python such as a vector/array that holds exactly one kind of data 
> >structure and not force use of things like a list when that is more than is 
> >needed?
> >
> https://docs.python.org/3/library/array.html
>
> seems to fit the criteria...

The stdlib array module is basically unused in comparison to NumPy.
The capabilities of the array module do not meet the needs for most
users who want to do anything useful with arrays.

The intention in creating NumPy  (in the NumPy/SciPy split) was that
it might be possible that NumPy could be merged into core Python.
Unfortunately that didn't come to be.

--
Oscar
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: C is it always faster than nump?

2022-02-25 Thread Dennis Lee Bieber
On Fri, 25 Feb 2022 23:06:57 + (UTC), Avi Gross 
declaimed the following:

>I do have to wonder if anyone ever considered adding back enough functionality 
>into base Python to make some additions less needed. Is there any reason the 
>kind of structures used by so many languages cannot be made part of python 
>such as a vector/array that holds exactly one kind of data structure and not 
>force use of things like a list when that is more than is needed?
>
https://docs.python.org/3/library/array.html

seems to fit the criteria...


-- 
Wulfraed Dennis Lee Bieber AF6VN
wlfr...@ix.netcom.comhttp://wlfraed.microdiversity.freeddns.org/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: C is it always faster than nump?

2022-02-25 Thread Oscar Benjamin
On Fri, 25 Feb 2022 at 23:13, Barry  wrote:
>
> > On 25 Feb 2022, at 23:00, Richard Damon  wrote:
> >
> > On 2/25/22 2:47 PM, Chris Angelico wrote:
> >>> On Sat, 26 Feb 2022 at 05:49, Richard Damon  
> >>> wrote:
> >>> On 2/25/22 4:12 AM, BELAHCENE Abdelkader wrote:
>  Hi,
>  a lot of people think that C (or C++) is faster than python, yes I agree,
>  but I think that's not the case with numpy, I believe numpy is faster 
>  than
>  C, at least in some cases.
> 
> >>> My understanding is that numpy is written in C, so for it to be faster
> >>> than C, you are saying that C is faster that C.
> >> Fortran actually, but ultimately, they both get compiled to machine code.
> >
> > Looking at the Github repo I see:
> >
> > Languages:
> > Python.  62.5%
> > C   35.3%
> > C++.   1.0%
> > Cython.   0.9%
> > Shell.   0.2%
> > Fortran.   0.1%
>
> I assume that this is just for bumpy and not for all its dependencies.
> That will add a lot of Fortran and c++ I expect.

NumPy links with BLAS/LAPACK that will do the heavy lifting for common
linear algebra operations. Multiple different BLAS libraries can be
used with NumPy and those libraries might be written in Fortran and
might also involve some hand-crafted assembly for particular
architectures. Some explanation of NumPy's BLAS/LAPACK support is
here:
https://numpy.org/devdocs/user/building.html

By default if you install NumPy from conda (or at least if you install
the Anaconda distribution) then I think that NumPy will use the Intel
MKL library:
https://en.wikipedia.org/wiki/Math_Kernel_Library
As I understand it the core of MKL is Fortran and is compiled with
Intel's ifortran compiler but some parts might be C/C++. MKL is also
the same library that is used by Matlab for its own heavy lifting.
It's not sufficiently free to be used in NumPy's PyPI wheels though.

If you install the precompiled NumPy wheels with pip from PyPI then I
think those are statically linked with OpenBLAS:
https://github.com/xianyi/OpenBLAS
Again I think the core of OpenBLAS is Fortran but there's some C in there.

In the pre-wheel days the situation was that NumPy provided installer
files for Windows that would give binaries linked with ATLAS (also
Fortran):
https://en.wikipedia.org/wiki/Automatically_Tuned_Linear_Algebra_Software

I think at some point NumPy used to use the OSX Accelerate library but
the page I linked above says that's now deprecated. I don't know
anything about Accelerate but I wouldn't be surprised to hear that it
was a bunch of old Fortran code!

If you build NumPy from source without having any BLAS/LAPACK
libraries then I think it uses its own backup version of these that is
written in C but not as well optimised. This used to be the default
for a pip install on Linux in pre-wheel times.

Many operations in NumPy don't actually use BLAS/LAPACK and for those
parts the heavy lifting is all done in NumPy's own C code.

Lastly SciPy which is very often used together with NumPy does have a
load of Fortran code. As I understand it at some point NumPy and SciPy
were divided from the previous numerical Python libraries and there
was a deliberate split so that the Fortran code all ended up in SciPy
rather than NumPy.

--
Oscar
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: C is it always faster than nump?

2022-02-25 Thread Barry


> On 25 Feb 2022, at 23:00, Richard Damon  wrote:
> 
> On 2/25/22 2:47 PM, Chris Angelico wrote:
>>> On Sat, 26 Feb 2022 at 05:49, Richard Damon  
>>> wrote:
>>> On 2/25/22 4:12 AM, BELAHCENE Abdelkader wrote:
 Hi,
 a lot of people think that C (or C++) is faster than python, yes I agree,
 but I think that's not the case with numpy, I believe numpy is faster than
 C, at least in some cases.
 
>>> My understanding is that numpy is written in C, so for it to be faster
>>> than C, you are saying that C is faster that C.
>> Fortran actually, but ultimately, they both get compiled to machine code.
> 
> Looking at the Github repo I see:
> 
> Languages:
> Python.  62.5%
> C   35.3%
> C++.   1.0%
> Cython.   0.9%
> Shell.   0.2%
> Fortran.   0.1%

I assume that this is just for bumpy and not for all its dependencies.
That will add a lot of Fortran and c++ I expect.

> 
> So there is a bit of Fortan in there, but it looks like most of the heavy 
> lifting is in C.
> 
> My guess is the Fortran is likely some hooks to add Fortran modules into the 
> program with numpy.
> 
> ...
>>> The key point is that numpy was written by skilled programmers who
>>> carefully optimized their code to be as fast as possible for the major
>>> cases. Thus it is quite possible for the numpy code to be faster in C
>>> than code written by a person without that level of care and effort.
>> This is clearly true.
>> 
>> ChrisA
> 
> 
> -- 
> Richard Damon
> 
> -- 
> https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: C is it always faster than nump?

2022-02-25 Thread Avi Gross via Python-list
Is that fair, Grant?

I go back far enough that in my earliest years I was submitting FORTRAN 
programs written on punched cards and often getting them back the next day. The 
compiling was not the major factor in how long it took.

For many cases, a compiled language only needs to be compiled once and can be 
run many times. Other than during development timeframes, the major concept of 
speed is how well it runs not how long it takes to compile, especially if the 
compiler is busy doing all kinds of optimizations to your code or checking for 
possible obscure errors or making sure the code does not try something funny 
like access memory not allowed and so on.

An interpreted language often has to do the same things every time, albeit some 
have come up with ways to partially compile modules and only redo if they have 
been changed. Some of what they do every time is in some sense wasted effort 
but as a tradeoff, they can do many things in a dynamic way that compiled 
programs may not do easily or at all. 

Another argument I find is unfair, is some comparisons with what I consider 
"mixed" effort. If you have a program that calls various numpy routines using 
native python in between, then clearly a decent amount of the time spent is not 
in numpy. It may for example suddenly import a module and that takes time spent 
doing nothing about the problem other than loading what is needed. Another 
example, it is common to place the numbers or other values you get from numpy 
operations into dictionaries, lists and so on, or to make graphs. If you had 
done the same work in a purely C (or FORTRAN or whatever environment) and had 
access to similar other functionality, the latter two would all be in C or some 
compiled library. 

With exceptions aplenty, speed is not always a primary consideration. Often 
programmer time and productivity matter more. I have seen many projects though 
that were first implemented in a language like Python and when they had figured 
out a decent way to do the task reliably, they handed it over to people to redo 
much or all of it using languages like C++, compiles and all, as a reasonable 
translation from a working application may take much less time to implement and 
once done, way work better. At least it may until you want to change it! 
Prototyping often works better in some languages. Python has a compromise that 
modules can use to speed up important parts of the process by substituting a 
compiled function in C/C++/whatever for a local Python function but not 
necessarily switching entirely to C and suffering under the negatives of that 
environment.

I do wonder if we are already on a path where a language that handles concepts 
like parallelism (or even weirder, quantum computations) well may be a better 
candidate for doing some projects well as even though it may not run at top 
speed, it can make use of many processors or a network of machines in the 
cloud, to handle things in a more flexible way that may even get the job done 
faster.

Next discussion is whether pandas is faster than C, followed by SciPy ...

I do have to wonder if anyone ever considered adding back enough functionality 
into base Python to make some additions less needed. Is there any reason the 
kind of structures used by so many languages cannot be made part of python such 
as a vector/array that holds exactly one kind of data structure and not force 
use of things like a list when that is more than is needed?


-Original Message-
From: Grant Edwards 
To: python-list@python.org
Sent: Fri, Feb 25, 2022 4:12 pm
Subject: Re: C is it always faster than nump?


On 2022-02-25, Richard Damon  wrote:

> On 2/25/22 4:12 AM, BELAHCENE Abdelkader wrote:

>> Hi,

>> a lot of people think that C (or C++) is faster than python, yes I agree,

>> but I think that's not the case with numpy, I believe numpy is faster than

>> C, at least in some cases.

>

> My understanding is that numpy is written in C,



Sort of. IIRC a lot of the heavly lifting is done by libraries like

BLAS and LAPAK that were written in FORTRAN the last time I checked.



Has that changed?



Or am I conflating numpy with some other scientific-python stuff. Back

when I did a lot of numerical stuff in Python, I remember spending a

lot of time watching FORTRAN compiles. Admittedly, that's getting to

be 10+ years ago...



--

Grant





-- 

https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: C is it always faster than nump?

2022-02-25 Thread Chris Angelico
On Sat, 26 Feb 2022 at 09:58, Richard Damon  wrote:
>
> On 2/25/22 2:47 PM, Chris Angelico wrote:
> > On Sat, 26 Feb 2022 at 05:49, Richard Damon  
> > wrote:
> >> On 2/25/22 4:12 AM, BELAHCENE Abdelkader wrote:
> >>> Hi,
> >>> a lot of people think that C (or C++) is faster than python, yes I agree,
> >>> but I think that's not the case with numpy, I believe numpy is faster than
> >>> C, at least in some cases.
> >>>
> >> My understanding is that numpy is written in C, so for it to be faster
> >> than C, you are saying that C is faster that C.
> > Fortran actually, but ultimately, they both get compiled to machine code.
>
> Looking at the Github repo I see:
>
> Languages:
> Python.  62.5%
> C   35.3%
> C++.   1.0%
> Cython.   0.9%
> Shell.   0.2%
> Fortran.   0.1%
>
> So there is a bit of Fortan in there, but it looks like most of the
> heavy lifting is in C.
>
> My guess is the Fortran is likely some hooks to add Fortran modules into
> the program with numpy.
>

GitHub's analysis isn't always very meaningful. I think it's pretty
clear that Numpy isn't implemented in Python :) In any case, point is
that the implementation language is mostly irrelevant.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: C is it always faster than nump?

2022-02-25 Thread Richard Damon

On 2/25/22 2:47 PM, Chris Angelico wrote:

On Sat, 26 Feb 2022 at 05:49, Richard Damon  wrote:

On 2/25/22 4:12 AM, BELAHCENE Abdelkader wrote:

Hi,
a lot of people think that C (or C++) is faster than python, yes I agree,
but I think that's not the case with numpy, I believe numpy is faster than
C, at least in some cases.


My understanding is that numpy is written in C, so for it to be faster
than C, you are saying that C is faster that C.

Fortran actually, but ultimately, they both get compiled to machine code.


Looking at the Github repo I see:

Languages:
Python.  62.5%
C       35.3%
C++.   1.0%
Cython.   0.9%
Shell.   0.2%
Fortran.   0.1%

So there is a bit of Fortan in there, but it looks like most of the 
heavy lifting is in C.


My guess is the Fortran is likely some hooks to add Fortran modules into 
the program with numpy.


...

The key point is that numpy was written by skilled programmers who
carefully optimized their code to be as fast as possible for the major
cases. Thus it is quite possible for the numpy code to be faster in C
than code written by a person without that level of care and effort.

This is clearly true.

ChrisA



--
Richard Damon

--
https://mail.python.org/mailman/listinfo/python-list


Re: C is it always faster than nump?

2022-02-25 Thread Grant Edwards
On 2022-02-25, Richard Damon  wrote:
> On 2/25/22 4:12 AM, BELAHCENE Abdelkader wrote:
>> Hi,
>> a lot of people think that C (or C++) is faster than python, yes I agree,
>> but I think that's not the case with numpy, I believe numpy is faster than
>> C, at least in some cases.
>
> My understanding is that numpy is written in C,

Sort of. IIRC a lot of the heavly lifting is done by libraries like
BLAS and LAPAK that were written in FORTRAN the last time I checked.

Has that changed?

Or am I conflating numpy with some other scientific-python stuff. Back
when I did a lot of numerical stuff in Python, I remember spending a
lot of time watching FORTRAN compiles. Admittedly, that's getting to
be 10+ years ago...

--
Grant


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: C is it always faster than nump?

2022-02-25 Thread Chris Angelico
On Sat, 26 Feb 2022 at 06:44, Avi Gross via Python-list
 wrote:
>
> I agree with Richard.
>
> Some people may be confused and think c is the speed of light and 
> relativistically speaking, nothing can be faster. (OK, just joking. The uses 
> of the same letter of the alphabet are not at all related. One is named for 
> the language that came after the one named B, while the other may be short 
> for celeritas meaning speed.)
>
> There is no such thing as C. C does nothing. It is a combination of a 
> language specification and some pieces of software called compilers that 
> implement it well or less well.
>

Uhh, that's taking it a little bit TOO far I agree with your
point, but saying that there's no such thing as C is slightly unfair
:)

> There is such a thing as a PROGRAM. A program completely written in C is a 
> thing. It can run fast or slow based on a combination of how it was written 
> and on what data it operates on, which hardware and OS and so on. AND some of 
> it may likely be running code from libraries written in other languages like 
> FORTRAN that get linked into it in some way at compile time or runtime, and 
> hooks into the local OS and so on.
>
> So your program written supposedly in pure C, may run faster or slower. If 
> you program a "sort" algorithm in C, it may matter if it is an implementation 
> of a merge sort or at bubble sort or ...
>

More specifically: You're benchmarking a particular *implementation*
of a particular *algorithm*. Depending on what you're trying to
demonstrate, either could be significant.

Performance testing between two things written in C is a huge job.
Performance testing across languages has a strong tendency to be
meaningless (like benchmarking Python's integers against JavaScript's
numbers).

> As noted, numpy is largely written in C. It may well be optimized in some 
> places but there are constraints that may well make it hard to optimize 
> compared to some other implementation without those constraints. In 
> particular, it interfaces with standard Python data structures at times such 
> as when initializing from a Python List, or List of Lists, or needing to hold 
> on to various attributes so it can be converted back, or things I am not even 
> aware of.
>

(Fortran)

In theory, summing a Numpy array should be incredibly fast, but in
practice, there's a lot of variation, and it can be quite surprising.
For instance, integers are faster than floats, everyone knows that.
And it's definitely faster to sum smaller integers than larger ones.

rosuav@sikorsky:~$ python3 -m timeit -s 'import numpy; x =
numpy.array(range(100), dtype=numpy.float64)' 'numpy.sum(x)'
1000 loops, best of 5: 325 usec per loop
rosuav@sikorsky:~$ python3 -m timeit -s 'import numpy; x =
numpy.array(range(100), dtype=numpy.int64)' 'numpy.sum(x)'
500 loops, best of 5: 551 usec per loop
rosuav@sikorsky:~$ python3 -m timeit -s 'import numpy; x =
numpy.array(range(100), dtype=numpy.int32)' 'numpy.sum(x)'
500 loops, best of 5: 680 usec per loop

... Or not.

Summing arrays isn't necessarily the best test of numpy anyway, but as
you can see, testing is an incredibly difficult thing to get right.
The easiest thing to prove is that you have no idea how to prove
anything usefully, and most of us achieve that every time :)

ChrisA


> So, I suspect it may well be possible to make a pure C library similar to 
> numpy in many ways but that can only be used within a C program that only 
> uses native C data structures. It also is possible to write such a program 
> that is horribly slow. And it is possible to write a less complex version of 
> numpy that does not support some current numpy functionality and overall runs 
> much faster on what it does support.
>
> I do wonder at the reason numpy and pandas and lots of other modules have to 
> exist. Other languages like R made design choices that built in ideas of 
> vectorization from the start. Python has lots of object-oriented 
> extensibility that can allow you to create interpreted code that may easily 
> extend it in areas to have some similar features. You can create an 
> array-like data structure that holds only one object type and is extended so 
> adding two together (or multiplying) ends up doing it componentwise. But 
> attempts to do some such things often run into problems as they tend to be 
> slow. So numpy was not written in python, mostly, albeit it could have been 
> even more impressive if it took advantage of more pythonic abilities, at a 
> cost.
>
> But now that numpy is in C, pretty much, it is somewhat locked in when and if 
> other things in Python change.
>
> The reality is that many paradigms carried too far end up falling short.
>
>
> -Original Message-
> From: Richard Damon 
> To: python-list@pyt

Re: C is it always faster than nump?

2022-02-25 Thread Chris Angelico
On Sat, 26 Feb 2022 at 05:49, Richard Damon  wrote:
>
> On 2/25/22 4:12 AM, BELAHCENE Abdelkader wrote:
> > Hi,
> > a lot of people think that C (or C++) is faster than python, yes I agree,
> > but I think that's not the case with numpy, I believe numpy is faster than
> > C, at least in some cases.
> >
> My understanding is that numpy is written in C, so for it to be faster
> than C, you are saying that C is faster that C.

Fortran actually, but ultimately, they both get compiled to machine code.

Really, what the OP has demonstrated is that good, well-written code
called from bad code produces meaningless numbers that are not the
same as bad code written in C. I can't even use that to prove that
good code is faster than bad code, since the measurements aren't
comparable; but if that were what was being measured, it wouldn't be
very surprising.

To do a fair comparison of C and Numpy, you'd have to:

1) Use the same type of timer
2) Use the same algorithm (unless you're benchmarking "naive C code"
against "well-written Fortran code")
3) Have comparable levels of compile-time optimization
4) Cope with the vagaries of CPU caching
5) Ensure that the same data types are being used everywhere
6) Probably several other things that I didn't think of.

The precise way you run the test could easily skew it by orders of
magnitude in either direction.

> The key point is that numpy was written by skilled programmers who
> carefully optimized their code to be as fast as possible for the major
> cases. Thus it is quite possible for the numpy code to be faster in C
> than code written by a person without that level of care and effort.

This is clearly true.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: C is it always faster than nump?

2022-02-25 Thread Avi Gross via Python-list
I agree with Richard.

Some people may be confused and think c is the speed of light and 
relativistically speaking, nothing can be faster. (OK, just joking. The uses of 
the same letter of the alphabet are not at all related. One is named for the 
language that came after the one named B, while the other may be short for 
celeritas meaning speed.)

There is no such thing as C. C does nothing. It is a combination of a language 
specification and some pieces of software called compilers that implement it 
well or less well.

There is such a thing as a PROGRAM. A program completely written in C is a 
thing. It can run fast or slow based on a combination of how it was written and 
on what data it operates on, which hardware and OS and so on. AND some of it 
may likely be running code from libraries written in other languages like 
FORTRAN that get linked into it in some way at compile time or runtime, and 
hooks into the local OS and so on.

So your program written supposedly in pure C, may run faster or slower. If you 
program a "sort" algorithm in C, it may matter if it is an implementation of a 
merge sort or at bubble sort or ...

As noted, numpy is largely written in C. It may well be optimized in some 
places but there are constraints that may well make it hard to optimize 
compared to some other implementation without those constraints. In particular, 
it interfaces with standard Python data structures at times such as when 
initializing from a Python List, or List of Lists, or needing to hold on to 
various attributes so it can be converted back, or things I am not even aware 
of.

So, I suspect it may well be possible to make a pure C library similar to numpy 
in many ways but that can only be used within a C program that only uses native 
C data structures. It also is possible to write such a program that is horribly 
slow. And it is possible to write a less complex version of numpy that does not 
support some current numpy functionality and overall runs much faster on what 
it does support.

I do wonder at the reason numpy and pandas and lots of other modules have to 
exist. Other languages like R made design choices that built in ideas of 
vectorization from the start. Python has lots of object-oriented extensibility 
that can allow you to create interpreted code that may easily extend it in 
areas to have some similar features. You can create an array-like data 
structure that holds only one object type and is extended so adding two 
together (or multiplying) ends up doing it componentwise. But attempts to do 
some such things often run into problems as they tend to be slow. So numpy was 
not written in python, mostly, albeit it could have been even more impressive 
if it took advantage of more pythonic abilities, at a cost.

But now that numpy is in C, pretty much, it is somewhat locked in when and if 
other things in Python change. 

The reality is that many paradigms carried too far end up falling short.


-Original Message-
From: Richard Damon 
To: python-list@python.org
Sent: Fri, Feb 25, 2022 1:48 pm
Subject: Re: C is it always faster than nump?


On 2/25/22 4:12 AM, BELAHCENE Abdelkader wrote:
> Hi,
> a lot of people think that C (or C++) is faster than python, yes I agree,
> but I think that's not the case with numpy, I believe numpy is faster than
> C, at least in some cases.
>
My understanding is that numpy is written in C, so for it to be faster 
than C, you are saying that C is faster that C.

The key point is that numpy was written by skilled programmers who 
carefully optimized their code to be as fast as possible for the major 
cases. Thus it is quite possible for the numpy code to be faster in C 
than code written by a person without that level of care and effort.

There are similar package available for many languages, including C/C++ 
to let mere mortals get efficient numerical processing.

-- 
Richard Damon


-- 
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: C is it always faster than nump?

2022-02-25 Thread Chris Angelico
On Sat, 26 Feb 2022 at 03:13, BELAHCENE Abdelkader
 wrote:
> *This is the Python3 program :import timeit as itimport numpy as npimport
> systry : n=eval(sys.argv[1])except: print ("needs integer as argument") ;
> exit() a=range(1,n+1)b=np.array(a)def func1(): return sum(a)def
> func2(): return np.sum(b)print(f"sum with Python: {func1()} and NumPy
> {func2()} ")tm1=it.timeit(stmt=func1, number=n)print(f"time used Python
> Sum: {round(tm1,2)} sec")tm2=it.timeit(stmt=func2, number=n)print(f"time
> used  Numpy Sum: {round(tm2,2)} sec")*

This is terrible code. Even aside from the messed-up formatting (for
which your mail client is probably to blame), using eval() and a bare
"except:" clause is not a good way to use Python. And then you get
timeit to do as many iterations as the length of the array, which is
hardly indicative or meaningful for small values.

> *and Here the C program:#include #include #include
> long func1(int n){ long  r=0;for (int  i=1; i<=
> n;i++) r+= i; return r;}int main(int argc, char* argv[]){
> clock_t c0, c1; long v,count; int n;   if ( argc < 2)
> {  printf("Please give an argument"); return
> -1;  }n=atoi(argv[1]); c0 = clock();*
>
>
>
>
> *   for (int j=0;j < n;j++) v=func1(n); c1 = clock(); printf
> ("\tCPU  time :%.2f sec", (float)(c1 - c0)/CLOCKS_PER_SEC);
> printf("\n\tThe value : %ld\n",  v);}*

At least you're consistent, using an iteration count equal to the
length of the array again. But that just means that it's equally
meaningless.

Did you know that Python's timeit and C's clock don't even measure the
same thing?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: C is it always faster than nump?

2022-02-25 Thread Richard Damon

On 2/25/22 4:12 AM, BELAHCENE Abdelkader wrote:

Hi,
a lot of people think that C (or C++) is faster than python, yes I agree,
but I think that's not the case with numpy, I believe numpy is faster than
C, at least in some cases.

My understanding is that numpy is written in C, so for it to be faster 
than C, you are saying that C is faster that C.


The key point is that numpy was written by skilled programmers who 
carefully optimized their code to be as fast as possible for the major 
cases. Thus it is quite possible for the numpy code to be faster in C 
than code written by a person without that level of care and effort.


There are similar package available for many languages, including C/C++ 
to let mere mortals get efficient numerical processing.


--
Richard Damon

--
https://mail.python.org/mailman/listinfo/python-list


C is it always faster than nump?

2022-02-25 Thread BELAHCENE Abdelkader
Hi,
a lot of people think that C (or C++) is faster than python, yes I agree,
but I think that's not the case with numpy, I believe numpy is faster than
C, at least in some cases.


*Is there another explanation ?Or where can find  a doc speaking  about the
subject?*Thanks a lot
Regards
Numpy implements vectorization for arrays, or I'm wrong. Anyway here is an
example Let's look at the following case:
Here is the result on my laptop i3:

Labs$ *python3 tempsExe.py  5*
  sum with Python: 1250025000 and NumPy 1250025000
  time used Python Sum: * 37.28 sec *
  time used  Numpy Sum:  *1.85 sec*

Labs$ *./tt5 *

*   CPU  time :7.521730*

*The value : 1250025000 *





















*This is the Python3 program :import timeit as itimport numpy as npimport
systry : n=eval(sys.argv[1])except: print ("needs integer as argument") ;
exit() a=range(1,n+1)b=np.array(a)def func1(): return sum(a)def
func2(): return np.sum(b)print(f"sum with Python: {func1()} and NumPy
{func2()} ")tm1=it.timeit(stmt=func1, number=n)print(f"time used Python
Sum: {round(tm1,2)} sec")tm2=it.timeit(stmt=func2, number=n)print(f"time
used  Numpy Sum: {round(tm2,2)} sec")*


















*and Here the C program:#include #include #include
long func1(int n){ long  r=0;for (int  i=1; i<=
n;i++) r+= i; return r;}int main(int argc, char* argv[]){
clock_t c0, c1; long v,count; int n;   if ( argc < 2)
{  printf("Please give an argument"); return
-1;  }n=atoi(argv[1]); c0 = clock();*




*   for (int j=0;j < n;j++) v=func1(n); c1 = clock(); printf
("\tCPU  time :%.2f sec", (float)(c1 - c0)/CLOCKS_PER_SEC);
printf("\n\tThe value : %ld\n",  v);}*
-- 
https://mail.python.org/mailman/listinfo/python-list