Re: C is it always faster than nump?
Il giorno sabato 26 febbraio 2022 alle 19:41:37 UTC+1 Dennis Lee Bieber ha scritto: > On Fri, 25 Feb 2022 21:44:14 -0800, Dan Stromberg > declaimed the following: > >Fortran, (still last I heard) did not support pointers, which gives Fortran > >compilers the chance to exploit a very nice class of optimizations you > >can't use nearly as well in languages with pointers. > > > Haven't looked much at Fortran-90/95 then... > > Variable declaration gained a POINTER qualifier, and there is an > ALLOCATE intrinsic to obtain memory. > > And with difficulty one could get the result in DEC/VMS FORTRAN-77 > since DEC implemented (across all their language compilers) intrinsics > controlling how arguments are passed -- overriding the language native > passing: > CALL XYZ(%val(M)) > would actually pass the value of M, not Fortran default address-of, with > the result that XYZ would use that value /as/ the address of the actual > argument. (Others were %ref() and %descr() -- descriptor being a small > structure with the address reference along with, say, upper/lower bounds; > often used for strings). > -- > Wulfraed Dennis Lee Bieber AF6VN > wlf...@ix.netcom.com http://wlfraed.microdiversity.freeddns.org/ The latest Fortran revision is the 2018. A variable can also have the VALUE attribute even though nowhere in the standard is written that it means passing the data by value. It just means that if a variable is changed in a procedure the changes don't propagate back to the caller. With the iso_c_binding one can directly call a C function or let a Fortran procedure appear as a C function. There is the C_LOC that gives the C address of a variable if needed. Of course from 2003 it is fully object oriented. The claim that it was faster then C is mostly related to the aliasing rule that is forbidden in Fortran. The C introduced the "restrict" qualifier for the same reason. In Fortran you also have array operation like you have in numpy. -- https://mail.python.org/mailman/listinfo/python-list
Re: C is it always faster than nump?
On Fri, 25 Feb 2022 21:44:14 -0800, Dan Stromberg declaimed the following: >Fortran, (still last I heard) did not support pointers, which gives Fortran >compilers the chance to exploit a very nice class of optimizations you >can't use nearly as well in languages with pointers. > Haven't looked much at Fortran-90/95 then... Variable declaration gained a POINTER qualifier, and there is an ALLOCATE intrinsic to obtain memory. And with difficulty one could get the result in DEC/VMS FORTRAN-77 since DEC implemented (across all their language compilers) intrinsics controlling how arguments are passed -- overriding the language native passing: CALL XYZ(%val(M)) would actually pass the value of M, not Fortran default address-of, with the result that XYZ would use that value /as/ the address of the actual argument. (Others were %ref() and %descr() -- descriptor being a small structure with the address reference along with, say, upper/lower bounds; often used for strings). -- Wulfraed Dennis Lee Bieber AF6VN wlfr...@ix.netcom.comhttp://wlfraed.microdiversity.freeddns.org/ -- https://mail.python.org/mailman/listinfo/python-list
Re: C is it always faster than nump?
Dan Stromberg wrote: > On Fri, Feb 25, 2022 at 8:12 AM BELAHCENE Abdelkader < > abdelkader.belahc...@enst.dz> wrote: > >> Hi, >> a lot of people think that C (or C++) is faster than python, yes I agree, >> but I think that's not the case with numpy, I believe numpy is faster than >> C, at least in some cases. >> > > This is all "last time I heard". > > numpy is written, in significant part, in Fortran. > > Fortran, especially for matrix math with variable dimensions, can be faster > than C. > > Fortran, (still last I heard) did not support pointers, which gives Fortran > compilers the chance to exploit a very nice class of optimizations you > can't use nearly as well in languages with pointers. > > I used to code C to be built with the "noalias" optimization, to get much > of the speed of Fortran in C. But it required using an error prone subset > of C without good error detection. Pointers were introduced in Fortran 90. Neil. -- https://mail.python.org/mailman/listinfo/python-list
Re: C is it always faster than nump?
Thanks every body, I want to close the subject, but just a naive question: Does numpy use a* vectorization *for arrays? I mean when I add 2 arrays ( or in sum function) how it is done, in an other word b=np.arange(100); t=np.sum(b) is equivalent or not to s=0 for i in range(100): s +=b[i] thanks a lot Le sam. 26 févr. 2022 à 06:44, Dan Stromberg a écrit : > > On Fri, Feb 25, 2022 at 8:12 AM BELAHCENE Abdelkader < > abdelkader.belahc...@enst.dz> wrote: > >> Hi, >> a lot of people think that C (or C++) is faster than python, yes I agree, >> but I think that's not the case with numpy, I believe numpy is faster than >> C, at least in some cases. >> > > This is all "last time I heard". > > numpy is written, in significant part, in Fortran. > > Fortran, especially for matrix math with variable dimensions, can be > faster than C. > > Fortran, (still last I heard) did not support pointers, which gives > Fortran compilers the chance to exploit a very nice class of optimizations > you can't use nearly as well in languages with pointers. > > I used to code C to be built with the "noalias" optimization, to get much > of the speed of Fortran in C. But it required using an error prone subset > of C without good error detection. > > > -- https://mail.python.org/mailman/listinfo/python-list
Re: C is it always faster than nump?
On Fri, Feb 25, 2022 at 8:12 AM BELAHCENE Abdelkader < abdelkader.belahc...@enst.dz> wrote: > Hi, > a lot of people think that C (or C++) is faster than python, yes I agree, > but I think that's not the case with numpy, I believe numpy is faster than > C, at least in some cases. > This is all "last time I heard". numpy is written, in significant part, in Fortran. Fortran, especially for matrix math with variable dimensions, can be faster than C. Fortran, (still last I heard) did not support pointers, which gives Fortran compilers the chance to exploit a very nice class of optimizations you can't use nearly as well in languages with pointers. I used to code C to be built with the "noalias" optimization, to get much of the speed of Fortran in C. But it required using an error prone subset of C without good error detection. -- https://mail.python.org/mailman/listinfo/python-list
Re: C is it always faster than nump?
On Fri, Feb 25, 2022 at 9:03 PM Chris Angelico wrote: > On Sat, 26 Feb 2022 at 15:39, Avi Gross via Python-list > wrote: > > Take interpreted languages including Python and R that specify all kinds > of functions that may be written within the language at first. Someone may > implement a function like sum() (just an example) that looks like the sum > of a long list of items is the first item added to a slightly longer sum of > the remaining items. It stops when the final recursive sum is about to be > called with no remaining arguments. Clearly this implementation may be a > tad slow. But does Python require this version of sum() or will it allow > any version that can be called the same way and returns the same results > every time? > > > > That's also true of C and pretty much every language I know of. They > define semantics, not implementation. > This comes back to something we've discussed before. A language that is described primarily by a reference implementation rather than a standard, runs the risk of being defined by that implementation. -- https://mail.python.org/mailman/listinfo/python-list
Re: C is it always faster than nump?
On Sat, 26 Feb 2022 at 15:39, Avi Gross via Python-list wrote: > Take interpreted languages including Python and R that specify all kinds of > functions that may be written within the language at first. Someone may > implement a function like sum() (just an example) that looks like the sum of > a long list of items is the first item added to a slightly longer sum of the > remaining items. It stops when the final recursive sum is about to be called > with no remaining arguments. Clearly this implementation may be a tad slow. > But does Python require this version of sum() or will it allow any version > that can be called the same way and returns the same results every time? > That's also true of C and pretty much every language I know of. They define semantics, not implementation. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: C is it always faster than nump?
Agreed, Chris. There are many ways to get something done. I often use the Anaconda distribution because it tends to bundle many of the modules I need and more. Not that it is such a big deal to load the ones you need, but if you share your program, others trying to use it may have some problems. -Original Message- From: Chris Angelico To: python-list@python.org Sent: Fri, Feb 25, 2022 11:16 pm Subject: Re: C is it always faster than nump? On Sat, 26 Feb 2022 at 14:35, Avi Gross via Python-list wrote: > But with numpy and more available anyway, it may not be necessary to reinvent > much of that. I was just wondering if it ever made sense to simply include it > in the base python, perhaps as a second executable with a name like pythonn > to signify that it is more numeric. So if you run that, you know you do not > need to add an assortment of modules. I keep seeing programs that just > automatically add numpy and pandas and various graphic modules and other > scientific and machine learning modules. Of course not everyone needs or even > wants this. Many simply use base Python techniques even if they are low for > larger amounts of data. > How would that be different from getting one of the numeric/scientific distributions of Python? Why should it be a different Python executable?!? ChrisA -- https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: C is it always faster than nump?
Yes, Chris, C is real as a somewhat abstract concept. There are a whole slew of different variations each time it is released anew with changes and then some people at various times built actual compilers that implement a varying subset of what is possible, and not necessarily in quite the same way. As you gathered, I am saying that comparing languages is not so effective as comparing implementations and even better specific programs on specific data. And yet, you can still get odd results if you cherry pick what to test. Consider a sorting algorithm that rapidly checks if the data is already sorted, and if so, does not bother sorting it. It will quite possibly be the fastest one in a comparison if the data is chosen to be already in order! But on many other sets of data it will have wasted some time checking if it is in order while other algorithms have started sorting it! Bad example, maybe, but there are better ones. Consider an algorithm that does no checking for one of many errors that can happen. It does not see if the arguments it gets are within expected ranges of types or values. It does not intercept attempts to divide by zero and much more. Another algorithm is quite bulletproof and thus has lots more code and maybe runs much slower. Is it shocking if it tests slower . But the other code may end up failing faster in the field and need a rewrite. A really fair comparison is often really hard. Languages are abstract and sometimes a new implementation makes a huge change. Take interpreted languages including Python and R that specify all kinds of functions that may be written within the language at first. Someone may implement a function like sum() (just an example) that looks like the sum of a long list of items is the first item added to a slightly longer sum of the remaining items. It stops when the final recursive sum is about to be called with no remaining arguments. Clearly this implementation may be a tad slow. But does Python require this version of sum() or will it allow any version that can be called the same way and returns the same results every time? Does it even matter if the function is written in C or C++ or FORTRAN or even assembler of some kind, as long as it is placed in an accessible library and there is some interface that allows you to make the call in python notation and it is fed to the function in the way it requires, and similarly deals with returned values? A wrapper, sort of. The use of such a shortcut is not against the spirit of the language. You can still specify you want the sum() function from some module, or write your own. This is true most places. I remember way back when how early UNIX shells did silly things like call /bin/echo to do trivial things, or call an external program to do something as trivial as i=i+1 and then they started building in such functionality and your shell scripts suddenly really speeded up. A non-programmer I once worked for wrote some truly humongous shell scripts that brought machines it was run on remotely in places like Japan during their day-time to their knees. Collecting billing data from all over by running a pipeline with 9 processes per line/row was a bit much. At first I sped it up quite a bit by using newer built-in features like I described, or doing more with fewer elements in pipelines. But I saw how much was caused by using the wrong tools for the job and there were programs designed to analyze data in various ways. I replaced almost all of it with an AWK script that speeded things up many orders of magnitude. And, yes, AWK was not as fast as C but more trivial to program in for this need as it had so many needed aspects built-in or happening automagically. Would we do the entire project differently today? Definitely. All the billing records would not be sitting in an assortment of flat files all over the place but rather be fed into some database that made retrieval of all kinds of reports straightforward without needing to write much code at all. How many modules or "packages" were once written largely using the language and then gradually "improved" by replacing parts, especially slower parts, with external content as we have been discussing? In a sense, some Python applications run on older versions of Python may be running faster as newer versions have improved some of the "same" code while to the user, they see it running on the same language, Python? -Original Message- From: Chris Angelico To: python-list@python.org Sent: Fri, Feb 25, 2022 2:58 pm Subject: Re: C is it always faster than nump? On Sat, 26 Feb 2022 at 06:44, Avi Gross via Python-list wrote: > > I agree with Richard. > > Some people may be confused and think c is the speed of light and > relativistically speaking, nothing can be faster. (OK, just joking. The uses > of the same letter of the alphabet are not at all related.
Re: C is it always faster than nump?
On Sat, 26 Feb 2022 at 14:35, Avi Gross via Python-list wrote: > But with numpy and more available anyway, it may not be necessary to reinvent > much of that. I was just wondering if it ever made sense to simply include it > in the base python, perhaps as a second executable with a name like pythonn > to signify that it is more numeric. So if you run that, you know you do not > need to add an assortment of modules. I keep seeing programs that just > automatically add numpy and pandas and various graphic modules and other > scientific and machine learning modules. Of course not everyone needs or even > wants this. Many simply use base Python techniques even if they are low for > larger amounts of data. > How would that be different from getting one of the numeric/scientific distributions of Python? Why should it be a different Python executable?!? ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: C is it always faster than nump?
Dennis, What you describe may be a start but is it anything I might not have easily created myself? https://docs.python.org/3/library/array.html I can see creating my own object and adding those methods and attributes while gaining very little, except perhaps storage. Can I add or multiply two such items efficiently if it contains a numeric value? Can I offer them as an argument to all kinds of functions which can now handle it well? How does it work if a second operand is a scalar or an array of another data type. Can two be compared and result in an array of boolean (not seen in the list of types). Numpy does quite a bit of that kind of thing but perhaps better is a language like R where all that and more are built in. But with numpy and more available anyway, it may not be necessary to reinvent much of that. I was just wondering if it ever made sense to simply include it in the base python, perhaps as a second executable with a name like pythonn to signify that it is more numeric. So if you run that, you know you do not need to add an assortment of modules. I keep seeing programs that just automatically add numpy and pandas and various graphic modules and other scientific and machine learning modules. Of course not everyone needs or even wants this. Many simply use base Python techniques even if they are low for larger amounts of data. -- https://mail.python.org/mailman/listinfo/python-list
Re: C is it always faster than nump?
On Sat, 26 Feb 2022 at 03:10, Dennis Lee Bieber wrote: > > On Fri, 25 Feb 2022 23:06:57 + (UTC), Avi Gross > declaimed the following: > > >I do have to wonder if anyone ever considered adding back enough > >functionality into base Python to make some additions less needed. Is there > >any reason the kind of structures used by so many languages cannot be made > >part of python such as a vector/array that holds exactly one kind of data > >structure and not force use of things like a list when that is more than is > >needed? > > > https://docs.python.org/3/library/array.html > > seems to fit the criteria... The stdlib array module is basically unused in comparison to NumPy. The capabilities of the array module do not meet the needs for most users who want to do anything useful with arrays. The intention in creating NumPy (in the NumPy/SciPy split) was that it might be possible that NumPy could be merged into core Python. Unfortunately that didn't come to be. -- Oscar -- https://mail.python.org/mailman/listinfo/python-list
Re: C is it always faster than nump?
On Fri, 25 Feb 2022 23:06:57 + (UTC), Avi Gross declaimed the following: >I do have to wonder if anyone ever considered adding back enough functionality >into base Python to make some additions less needed. Is there any reason the >kind of structures used by so many languages cannot be made part of python >such as a vector/array that holds exactly one kind of data structure and not >force use of things like a list when that is more than is needed? > https://docs.python.org/3/library/array.html seems to fit the criteria... -- Wulfraed Dennis Lee Bieber AF6VN wlfr...@ix.netcom.comhttp://wlfraed.microdiversity.freeddns.org/ -- https://mail.python.org/mailman/listinfo/python-list
Re: C is it always faster than nump?
On Fri, 25 Feb 2022 at 23:13, Barry wrote: > > > On 25 Feb 2022, at 23:00, Richard Damon wrote: > > > > On 2/25/22 2:47 PM, Chris Angelico wrote: > >>> On Sat, 26 Feb 2022 at 05:49, Richard Damon > >>> wrote: > >>> On 2/25/22 4:12 AM, BELAHCENE Abdelkader wrote: > Hi, > a lot of people think that C (or C++) is faster than python, yes I agree, > but I think that's not the case with numpy, I believe numpy is faster > than > C, at least in some cases. > > >>> My understanding is that numpy is written in C, so for it to be faster > >>> than C, you are saying that C is faster that C. > >> Fortran actually, but ultimately, they both get compiled to machine code. > > > > Looking at the Github repo I see: > > > > Languages: > > Python. 62.5% > > C 35.3% > > C++. 1.0% > > Cython. 0.9% > > Shell. 0.2% > > Fortran. 0.1% > > I assume that this is just for bumpy and not for all its dependencies. > That will add a lot of Fortran and c++ I expect. NumPy links with BLAS/LAPACK that will do the heavy lifting for common linear algebra operations. Multiple different BLAS libraries can be used with NumPy and those libraries might be written in Fortran and might also involve some hand-crafted assembly for particular architectures. Some explanation of NumPy's BLAS/LAPACK support is here: https://numpy.org/devdocs/user/building.html By default if you install NumPy from conda (or at least if you install the Anaconda distribution) then I think that NumPy will use the Intel MKL library: https://en.wikipedia.org/wiki/Math_Kernel_Library As I understand it the core of MKL is Fortran and is compiled with Intel's ifortran compiler but some parts might be C/C++. MKL is also the same library that is used by Matlab for its own heavy lifting. It's not sufficiently free to be used in NumPy's PyPI wheels though. If you install the precompiled NumPy wheels with pip from PyPI then I think those are statically linked with OpenBLAS: https://github.com/xianyi/OpenBLAS Again I think the core of OpenBLAS is Fortran but there's some C in there. In the pre-wheel days the situation was that NumPy provided installer files for Windows that would give binaries linked with ATLAS (also Fortran): https://en.wikipedia.org/wiki/Automatically_Tuned_Linear_Algebra_Software I think at some point NumPy used to use the OSX Accelerate library but the page I linked above says that's now deprecated. I don't know anything about Accelerate but I wouldn't be surprised to hear that it was a bunch of old Fortran code! If you build NumPy from source without having any BLAS/LAPACK libraries then I think it uses its own backup version of these that is written in C but not as well optimised. This used to be the default for a pip install on Linux in pre-wheel times. Many operations in NumPy don't actually use BLAS/LAPACK and for those parts the heavy lifting is all done in NumPy's own C code. Lastly SciPy which is very often used together with NumPy does have a load of Fortran code. As I understand it at some point NumPy and SciPy were divided from the previous numerical Python libraries and there was a deliberate split so that the Fortran code all ended up in SciPy rather than NumPy. -- Oscar -- https://mail.python.org/mailman/listinfo/python-list
Re: C is it always faster than nump?
> On 25 Feb 2022, at 23:00, Richard Damon wrote: > > On 2/25/22 2:47 PM, Chris Angelico wrote: >>> On Sat, 26 Feb 2022 at 05:49, Richard Damon >>> wrote: >>> On 2/25/22 4:12 AM, BELAHCENE Abdelkader wrote: Hi, a lot of people think that C (or C++) is faster than python, yes I agree, but I think that's not the case with numpy, I believe numpy is faster than C, at least in some cases. >>> My understanding is that numpy is written in C, so for it to be faster >>> than C, you are saying that C is faster that C. >> Fortran actually, but ultimately, they both get compiled to machine code. > > Looking at the Github repo I see: > > Languages: > Python. 62.5% > C 35.3% > C++. 1.0% > Cython. 0.9% > Shell. 0.2% > Fortran. 0.1% I assume that this is just for bumpy and not for all its dependencies. That will add a lot of Fortran and c++ I expect. > > So there is a bit of Fortan in there, but it looks like most of the heavy > lifting is in C. > > My guess is the Fortran is likely some hooks to add Fortran modules into the > program with numpy. > > ... >>> The key point is that numpy was written by skilled programmers who >>> carefully optimized their code to be as fast as possible for the major >>> cases. Thus it is quite possible for the numpy code to be faster in C >>> than code written by a person without that level of care and effort. >> This is clearly true. >> >> ChrisA > > > -- > Richard Damon > > -- > https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: C is it always faster than nump?
Is that fair, Grant? I go back far enough that in my earliest years I was submitting FORTRAN programs written on punched cards and often getting them back the next day. The compiling was not the major factor in how long it took. For many cases, a compiled language only needs to be compiled once and can be run many times. Other than during development timeframes, the major concept of speed is how well it runs not how long it takes to compile, especially if the compiler is busy doing all kinds of optimizations to your code or checking for possible obscure errors or making sure the code does not try something funny like access memory not allowed and so on. An interpreted language often has to do the same things every time, albeit some have come up with ways to partially compile modules and only redo if they have been changed. Some of what they do every time is in some sense wasted effort but as a tradeoff, they can do many things in a dynamic way that compiled programs may not do easily or at all. Another argument I find is unfair, is some comparisons with what I consider "mixed" effort. If you have a program that calls various numpy routines using native python in between, then clearly a decent amount of the time spent is not in numpy. It may for example suddenly import a module and that takes time spent doing nothing about the problem other than loading what is needed. Another example, it is common to place the numbers or other values you get from numpy operations into dictionaries, lists and so on, or to make graphs. If you had done the same work in a purely C (or FORTRAN or whatever environment) and had access to similar other functionality, the latter two would all be in C or some compiled library. With exceptions aplenty, speed is not always a primary consideration. Often programmer time and productivity matter more. I have seen many projects though that were first implemented in a language like Python and when they had figured out a decent way to do the task reliably, they handed it over to people to redo much or all of it using languages like C++, compiles and all, as a reasonable translation from a working application may take much less time to implement and once done, way work better. At least it may until you want to change it! Prototyping often works better in some languages. Python has a compromise that modules can use to speed up important parts of the process by substituting a compiled function in C/C++/whatever for a local Python function but not necessarily switching entirely to C and suffering under the negatives of that environment. I do wonder if we are already on a path where a language that handles concepts like parallelism (or even weirder, quantum computations) well may be a better candidate for doing some projects well as even though it may not run at top speed, it can make use of many processors or a network of machines in the cloud, to handle things in a more flexible way that may even get the job done faster. Next discussion is whether pandas is faster than C, followed by SciPy ... I do have to wonder if anyone ever considered adding back enough functionality into base Python to make some additions less needed. Is there any reason the kind of structures used by so many languages cannot be made part of python such as a vector/array that holds exactly one kind of data structure and not force use of things like a list when that is more than is needed? -Original Message- From: Grant Edwards To: python-list@python.org Sent: Fri, Feb 25, 2022 4:12 pm Subject: Re: C is it always faster than nump? On 2022-02-25, Richard Damon wrote: > On 2/25/22 4:12 AM, BELAHCENE Abdelkader wrote: >> Hi, >> a lot of people think that C (or C++) is faster than python, yes I agree, >> but I think that's not the case with numpy, I believe numpy is faster than >> C, at least in some cases. > > My understanding is that numpy is written in C, Sort of. IIRC a lot of the heavly lifting is done by libraries like BLAS and LAPAK that were written in FORTRAN the last time I checked. Has that changed? Or am I conflating numpy with some other scientific-python stuff. Back when I did a lot of numerical stuff in Python, I remember spending a lot of time watching FORTRAN compiles. Admittedly, that's getting to be 10+ years ago... -- Grant -- https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: C is it always faster than nump?
On Sat, 26 Feb 2022 at 09:58, Richard Damon wrote: > > On 2/25/22 2:47 PM, Chris Angelico wrote: > > On Sat, 26 Feb 2022 at 05:49, Richard Damon > > wrote: > >> On 2/25/22 4:12 AM, BELAHCENE Abdelkader wrote: > >>> Hi, > >>> a lot of people think that C (or C++) is faster than python, yes I agree, > >>> but I think that's not the case with numpy, I believe numpy is faster than > >>> C, at least in some cases. > >>> > >> My understanding is that numpy is written in C, so for it to be faster > >> than C, you are saying that C is faster that C. > > Fortran actually, but ultimately, they both get compiled to machine code. > > Looking at the Github repo I see: > > Languages: > Python. 62.5% > C 35.3% > C++. 1.0% > Cython. 0.9% > Shell. 0.2% > Fortran. 0.1% > > So there is a bit of Fortan in there, but it looks like most of the > heavy lifting is in C. > > My guess is the Fortran is likely some hooks to add Fortran modules into > the program with numpy. > GitHub's analysis isn't always very meaningful. I think it's pretty clear that Numpy isn't implemented in Python :) In any case, point is that the implementation language is mostly irrelevant. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: C is it always faster than nump?
On 2/25/22 2:47 PM, Chris Angelico wrote: On Sat, 26 Feb 2022 at 05:49, Richard Damon wrote: On 2/25/22 4:12 AM, BELAHCENE Abdelkader wrote: Hi, a lot of people think that C (or C++) is faster than python, yes I agree, but I think that's not the case with numpy, I believe numpy is faster than C, at least in some cases. My understanding is that numpy is written in C, so for it to be faster than C, you are saying that C is faster that C. Fortran actually, but ultimately, they both get compiled to machine code. Looking at the Github repo I see: Languages: Python. 62.5% C 35.3% C++. 1.0% Cython. 0.9% Shell. 0.2% Fortran. 0.1% So there is a bit of Fortan in there, but it looks like most of the heavy lifting is in C. My guess is the Fortran is likely some hooks to add Fortran modules into the program with numpy. ... The key point is that numpy was written by skilled programmers who carefully optimized their code to be as fast as possible for the major cases. Thus it is quite possible for the numpy code to be faster in C than code written by a person without that level of care and effort. This is clearly true. ChrisA -- Richard Damon -- https://mail.python.org/mailman/listinfo/python-list
Re: C is it always faster than nump?
On 2022-02-25, Richard Damon wrote: > On 2/25/22 4:12 AM, BELAHCENE Abdelkader wrote: >> Hi, >> a lot of people think that C (or C++) is faster than python, yes I agree, >> but I think that's not the case with numpy, I believe numpy is faster than >> C, at least in some cases. > > My understanding is that numpy is written in C, Sort of. IIRC a lot of the heavly lifting is done by libraries like BLAS and LAPAK that were written in FORTRAN the last time I checked. Has that changed? Or am I conflating numpy with some other scientific-python stuff. Back when I did a lot of numerical stuff in Python, I remember spending a lot of time watching FORTRAN compiles. Admittedly, that's getting to be 10+ years ago... -- Grant -- https://mail.python.org/mailman/listinfo/python-list
Re: C is it always faster than nump?
On Sat, 26 Feb 2022 at 06:44, Avi Gross via Python-list wrote: > > I agree with Richard. > > Some people may be confused and think c is the speed of light and > relativistically speaking, nothing can be faster. (OK, just joking. The uses > of the same letter of the alphabet are not at all related. One is named for > the language that came after the one named B, while the other may be short > for celeritas meaning speed.) > > There is no such thing as C. C does nothing. It is a combination of a > language specification and some pieces of software called compilers that > implement it well or less well. > Uhh, that's taking it a little bit TOO far I agree with your point, but saying that there's no such thing as C is slightly unfair :) > There is such a thing as a PROGRAM. A program completely written in C is a > thing. It can run fast or slow based on a combination of how it was written > and on what data it operates on, which hardware and OS and so on. AND some of > it may likely be running code from libraries written in other languages like > FORTRAN that get linked into it in some way at compile time or runtime, and > hooks into the local OS and so on. > > So your program written supposedly in pure C, may run faster or slower. If > you program a "sort" algorithm in C, it may matter if it is an implementation > of a merge sort or at bubble sort or ... > More specifically: You're benchmarking a particular *implementation* of a particular *algorithm*. Depending on what you're trying to demonstrate, either could be significant. Performance testing between two things written in C is a huge job. Performance testing across languages has a strong tendency to be meaningless (like benchmarking Python's integers against JavaScript's numbers). > As noted, numpy is largely written in C. It may well be optimized in some > places but there are constraints that may well make it hard to optimize > compared to some other implementation without those constraints. In > particular, it interfaces with standard Python data structures at times such > as when initializing from a Python List, or List of Lists, or needing to hold > on to various attributes so it can be converted back, or things I am not even > aware of. > (Fortran) In theory, summing a Numpy array should be incredibly fast, but in practice, there's a lot of variation, and it can be quite surprising. For instance, integers are faster than floats, everyone knows that. And it's definitely faster to sum smaller integers than larger ones. rosuav@sikorsky:~$ python3 -m timeit -s 'import numpy; x = numpy.array(range(100), dtype=numpy.float64)' 'numpy.sum(x)' 1000 loops, best of 5: 325 usec per loop rosuav@sikorsky:~$ python3 -m timeit -s 'import numpy; x = numpy.array(range(100), dtype=numpy.int64)' 'numpy.sum(x)' 500 loops, best of 5: 551 usec per loop rosuav@sikorsky:~$ python3 -m timeit -s 'import numpy; x = numpy.array(range(100), dtype=numpy.int32)' 'numpy.sum(x)' 500 loops, best of 5: 680 usec per loop ... Or not. Summing arrays isn't necessarily the best test of numpy anyway, but as you can see, testing is an incredibly difficult thing to get right. The easiest thing to prove is that you have no idea how to prove anything usefully, and most of us achieve that every time :) ChrisA > So, I suspect it may well be possible to make a pure C library similar to > numpy in many ways but that can only be used within a C program that only > uses native C data structures. It also is possible to write such a program > that is horribly slow. And it is possible to write a less complex version of > numpy that does not support some current numpy functionality and overall runs > much faster on what it does support. > > I do wonder at the reason numpy and pandas and lots of other modules have to > exist. Other languages like R made design choices that built in ideas of > vectorization from the start. Python has lots of object-oriented > extensibility that can allow you to create interpreted code that may easily > extend it in areas to have some similar features. You can create an > array-like data structure that holds only one object type and is extended so > adding two together (or multiplying) ends up doing it componentwise. But > attempts to do some such things often run into problems as they tend to be > slow. So numpy was not written in python, mostly, albeit it could have been > even more impressive if it took advantage of more pythonic abilities, at a > cost. > > But now that numpy is in C, pretty much, it is somewhat locked in when and if > other things in Python change. > > The reality is that many paradigms carried too far end up falling short. > > > -Original Message- > From: Richard Damon > To: python-list@pyt
Re: C is it always faster than nump?
On Sat, 26 Feb 2022 at 05:49, Richard Damon wrote: > > On 2/25/22 4:12 AM, BELAHCENE Abdelkader wrote: > > Hi, > > a lot of people think that C (or C++) is faster than python, yes I agree, > > but I think that's not the case with numpy, I believe numpy is faster than > > C, at least in some cases. > > > My understanding is that numpy is written in C, so for it to be faster > than C, you are saying that C is faster that C. Fortran actually, but ultimately, they both get compiled to machine code. Really, what the OP has demonstrated is that good, well-written code called from bad code produces meaningless numbers that are not the same as bad code written in C. I can't even use that to prove that good code is faster than bad code, since the measurements aren't comparable; but if that were what was being measured, it wouldn't be very surprising. To do a fair comparison of C and Numpy, you'd have to: 1) Use the same type of timer 2) Use the same algorithm (unless you're benchmarking "naive C code" against "well-written Fortran code") 3) Have comparable levels of compile-time optimization 4) Cope with the vagaries of CPU caching 5) Ensure that the same data types are being used everywhere 6) Probably several other things that I didn't think of. The precise way you run the test could easily skew it by orders of magnitude in either direction. > The key point is that numpy was written by skilled programmers who > carefully optimized their code to be as fast as possible for the major > cases. Thus it is quite possible for the numpy code to be faster in C > than code written by a person without that level of care and effort. This is clearly true. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: C is it always faster than nump?
I agree with Richard. Some people may be confused and think c is the speed of light and relativistically speaking, nothing can be faster. (OK, just joking. The uses of the same letter of the alphabet are not at all related. One is named for the language that came after the one named B, while the other may be short for celeritas meaning speed.) There is no such thing as C. C does nothing. It is a combination of a language specification and some pieces of software called compilers that implement it well or less well. There is such a thing as a PROGRAM. A program completely written in C is a thing. It can run fast or slow based on a combination of how it was written and on what data it operates on, which hardware and OS and so on. AND some of it may likely be running code from libraries written in other languages like FORTRAN that get linked into it in some way at compile time or runtime, and hooks into the local OS and so on. So your program written supposedly in pure C, may run faster or slower. If you program a "sort" algorithm in C, it may matter if it is an implementation of a merge sort or at bubble sort or ... As noted, numpy is largely written in C. It may well be optimized in some places but there are constraints that may well make it hard to optimize compared to some other implementation without those constraints. In particular, it interfaces with standard Python data structures at times such as when initializing from a Python List, or List of Lists, or needing to hold on to various attributes so it can be converted back, or things I am not even aware of. So, I suspect it may well be possible to make a pure C library similar to numpy in many ways but that can only be used within a C program that only uses native C data structures. It also is possible to write such a program that is horribly slow. And it is possible to write a less complex version of numpy that does not support some current numpy functionality and overall runs much faster on what it does support. I do wonder at the reason numpy and pandas and lots of other modules have to exist. Other languages like R made design choices that built in ideas of vectorization from the start. Python has lots of object-oriented extensibility that can allow you to create interpreted code that may easily extend it in areas to have some similar features. You can create an array-like data structure that holds only one object type and is extended so adding two together (or multiplying) ends up doing it componentwise. But attempts to do some such things often run into problems as they tend to be slow. So numpy was not written in python, mostly, albeit it could have been even more impressive if it took advantage of more pythonic abilities, at a cost. But now that numpy is in C, pretty much, it is somewhat locked in when and if other things in Python change. The reality is that many paradigms carried too far end up falling short. -Original Message- From: Richard Damon To: python-list@python.org Sent: Fri, Feb 25, 2022 1:48 pm Subject: Re: C is it always faster than nump? On 2/25/22 4:12 AM, BELAHCENE Abdelkader wrote: > Hi, > a lot of people think that C (or C++) is faster than python, yes I agree, > but I think that's not the case with numpy, I believe numpy is faster than > C, at least in some cases. > My understanding is that numpy is written in C, so for it to be faster than C, you are saying that C is faster that C. The key point is that numpy was written by skilled programmers who carefully optimized their code to be as fast as possible for the major cases. Thus it is quite possible for the numpy code to be faster in C than code written by a person without that level of care and effort. There are similar package available for many languages, including C/C++ to let mere mortals get efficient numerical processing. -- Richard Damon -- https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: C is it always faster than nump?
On Sat, 26 Feb 2022 at 03:13, BELAHCENE Abdelkader wrote: > *This is the Python3 program :import timeit as itimport numpy as npimport > systry : n=eval(sys.argv[1])except: print ("needs integer as argument") ; > exit() a=range(1,n+1)b=np.array(a)def func1(): return sum(a)def > func2(): return np.sum(b)print(f"sum with Python: {func1()} and NumPy > {func2()} ")tm1=it.timeit(stmt=func1, number=n)print(f"time used Python > Sum: {round(tm1,2)} sec")tm2=it.timeit(stmt=func2, number=n)print(f"time > used Numpy Sum: {round(tm2,2)} sec")* This is terrible code. Even aside from the messed-up formatting (for which your mail client is probably to blame), using eval() and a bare "except:" clause is not a good way to use Python. And then you get timeit to do as many iterations as the length of the array, which is hardly indicative or meaningful for small values. > *and Here the C program:#include #include #include > long func1(int n){ long r=0;for (int i=1; i<= > n;i++) r+= i; return r;}int main(int argc, char* argv[]){ > clock_t c0, c1; long v,count; int n; if ( argc < 2) > { printf("Please give an argument"); return > -1; }n=atoi(argv[1]); c0 = clock();* > > > > > * for (int j=0;j < n;j++) v=func1(n); c1 = clock(); printf > ("\tCPU time :%.2f sec", (float)(c1 - c0)/CLOCKS_PER_SEC); > printf("\n\tThe value : %ld\n", v);}* At least you're consistent, using an iteration count equal to the length of the array again. But that just means that it's equally meaningless. Did you know that Python's timeit and C's clock don't even measure the same thing? ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: C is it always faster than nump?
On 2/25/22 4:12 AM, BELAHCENE Abdelkader wrote: Hi, a lot of people think that C (or C++) is faster than python, yes I agree, but I think that's not the case with numpy, I believe numpy is faster than C, at least in some cases. My understanding is that numpy is written in C, so for it to be faster than C, you are saying that C is faster that C. The key point is that numpy was written by skilled programmers who carefully optimized their code to be as fast as possible for the major cases. Thus it is quite possible for the numpy code to be faster in C than code written by a person without that level of care and effort. There are similar package available for many languages, including C/C++ to let mere mortals get efficient numerical processing. -- Richard Damon -- https://mail.python.org/mailman/listinfo/python-list
C is it always faster than nump?
Hi, a lot of people think that C (or C++) is faster than python, yes I agree, but I think that's not the case with numpy, I believe numpy is faster than C, at least in some cases. *Is there another explanation ?Or where can find a doc speaking about the subject?*Thanks a lot Regards Numpy implements vectorization for arrays, or I'm wrong. Anyway here is an example Let's look at the following case: Here is the result on my laptop i3: Labs$ *python3 tempsExe.py 5* sum with Python: 1250025000 and NumPy 1250025000 time used Python Sum: * 37.28 sec * time used Numpy Sum: *1.85 sec* Labs$ *./tt5 * * CPU time :7.521730* *The value : 1250025000 * *This is the Python3 program :import timeit as itimport numpy as npimport systry : n=eval(sys.argv[1])except: print ("needs integer as argument") ; exit() a=range(1,n+1)b=np.array(a)def func1(): return sum(a)def func2(): return np.sum(b)print(f"sum with Python: {func1()} and NumPy {func2()} ")tm1=it.timeit(stmt=func1, number=n)print(f"time used Python Sum: {round(tm1,2)} sec")tm2=it.timeit(stmt=func2, number=n)print(f"time used Numpy Sum: {round(tm2,2)} sec")* *and Here the C program:#include #include #include long func1(int n){ long r=0;for (int i=1; i<= n;i++) r+= i; return r;}int main(int argc, char* argv[]){ clock_t c0, c1; long v,count; int n; if ( argc < 2) { printf("Please give an argument"); return -1; }n=atoi(argv[1]); c0 = clock();* * for (int j=0;j < n;j++) v=func1(n); c1 = clock(); printf ("\tCPU time :%.2f sec", (float)(c1 - c0)/CLOCKS_PER_SEC); printf("\n\tThe value : %ld\n", v);}* -- https://mail.python.org/mailman/listinfo/python-list