Hi Andrew, Le 23/06/2015 19:08, Andrew E. Davidson a écrit : > sorry if this has been asked many times before. (maybe this can be > added to the FAQ?) > > has anyone done any bench marking?
Yes. > > The idea of having a math package that is implemented pure java is > very attractive. My experience with machine learning is that java is > very slow. To go fast you need to take advantage of assembler or > libraries written in fortran or C. For example http://jblas.org/ > <http://jblas.org/> It is not that simple, and in some case, it can be slower ... I did not find the benchmark I presented at several symposium in 2010, but here are some rough results. The tests were done on the QR decomposition plus solving of an A.X = B linear problem, with dense matrices. I did it for dimensions up to 4000x4000 if I remember well. The benchmark was made using the same underlying algorithm (but obvisouly different implementations). The results were, in increasing performance : - Numerical Recipes in fortran, non-optimized - Numerical Recipes in fortran, optimized - LAPACK with ATLAS as a BLAS implementation (almost no difference in non-optimized or optimized) - Apache Commons Math ! Well we were only about 2% faster than LAPACK, and it was on only one algorithm type, on my machine. I was happy and in fact surprised, I did not expect we could reach LAPACK performances. A more realistic result is to look also for other algorithms. Answering your question for the general case, is however more difficult, and here I don't have real benchmarks, only some general feelings. I would say that accross different domains, the speed differences that can be observed are typically a factor 1.5 or 2 (Java being slower), which is a significant difference but clearly not as important as most people think. In fact, there are many factors other than language that are also in this domain of 1.5 or 2. The lessons I learnt here are *not* that we are faster (for most operation, I am sure we are slower), but rather than language is only one factor for speed. Change the algorithm and you change the speed. Change the compiler and you change the speed. Change the optimizer and you change the speed. Change the human developer and you change the speed, change your computer for one that is only a few months more recent and you change the speed ... Attempting to use Java-fortran native interface to get speed is almost always a bad idea. The reason is that the layer between the languages is difficult to go through and really slow. You will spend much of the type in this layer rather than in real processing code. This is especially true for matrices because double[][] are not packed as a lot of double numbers in some specified order after an initial pointer, you often have to copy between Java arrays (which are objects) to C or fortran arrays and you lose a lot of time doing copies. Java-fortran native interface is useful, but not for speed considerations. From my experience, it is more useful for interfacing with libraries that have only one implementation and that you cannot afford to port (because they are huge, because they are highly domain specific and nobody else use them, because they have been validated and you cannot take the risk to introduce a bug by porting them, because you don't have the time, or because you don't have the money). In my domain (space systems), we use extensively Apache Commons Math and some upper level Java libraries and since a few years we have replaced many older fortran and C libraries. In all cases, we are either as fast or much faster. This is mainly because when developping these replacements libraries, we have chosen different architectures, used newer algorithms, used different trade-offs between memory and processing than what was available to engineers 20 or 30 years ago. For sure, if they were to develop again their libraries in fortran by now, they will also improve their results. So what is important is what you can achieve at present time, using present algorithms and present languages. If your work is really focused on linear algebra, there are other Java libraries that are faster than Apache Commons Math for this specific domain (some use native interface, some don't). Linear algebra is one of our weak points. Apache Commons Math is a library with a broad coverage, not a specialized one for linear algebra only. So as a summary, yes there have been some benchmarks. Yes Java can be fast (and it can also be slow depending on how well it is developed, just like all other languages). best regards, Luc > > > Kind Regards > > Andy > > > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@commons.apache.org For additional commands, e-mail: user-h...@commons.apache.org