Re: Software drag racing

2021-08-10 Thread Andrew Rowley
I wrote up the results of trying Dave Plummer's software drag racing on 
z/OS.


The focus is mainly Java vs C++ because I do SMF processing in Java, and 
people like to tell me Java is too slow. Results are here:


https://www.blackhillsoftware.com/news/2021/08/10/java-vs-c-drag-racing-on-z-os/

The TLDR summary:

- Initially, the Java solution was much faster than C++
- With some work, C++ matched the speed of Java
- Rewriting the Java code to match the faster C++ code made Java slower, 
to the point where the results were about what you would expect in a C++ 
vs Java comparison
- If you have a zIIP and your CPs are not full speed, Java might be the 
fastest language on your system by a big margin.


--
Andrew Rowley
Black Hill Software

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Software drag racing

2021-07-08 Thread Mike Schwab
First episode of counting number of primes in various languages has been 
uploaded to 
https://www.youtube.com/c/DavesGarage/videos . 
https://www.youtube.com/watch?v=tQtFdsEcK_s

Only 45 languages.  1500 languages claimed on
 http://www.99-bottles-of-beer.net/ 
 http://www.99-bottles-of-beer.net/0.html ...

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Software drag racing

2021-06-28 Thread Andrew Rowley

On 29/06/2021 11:14 am, David Crayford wrote:


I'm using Rocket's Python which is definitely CPython (the same as 
IBMs). When the say "native" they mean it's a native z/OS port.




Interesting... I found a pull request that claims a 65-130x faster 
Python implementation with details here:

https://github.com/PlummersSoftwareLLC/Primes/pull/40

When I compared Python solution_1 to solution_2, solution_2 was about 
200x faster (Ubuntu under VMWare on Windows).


The Python solution_2was just ahead of the C++ solution where I removed 
vector. C++ using vector was about 45% faster.


I guess it shows how much difference a knowledge of language features 
and picking the right structures etc. can make.


--
Andrew Rowley
Black Hill Software

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Software drag racing

2021-06-28 Thread David Crayford

On 28/06/2021 1:15 pm, Andrew Rowley wrote:


Well, I'm seeing different results on z/OS. There is no PyPy on z/OS.



I don't know. IBM describe it as a "Native Python compiler for z/OS" 
so maybe it is JIT compiled? I guess it's a good thing if it is. 


I'm using Rocket's Python which is definitely CPython (the same as 
IBMs). When the say "native" they mean it's a native z/OS port.




--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Software drag racing

2021-06-28 Thread Bill Ogden
Being a bit more curious, I set  //GO.SYSOUT DD DUMMY to remove the 
overhead of formatting and printing, and the CPU time for the 3355440 
version went to 1.10 seconds. Very impressive.

Bill



--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Software drag racing

2021-06-28 Thread Bill Ogden
Not being a COBOL person, I could not resist  trying the prime number 
COBOL program previously listed. Working on a zPDT system (based on a 
rather large laptop) I tried the NA-LINE OCCURS 26214 version and this 
took .14 seconds CPU, producing about 2800 lines of output. I then tried 
the NA-LINE OCCURS 3355440 version and this took 4.97 seconds CPU and 
exceeded my output line count by about 228,000 lines.  In both cases the 
"printed" prime numbers looked like real prime numbers although I did not 
try to factor any of them.  (I did not try the NA-LINE OCCURS 5000 
version!) The CPU times are the emulated IBM Z times for the compiled 
program execution under z/OS 2.4.

 Being very definitely not a COBOL person, I needed to review what goes in 
"section A" vs "section B" of a COBOL source program.

Very interesting.

Bill


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Software drag racing

2021-06-27 Thread Andrew Rowley

On 28/06/2021 2:46 pm, David Crayford wrote:


Well, I'm seeing different results on z/OS. There is no PyPy on z/OS.



I don't know. IBM describe it as a "Native Python compiler for z/OS" so 
maybe it is JIT compiled? I guess it's a good thing if it is.


--
Andrew Rowley
Black Hill Software

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Software drag racing

2021-06-27 Thread David Crayford

On 28/06/2021 12:31 pm, Andrew Rowley wrote:

On 28/06/2021 1:51 pm, David Crayford wrote:

That not particularily fast either. How can Python be faster than C?

danielspaangberg_1of2;2756;5.000862;1;algorithm=base,faithful=yes,bits=1



How did you run the Python version? From the comments it looks like 
running it using PyPy improves the speed about 10x (JIT compiled I 
think). People have also created Python versions using numeric 
libraries which are much faster than the original. I'm not sure what 
optimizations have been included.


TS8004@RSD6 ~/Primes/PrimePython/solution_2 (drag-race) 
─── 
[12:42:56]

> python3 PrimePY.py -t 5
Passes: 3451, Time: 5.001404762268066, Avg: 0.0014492624637114072, 
Limit: 100, Count: 78498, Valid: True

ssovest; 3451;5.001404762268066;1;algorithm=base,faithful=yes,bits=8



The original video was C++ vs C# vs Python:
https://www.youtube.com/watch?v=D3h62rgewZM

Python was slower as expected.


Well, I'm seeing different results on z/OS. There is no PyPy on z/OS.

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Software drag racing

2021-06-27 Thread Andrew Rowley

On 28/06/2021 1:51 pm, David Crayford wrote:

That not particularily fast either. How can Python be faster than C?

danielspaangberg_1of2;2756;5.000862;1;algorithm=base,faithful=yes,bits=1



How did you run the Python version? From the comments it looks like 
running it using PyPy improves the speed about 10x (JIT compiled I 
think). People have also created Python versions using numeric libraries 
which are much faster than the original. I'm not sure what optimizations 
have been included.


The original video was C++ vs C# vs Python:
https://www.youtube.com/watch?v=D3h62rgewZM

Python was slower as expected.

--
Andrew Rowley
Black Hill Software

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Software drag racing

2021-06-27 Thread David Crayford

On 28/06/2021 9:56 am, Andrew Rowley wrote:

On 26/06/2021 4:50 pm, David Crayford wrote:


That's an exaggeration. C is just over 3x faster.

C

    mckoss-c830;14285;5.0;1;algorithm=wheel,faithful=yes,bits=1



The C implementations are confusing because there are various 
algorithm changes as well, rather than just the language change. This 
one uses the wheel algorithm instead of the base (original). I don't 
know enough math or algorithms to say what difference that makes.


I think the solution_2/sieve_1of2.c is the one that corresponds to the 
C++ code.



That not particularily fast either. How can Python be faster than C?

danielspaangberg_1of2;2756;5.000862;1;algorithm=base,faithful=yes,bits=1

BTW, if you want to compile any of the programs in solution_2 you will 
need the following header file as clock_gettime() and gettimeofday() are 
not implemented on z/OS.



#ifndef TIME_ZOS_H
#define TIME_ZOS_H 1
#include 
#include 
#define clock_gettime(n,ts) gettimeofdayMonotonic(ts)
static inline int gettimeofdayMonotonic(struct timespec* Output) {
  // The POSIX gettimeofday() function is not available on z/OS. Therefore,
  // we will call stcke and other hardware instructions in implement 
equivalent.
  // Note that nanoseconds alone will overflow when reaching new epoch 
in 2042.

  struct _t {
    uint64_t Hi;
    uint64_t Lo;
  };
  struct _t Value = {0, 0};
  uint64_t CC = 0;
  asm(" stcke %0\n"
  " ipm %1\n"
  " srlg %1,%1,28\n"
  : "=m"(Value), "+r"(CC)::);
  if (CC != 0) {
    errno = EMVSTODNOTSET;
    return CC;
  }
  uint64_t us = (Value.Hi >> 4);
  uint64_t ns = ((Value.Hi & 0x0F) << 8) + (Value.Lo >> 56);
  ns = (ns * 1000) >> 12;
  us = us - 22089888;
  register uint64_t DivPair0 asm("r0"); // dividend (upper half), remainder
  DivPair0 = 0;
  register uint64_t DivPair1 asm("r1"); // dividend (lower half), quotient
  DivPair1 = us;
  uint64_t Divisor = 100;
  asm(" dlgr %0,%2" : "+r"(DivPair0), "+r"(DivPair1) : "r"(Divisor) :);
  Output->tv_sec = DivPair1;
  Output->tv_nsec = DivPair0 * 1000 + ns;
  return 0;
}
#endif

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Software drag racing

2021-06-27 Thread Andrew Rowley

On 25/06/2021 11:35 pm, Andrew Rowley wrote:


I set the limit to 100,000,000 (the source validates results for 
multiples of 10) and got 10 passes in 5 seconds on my laptop. At 
100,000,000 the bit map doesn't fit in processor cache which might be 
a significant penalty. 1,000,000 gives better granularity in the results.


For kicks I tried a limit of 1,000,000,000 which did 1 pass for 
50,847,534 primes in 6.7 seconds.




I tried the larger numbers on z/OS as well. z/OS numbers are not really 
comparable system to system, because the single thread difference 
between slowest and fastest Z systems is something like 20x. However the 
relative numbers are interesting.


C++:

Primes to 1,000,000: 4829 passes
Primes to 100,000,000: 28 passes
Primes to 1,000,000,000: 2 passes in 5.1 seconds

The drop in performance with larger numbers was less than for the 
laptop, my guess is that is due to the larger CPU caches.


Java (zIIP offline so running on CP):

1,000,000: 4800 passes
100,000,000: 14 passes
1,000,000,000: 1 pass in 9 seconds

The drop in performance for Java was unexpected, my guess is that the 
boolean array is not optimized for 100,000,000 or 1,000,000,000 entries. 
If e.g. each entry is a byte instead of a bit the CPU cache impact would 
be much greater.


I decided to create a Java version using the same bit storage as the C++ 
version i.e. a byte array manipulating individual bits. Again results 
were interesting:


1,000,000: 2675 passes
100,000,000: 14 passes
1,000,000,000: 2 passes in 7 seconds

So it was slower for small numbers, but approached C++ speeds for the 
search to 1,000,000,000.


--
Andrew Rowley
Black Hill Software

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Software drag racing

2021-06-27 Thread Andrew Rowley

On 26/06/2021 4:50 pm, David Crayford wrote:


That's an exaggeration. C is just over 3x faster.

C

    mckoss-c830;14285;5.0;1;algorithm=wheel,faithful=yes,bits=1



The C implementations are confusing because there are various algorithm 
changes as well, rather than just the language change. This one uses the 
wheel algorithm instead of the base (original). I don't know enough math 
or algorithms to say what difference that makes.


I think the solution_2/sieve_1of2.c is the one that corresponds to the 
C++ code.


--
Andrew Rowley
Black Hill Software

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Software drag racing

2021-06-26 Thread David Crayford

On 26/06/2021 2:50 pm, David Crayford wrote:


On 26/06/2021 2:38 pm, David Crayford wrote:


I think the C++ compiler/library is a large part of it. On other 
platforms vector was apparently a big performance improvement, 
on z/OS removing it was a big improvement. So if there is something 
smart happening there on other platforms, there might be something 
not so smart on z/OS. xlclang++ might recover that performance. 


I think the C++ code is sub-optimal. The PrimesC code is 5x faster 
than Java so if I were to compile that code with the C++ compiler 
would it qualify as C++?



That's an exaggeration. C is just over 3x faster.

C

    mckoss-c830;14285;5.0;1;algorithm=wheel,faithful=yes,bits=1

Java

    MansenC;4257;5.00;1;algorithm=base,faithful=yes

NodeJS (lolz! So much for IBMs V8 JIT engine)

    fvbakelnodejs;887;5.004;1;algorithm=base,faithful=yes,bits=1



golang

    bundgaard;3749;5.000486;1;algorithm=base,faithful=yes

C++

    davepl;3460;5.001122;1;algorithm=base,faithful=yes,bits=1

Python3

    ssovest; 3519;5.0006608963012695;1;algorithm=base,faithful=yes,bits=8

    Wow! The Python code has step table optimizations so skips a lot of 
unnecessary calculations


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Software drag racing

2021-06-26 Thread David Crayford

On 26/06/2021 2:38 pm, David Crayford wrote:


I think the C++ compiler/library is a large part of it. On other 
platforms vector was apparently a big performance improvement, 
on z/OS removing it was a big improvement. So if there is something 
smart happening there on other platforms, there might be something 
not so smart on z/OS. xlclang++ might recover that performance. 


I think the C++ code is sub-optimal. The PrimesC code is 5x faster 
than Java so if I were to compile that code with the C++ compiler 
would it qualify as C++?



That's an exaggeration. C is just over 3x faster.

C

    mckoss-c830;14285;5.0;1;algorithm=wheel,faithful=yes,bits=1

Java

    MansenC;4257;5.00;1;algorithm=base,faithful=yes

NodeJS (lolz! So much for IBMs V8 JIT engine)

    fvbakelnodejs;887;5.004;1;algorithm=base,faithful=yes,bits=1

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Software drag racing

2021-06-26 Thread David Crayford

On 26/06/2021 11:59 am, Andrew Rowley wrote:

On 25/06/2021 8:58 pm, Scott Chapman wrote:
If other platforms don't JIT as quickly or aggressively, or if their 
JIT compiler isn't as smart as IBM's then their results may not be 
the same. Similarly, if the IBM C compiler isn't as optimized as it 
is on other platforms, it might underperform.


I don't know how long it takes for JIT to optimize the loop, 5 seconds 
is a lot of passes and in CPU terms is a very long time so it's not 
surprising that it is pretty good. The original Java code ran for 10 
seconds. I reduced it to 5 to match the C++ implementation. I was 
expecting a performance reduction due to warm up time but it wasn't 
noticeable. The timing starts after the Java code is running so JVM 
startup is not measured.


I think the C++ compiler/library is a large part of it. On other 
platforms vector was apparently a big performance improvement, 
on z/OS removing it was a big improvement. So if there is something 
smart happening there on other platforms, there might be something not 
so smart on z/OS. xlclang++ might recover that performance. 


I think the C++ code is sub-optimal. The PrimesC code is 5x faster than 
Java so if I were to compile that code with the C++ compiler would it 
qualify as C++?


FWIW, I'm happy that Java stacks up so well on z/OS as I'm pretty much 
coding 90% on the JVM these days. IBM have done a brilliant job 
optimizing. But these bench tests are more about comparing 
implementations of algorithms then equivalent implementations in 
different programming languages.




--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Software drag racing

2021-06-25 Thread Andrew Rowley

On 25/06/2021 8:58 pm, Scott Chapman wrote:

If other platforms don't JIT as quickly or aggressively, or if their JIT 
compiler isn't as smart as IBM's then their results may not be the same. 
Similarly, if the IBM C compiler isn't as optimized as it is on other 
platforms, it might underperform.


I don't know how long it takes for JIT to optimize the loop, 5 seconds 
is a lot of passes and in CPU terms is a very long time so it's not 
surprising that it is pretty good. The original Java code ran for 10 
seconds. I reduced it to 5 to match the C++ implementation. I was 
expecting a performance reduction due to warm up time but it wasn't 
noticeable. The timing starts after the Java code is running so JVM 
startup is not measured.


I think the C++ compiler/library is a large part of it. On other 
platforms vector was apparently a big performance improvement, on 
z/OS removing it was a big improvement. So if there is something smart 
happening there on other platforms, there might be something not so 
smart on z/OS. xlclang++ might recover that performance.


The PrimeSieve object and boolean array are reallocated each pass, but 
it is really the best case situation for Java GC with no long lived objects.


Andrew Rowley

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Software drag racing

2021-06-25 Thread Andrew Rowley

On 25/06/2021 9:26 pm, Dave Jousma wrote:

I looked at the video, and looks like he is running for 5 seconds, but cannot 
tell how many prime numbers he calculated on the platforms he was testing.   I 
had an old COBOL program laying around from years ago that calculates prime 
numbers.   This morning I cranked it up to a limit of 50,000,000 and it 
completed in 6 seconds.  This is on enterprise cobol 6 on z15.


The program searches for all primes up to 1,000,000, throws away the 
result and repeats as many times as it can in 5 seconds (it's a drag 
race, the task doesn't need to be useful). My laptop (i5-7200U) does 
6500 passes. I think the algorithm has been tweaked and improved since 
he made the video.


I set the limit to 100,000,000 (the source validates results for 
multiples of 10) and got 10 passes in 5 seconds on my laptop. At 
100,000,000 the bit map doesn't fit in processor cache which might be a 
significant penalty. 1,000,000 gives better granularity in the results.


For kicks I tried a limit of 1,000,000,000 which did 1 pass for 
50,847,534 primes in 6.7 seconds.


Someone wrote a GNUCobol solution - I have no idea how good it is:
https://github.com/PlummersSoftwareLLC/Primes/tree/drag-race/PrimeCOBOL/solution_1

z/OS COBOL and assembler solutions would be interesting.

Andrew Rowley




--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Software drag racing

2021-06-25 Thread Dave Jousma
I looked at the video, and looks like he is running for 5 seconds, but cannot 
tell how many prime numbers he calculated on the platforms he was testing.   I 
had an old COBOL program laying around from years ago that calculates prime 
numbers.   This morning I cranked it up to a limit of 50,000,000 and it 
completed in 6 seconds.  This is on enterprise cobol 6 on z15.

PGM-NAME   EXCPCPUSRB  CLOCK
PRIMECB6  17971.08.00.11

000100 IDENTIFICATION DIVISION.
000200 PROGRAM-ID.  PRIMECOB.  
000300*  EXAMPLE OF IN-LINE PERFORMS, END-IF, INITIALIZE STATEMENTS
000400 DATA DIVISION.  
000500 WORKING-STORAGE SECTION.
000600 01  NUMBER-ARRAY.   
000700 05 NA-LINE OCCURS 5000. 
000700*05 NA-LINE OCCURS 3355440.  
000710*05 NA-LINE OCCURS 26214.
000720*  ABOVE OCCURS CLAUSE IS LARGEST THAT WILL WORK FOR COBOL/VS
000800 10 NA-NUMBER PIC X. 
000900 88 IS-PRIME VALUE '1'.  
001000 88 IS-NOT-PRIME VALUE ZERO. 
001100 10 NA-PROOF  PIC S9(8)  BINARY. 
001200 01  CANDIDATEPIC S9(8)  BINARY. 
001300 01  CPRIME   PIC S9(8)  BINARY. 
001400 01  ARRAY-SIZE   PIC S9(8)  BINARY. 
001500 
001600 PROCEDURE DIVISION. 
001700 MAINLINE.   
001800 INITIALIZE NUMBER-ARRAY REPLACING ALPHANUMERIC BY '1'   
001900 INITIALIZE NUMBER-ARRAY REPLACING NUMERIC BY ZERO   
000200  
002100 COMPUTE ARRAY-SIZE = LENGTH OF NUMBER-ARRAY / 5  
002200  
002300 PERFORM VARYING CPRIME FROM 2 BY 1   
002400 UNTIL CPRIME > ARRAY-SIZE
002500 IF IS-PRIME (CPRIME) 
002600 COMPUTE CANDIDATE = CPRIME + CPRIME  
002700 PERFORM UNTIL CANDIDATE > ARRAY-SIZE 
002800 SET IS-NOT-PRIME(CANDIDATE) TO TRUE  
002900 MOVE CPRIME TO NA-PROOF(CANDIDATE)   
003000 ADD CPRIME TO CANDIDATE  
003100 END-PERFORM  
003200 END-IF   
003300 END-PERFORM  
003400  
003500 PERFORM VARYING CANDIDATE FROM 1 BY 1
003600 UNTIL CANDIDATE > ARRAY-SIZE 
003700 IF IS-PRIME (CANDIDATE)  
003800 DISPLAY CANDIDATE ' IS PRIME'
003900*ELSE 
004000*DISPLAY CANDIDATE ' DIVISIBLE BY '   
004100*NA-PROOF(CANDIDATE)  
004200 END-IF   
004300 END-PERFORM.
004400 STOP RUN.

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Software drag racing

2021-06-25 Thread Scott Chapman
That is a bit surprising given that it looks like that only runs for 5 seconds? 
But there's not much code there and it's all in a loop. My recollection is that 
the JIT compiler will step in after x repeated executions, which will happen 
pretty quickly here. And the compiled code as the potential to be as fast (or 
maybe faster) than any other compiled code. Without looking too hard, it 
doesn't appear that there's really any object allocation going on inside that 
loop, so the overhead of Java managing objects on the heap appears to be a 
non-factor here as well.

So I think it would certainly be possible that Java would be similar to any 
other compiled language, if the test ran sufficiently long such that the time 
to get the code JITed is relatively short compared to the overall execution 
time. And IBM did a whole lot of work to speed up JVM startup. Still, it is 
surprising to me that it works that well over a 5 second test. 

If other platforms don't JIT as quickly or aggressively, or if their JIT 
compiler isn't as smart as IBM's then their results may not be the same. 
Similarly, if the IBM C compiler isn't as optimized as it is on other 
platforms, it might underperform.

Scott Chapman

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Software drag racing

2021-06-24 Thread David Crayford
I profiled the C++ code using APA. It's spending all it's time bit 
twiddling and doing mod division. The Java version uses a much more 
efficient implementation that uses an array of boolean's which isn't 
initialized to turn the bits on.


On 25/06/2021 1:01 pm, Andrew Rowley wrote:

On 25/06/2021 2:34 pm, David Crayford wrote:
If you've got the XLC 2.4.1 it will will be installed in the 
/usr/lpp/cbc/xlclang/exe directory. It's 64-bit only. We use this 
compiler exclusively now. It's really cool and has neat features such 
as type checking printf() format flags at compile time.




I have /usr/lpp/cbclib/xlclang/exe/, but:

ls -l /usr/lpp/cbclib/xlclang/exe/
total 16
erwxrwxrwx   1 OMVSKERN OMVSGRP    7 Jun 12  2019 clcdrvr -> CLCDRVR

I think that is linking to a program outside the filesystem.

I found IBM XL C/C++ V2.4.1 for z/OS V2.4 web deliverable with a 
program directory. It has a z/OS dataset that goes in the linklist, I 
can't see that on my system. I suspect I have the OMVS bits but need 
to request the z/OS components be attached to my system (Dallas RDP 
system).





--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Software drag racing

2021-06-24 Thread Andrew Rowley

On 25/06/2021 2:34 pm, David Crayford wrote:
If you've got the XLC 2.4.1 it will will be installed in the 
/usr/lpp/cbc/xlclang/exe directory. It's 64-bit only. We use this 
compiler exclusively now. It's really cool and has neat features such 
as type checking printf() format flags at compile time.




I have /usr/lpp/cbclib/xlclang/exe/, but:

ls -l /usr/lpp/cbclib/xlclang/exe/
total 16
erwxrwxrwx   1 OMVSKERN OMVSGRP    7 Jun 12  2019 clcdrvr -> CLCDRVR

I think that is linking to a program outside the filesystem.

I found IBM XL C/C++ V2.4.1 for z/OS V2.4 web deliverable with a program 
directory. It has a z/OS dataset that goes in the linklist, I can't see 
that on my system. I suspect I have the OMVS bits but need to request 
the z/OS components be attached to my system (Dallas RDP system).



--
Andrew Rowley
Black Hill Software

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Software drag racing

2021-06-24 Thread David Crayford

On 25/06/2021 12:44 pm, Ed Jaffe wrote:

On 6/24/2021 9:34 PM, David Crayford wrote:
If you've got the XLC 2.4.1 it will will be installed in the 
/usr/lpp/cbc/xlclang/exe directory. It's 64-bit only. We use this 
compiler exclusively now. It's really cool and has neat features such 
as type checking printf() format flags at compile time.


If you're referring to IBM's LLVM-based C/C++ compiler, I believe it 
is in a closed beta test status right now.



I'm referring to what's shipped with XL C/C++ 
https://www-01.ibm.com/servers/resourcelink/svc00100.nsf/pages/xlCC++V241ForZOsV24?OpenDocument


--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Software drag racing

2021-06-24 Thread Ed Jaffe

On 6/24/2021 9:34 PM, David Crayford wrote:
If you've got the XLC 2.4.1 it will will be installed in the 
/usr/lpp/cbc/xlclang/exe directory. It's 64-bit only. We use this 
compiler exclusively now. It's really cool and has neat features such 
as type checking printf() format flags at compile time.


If you're referring to IBM's LLVM-based C/C++ compiler, I believe it is 
in a closed beta test status right now.



--
Phoenix Software International
Edward E. Jaffe
831 Parkview Drive North
El Segundo, CA 90245
https://www.phoenixsoftware.com/



This e-mail message, including any attachments, appended messages and the
information contained therein, is for the sole use of the intended
recipient(s). If you are not an intended recipient or have otherwise
received this email message in error, any use, dissemination, distribution,
review, storage or copying of this e-mail message and the information
contained therein is strictly prohibited. If you are not an intended
recipient, please contact the sender by reply e-mail and destroy all copies
of this email message and do not otherwise utilize or retain this email
message or any or all of the information contained therein. Although this
email message and any attachments or appended messages are believed to be
free of any virus or other defect that might affect any computer system into
which it is received and opened, it is the responsibility of the recipient
to ensure that it is virus free and no responsibility is accepted by the
sender for any loss or damage arising in any way from its opening or use.

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Software drag racing

2021-06-24 Thread David Crayford
If you've got the XLC 2.4.1 it will will be installed in the 
/usr/lpp/cbc/xlclang/exe directory. It's 64-bit only. We use this 
compiler exclusively now. It's really cool and has neat features such as 
type checking printf() format flags at compile time.


On 25/06/2021 12:22 pm, Andrew Rowley wrote:

On 25/06/2021 1:45 pm, David Crayford wrote:
Interesting! Try compiling using xlclang++ instead of xlc and see how 
you go. xlclang++ has a far superior standard library.




I'm not sure whether it is installed on my system. I tried it and it 
fails, I think it is looking for module CLCDRVR. Any suggestions where 
that might be located e.g. a LLQ? I tried CLC* and **.SCLC* but no luck.




--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Software drag racing

2021-06-24 Thread Andrew Rowley

On 25/06/2021 1:45 pm, David Crayford wrote:
Interesting! Try compiling using xlclang++ instead of xlc and see how 
you go. xlclang++ has a far superior standard library.




I'm not sure whether it is installed on my system. I tried it and it 
fails, I think it is looking for module CLCDRVR. Any suggestions where 
that might be located e.g. a LLQ? I tried CLC* and **.SCLC* but no luck.


--
Andrew Rowley
Black Hill Software

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Re: Software drag racing

2021-06-24 Thread David Crayford
Interesting! Try compiling using xlclang++ instead of xlc and see how 
you go. xlclang++ has a far superior standard library.


Nothing matches the speed of C :)

mckoss-c830;14259;5.0;1;algorithm=wheel,faithful=yes,bits=1

On 25/06/2021 10:22 am, Andrew Rowley wrote:
Dave Plummer has a series of Software Drag Racing videos, using a 
program to search for prime numbers as a simple speed test for 
different languages and/or hardware. The "drag race" description 
acknowledges that it isn't a comprehensive benchmark, just a test of 
speed at one particular simple task.


I thought it would be fun to try it on z/OS. I modified the C++ 
version to compile on z/OS, and there was a Java version that ran 
without modification.


Results were interesting.

- Java was much faster than C++ on z/OS. I modified the C++ version to 
change the vector to a byte/bit array (as was used in his first 
version) and performance was much better. However it still only 
matched Java, it didn't beat it. On other platforms C++ was much 
faster than Java e.g. 15-50%, maybe more.

- 31 bit code was about 10% faster than 64 bit, for both C++ and Java.

I configured my zIIP offline to make sure the Java code was running on 
a regular CP.


I'm interested to know why C++ didn't outperform Java. C++ isn't my 
language, so I might be missing something obvious. Any ideas?


Software Drag Racing video:
https://www.youtube.com/watch?v=l1j-aF_wyzU

C++ to run on z/OS:
https://github.com/andrew890/Primes/tree/drag-race/PrimeCPPzOS/solution_1

Java version:
https://github.com/andrew890/Primes/tree/drag-race/PrimeJava/solution_1



--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN