On 8/21/22 12:17 PM, Dave Jones wrote:
Now I am wondering if, perhaps, the time is right for IBM to re-consider that decision. On modern z processors, we already have IEEE floating point instructions in the hardware, Linux (a popular options for Intel-base number-crunch systems), and support for PCI.e IBM is already allowing 3rd-party SSD drives to be attached and accessed by the o/s.

Which "floating point" instruction are you referring to? My understanding is that there are many.

My understanding is that Intel CPUs also have many different floating point instructions in the hardware.

What if we were able to connect a number, say, 40, of GPU cards (like the Nvida Tesla 1000) to a z box. Have the I/O system pass the GPU card directly an LPAR running on the system.

I would wonder about the ratio of GPUs to systems for failure domain.

40 GPUs per system vs 8 GPUs per system. If there is a system failure, the former takes out all of the GPUs while the latter takes out 1/5 of the GPUs.

Porting the CUDO drives over to Linux (or z/OS, or CMS for that matter) does not appear to be that difficult and the hardware changes should be transparent to the o/s.

That sounds like programming effort in the porting whereas the requisite programs already exist on the Open Systems side.

Linux already supports a large number of scientific. software applications, runs the latest versions of popular scientific languages (FORTRAN, C/C++, Python).

I believe the IBM z would be well-suited to this, as the density of cards in the PCIe cages is far greater than the density that could be obtained in normal PCs.

I question the veracity of that statement.

There are multiple commercially available systems that will hold eight GPUs in a 3 RU server. That means that it would be possible to put 104 GPUs in a standard 40 RU cabinet.

The last time I checked, the smallest IBM Z would fit in a standard 19" cabinet, but it took up a considerable amount of the cabinet. So how many I/O drawers ~> GPUs are you going to fit in that cabinet with the CEC(s)?

Then there's the fact that in most of the GPU based computing that I've seen, a disproportionate amount of the computing is done by the GPU and the CPU does little more than shuffle data around. In some ways, the GPU might be thought of like the processor and the CPU thought of like I/O controllers. SO with this in mind, just how much CPU is needed for an I/O controller? Is it really worth consuming more RUs that can be dedicated to more GPUs?

This, combined with the strong sysplex clustering ability of z/OS (or SSI on z/VM) could allow the system Z platform to pack more computing power into a smaller footprint than a comparable Intel-based Linux cluster system, while being easier to use as programs would not have to be rewritten to take advantage of the system's clustering.

Despite z/OS having impressive clustering abilities, I don't think that GPU based computing would take advantage of it.

It might be an easy sell on it's energy-reduction assets alone, since everyone is now worried about how much energy data-centers now consume.

I don't have any numbers to back it up, but I question the veracity of that.

Thoughts/comments/objections welcome, of course. Full disclosure: this idea was first suggested to me by Enzo Damato after his tour of the POK lab.

I believe that putting GPUs in the mainframe would be very interesting. And would probably have some interesting applications. But I don't think that using a mainframe to drive GPUs is going to be the next big thing in GPU heavy computing.

Also, look at all the BitCoin (et al.) miners out there that use PCIe expanders (fan-out) to connect many GPUs to a single CPU. I've seen GPU to CPU ratios ten to one or greater. So, again, the CPU workload isn't where the demand is. I also think that the CPU workload is what the mainframe brings to the table.

I think this is an interesting thought experiment. But I don't think that it will compete in the market. Partially because if it would, I suspect that it would be doing so already.



--
Grant. . . .
unix || die

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Reply via email to