On Tue, 8 Aug 2023 at 00:06, Jon Perryman <jperr...@pacbell.net> wrote:
> > On Thu, 20 Jul 2023 at 09:01, Rob van der Heij <rvdh...@gmail.com> > wrote: > > It would be interesting to see your evidence of IBM Z not performing > well with Linux. > > Linux on z performs better than Linux on most other hardware. My point is > that Linux wastes much of z hardware. > > Since I haven't seen Linux on z, I have to make some assumptions. It's > probably fair to say the Linux filesystem still uses block allocation. > Let's say it's a 10 disk filesystem and 100 people are writing 1 block > repeatedly at the same time. After each writes 10 blocks, where are the 10 > blocks for a specific user. In z/OS you know exactly where those blocks > would be in the file. If you read that file are these blocks located > sequentially. While the filesystem can make a few decisions, it's nothing > close to the planning provided by SMS, HSM, SRM and other z/OS tools. Like > MS Windows disks, Linux filesystems can benefit from defrag. Also consider > when Linux needs more CPUs than available. Clustering must be implemented > on Linux to increase the number of CPU which does not share the filesystem. > In z/OS, a second box has full access to all files because of Sysplex. > I used to say that with several layers of virtualization, performance is rarely intuitive, and often counterintuitive to the uninformed. The famous case is where Linux "CPU wait" goes down when you give it *less* virtual CPUs. Not having looked at it may not give you the best foundation for an opinion. Linux (on any platform) uses a "lazy write" approach where data is kept in memory (page cache) briefly after a change, to see whether it's going to be changed again. A typical case would be where you're copying a lot of files in a directory, and for each file added, the operating system modifies the (in memory) directory. Eventually, the "dirty" blocks are written to disk (we may worry about data loss around an outage, but that's a different discussion - there are mechanisms to ensure data is persistent before you continue with destructive changes). Because Linux will write out blocks at its own convenience, the device driver can order the data to create runs of consecutive blocks in a single I/O operation. Most serious use cases use journaling file systems on Linux and stripe the file systems over multiple disks, so I'm not entirely sure what you aim at with the blocks of a single extent being close together. Yes, I used to worry about the typical stripe that does not align with the 3390 track length, but as 3390 stopped rotating 30 years ago, the control unit cache is not aligned by track either. I don't think anyone on Linux will defrag a file system, especially not because a lot is either on SSD or virtualized on RAID devices. The data-heavy applications often use FCP (SCSI) rather than FICON attached disk because the logical I/O model doesn't take full advantage of the complexity and cost of running your own channel programs. The common scenario is to run Linux in a guest on z/VM so you can size the virtual machine to meet the application requirements. And z/VM Single System Image lets you move the running virtual machine from one member of the cluster to the other to exploit the full capacity available in multiple physically separate IBM Z hardware configurations. Since Linux is popular on small devices, a lot of applications scale horizontally rather than vertically: when your web traffic increases, you fire up a few more Linux guests to spread the load, rather than triple the size of a single Linux instance and expect everything in the application to scale. It is rare to have a Linux application that can consume a full IBM Z configuration. > I'm sure IBM has made improvements but some design limitations will be > difficult to resolve without the correct tools. For instance, can DB2 for > Linux on z share a database across multiple z frames. It's been a while > since I last looked but DB2 for z/OS was used because it outperformed DB2 > for Linux on z. > I expect "outperformed" depends on the type of workload and the API. When you have a COBOL application intimately connected to DB2 to the point where they share the same buffers and such, that's different from an API that is transaction based and communicates through DRDA over TCP/IP as if the application and the data could be in different places.You get away with a lot of bad things in the application design when latency is neglectable. Customers have Linux applications use DB2 on z/OS because the data is there, not because of performance. Rob