Re: How to unformat a dasd drive?
Mark Post writes: > >>> On 5/19/2016 at 07:33 AM, Malcolm Beattie <beatt...@uk.ibm.com> wrote: > > my $dev = sprintf("/dev/disk/by-path/ccw-0.0.%04s", $devno); > > You should not assume that the first two pieces of the busid will always be > 0.0. Even today, it can be 0.1 or 0.2, etc., depending on what CSS the > device is in. OK, best make it my $dev = "/dev/disk/by-path/ccw-$devno"; and have the caller ensure the argument is passed in canonical form. --Malcolm -- Malcolm Beattie Linux and System z Technical Consultant, zChampion IBM UK Systems and Technology Group -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to unformat a dasd drive?
Mark Post writes: > >>> On 5/18/2016 at 04:27 PM, Malcolm Beattie <beatt...@uk.ibm.com> wrote: > -snip- > > So an ad hoc and quick and dirty way to write a couple of tracks > > of a lone R0 that'll be treated by Linux as "n/f" is (starting > > with DASD device 777 offline): > > Cool, it does indeed work. So I wrapped a script around that and uploaded it > to http://wiki.linuxvm.org/wiki/Projects_and_Software/Scripts for anyone > that's interested in it. > > Bug reports are welcome, but don't expect rapid turnaround. :) Having the following "llunformat" as the low-level script that does the unformatting (invoke as "llunformat devno") may be preferable in terms of keeping people's lunch down compared to the Perl one-liner: #!/usr/bin/perl # Copyright 2016 IBM United Kingdom Ltd # Author: Malcolm Beattie, IBM # Last update: 19 May 2016 # Sample code - NO WARRANTY # use strict; sub ckd { my ($c, $h, $r, $key, $data) = @_; my $count = pack("nnCCn", $c, $h, $r, length($key), length($data)); return $count . $key . $data; } sub track { my ($data) = @_; return $data . ("\xff" x 8) . ("\0" x (65536 - 8 - length($data))); } my $devno = shift @ARGV or die "Usage: llunformat devno\n"; my $dev = sprintf("/dev/disk/by-path/ccw-0.0.%04s", $devno); my @cmd = (qw(dd bs=65536 oflag=direct), "of=$dev"); open(DD, "|-", @cmd) or die "dd: $!\n"; for (my $h = 0; $h < 2; $h++) { print DD track(ckd(0, $h, 0, "", "\0" x 8)); } --Malcolm -- Malcolm Beattie Linux and System z Technical Consultant, zChampion IBM UK Systems and Technology Group -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to unformat a dasd drive?
Mark Post writes: > >>> On 5/17/2016 at 11:51 AM, Scott Rohling <scott.rohl...@gmail.com> wrote: > > I'm wondering if something like this would work?: > > > > You can add 'count=xx' to only write so many blocks... not sure how many > > it takes to wipe formatting info - nothing to play with at the moment.. > > > > > > dd if=/dev/zero of=/dev/dasdX iflag=nocache oflag=direct bs=4096 > > It doesn't seem to work. You're still going to be limited by the > dasd_eckd_mod driver to writing in the formatted space and not the raw device > itself. I even tried turning on the raw_track_access and that didn't help > either. Trying to use both that and oflag=direct caused an I/O error and the > dd aborted. You do indeed need to enable raw_track_access (must be done while device is offline to Linux) but you also need to write valid track images and use 64KB O_DIRECT I/Os. Linux interprets track images as starting from the R0 (no key, 8 bytes of \0 data) (not starting from the 5-byte HA as done in AWS format) and having 8 0xff bytes terminate the track data (followed by padding to 64KB). So an ad hoc and quick and dirty way to write a couple of tracks of a lone R0 that'll be treated by Linux as "n/f" is (starting with DASD device 777 offline): # chccwdev -a raw_track_access=1 -e 777 # perl -e 'for ($h=0;$h<2;$h++){printf "\0\0\0%c\0\0\0\x8%s",$h,(("\0"x8).("\xff"x8).("\0"x65512))}' | dd bs=65536 count=2 oflag=direct of=/dev/disk/by-path/ccw-0.0.0777 Then take the device offline and online (in normal mode) again with # chccwdev -d 777 # chccwdev -a raw_track_access=0 -e 777 Works for me. Note that you need the explicit resetting of raw_track_access back to zero since the attribute is "sticky" across varying offline/online. --Malcolm -- Malcolm Beattie Linux and System z Technical Consultant, zChampion IBM UK Systems and Technology Group -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: How to find a memory leak?
Alan Altmark writes: On Thursday, 07/09/2015 at 04:25 EDT, Mark Post mp...@suse.com wrote: The next question is - can this ever be done by a non-root user? I tried No. # ls -l /proc/sys/vm/drop_caches -rw-r--r-- 1 root root 0 Jul 9 16:23 /proc/sys/vm/drop_caches Thank heavens! That's all we need -- unprivileged users messing with the cache Even unprivileged programs have limited and controlled access to influencing the caching behaviour for files that they deal with, whether via read/write or mapped into memory. There are the POSIXy interfaces: madvise(..., MADV_RANDOM) and fadvise(..., POSIX_FADV_RANDOM) madvise(..., MADV_SEQUENTIAL) and fadvise(..., POSIX_FADV_SEQUENTIAL) Similarly WILLNEED, DONTNEED and a few extras like: fsync(...) fdatasync(...) and one or two where the APIs or functionality aren't as standardised or common like readahead(...). Linux has per-open-file tracking of readahead window information and per-page marks in the page cache itself and does a good job of deducing the right amount of sync/async readahead based on access pattern and memory pressure in most common cases. However, it's nice to be able to give it a hint or two (e.g. I'm going to stream through this file once and then won't need it again) while continuing to use the usual simple file APIs without having to mess around reinventing your own buffering or fiddle around with separate threads, async I/Os or separate access methods (or equivalent) in O/Ses where caching is all-or-nothing or privileged-control-only. --Malcolm -- Malcolm Beattie Linux and System z Technical Consultant, zChampion IBM UK Systems and Technology Group -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Single User mode Linux Guest
Michael MacIsaac writes: What I see is a prompt for the root password: INIT: Going single user INIT: Sending processes the TERM signal *Give root password* for maintenance (or type Control-D to continue): If it's under z/VM, you can do ipl ... parm init=/bin/sh and it'll just start up a shell on the console without even starting up the real init. Then you can passwd root follwed by a #cp signal shutdown If you're not under z/VM then you only get to specify the loadparm from the HMC Load panel so put prompt (no quotes) in the loadparm, do the Load and then in the Operating System Messages/SCLP applet, respond to the menu prompt with 1 init=/bin/sh (where the number 1 is the menu entry to use rather than a runlevel number). --Malcolm -- Malcolm Beattie Linux and System z Technical Consultant, zChampion IBM UK Systems and Technology Group -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Enabling SSL for access to a z/OS guest
Cameron Seay writes: This may be a question for the z/OS board, but since all of our z/OS lives inside z/VM guests I will ask it here first. The IBMVM mailing list would have been better still but there's a good sized overlap between the IBMVM and LINUX-390 lists. Our IT staff wants us to use SSL so that outside users can access the z/VM LPAR without having to get vpn accounts. Currently they do. We access z/OS via logging into a VM LPAR and then dialing into the z/OS guest. The 3270 client we use has SSL capability. What needs to be enabled/turned on on the VM side to allow a connection via SSL? The IT folks are going to open a port for this purpose. Follow the Configuring the SSL Server chapter of the z/VM TCP/IP Planning and Customization manual to get the base SSL and TLS support set up with your certificate and to get the SSL service virtual machine(s) set up. There were significant changes brought in in z/VM 6.2 for SSL (e.g. multiple server pools) so the exact method depends on what level of z/VM you're using and if you're still on 5.4 then there'll be a bit of tweaking you'll need to remember to do to the configuration when you upgrade. Then for SSL-secured tn3270 access you follow the Configuring the TCP/IP Server chapter. You need to choose one or both of: (a) having z/VM TCPIP and the tn3270 clients negotiate SSL via TLS (no need for a separate port). You use INTERCLIENTPARAMS statements to configure it: TLSLABEL to choose your certificate label and SECURECONNECTION NEVER|REQUIRED|PREFERRED|ALLOWED to set your policy on whether clients can/must negotiate SSL. Some tn3270 clients that support SSL don't support TLS-negotiated SSL and some of those that support TLS have problems depending on which end tries to negotiate first so that may influence your SECURECONNECTION or TLS choice. (b) having the tn3270 client make an immediate SSL-protocol connection in which case you need a separate port and add SECURE your_cert_label to the relevant portnum TCP INTCLIEN line in the PORT section of your PROFILE TCPIP. --Malcolm -- Malcolm Beattie Linux and System z Technical Consultant, zChampion IBM UK Systems and Technology Group -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: NFS migration
Jake anderson writes: Recently we did a migration from one NFS storage server to another NFS storage server. During this Migration all the copied File had owners as root. In the recent NFS storage server the FTP server option is no more available so we have mounted the NFS storage to a linux running on VMware infra(as a ftp server). So when we try change the owner of any file mounted to Linux we get a permission denied(Even when we try it as root). The message we get is permission denied(This is the only message). The ls -l clearly gives that all the file has the owner as root. Has any undergone this situation ? Why a root cannot change the owner(root) to someother ID ? Since the files have the User and Group copied from previous NFS storage. Aren't there anyways to change the Owner and Group from Linux ? It's the NFS server that's forbidding it. It's very common in all but the snazziest of NFS environments for the NFS server to squash the root user of NFS clients and treat it as an unprivileged, anonymous user. This avoids having a root user on any NFS client getting root-level access to all exported files on the server. For a Linux-based NFS server, the export options root_squash (which is the default) and all_squash (probably not the case here) do this. You need an explicit export option no_root_squash to allow root on the chosen NFS clients to be allowed to chown and access exported files as though they were uid 0 on the server. Other NFS servers or appliances may present the option differently. --Malcolm -- Malcolm Beattie Linux and System z Technical Consultant IBM UK Systems and Technology Group -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Bash script for FTP to mainframe
Bauer, Bobby (NIH/CIT) [E] writes: That's interesting, it is prompting for the password!? ftp -inv $HOST EOF user $USER PASS $PASS Many of the user-visible commands to ftp clients are the same as or similar to the underlying ftp protocol commands that the ftp client sends over the network to the server. That sometimes makes it easy to conflate the two concepts and can cause confusion with what's actually going on. In the ftp protocol itself, the client program sends a user ... followed by a pass ... to the server to complete the logon process. However, the ftp client program gets the information from the end user differently. In your case, you're using the -i option (or else it would prompt interactively for the username) and you're using the -n option so it's not auto-logging in with username/password from the ~/.netrc file. (You might wish to consider holding the password there instead of stashing it in the script). So the program starts up and you use the end-user user username [password] command. The program uses the username component and sends a user username protocol command. However, it then needs the password to send. To get it, it either takes it from the second argument of your user command or, if not there, prompts you on the terminal for it (bypassing stdin). Although the ftp client program then sends a pass ... protocol command to the server, it's not an end-user command which can be used. So, to return to your original try: HOST=nih USER=me PASS=password ftp -inv $HOST EOF user $USER $PASS [...] Remote system type is MVS. (username) 331 Send password please. 530 new passwords are not the same Login failed. I know the password is correct. I don't know what it is doing/complaining about when it says the new password is not the same. Anybody know how to do this? I look up the 530 new passwords are not the same error from the z/OS Communications Server IP and SNA Codes manual and find: 530 new passwords are not the same Explanation: The PASS command was issued using the format old_password/new_password/new_password to change the password of the user ID, but the second “new password” was not identical to the first “new password”. Both “new passwords” must be the same. So I wonder if there may have been a slash character in your password? Then the z/OS ftp server interprets a slash as an implicit logon-and-change-password request which is failing. I'm mildly surprised it would have decided to check the equality of the two new passwords (and return the error) before verifying that the actual password (to the left of the first slash) was valid. --Malcolm -- Malcolm Beattie Mainframe Systems and Software Business, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Putting devices offline automatically
Mauro Souza writes: Hi guys, I have a client with a peculiar problem. They have zVM 4 partitions, sharing the same LCU's, every partition sees every DASD, and each partition have its own range of disks, defined with Offline_at_IPL on SYSTEM CONFIG. They use this setup because some times they need to access in one partition a disk belonging to another. When this need arises, they issue a VARY ON 2345 and get the disk online, use it, copy something, VARY OFF 2345. Works fine. The problem: some channels are getting offline sometimes. We know that people are always messing with the cables, and from time to time one fiber or another gets loose, they fix, and so on. And when the device comes back online, every single DASD on that CHPID comes online too, ignoring the Offline_at_IPL statement. As it should, because Offline_at_IPL is just for IPL. We are thinking on a method and apparatus for getting those DASDs offline when the CHPID gets back to life. I already have a REXX (I just deleted a lot of lines of CPSYNTAX and added a few more) that parses SYSTEM CONFIG and looks if a given DASD is in the Offline_at_IPL range, and can put a DASD offline. I just could not make the exec run by itself (or by MAINT, or another CMS machine) every time a channel status changes. I tried to setup PROP, and it looks fine, except it doesn't react at all. My PROP RTABLE is configured to run my exec, but when the channel gets back, it does nothing. If I send the message by hand from MAINT, the exec runs, and puts the device offline if it is on Offline_at_IP range. I guess I will have to read everything about PROP again (I could find few documentation and examples), in case I have missed something. I saw the NOTACCEPTed statement for DEVICES on SYSTEM CONFIG, but looks like it will take the device offline forever, and we will need to bring it online sometimes. Does anyone have any idea for us? NOTACCEPTED is reversible: provided you have# FEATURES ENABLE SET_DEVICES in your SYSTEM CONFIG, you could try doing CP VARY OFFLINE rdev CP SET DEVICES NOTACCEPTED rdev and see if that prevents the disappearance/reappearance of the channel triggering it coming online again. The CP Commands and Utilities Reference describes the behaviour as: NOTACCEPTed tells CP not to accept the specified device or devices when the device(s) is dynamically added to VM from another partition. When VM is running second level it will prevent a device(s) from being accepted when a device is attached to the virtual machine in which VM is running. If VM dynamically defines a device for its own partition the NOTACCEPTed designation is overridden. and it's not completely clear to me which case is covered by channel reappears. Worth a try though. Bringing it online again, you should only need a CP VARY ONLINE rdev CP still will retain some knowledge of the device though, so if that doesn't work and you want CP to forget about it even more (though still not entirely), you could try CP VARY OFFLINE rdev CP SET RDEVICE rdev CLEAR CP SET DEVICES NOTSENSED rdev CP SET DEVICES NOTACCEPTED rdev After that lot, to bring it back with default DASD options you'll want something like CP SET DEVICES SENSED rdev CP VARY ONLINE rdev or, if you want non-default settings for the rdev's setting of SHARED, EQID or MDC, you'll need instead an explicit CP SET RDEVICE rdev ... TYPE DASD ... which includes your options, followed by CP VARY ONLINE rdev (maybe preceded by CP SET DEVICES ACCEPTED rdev CP SET DEVICES SENSED rdev for the sake of completeness). --Malcolm -- Malcolm Beattie Mainframe Systems and Software Business, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Issues using VMUR
Shumate, Scott writes: That worked, but I'm having issues with making it perm. I added it to /etc/zipl.conf and reran zipl. I rebooted but it was still on the black list. Have a look at the output of # cat /proc/cmdline and check the syntax closely. For example, there must be no spaces within the cio_ignore= value, there must be an exclamation mark to remove the device number rather than add it, the !0.0.000c has to appear after the all not before and I can imagine that the device number may have to be spelled out in the full canonical form of zero dot zero dot four hex digits. Also check you edited the zipl.conf stanza for the kernel you then actually booted. Since the contents of /proc/cmdline show the information from the current boot, you'd be able to tell if it's the wrong stanza because it wouldn't have your edits in place. Send the output (along with the output of cio_ignore -l and lscss for good measure) if it's still not clear. --Malcolm -- Malcolm Beattie Mainframe Systems and Software Business, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Issues using VMUR
Shumate, Scott writes: Contents of /etc/zipl.conf [...] parameters=root=/dev/mapper/VolGroup00-lv_root rd_DASD=0.0.0701 rd_DASD=0.0.0700 rd_NO_LUKS rd_DASD=0.0.0702 LANG=en_US.UTF-8 rd_NO_MD KEYTABLE=us cio_ignore=all,!0.0.0009,!0.0.000c rd_LVM_LV=VolGroup00/lv_root SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=VolGroup00/lv_swap rd_NO_DM That shows the !0.0.000c and looks fine so 00c should be available after the next reboot, provided zipl is run after the edit and before rebooting. Output from cat/proc/cmdline (I don't see !0.0.000c) [root@wil-zvmdb01 ~]# cat /proc/cmdline root=/dev/mapper/VolGroup00-lv_root rd_DASD=0.0.0701 rd_DASD=0.0.0700 rd_NO_LUKS rd_DASD=0.0.0702 LANG=en_US.UTF-8 rd_NO_MD KEYTABLE=us cio_ignore=all,!0.0.0009,!0.0.0009 rd_LVM_LV=VolGroup00/lv_root SYSFONT=latarcyrheb-sun16 crashkernel=auto rd_LVM_LV=VolGroup00/lv_swap rd_NO_DM BOOT_IMAGE=0 This shows two copies of !0.0.0009 in place for the current boot. Since the original was cio_ignore=all,!0.0.0009 it looks as though an additional 0.0.0009 had been added giving cio_ignore=all,!0.0.0009,!0.0.0009 instead of the second one being !0.0.000c. I've just tried on an RHEL62 system with the exact same kernel as yours and !0.0.000c works fine for me. Any chance the second !0.0.0009 was added, then zipl run (which writes the change to the boot block) and only then was it noticed and changed to !0.0.000c without then rerunning zipl and rebooting? Ouput from cio_ignore -l Ignored devices: = 0.0.-0.0.0008 0.0.000a-0.0.000b 0.0.000d-0.0.06ff This, though, shows that 00c is not being ignored and should be usable at the moment. Given that it isn't in the kernel cmdline shown above for the current boot, it must have dynamically set via a cio_ignore command done directly or indirectly. RedHat uses various scripts triggered from udev hot-plug rules to fiddle with cio_ignore for things like dasd but I hadn't thought they'd done anything as polished for vmur. Output from zipl [root@wil-zvmdb01 ~]# zipl Using config file '/etc/zipl.conf' Run /lib/s390-tools/zipl_helper.device-mapper /boot/ Building bootmap in '/boot/' Building menu 'rh-automatic-menu' Adding #1: IPL section 'linux-2.6.32-220.el6.s390x' (default) Preparing boot device: dasdb. Done. Running zipl doesn't just tell you the current configuration, it writes the current zipl.conf settings into the boot block to be used for the next reboot. It looks as though maybe some of the various steps and tests (dynamic cio_ignore, editing zipl.conf, running zipl, rebooting, seeing /proc/cmdline and the current cio_ignore table and accessing the device) weren't in the intended order. If (1) zipl.conf really does have the !0.0.000c in it as shows (2) and zipl has been run after that change was made then you should find that after the next reboot you see the newly stamped setting via /proc/cmdline and the corresponding omission of 0.0.000c in the output from cio_ignore -l. --Malcolm -- Malcolm Beattie Mainframe Systems and Software Business, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Issues using VMUR
Shumate, Scott writes: I'm having issues wondering if someone could help me out. I'm trying to receive files from my reader. I'm currently running RHEL6. [...] I list the rdr with vmur [root@wil-zvmdb01 dev]# vmur li ORIGINID FILE CLASS RECORDS CPY HOLD DATE TIME NAME TYPE DIST RSCS 0007 B PUN 0006 001 NONE 03/04 16:11:39 SCOTT EXEC SYSPROG RSCS 0008 B PUN 0006 001 NONE 03/04 17:19:44 SCOTT EXEC SYSPROG LXP10001 0004 T CON 0744 001 NONE 02/26 17:04:46 LXP10001 This works because listing the contents of the reader is done via a DIAG call rather than Linux prodding the device itself. I try to bring rdr online [root@wil-zvmdb01 dev]# chccwdev -e 000c Device 0.0.000c not found This is because RedHat defaults to ignoring all devices via the cio_ignore blacklist and only enabling those that are explicitly removed from the blacklist. To do this dynamically, do: # cio_ignore -r 00c You can then bring it online with chccwdev. You can ensure that the device is not ignored at the next boot by modifying the cio_ignore parameter in the kernel command line in /etc/zipl.conf and rerunning zipl. For example, you could change cio_ignore=all,!0.0.0009 to cio_ignore=all,!0.0.0009,!0.0.000c or getting rid of the cio_ignore blacklist and allowing all the virtual machine's devices to be seen. --Malcolm -- Malcolm Beattie Mainframe Systems and Software Business, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Speed of BASH script vs. Python vs. Perl vs. compiled
John McKown writes: This is more a curiosity question. I have written a bash script which reads a bzip2 compressed set of files. For each record in the file, it writes the record into a file name based on the first two words in the record and the generation number from the input file name. Do to the extreme size of the input (47 files, each of which would be around 120 Gb to 180 Gb expanded or 23 to 27 million lines - very large). Basically there are probably around 50 or so (don't know) possible combinations of the words. I'm wondering if I rewriting the script into either Python or Perl (both basically interpreted) would be worth my while. Perl (and Python) aren't simply interpreted. In the case of perl, it compiles the source into an internal op tree (rather like bytecode) while performing a decent amount of cheap optimisation (peephole optimisation mostly) and then runs that internal structure. Python will do something similar but the internal representation is different. Most if this isn't relevant to your situation here though. Or should I go with a compiler such as C/C++? Or, lastly, is it basically irrelevant due to the extremely large number of records and the minimal processing; which means that I/O will dominate the application. It's not I/O that dominating in your implementation below, it is (as others have spotted) the opening and closing the relevant file on every single line of input. Either Perl or Python will let you remove this cost entirely. In fact, on your bash script below, bash seems to read each character from its uncompressed input in a separate read syscall which is dreadful but may be fixable. If you're interested, the bash script looks like: #!/bin/bash for i in irradu00.g*.bz2;do gen=${i#irradu00.}; # remove prefix gen=${gen%.bz2}; # remove suffix, leaving generation bzcat $i |\ while read line;do fn=${line%% *} # remove all trailing characters after a space ft=${line:9:8} # get second word ft=${ft%% *} # and remove trailing spaces echo ${line} ${fn}.${ft}.${gen}.tx2; done; done This Perl program (or analogue in Python or whatever) is likely to give (and strace on some small test data shows) much, much better behaviour for larger input files: #!/usr/bin/perl use strict; use IO::File; my %fhcache; sub newfh { my $filename = shift; my $fh = IO::File-new($filename, a) or die $filename: $!\n; $fhcache{$filename} = $fh; return $fh; } sub getfh { my $filename = shift; return $fhcache{$filename} || newfh($filename); } foreach my $infile (irradu00.g*.bz2) { open(IN, bzcat $infile|) or die bzcat $infile: $!\n; my ($gen) = $infile =~ /\.(.*)\.bz2/; while (IN) { my ($fn, $ft) = split; getfh($fn.$ft.$gen.tx2)-print($_); } close(IN); } --Malcolm -- Malcolm Beattie Mainframe Systems and Software Business, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Sending a signal along with associated data.
Thomas Anderson writes: You are correct in that the use of signals is pretty limited and there isn't a convenient way to pass data to the target application. POSIX real-time signals allow an int or pointer's-worth of data to be sent along with the signal which is then queued and also carries the sender's uid, gid and pid. See the Real-time Signals section of signal(7) for the care needed to select an unused rt signal number and see sigqueue(3) for how to send them. --Malcolm -- Malcolm Beattie Mainframe Systems and Software Business, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: SSH and LDAP/RACF
Florian Bilek writes: 2.) In principle the login via SSH is working very good. I encountered recently a kind of weakness in the configuration: A RACF user that uses its own RSA keys to log into the system. When I do a RACF revoke on that user, it seems that the LDAP check not takes place and the user can still login. What can be done about that? There's a section of the sshd(8) man page beginning: Regardless of the authentication type, the account is checked to ensure that it is accessible. An account is not accessible if it is locked, listed in DenyUsers or its group is listed in DenyGroups. The definition of a locked account is system dependant. Some platforms... and which then (as I try to ignore the misspelling of dependent) gives O/S-specific ways that it checks for locked accounts, usually by special contents of a directly-accessed shadow password field such as *LK, Nologin, !. From that, I'd guess that sshd may not invoke PAM in a way that would let you use pam_ldap to do the appropriate lookup via LDAP. What about, as a workaround, creating a RACF group named NOLOGIN, connecting revoked users to that group (an extra step, but that's why I called it a workaround not a proper solution) and then putting DenyGroups nologin in your sshd_config? If z/VM LDAP doesn't special case group membership lookups for revoked users then I think that may work. --Malcolm -- Malcolm Beattie Mainframe Systems and Software Business, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Start networkadapter without reboot
Ursula Braun writes: correct, state HARDSETUP should be a temporary state during online setting for a qeth device only. If the device stays in state HARDSETUP something unexpected occurred. To understand the reason, the trace file /sys/kernel/debugfs/qeth_setup/hex_ascii should be checked by a qeth expert. It's a wrap around buffer; thus it needs to be saved immediately after the problem shows up. I can reproduce this easily on SLES11SP1 kernel 2.6.32.12-0.7-default (not up to date on service at all). I'll send you a transcript. --Malcolm -- Malcolm Beattie Mainframe Systems and Software Business, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Start networkadapter without reboot
Ursula Braun writes: looking through your transcript, I find the definition of the NIC (vmcp def nic 800 type qdio), but I do not see a vmcp couple command to bind the created NIC to a VSWITCH or GuestLAN. This would explain the failing STARTLAN command for this qeth device. I intentionally didn't bother with a COUPLE since I was trying to reproduce Berry's problem and also expecting the vNIC to act like a normal NIC and let me configure it and even ifconfig it up before plugging it into a switch. I'd thought that that used to work but maybe not. Would it be considered a bug or an unimplemented feature that it doesn't act that way? Actually, even when I then couple it to a VSWITCH, the state remains in HARDSETUP (even after I do an echo 1 recover too) and an ifup eth1 still fails. That makes it even more unlike a normal NIC and seems very inconvenient. I'll send you the trace for that too in case that part is unexpected. --Malcolm -- Malcolm Beattie Mainframe Systems and Software Business, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Start networkadapter without reboot
van Sleeuwen, Berry writes: I've found some more information when looking through the /sys/ directory. The state for this device is in HARDSETUP. nlzlx204:~ # cat /sys/devices/qeth/0.0.0f10/state HARDSETUP Searching for this I've found some information in patch reports, for instance at git390.marist.edu and kerneltrap.org. This status is a result of a boot without the actual device available. Indeed, what we have here now. When the cable is plugged (or the vswitch connected) the state should switch to SOFTSETUP or eventualy even to UP. But it doesn't. Would it be possible to get it to UP dynamically or is this a bug that can be resolved with a reboot or network restart only? (kernel level is at 2.6.32.12-0.7). Try # echo 1 recover You may need to take the device offline first (echo 0 online) and then bring it back again after the recovery attempt (echo 1 online). --Malcolm -- Malcolm Beattie Mainframe Systems and Software Business, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: bash question.
McKown, John writes: Very nice! Thanks. I guess that I'm going to end up dedicating a weekend day to just read the entire output from info bash. Luckily, I can create a text file from it, convert it to PDF format, then read the PDF directly on my Kindle DX or Android tablet. In case you weren't aware of it already, the utilities used to process the *roff macros used in man pages support typesetting to PostScript as well as generating simple text output. So typing man -t bash bash-man.ps will generate you a nicely formatted PostScript version of the man page in bash-man.ps, fancy fonts and all, instead of what you'd get from just taking the text version. That's suitable for direct pringting but you can instead just ps2pdf bash-man.ps to produce your bash-man.pdf PDF version. Using info bash instead of man bash uses a slightly different source of documentation (the FSF document their own programs in their own GNU info format instead of man pages) but you'll nearly always find that some nice people have already ensured that your distro has man pages for the programs as well and that they have either exactly the same content or are close enough for most purposes. There are ways of generating various typeset-like formats from info format too but I forget what they are and I don't think they are as simple as just adding -t to your man command invocation. --Malcolm -- Malcolm Beattie Mainframe Systems and Software Business, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: New book: Linux Health Checker 1.0 User's Guide
David Boyes writes: Do we have any equivalent System Health Checker for z/VM? Would be an interesting Summer of Code project if someone were willing to mentor the student. You'd need a college that still had a VM system, though -- which pretty much limits it to a few candidates. I provide Linux guests and second level z/VM systems, not just z/OS, on the Zeus hub used by EMEA universities in the z Academic Initiative program. I'd have thought the US zAI folks would likely do the same on their hubs. --Malcolm -- Malcolm Beattie Mainframe Systems and Software Business, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Oracle in virtual environments
Harder, Pieter writes: Yes, this policy of non-support of Oracle on virtual systems applies to VMware as well. That is why we still run our main Oracle on Sparc iron. Political trouble ahead ;-) Berry van Sleeuwen writes: In a discussion with our Oracle group they claim that there is no support for Oracle in virtual systems and that they therefore will not support on virtual systems too. So when a (performance) problem is found they first advise to migrate to a dedicated server, and increase resources, before they attempt to solve the problem. This is not (only) true for z but also for other virtual systems as well, we discovered this because of the advice to migrate off of cloud systems. So basically any 'cloud' service is not advised to run Oracle. [...] Can anyone confirm this statement? Is it Oracle or is it the interpretation or our Oracle group? Is there a formal statement from Oracle itself? I'm not Oracle and I should think you'll get an official response soon, but just to give you the good news as soon as possible: I've seen public statements from Oracle that System z virtualisation is handled specially by them and is fully supported, unlike most other virtualisation. The nearest public statement I find to hand I have is the 11th foil (labelled 22) of Oracle's SHARE presentation from 13 August 2008, titled Virtualizing Oracle Servers with Linux on IBM System z by Barry Perkins of Oracle and IBM's own Kathryn Arrell which says: IBM System z Server Virtualization – A Proven Platform ... System z virtualization is fully supported by Oracle Database, Real Application Clusters, Fusion middleware (and that statement is highlighted in red to indicate its importance). --Malcolm -- Malcolm Beattie Mainframe Systems and Software Business, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Porting old application- numeric keypad
Smith, Ann (ISD, IT) writes: They have tried diff. As John says, GNU diff, as available on any Linux, provides a lot of powerful functions include recursion support (specified explicitly via the -r option). I'd encourage anyone using diff to use the -u option to provide unidiff format output which is much more human-readable and provides more context information used by patching software to behave more robustly in the face of applying patches to slightly modified files. Using diff to do diff -ur dir1 dir2 and such like is something I do fairly frequently and I've never found any glaring omissions in its functionality. Has some functions but appareently not all that dircmp -d provides. What functionality do they think is missing from GNU diff? I wouldn't be surprised if education plus possibly some minor pre/post-processing with other utilities solved their problems. --Malcolm -- Malcolm Beattie Mainframe Systems and Software Business, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: DB2 Performance Questions
Shedlock, George writes: We are conducting a proof of concept for one of our divisions. This includes DB/2 9.7 and Suse 10 SP 3 with CKD dasd. The first opportunity is the defines of tablespaces. In our x86 environment, this runs in less that 3-4 minutes (yes, there are a lot of tables). In our z/Linux guest that same set of defined runs in about 3-4 HOURS!. The only thing we have seen as far as activity is a very large number of disk I/O's to dasd. The tables define approx. 800-900 GB. What we think we are seeing is that the table spaces are being formatted. We have tried the no file system caching option on the define, but is flagged as an invalid option. IBM is saying that this is the default, but after the tables are defined and we look at the tables, we see that the option is turned on. If we then try to turn it off, it is again flagged as an invalid option. Unless things have changed recently, DB2 V9 for LUW on Linux for System z only supports no file system caching (i.e. direct I/O) when using FCP SCSI disk access, not with ECKD disk access. I find this documented in Table 16 (Supported configuration for table spaces without file system caching) on pp160-161 of the DB2 V9 Administration Guide: Implementation manual (SC10-4221-00). My copy of the manual is from quite a while back so a newer may have some changes. Assuming the restriction is still in place, I share your pain. Within the constraints of doing without direct I/O, you can at least try to ensure that your DB2 data is spread and striped across a large enough number of device numbers (virtual and real, possibly including PAV devices of various flavours if appropriate) and across enough channels, back-end disks, ranks and whatever other objects your back-end DASD subsystem needs to ensure good performance. There are presentations around which describe the various configuration issues you need to cover. Without one of those and/or the help of someone else who knows where to look, you can easily run into bottlenecks due to configuration rather than fundamental hardware or software. --Malcolm -- Malcolm Beattie Mainframe Systems and Software Business, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Showing a process running in BG
Tom Duerbusch writes: I have a process that may or may not be running in background. When I use any of the forms of ps, it shows the process running, but, I don't understand if any of the fields being displayed, indicate that this is a BG process. It all looks the same to me G. If the process is running in the background, I need to follow the path of how did it get there (bg). If the process isn't running in background, I have a different problem all together. Use the -j or j option of ps to list the process group, session ID and controlling terminal of your processes. So if you prefer your ps options SysV-flavoured, do ps -ejww or if you prefer them BSD-flavoured, do ps ajxww You may be able to get way without knowing the precise details of how processes, process groups, sessions and controlling terminals interact. The common cases are (1) The process is a daemon: no controlling terminal (TTY column ?), pid = pgid = sid. (2) The process is an interactive shell: has a controlling terminal, pid = pgid = sid. (3) The process is a part of a foreground or background job of an interactive shell: has a controlling terminal (that of the shell that started the job), sid is same as the shell's sid. The leader of the process group (pid = pgid) is usually the first command in a pipe line (e.g. for a | b | c, the pgid will be the pid of a and b and c will have same pgid but, of course, different pids). The difference between foreground and background is whether the pgid is the one set on the controlling terminal. The shell uses tcsetpgrp() on the controlling terminal to switch between foreground and background. The important consequences are things like hitting Ctrl/C on the terminal sends an interrupt signal to its process group (foreground processes) and processes attempting I/O to their controlling terminal get refused with SIGTTOU sent if their process group doesn't match (i.e. background processes), although that behaviour is configurable. --Malcolm -- Malcolm Beattie Mainframe Systems and Software Business, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Standby storage and user direct
Brad Hinson writes: Anyone know if there's support in z/VM's user direct file for defining standby storage/memory when defining a user? There's no specific keyword so you do it via COMMAND DEFINE STORAGE AS size STANDBY size You can include a RESERVED size in there too but unless you're wanting to simulate closely an LPAR environment where other LPARs have used up the spare memory before you, it doesn't seem much use. --Malcolm -- Malcolm Beattie Mainframe Systems and Software Business, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Z10 vs x86 or Sparc
Dave Jones writes: z196: quad core chips chip speed: 5.2 GHz L1: 64K I / 128K D private/core L2: 1.5M I+D private/core L3: 24MB/chip - shared plus L4: 192MB/book - shared between the 20 or 24 cores (then multiply by the 1-4 books). --Malcolm -- Malcolm Beattie Mainframe Systems and Software Business, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: selinux question
Neale Ferguson writes: Thanks. I used the chcon command to change the context but am still having problems and seeing this in the audit log: type=AVC msg=audit(1296596790.809:1547): avc: denied { execute } ... Now it's complaining about execute whereas before it was only complaining about read. I'm no expert here, but I believe the types of object are in general different from the types of subjects for Type Enforcement which is the usual SELinux policy. If you look in the selinux-policy SRPM (just do a build-prepare with rpmbuild -bp), you'll find the source for the snmpd policy in directory serefpolicy-3.7.19/policy/modules/services in files snmp.fc, snmp.if and snmp.te for, respectively, the contexts for particular directory names (for use with restorecon), the interfaces and the underlying types. I'm looking at Fedora 13 but it's probably close. I see stuff in there for it reading lib files and executing init scripts and so on but I see nothing for loading dynamic modules. If you want to solve this properly rather than using a blunt hammer then you could maybe look at the apache.* policy files in the same directory and see how the httpd_modules_t type is implemented there to handle Apache DSOs and use similar type and interface definitions for snmpd. --Malcolm -- Malcolm Beattie IBM Mainframe Systems and Software Business, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Notification when a spool file arrives
Eddie Chen writes: Over the weekend I took a look at VMUR.CCP source to insert FD_ZERO(), FD_SET() and select(). I thought adding would be trivial. However, when I took a look at the device driver VMUR.C on the internet, I found that the OPEN calls diag_read_next_file and if there are no data(no reader file) it will returns ENODATA(No data). That means when I open the /dev/00c to get the file descriptor it will come back NO data thus no file descriptor for the select(). Also I notice that it does not allow OPEN in write mode as well. (Subject line changed to clarify thread contents.) Indeed, the interface to unit record devices (whether via channel programs or via DIAG) is not a blocking one: you try a read from the device and it either gives you data or it comes back immediately with no there's no data. Hence why I wanted to add an asynchronous notification that a new file has appeared in the reader. The I/O model is that when this happens, CP (presumably modelling what would happen with real hardware) presents an unsolicited interrupt to the driver. I've looked at two ways in which that could conveniently be sent through to userland. First of all, I've added an arrived attribute to ur devices which gets incremented each time a file arrives so cat /sys/bus/ccw/drivers/vmur/0.0.000c/arrived contains a number checkable by scripts or programs (useful in case of wakeups or restarts of an app so it can check if anything really happened). One of the notification methods is via a sysfs KOBJ_CHANGE uevent which can be caught by the udev subsystem. I added this on Friday and it seems to work nicely. In practice, it means you can either have a script which sits waiting for a file to appear by doing udevmonitor (or udevadm monitor depending on kernel) to wait for such events or you can add a udev rule to /etc/udev/rules.d with something like BUS=ccw, DRIVERS=vmur, ACTION=change, RUN+=/do/something to trigger a program to run when a file arrives. There's a fancy netlink API to uevents too if scripting doesn't appeal. The other notification method is indeed via a blocking I/O but not for the device itself. The good news is that sysfs *now* allows a drivers to wake up readers of a sysfs attribute by triggering poll() to see a POLLPRI condition. I can add that easily to the arrived attribute meaning that something roughly like fd = open(/sys/bus/ccw/drivers/vmur/0.0.000c/arrived, O_RDONLY); struct pollfd pi = { .fd = fd, .events = POLLLPRI }; rc = poll(pi, 1, 0, 0); will block nicely and wake up with POLLPRI in pi.revents when a new spool file arrives. The only annoyance is that when I say *now*, I mean in modern kernels because that API (sysfs_notify and in, even easier, sysfs_notify_dirent) is not in kernels of SLES10 SP2 era. I need to think about how best to ensure that people with older kernels can distinguish clearly what's available and what isn't (depending on how much backporting various people want to do). Of course, nothing here should be taken as meaning that IBM is committing to add this functionality and I'm only here talking about what I tried on my own test system. --Malcolm -- Malcolm Beattie Mainframe Systems and Software Business, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Notification when a spool file arrives
Alan Altmark writes: On Tuesday, 12/07/2010 at 09:07 EST, Malcolm Beattie beatt...@uk.ibm.com wrote: First of all, I've added an arrived attribute to ur devices which gets incremented each time a file arrives so cat /sys/bus/ccw/drivers/vmur/0.0.000c/arrived contains a number checkable by scripts or programs (useful in case of wakeups or restarts of an app so it can check if anything really happened). But you're not opening a spool file, you're opening a special character device. No, I'm opening a sysfs file--maybe I was unclear. The notification mechanism I'm talking about it is via the sysfs driver model file (/sys/.) and not the special character device file (/dev/...). The latter is, as both you and I have written, a non-blocking model. In the Linux driver model, device-related notifications can be sent as uevents (broadcast over an AF_NETLINK socket, one of whose listeners is udevd with its configurable rulesets undef /udev) or, more recently, via a POLLPRI condition on a descriptor opened on a sysfs file. Neither of these affects the open/read/write/close behaviour of the special character device file in /dev. Or I may have misunderstood your point? --Malcolm -- Malcolm Beattie Mainframe Systems and Software Business, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Question to smsgiucv
Florian Bilek writes: I am looking for a possibility using the Virtual Reader under z/VM in z/LINUX. The idea is to process files received from a z/OS via RSCS. Off course I could regullarily start VMUR to poll the RDR but couldn't that be done much smarter with an event starting VMUR ? I've kept meaning to add select() support or similar to vmur since I wrote the original but it's never quite made it to the top of my priority list. It should just be a few lines of code (catch the unsolicited interrupt and wake any waiters) in the right place. I'll try to take a look soon if nobody gets in there first. --Malcolm -- Malcolm Beattie Mainframe Systems and Software Business, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: IBM zEnterprise System announced???
Eric Spencer writes: [ Mark Post writes:] On 7/22/2010 at 06:19 PM, Marcy Cortes marcy.d.cor...@wellsfargo.com wrote: Sounds like it will be talking over that private IP network rather than some sort of CP co-processor though, so anything is possible. I didn't see any mention of an IP network, just that it was private. That could mean a lot of things. I think its private as in not visible outside the box(s), its not a part of your ip network in general. it is not proprietary I believe it's using standard ip protocols. The IEDN (intraensemble data network) is a flat, VLAN-aware, 10GbE switched network and you can use IPv4 or IPv6 as you wish. You can, if you want, have the IEDN behind the z196 and bring in all your external network connections into ordinary OSA-Express ports on the z196. In this case, connectivity between the z196 and the IEDN is via OSA-Express ports configured with a special CHPID type (OSX instead of OSD). However, you also have the option to bring your external data network directly into the IEDN via the TOR switch in which case it is the customer's particular responsibility to configure VLANs correctly via the TOR switch and consider firewalling requirements. In case there's any confusion, the other special network is the INMN (intranode management network) and that indeed is private: it connects the HMCs, SEs, zBX (at hardware, firmware and hypervisor level) and CPC (via OSA-Express ports configured as CHPID type OSM) for private chatter of management stuff. It's still IP though (IPv6 link-local, in fact). For more details, see section 7.4 of the IBM zEnterprise System Technical Guide redpiece (SG24-7833). --Malcolm -- Malcolm Beattie Mainframe Systems and Software Business, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: IBM zEnterprise System announced???
Dave Jones writes: Thanks, Alan, that's what I wanted to knowwe still treat these blades as distributed servers, only they're connected to the z via a secure, fast, internal network. Excellent. But wait, there's more... Once a blade is purchased and entitled to be put in the zBX, as soon as it's put in the zBX it becomes part of the z world. Assuming there's the usual z hardware support in place from IBM, the support immediately changes to 24x7 for the blade, it integrates into the call home mechanism of the box, it's monitored and watched just like any other z component and if anything goes wrong the usual z CE comes out and does the repair/replacement. Similarly, all firmware/hypervisor changes are done via the z HMC in the same way as, for example, channel cards, crypto cards and so on. I've already heard of one customer that's considering adding a zBX to a coupling-facility-only footprint (even though there's going to be no app data connectivity between the z196 and the zBX) purely to get the benefits of moving that level of management of the blade estate into the arena of z technology and z support. Oh, and when the zBX is installed, it doesn't just get dumped at the data centre door by the truck driver (as I'm told some blade chassis arrive)--it counts as z and so the full installation gets done in the same way as other z hardware. --Malcolm -- Malcolm Beattie Mainframe Systems and Software Business, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: automatic email?
John McKown writes: On Mon, 23 Feb 2009, Adam Thornton wrote: On Feb 23, 2009, at 8:13 PM, John McKown wrote: Many thanks for any ideas. I wrote a very simple MTA based on Net::SMTP in perl to do this. It's straightforward. Net::SMTP makes it very, very easy, assuming you already speak Perl. However: Most Linux distros let you configure an MTA to use a remote host as its smarthost with a couple of clicks. I would not recommend sendmail for anything other than an emetic in this day and age, but certainly Debian's packaging of Exim lets you set up a mailserver pointing to a smarthost trivially, and I believe I remember that it's a single line in postfix as well. Adam Thanks for the pointers to Exim. I do know Perl, somewhat. I'll look at Net::SMTP as my needs are minimal. I'm not the sysadmin on this particular box (I support a vendor application), so installing software is a bit difficult. I need to request it from the sysadmin and then it needs to be approved by corporate security (believe it or not). The advantage to having an MTA running locally is that it handles all the corner cases of SMTP, queueing and logging so that you don't have to: what to do when the server is unavailable, what to do when the server is responding slowly, what to do when the server sends a temporary 4xx error, what to do when the server plays protocol games (more usual for externally facing servers, but still...), what to do when you suddenly want to send lots of mails at once. It's your local MTA's job to queue them for you locally, keep track of them, ensure they get sent out reliably eventually and let you know exactly whether they have been received successfully. If someone blames you for an email going missing and the remote folks can't/won't find what happened from their logs (maybe that doesn't happen these days but it sure used to...) then it's nice to be able to grep what happened out of your logs (exigrep is a useful utility if you use Exim) and tell them exactly when the email was sent across and exactly what their server said. Can you tell this is from the heart and from real experience? ;-) You don't need to have the MTA listening on an external interface (just localhost) so you don't have to worry about incoming mail and mail relay security. --Malcolm -- Malcolm Beattie System z SWG/STG, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: Udev ressize 256 too short
Marcy Cortes writes: Does anyone know what these mean? This is SLES10 SP 2 - new install. The volume groups are very large - vg02-vg06 each contain 22 mod 54 volumes (almost a Terabyte). vg01 contains 12 mod 54. Volume group system is pretty small at 6G. There was some discussion on this list a year ago, but no resolution was posted. Waiting for udev to settle... Scanning for LVM volume groups... Reading all physical volumes. This may take a while... Found volume group vg02 using metadata type lvm2 Found volume group vg01 using metadata type lvm2 Found volume group vg04 using metadata type lvm2 Found volume group vg05 using metadata type lvm2 Found volume group system using metadata type lvm2 Found volume group vg06 using metadata type lvm2 Found volume group vg03 using metadata type lvm2 Activating LVM volume groups... 1 logical volume(s) in volume group vg02 now active udevd-event[4269]: run_program: ressize 256 too short [lots more ressize 256 too short lines snipped] Some initial googling makes it look like udevd is invoking a program with util_run_program() or udev_exec() and the caller is expecting that program to produce a result (by writing to stdout) of no more than 256 characters. The called program is producing too much stdout which makes its caller sad. A search of the udev codebase shows #defines of length of 256 for some filename and directory name buffers and one or two other things. You could try running udevmonitor while the scan takes place and then work through which udev rules and external programs get invoked to try to catch the one constructing the undesirably long output. --Malcolm -- Malcolm Beattie System z SWG/STG, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: z10 BC is Here
Bruce Hayden writes: It must be you.. It is basically a single frame z10 EC in size and is not a rack. It does use an MCM, but like the z9 BC, it isn't mounted in a book and you can't add another. Actually, it's not an MCM (Multi Chip Module) in the z10 BC, it's six SCMs (Single Chip Modules): 4 separate SCMs for the 4 separate Enterprise Quad Core chips (3 active in each) and 2 other SCMs for the SC (System Controller) chips. For pictures and more detail the draft of the redbook IBM System z10 Business Class Technical Overview (SG24-7632) is now available at http://www.redbooks.ibm.com/redpieces/abstracts/sg247632.html --Malcolm -- Malcolm Beattie System z SWG/STG, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: var subdirectory
Gentry, Stephen writes: I think I've painted myself into a corner. My root subdirectory has gotten full. When I built this linux, /opt and a couple other subdir were installed as separate mount points. Now, I'd like to move var to a separate dasd/mount point. When I try to rm the var subdirectory, I use the -rf command. However, 5 subdir's won't delete. They are: `var/lib/nfs/rpc_pipefs/statd' `var/lib/nfs/rpc_pipefs/portmap' `var/lib/nfs/rpc_pipefs/nfs' `var/lib/nfs/rpc_pipefs/mount' `var/lib/nfs/rpc_pipefs/lockd' I get an operation not permitted message. I figure that maybe there is a task running, so I go and kill some tasks that look like they might be related to this, but still no luck. I need to remove this subdir so I can mount the new one (var). I do not have a 2nd linux running therefore, I cannot mount this disk to a 2nd one and delete var. Basically, all I have is a linux command line in a 3270 session. I can't putty into this linux under existing conditions. Does anyone have any suggestion? You will find that there is a (pseudo)filesystem mounted on /var/lib/nfs/rpc_pipefs which supports some fancy NFS functionality. You will need to unmount it first or else avoid descending into it when attempting to remove files under /var. I'm surprised that going down to runlevel 1 doesn't unmount it but perhaps the init.d scripts don't tidy up everything or else some nfs-related kernel module keeps some refcount on it. After stopping all nfs-related services via their init.d scripts, try a fuser -m /var/lib/nfs/rpc_pipefs to see if any processes are still around. Once those are stopped, you should be able to umount /var/lib/nfs/rpc_pipefs unless there's a kernel refcount held on it somehow. --Malcolm -- Malcolm Beattie System z SWG/STG, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: NTP daemon problem (was: Weird application freeze problem)
Edmund R. MacKenty writes: Does anyone know of a Linux tool that would give more accurate information about process wake-ups? It would be nice to be able to profile Linux daemons like this and see which ones play nice in a VM environment, because ntpd sure doesn't! Try strace -tt -T -o strace.log -p $pid and use filter options to avoid too much output. man strace for details. --Malcolm -- Malcolm Beattie System z SWG/STG, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: DDR'ing 3390 DASD To Remote Location
Rob van der Heij writes: When we saw the first new z/VM installations with Linux show up, I proposed a new feature for the Linux disk driver that would allow arbitrary tracks to be read and written (like the pipeline stages). That way a Linux guest could be used to backup the VM packs along with the Linux data. And for D/R restore you could first IPL one Linux guest native, restore the VM packs (from TSM) and then IPL VM again. Something like that would fit your needs. The design of the driver appeared to be very simple after a few beers, but next morning it turned out to be harder. I wrote writetrack and readtrack kernel modules for Linux 5 years ago which implement ioctls to do that along with simplistic userland utilities and they worked OK for me to transfer various VM and z/OS disks as images via plain Linux files. The internal kernel API for DASD driver disciplines was a bit icky back then so they would need a good polish (or simply a rewrite based on the existing template). The unit record driver I did was eventually noticed and requested by enough customers (I assume) that Boe asked me for it, polished it and pushed it upstream so perhaps a similar thing might work if people are interested in full track read/write for Linux. No guarantees, since I don't know how those requests were routed and prioritised before they ended up as a request to me. I suggest requests be sent via whatever the usual official route is for customer requests (i.e. not directly to me, I'm afraid). --Malcolm -- Malcolm Beattie System z SWG/STG, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: DDR'ing 3390 DASD To Remote Location
Rob van der Heij writes: On Thu, Jun 19, 2008 at 5:11 PM, Malcolm Beattie [EMAIL PROTECTED] wrote: I wrote writetrack and readtrack kernel modules for Linux 5 years ago which implement ioctls to do that along with simplistic userland utilities and they worked OK for me to transfer various VM and z/OS disks as images via plain Linux files. The internal kernel API for DASD driver disciplines was a bit icky back then so they would need a good polish (or simply a rewrite based on the existing template). IMHO the disadvantage of that approach is that you need another userspace tool to get the data in and out of the driver. It is harder to integrate in existing backup processes. The design we discussed back then was to show the cylinders of the volume as files in a directory (bonus points when arranged according to CMS formatted minidisks). That way you could backup the data as any other Linux data, and have automatically a way for incremental (per cylinder) backups etc. Should be straightforward to use FUSE to add such a filesystem wrapper around the underlying readtrack/writetrack ioctl. --Malcolm -- Malcolm Beattie System z SWG/STG, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: Measuring CPU performance? Which is right?
CHAPLIN, JAMES (CTR) writes: On the zLinux guest (ZP013), using sar I get a CPU usage of about 15%: [...] But under Perfkit (zVM) we get the following exception message, 33.5% CPU: 11:51:51 FCXUSL317A User ZP013 %CPU 33.5 exceeded threshold 30.0 for 5 min. [...] We have two IFLs defined to the guest. [...] Why are the numbers from PERFKIT different from the zLinux environment? PerfKit percentages are calculated as percentage of one engine. Linux percentages calculate percentage of CPU resource available to the image. For your Linux guest with 2 engines, Linux tells you it's using ~15% of its 2-engines'-worth. PerfKit spells that as ~30% of a nominally-100%-utilised single engine. Same resource usage, different way of displaying the measurement. [For the purposes of this posting, I'm treating any remaining few percent difference as a second order effect or else we'd muddy the waters with discussing a bunch of more complex measurement issues.] --Malcolm -- Malcolm Beattie System z SWG/STG, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: recover root password
RPN01 writes: To be completely compliant, everything done by / with root will need to be logged, showing what was done, and by whom. Can you do that now, with two or more people logging into root? Can you do it with even one person logging into root? Not on any distribution I know today. Quick plug: I'll be covering Linux native tools for auditing (auditd/auditctl), accounting (acct/sa) and other things beginning with A[1] in my technical session at the z Tech Conference in Dresden next month. There are trade-offs involved in enabling such things but if you really want to audit everything root does, you can. --Malcolm [1] ACLs and Activity reporting. -- Malcolm Beattie System z SWG/STG, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: how can I mount /tmp as a tmpfs in SLES9?
Rob van der Heij writes: On Tue, Mar 18, 2008 at 5:18 AM, Mark Post [EMAIL PROTECTED] wrote: The normal usage of Linux in /tmp is pretty limited, so I don't think I'd be scared about a few MBs there. But since those files probably remain in page cache while you need them, you do not win anything there. Others have discussed a lot of the aspects of /tmp configuration in this thread but I'll just point out that there is a much bigger win with tmpfs beyond the data in page cache that would apply even with /tmp on a normal filesystem on a fast block device. Linux internally models the whole filesystem hierarchy (directories, sub-directories, files, etc) with its VFS layer and caches it in the structures in its dcache. A normal filesystem has to take those internal structures and record them into blocks (and read them from blocks) so that the block layer can do the I/O. Directory contents have to be squished into a format which can be used as metadata blocks and recorded on the block device, as does inode data like last access time and so on. tmpfs doesn't have to do any of that at all since it's just a thin layer around the VFS. That reduces the path length for filesystem operations from file op - VFS - fs - block layer - device driver (e.g. DIAG) to file op - VFS+mm That's a particularly big win for metadata-intensive operations. A surprising number of applications do indeed use /tmp (often creating and immediately unlinking the file so you may not see them around much) and I think there are meta-data heavy ones too although my experience with those is out of date. Such apps do things like extracting tar files to /tmp and then walk/read through the results. I've just tried out an example: make a script which untars a tar file with ~4000 files of about 10KB each (I used /etc) into a directory and then does rm -rf on it. The only system I can do the test on at the moment is dreadful for proper measurement (tiny SLES10SP1 under z/VM 4.4 as a capped guest under z/VM 5.x hence no DIAG either so there's dasd_fba driver overhead there that wouldn't be present). Running a couple of those test scripts in parallel, I get that tmpfs is twice as fast as ext2 (mounted noatime) on VDISK (FBA not DIAG though) with the CPU pegged at its cap of ~30% but that system setup is so unusual it's probably not very useful. An internal throughput test of a similar nature on a proper system would be interesting. (Trying a different mail setup; let's see if it works.) --Malcolm -- Malcolm Beattie System z SWG/STG, Europe IBM UK -- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: Dynamic configuration changes
Mark Post writes: If an additional CPU gets DEFINEd (via CP), or configured online (in an LPAR), you then need to bring it online to the Linux system by echoing a 1 into /sys/devices/system/cpu/cpu?/online, where ? equals the number of the CPU, (starting with 0). Now that Mark's explained nicely what to do, I'd like to follow up with a big fat warning about what *not* to do: do not use CP DETACH CPU n to take a virtual CPU out of your guest's configuration once present. A DETACH CPU will immediately trigger the effects of a CP SYSTEM CLEAR and your guest will be dead in the water with all its memory zeroed. Once a CPU has been DEFINEd into the guest's configuration, only use Linux sysfs to vary it online or offline and don't DETACH it. Another minor warning: I seem to remember one version of SLES9 having problems with hot CPU support. I'm not sure which service pack level; it might even have been before SP1. The symptom was that you could vary offline (and then online) CPUs that existed and were online when the guest was IPLed but if you did a dynamic CP DEFINE CPU then a following echo 1 .../online would fail to add it and would leave an uninterruptible process around or something similar. Since the additional_cpus and possible_cpus boot parameters came along, I think things now all work as they should. --Malcolm -- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: cmsfs and cmsfs.o
Mark Perry writes: is there a way to use the bus/device address - such as 0.0.0190 rather than having to find out the /dev/dasdx? The udev configuration on SLES9 and SLES10 gives you /dev/disk/by-path/ccw-0.0.0190 and, if you're careful to avoid duplicate volsers (as seen by the Linux guest) then you can use /dev/disk/by-id/VOLSER on SLES9 or /dev/disk/by-id/ccw-VOLSER on SLES10. Or tweak your udev configuration to follow your own naming conventions. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] -- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: Moodle
Stephen Frazier writes: Is anyone running Moodle? Moodle is a course management system (CMS) - a free, Open Source software package designed using sound pedagogical principles, to help educators create effective online learning communities. Yes, I use it on Zeus (the hub supporting the European universities in the System z University Program for Europe/Academic Initiative). I started using it last year and am rather impressed with it. (Still on leave; still posting from home.) --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] -- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: How can we print to a VM virtual printer ? SUSE 9 and VM 00E ?
David Boyes writes: Would like to send/route print from z/Linux guest to the guests virtual printer 00E. No RSCS, not VTAM. Sure we could FTP but processing the spooled print output in CMS REXX is so much simpler. Any suggestions would be appreciated. There is no supported unit-record driver for Linux (Malcolm Beattie wrote one long ago, but AFAIK it hasn't been updated for 2.6 kernels, so really isn't much help any more). I did a 2.6 version which was taken up by Boeblingen late last year and is, perhaps, headed for mainline--I haven't heard recently. I don't know the timescale or how many changes it'll go through first. I'm on leave at the mo, so won't be checking right now. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] -- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: scp question.
Alan Altmark writes: On Tuesday, 02/20/2007 at 10:13 CST, McKown, John [EMAIL PROTECTED] wrote: Is there any definative documentation, such as an RFC, which states how scp handles the files that it transfers? In particular, I have the Cygwin scp on my Windows XP system. I am running IBM's Ported Tools version of OpenSSL and SSHD server on z/OS 1.6. When I do a simple: scp file [EMAIL PROTECTED]:file The contents of the file on z/OS has automagically been converted from ASCII to EBCDIC. This just seems __wrong__ to me. Start with RFC 4251, the Secure Shell Protocol Architecture. It will lead you to other RFCs. ssh data transfer has no concept of text or binary. It just moves bytes around. Here's some more rather surprising behaviour, using just ssh: thinkpad% echo -n ABC | ssh zos 'od -t x1' 00C1 C2 C3 03 So the three bytes (0x41, 0x42, 0x43) sent by the ssh client end up being read on stdin by od as 0xc1, 0xc2, 0xc3, i.e. converted from ASCII to EBCDIC. There's no scp there, just a stream of bytes to move around. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM Europe System z -- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: a history question
Richard Troth writes: Like Rd said, no timestamps in BASH history. (Other shells have history too, and still no timestamps.) Er, both bash and tcsh can timestamp history. tcsh does so by default and bash does so if you set HISTTIMEFORMAT. man bash says HISTTIMEFORMAT If this variable is set and not null, its value is used as a format string for strftime(3) to print the time stamp associ- ated with each history entry displayed by the history builtin. If this variable is set, time stamps are written to the history file so they may be preserved across shell sessions. Often a bit of care needs to be taken to consider what behaviour is wanted for saving history between sessions, multiple shells, login v. non-login shells etc. The man pages for tcsh and bash go into the details. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] -- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: FCP
McKown, John writes: No. Only z/VM and Linux understand FCP connected DASD. z/OS and z/VSE cannot access it. Actually, z/VSE as of 3.1 can indeed access SCSI disks via FCP. There's a chapter in the z/VSE Planning guide (chapter 9: Using SCSI Disks With Your z/VSE System ) which is a good place to start reading about it. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] -- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: possible to boot zLinux (SLES8) in different user mode?
Peter 1 Oberparleiter writes: [EMAIL PROTECTED] wrote: is it possible to boot zLinux (SLES8) in different user mode (2)? We're having the problem that a process is hanging during startup, unfortunately before the sshd daemon gets started... Newer distributions support a boot menu which can be activated at IPL time to specifiy an additional command line parameter. Unfortunately, as far as I know, this feature is not available on SLES8 systems. If you're running under VM and don't mind delving into a few CP commands then you can use the following quick and dirty way. I saw it on an IBM internal forum recently. I've added some extra explanation, shown some example output and mentioned the case where the parmline is EBCDIC instead of ASCII. I've tested it (the ASCII one anyway) and it works for me but no guarantees. Create a trace trap for when Linux starts running: CP TRACE I R 1 Then IPL from your normal boot device: IPL vdev You'll almost immediately see Tracing active at IPL - 0001 BASR 0DD0CC 0 Display the current kernel command line (the first 100 bytes) as hex and ASCII: D TX10480.100 If you're parmline is in ASCII (see below if not), it'll show something like R00010480 64617364 3D323830 302D3238 30462072 06 *dasd=2800-280F r* R00010490 6F6F743D 2F646576 2F646173 64613120*oot=/dev/dasda1 * R000104A0 766D706F 3D4C 4F474F46 4620766D*vmpoff=LOGOFF vm* R000104B0 68616C74 3D4C4F47 4F46460A *halt=LOGOFF.* R000104C0 ** R000104D0 to 0001057F suppressed line(s) same as above The parmline is terminated with a newline character (ASCII 0x0A above). You can append a space and S to the parmline (which tells Linux to boot into single user mode) by overwriting the newline with space+S+newline. Do that as follows (still assuming you're parmline is ASCII), replacing the address with where your trainling newline character lives: STORE S104BB 20530A Note that the leading S before the address 104BB (no intervening space) says that the following data (20530A) is hex for a byte string (in ASCII, 0x20 = space, 0x53 = S, 0x0A = newline). Repeat the display to check you got it right: D TX10480.100 R00010480 64617364 3D323830 302D3238 30462072 06 *dasd=2800-280F r* R00010490 6F6F743D 2F646576 2F646173 64613120*oot=/dev/dasda1 * R000104A0 766D706F 3D4C 4F474F46 4620766D*vmpoff=LOGOFF vm* R000104B0 68616C74 3D4C4F47 4F464620 530A*halt=LOGOFF S...* R000104C0 ** R000104D0 to 0001057F suppressed line(s) same as above Now let the guest continue the boot process: B and you'll see the usual boot messages as it comes up in single user mode. You can then unregister the trace with #CP TRACE END ALL As mentioned above, Linux can also cope with EBCDIC kernel command lines so that you can easily create the parmline from CMS for example. I think most people who've completed a full install will be using zipl to write the text in ASCII but, for completeness, if your parmline data is in EBCDIC, you'll need to use D T10480.100 instead of D TX10480.100 (i.e. with no X) in order to interpret the data being displayed correctly and you'll need to use EBCDIC for the hex codes to append to the parmline. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself -- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: VM Shutdown
Post, Mark K writes: Depending on what version of z/VM and Linux you're running, updating /etc/inittab to have something like this: # What to do at the Three Finger Salute. ca::ctrlaltdel:/sbin/shutdown -t5 -r now Change that -r (meaning reboot) into -h (meaning halt) so that the SIGNAL SHUTDOWN magic described elsewhere in this thread behaves as expected. For cleanliness, it's also good to include vmpoff=LOGOFF into the kernel parmline so that when the guest finishes shutting down and does a halt -p (a power-off halt), the kernel will do a CP LOGOFF. This logs the guest off; CP then knows that the guest has finished its signal shutdown processing cleanly and will log a nice message to say so. [The vmpoff assumes that your distribution uses a halt -p in its shutdown scripts, as SLES does. If your distribution ends up doing a halt without the -p then the relevant parmline addition would be vmhalt=LOGOFF. I usually add both...belt and braces.] --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself -- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: tar up directory structure but not contents
John Campbell writes: I would probably write off tar and, instead, do: cd(somewhere) find . -xdev -type d -print | cpio -ocv tree.cpio And combining this with the NUL-separation to ensure whitespace in filenames is handled correctly, this would become find . -xdev -type d -print0 | cpio -0ocv tree.cpio And if you use -H ustar instead of the -c option (which corresponds to -H newc) then cpio will write out a tar archive which tar can extract for you: find . -xdev -type d -print0 | cpio -0ov -H ustar tree.tar --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself -- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: Deleting files in a directory more than 5 days old
Post, Mark K writes: Finally, there is a subtle difference between doing an -exec rm and piping the output of find to an xargs rm command. The difference there is that the find command will invoke the rm command once for each file that it finds that matches your criteria. The xargs version will batch them up to the maximum line length that is allowed on your system, and invoke rm once for each maximum number of arguments, thus reducing the amount of system overhead required for process creation and destruction, etc. I tend to use that a lot these days. It really does speed things up when there are a lot of objects to be handled. However, if you use xargs be extremely careful about the possibility of whitespace in filenames. If you have a file called old price.list and use a pipe such as find -print | xargs rm (or, equivalently, omit the -print since it's the default action) then xargs will parse its input stream foo bar old price.list baz for arguments to rm by separating at whitespace and end up attempting to remove file old (which probably doesn't exist) and file price.list (which may be your new file which you definitely don't want removed). It's much safer to use find -print0 | xargs -0 rm (those are zeroes) which are GNU extensions that force find to print the filenames terminated with NUL (a.k.a. \0 a.k.a. ASCII code 0) and force xargs to split its input stream at the \0 character (which cannot appear in filenames) and thus safely remove exactly the right files. It also therefore handles filenames containing \n correctly which, although not a common mistake, can form part of a malicious attack against some programs which mis-parse such things. Talking of attacks, there were examples elsewhere in the thread of using find to traverse a directory such as /tmp to clean things up. I should warn people that there are race conditions that are easy to miss when doing such recursive operations on filesystems which are writable by potential attackers. These involve the order in which directories are read, lists built up, directories and symbolic links traversed and the resulting actions executed. There have been known exploits in the past resulting from such automation. An example includes versions of some of the automated /tmp cleanup scripts run from cron in various older distributions. Any of you whose threat models require you to give attention to possible attacks from local users should be careful how such automated scripts are coded. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself -- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: web server proof of concept - code set translation problem
SOLLENBERGER, JUSTIN writes: Most of the binaries are .pdf, .ppt, .gif, .jpg, .etc that are linked to be the web pages. Shouldn't be anything that would cause a problem. What about looking at this from the other direction? Do a recursive wget -r from the Linux system to pull the data directly from the original web server and let the web server decide how to serve the files up to a web client wanting ASCII. Provided all the data is static and doesn't have any server-side includes, hierarchy oddities or complex permissions, you'll have a starting point with the data in the right format. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself -- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: Adoption of UML Copy-On-Write
Matt Zimmerman writes: On Fri, Jun 18, 2004 at 04:42:43PM -0700, Brandon Darbro wrote: Huh? EVMS or LVM2 has a method of adding a writable layer to read only dasd? EVMS supports writable snapshots, using copy-on-write from an EVMS volume. I haven't tried it with read-only DASD, but in theory it should be possible for it to be used this way. Unfortunately, LVM2 snapshots (I haven't looked at EVMS to see how they do it) are the wrong way around for this use. When you write to a volume that you are snapshotting, the original data from the block is written to the snapshot volume and then the new block data is written back to the underlying original volume. The advantage is that the original volume always contains up to date data but the disadvantage is that you can't have the original volume readonly. A further benefit of having a new block data gets written to new volume device mapper would be that you could take raw copies (e.g. with DDR) of un-quiesced (i.e. mounted live and somewhat active) Linux (journalled) filesystems and be confident that mounting the resulting copy will (after haing replayed the journal and hence provided the copy lives on a writable volume itself) give you a filesystem with consistent metadata. The result corresponds to a point-in-time shutdown of the filesystem which journalling is designed to cope with). Atomic snapshotting at device level (e.g. ESS Flashcopy) can give you similar functionality but is less widely applicable or convenient. Writing such a device mapper shouldn't be too hard given the nice new dm infrastructure (which I haven't looked at detail and really must sometime): reads/writes in non-snaphot mode go straight through to the original volume. In snapshot mode, a write to block m causes the new block data to be written to the next free block, n, on the snapshot device along with an entry in a map at the front of the snapshot device mapping m - n. Reads look up in the map on the snapshot device and get the redirected block or else a read to the original block if unmapped. You then need a feature to merge back the new data into the original volume when desired. And you really, really don't want the snaphot device to fill up because the failure mode is much, much worse than letting an archive-style LVM snapshot fill up. I'd love to hear if EVMS has a mapping facility in that direction already. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself -- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: Linux Partition - DR readiness
Adam Thornton writes: On Wed, 2004-06-09 at 20:44, Ranga Nathan wrote: Thanks Adam I have given our zipl.conf below. From what you say, we should be OK. Well, as long as 230A is the *first* DASD that's detected, you should be. If something else comes up as /dev/dasda you have problems. The line in zipl.conf was parameters=dasd=2300-230F root=/dev/dasda1 which means that Linux will allocate a slot for each abstract device number from 2300 to 230F regardless of whether each is available, online or whatever. So /dev/dasda will always refer to device 2300, /dev/dasdb will always refer to 2301 and device 230A would always be /dev/dasdk. That means that it is not going to work if the root disk suddenly becomes 230A. The best solution, as others have suggested, is probably to arrange for the device numbers to be the same at both sites. In the absence of that, you can remove the dependency on device numbers for all non-root fileystems either by using mount-by-filesystem-label (for ext2/ext3, using e2label and LABEL=foo in /etc/fstab instead of the device name) or LVM (which pools together all PVs--physical volumes--and sorts out the logical volumes itself). However, that doesn't help for the root filesystem which needs explicitly coding in /etc/zipl.conf. In the absence of a boot-time choice of image and arguments, the only way would be to create a little initrd (with mkinitd) on which you can put a little script or program to query where you're running and choose your own root filesystem early in the boot procedure. I still think it would be easier all around to get the device numbers to match though [subliminal message: VM, VM, VM]. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself -- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: IBM 3900/4000 printers in linux
David Boyes writes: If you're not running Linux under VM, you'll need to modify the UR driver that Mr. Beattie wrote to emit the right CCWs for printer devices, then define the 3900 printer to CUPS as spool:/dev/printer (or whatever you tell the driver to do). If anyone is missing functionality they need from the ur driver, please let me know and I can probably add it without too much difficulty. I've nearly finished porting it to the 2.6 kernel (well, it compiles, so it's all over bar the shouting :-) and the new driver model means I can add features fairly cleanly. For printers, I'd guess the best thing would be an attribute like echo 1 /sys/bus/ccw/drivers/ur/0.0.001E/carriage-control where the first character of each line is taken to be the CCW command for that line (i.e. what traditional printer data already includes as carriage control). --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself -- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: VDISK for /tmp
Rob van der Heij writes: But this does not mean it is always wise to do. I don't know enough about Linux to tell whether it is important to have high bandwidth to /tmp. More important is low latency. Lots of creation, small reads and writes and unlinks. Compilation with gcc, for example, uses /tmp t store its partial assembly and object files when compiling (unless you use the -pipe option). Since files in /tmp don't need to survive across a reboot (assuming either sanity, comliance with LSB or both) having a filesystem which doesn't even try to dribble them out to disk can be convenient. That is the reason for the existence of the tmpfs filesystem. If you do mount -t tmpfs none /tmp then you get a filesystem which exists only in page cache. You can set the size limit for the filesystem at mount time (see man page for mount) or else it defaults to half of (what Linux thinks is) main memory. When compared with a normal filesystem backed by VDISK or by a DCSS, it'll produce a different mixture of pressure and behaviour but it's not clear under what circumstances it may provide a win. With tmpfs all the /tmp pages would be mixed with everything else but at least would be backed by a nice fast paging hierarchy (one hopes). With a normal filesystem on VDISK the /tmp activity would all be focussed on one memory area but would have a longer path length (through the block layer and into CP for VDISK or just the block layer and some page faults to DCSS). I can't remember off the top of my head whether ext2 will allocate blocks just released by an unlinked file or whether it'll allocate in fresh blocks (you mention this point elsewhere but I've snipped it now). If it reuses just-released blocks (or can be persuaded to do so) then the memory footprint on the VDISK would be much friendlier and usual tricks like mounting noatime and nodiratime would help flurries of metadata writes that you don't need to hit disk, er, backing memory. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself -- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: ipl boot-disk clear, SINGLE USER mode??
Rob van der Heij writes: vmlinx linux007 201 /mnt/tmp Yes, vmlinx is a little bash script that issues the CP LINK command through hcp, adds the new device to /proc/dasd/devices and issues the mount. If you replace the invocation of mount with something that just outputs a map line like -fstype=auto :/dev/$dasd$partition then you can turn it into an automount script for autofs. Then you can have vi /mdisk/linux007.201/etc/inittab automount the filesystem for you and auto-unmount it when you haven't touched it for a while. For an encore, have a similar script which detects if anyone else has linked to the minidisk and wait until it's free before outputting the line. Use another automount daemon over a separate mountpoint with a timeout of only a few seconds. Then you have a somewhat basic shared filesystem that at least can be used for letting things like cp /mdiskshare/lxconfig.300/someconfig /etc/someconfig report_summary /mdiskshare/lxconfig.300/`hostname`.`date +%j` be automated. Yes, it's fragile and any guest that holds a file open on it or cd's into a directory on it can block other guests indefinitely but I can still imagine it being useful in some environments. Naming is a bit finicky (best to enforce canonical naming, say lower case guest name, non-zero-padded lower-case hex and partition number as :2,:3,:4 with an enforced omission defaulting to partition 1, otherwise the autmounters keys and mapping give rise to a few interesting problems). Similar things are doable for /devno/123 and /volser/ABCDEF too (with the latter having further interesting namespace issues with the duplicate volsers that tend to happen on minidisks rather than real volumes). --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself -- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: ipl boot-disk clear, SINGLE USER mode??
[following up to myself, sorry] Malcolm Beattie writes: line. Use another automount daemon over a separate mountpoint with a timeout of only a few seconds. Except the timeout from non-usage of the filesystem will only trigger automount into unmounting it and the underlying minidisk will still be linked. automount doesn't provide a hook there, as far as I know. Darn. Needs thought. It's certainly solvable (e.g. have a daemon sitting around and looking at the directory of real mountpoints and doing the unlinks when necessary while avoiding races) but it's not nice (not sure if autofs creating the mountpoint will trigger a dnotify event which could be waited for) and probably not as easy and clean as I'd hoped. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself -- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: Can't locate module char-major-10-224
Peter E. Abresch Jr. - at Pepco writes: I am running SuSE Linix SLES8 with the latest patches applied on a native IBM 9672-R26 LPAR. I keep receiving the following message on my console and logs: Mar 12 07:30:00 mainpepl modprobe: modprobe: Can't locate module char-major-10-224 What is causing this and how can I correct this problem or eliminate the message. Thanks. Something is trying to open a character device with major number 10 and minor number 224. If you cat /proc/devices you'll see that major 10 belongs to the misc device driver (which is a way that simple device drivers can ask for a single character device simply). According to LANANA (www.lanana.org), minor 224 is assigned to some TCPA chip which is unlikely to be what you're using so something else has usurped that number. Assuming that the device node for it has been created in /dev, do ls -lR /dev | grep 10, and look for the name of the device node associated with 10, 224 to see if it reminds you of what's been installed. Whatever software it is, it'll be expecting you to have put a line alias char-major-10-224 foo in /etc/modules.conf so that whenever something tries to open the device, the kernel will automatically do modprobe foo instead of modprobe char-major-10-224. When the device driver is loaded, it'll register via the misc device API and then a cat /proc/misc will show the association between its minor number and name. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself -- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: filesystem overhead
Post, Mark K writes: Since ext3 is advertised as ext2 with a journal, I would say that the journal is the only additional overhead you'll see in terms of disk space usage. Hmm, sort of. From the point of view of purely disk space usage, that's true, as you say. Further, ext3 is indeed ext2 with a journal, so you're right there too. The reason for the sort of is that what ext2 means there is a little more ambiguous than you might imagine. During ext2's development history, a number of changes have been made to improve performance. Some of those have been folded into ext3 and some haven't. If you want to make an accurate comparison of ext2 and ext3, you may find in practice that those differences become significant for some workloads. Three of the changes which spring to mind are * htree indexing (from Daniel Phillips) for faster directory lookups * the Orlov allocator for choosing where to allocate disk blocks for new files (i.e. when do you put them near recently created files to get good locality of reference and when do you put them far away in order to allow room for the files to expand without fragmentation) * locking (if I recall, the locking requirements for the journalling sometimes means the kernel has to (or wants to) do the locking in the ext3 filesystem differently from ext2. What I can't remember is which changes were carried across from ext2 to ext3 and when. There's also the difference that the journal I/Os will affect the I/O scheduling for the ordinary filesystem I/Os themselves unless the journal has been placed somewhere else carefully enough. All in all, I'd suggest people do some thinking, testing and measuring when moving from ext2 to ext3 if the workload is I/O intensive enough that it might make a difference. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself -- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
Re: z/VM access to EMC (was: Accessing DASD on a Shark from Linux under z/VM)
Jim Sibley writes: I'm curious. One of the benefits touted, and true, about Linux on zSeries vs. some other platform, is the zSeries' strength in I/O. Is this still true with FCP attached SCSI DASD? Why would the zSeries drive SCSI DASD better than Intel or Sun? John McKown Senior Systems Programmer Basically you can attach more dasd space and have more simultaneous (NOT just concurrent) data transfers going on at the same time. The I/O advantage of the mainframe is that it usually has more paths (256 channels) to more devices(65,536) thus giving a lot more parallel I/O, not that any particular device is more efficient. If you have a lot threads active, more I/O can be done in parallel that most intel and other boxes. With 256 channels at say 12 MB/sec (shark) on , the total aggregate rate of the mainframe would be about 3 GB/sec. Obviously, that's limited by the 2 GB backend buss on the TREXX. The general idea is right but the bus limit is wrong: 2GB/sec I/O for an entire box would be very poor. Rather than have zSeries damned with faint praise, allow me to hype up its I/O capabilities a bit more. 2GByte/sec is the speed of a single STI bus and the smallest T-Rex (one book) has 12 STI buses while the largest (four books) has 48 STI buses for a total of 96GByte/sec bandwidth. Channel cards, whether ESCON or FICON, are spread over domains/slots to take advantage of the STI buses available. You can't fill all of that bandwidth with DASD I/O (there's a limit of 120 x FICON 2Gbit/sec ports--60 features on z990--making a nominal 24Gbyte/sec) but it's way more than 2GB. ESCON hits the limit of number of channels way before any hardware bandwidth limit but even so you only have 16 ESCON ports per card. Each STI bus fans out to four slots and, for ordinary I/O, gets multiplexed down to 333MByte/s, 500MByte/sec or 1000MByte/sec as appropriate. For ESCON, it uses 333Mbyte/s (which nicely encompasses the 16 x 20MByte/s nominal signalling for an ESCON card) and for FICON, 500MByte/sec (which nicely encompasses the 2 x 200MByte/s nominal for the dial-port FICON-Express cards). The buses and features are, IMHO, very well designed to ensure that there are no bottlenecks or caps right through to the backend memory bus adapters (MBAs) of the memory subsystem. For those interested in the details, Chapter 3 of the z990 Technical Guide redbook (SG24-6947) from www.redbooks.ibm.com elaborates on this and describes it very well. Also, the main frame typically has 2 processors dedicated to driving the devices (SAPs), so less real cpu is used for I/O. In fact, not just the SAPs (which deal with initiating the I/Os). Each channel card also is fairly powerful and has the responsibility of doing much of the I/O work itself. For example, each z900 FICON card has two 333MHz PowerPC processors (cross-checked for reliability) to do the work. Again, for lots of detail, see the z900 I/O subsystem paper by Stigliani et al in the z900 edition of the IBM Journal of RD (Vol 46 No 4/5 Jul/Sept 2002). --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: nfs hangs on NetApp NAS device
Adam Thornton writes: On Wed, 2004-02-25 at 12:42, McKown, John wrote: If you do not recommend the soft option (at least for R/W), what else is possible? If the NFS server dies or is unavailable for some reason, does that mean that all the client boxes which use it should die as well? Yes. If you're mounting files you need to have read-write, and the underlying filesystem goes away, you absolutely do not want to continue operations with the files you have open. If you do keep going, e.g. with a soft mount, you're looking at Data Corruption City. To expand on this a little: there are two independent two-way choices for how do I want the NFS filesystem to behave when it stops behaving like the local filesystem it's pretending to be?. One choice is soft v. hard, the other choice is intr v. nointr. The defaults are hard and nointr. The four combinations have the following properties: hard,nointr The default. Makes the filesystem behave (a little more) like a local filesystem in the sense that a read or write of n bytes will wait uninterruptibly until it has fully succeeded or failed[*]. hard,intr The useful alternative. Weakens the pretence of local filesystem semantics but only a little. If an interrupt (SIGINT, Ctrl/C, ...) occurs during a read(), then it returns with errno EINTR or a short read (not sure if NFS will actually do the latter). This doesn't usually confuse applications since EINTR must be handled anyway in the case it arrives just before the read and if the application is designed to cope with reading from terminals, pipes or devices then it needs to cope with short reads anyway. An EINTR in the middle of a write() is a bit nastier since you don't know what happened server-side (but then if you cared about exactly what data is on the server you'd either take more care of the NFS server or not use NFS). soft,nointr (or soft,intr I suppose) This weakens the pretence of a normal local filesystem even more, at least insofar as people trust quality of implementation as well as the letter of the law. If the NFS server times out (either because it's down or because the network's congested or because various timeout values have been tweaked) then the read()/write() returns with errno EIO meaning an I/O error. Now, many applications follow the methodology of if you can't handle it, don't test for it and other follow the methodology of being coded by a lazy git who doesn't even test for errors in which case your data is toast. Yes, it would also be toast if the local filesystem started giving I/O errors but such things are normally handled at a different level (shout at whoever implemented the RAID solution and/or the hardware vendor). Of the choices available, hard,intr tends to give much more useful and safe semantics than soft but, even so, needs careful thought and effort which could have been prevented by more effort in making the NFS server more reliable. A default hard mount will pick up the read/write transparently when the server comes back up again given the statelessness of NFS[*] so it's only long outages that matter. --Malcolm [*] Yes, those are lies but are close enough for this explanation. -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: Basevol/Guestvol
On Thu, 26 Feb 2004 09:45:53 -0500, Scully, William P [EMAIL PROTECTED] wrote: I recently gave a presentation at HillGang which describes a simplified approach for Basevol/Guestvol in the SuSE operating system. If you forward me at William dot Scully at CA doc COM your e-mail address, I'll fire off to you a copy of an HTML document which describes the approach I used. (I believe Mark Post also has a copy and intends to put it on the LinuxVM.org site, when he next updates those pages.) Bob writes: Thanks for the reply. Yes, I saw that presentation material and between that material and the Redbook I have been able to understand and setup everything except for where to put the (mount --bind)'s for the guestvol packs into the rc.d directory structure to have them so that the mounts are done at the correct time. I'd like the presentation too, please. As for the boot time (and shutdown time) details: tweaking RedHat's scripts and ordering was the main nuisance when I was designing basevol+guestvol. It turned out to be rather easier for SLES7 but I didn't have a chance to do it properly (or for SLES8) since my test VM/Linux system is very tight on disk space and I don't have the time/focus of a residency period to extend things. I wish Al Viro would finish off the unionfs he's been talking on and off about writing for years: we could do plenty of marvellous sharing setups with that. Even a cut-down version would be almost as useful (two layers only, bottom layer only read-only, no merging, no white-outs for unlink(), just mkdir to create an empty directory on top of one below). --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: ulimit settings
Kevin Ellsperman writes: I need to be able to set the ulimit for nofile - the maximum number of open files. It defaults to 1024 and WebSphere V5 needs this value set significantly higher. I can change this value from the command line as root, I cannot set this value in /etc/security/limits.conf. Anything I specify in limits.conf just gets ignored. This is especially crucial for us because we do not run WebSphere as root, but as a non-root user. Has anybody been able to change this value on a permanent basis for a user? The configuration file /etc/security/limits.conf is only used if the method you use to start the user session uses the PAM module pam_limits. In SLES8, for example, the default configurations are such that login and sshd use pam_limits but su doesn't. Look in /etc/pam.d/sshd and /etc/pam.d/login and you'll see that the last line of each is session required pam_limits.so which is what sets the resource limits based on /etc/security/limits.conf. If your WebSphere start up script uses su to get from root to the non-root user (or if it does its own setgroups/setgid/setuid stuff) then nothing will be looking at limits.conf. Another thing to note is that pam_limits will fail to grant the session at all if the attempt to set the chosen limits fails. In particular (as I've just found out by testing), if you put lines in limits.conf which have foo hard nofiles 11000 then you will no longer be able to log in to username foo by ssh because the ssh daemon itself has inherited the default limit of 1024 from its parent shell and so can't increase its child's limit beyond its own. Similarly, if you add the line session required pam_limits.so to /etc/pam.d/su then you will not be able to su to a username which has a limit higher than 1024 for nofiles configured in limits.conf. The answer for sshd is to start the daemon off with a higher limit of its own, e.g. add lines to /etc/sysconfig/ssh (which in SLES8 anyway gets sourced at sshd startup time): ulimit -h -n 2 ulimit -s -n 2 to set the process' hard and soft open files limits to 2 before the sshd itself gets execed. For su, you're going to have to set the limits before the su which means it's probably not worth using limits.conf at all: if you have to raise the limits before su'ing then you might as well set them to the right values to start with and not bother using pam_limits and limits.conf. In other words, just edit the startup script for WebSphere (or an /etc/sysconfig file if it's nicely behaved enough to source one) to set the limits higher before it starts up, using ulimit commands as above for bash. Note that the exact syntax is shell-dependent since such commands are necessarily shell builtins (it's no good calling out to a separate program because the rlimits are inherited only by children and so your own shell wouldn't have its own limits changed). For Bourne flavoured shells, ulimit is what you want; for csh flavoured shells you'd use limit with a different syntax (not that you'd ever be writing scripts in csh of course, but just fyi for interactive use). The sysctl fs.file-max (equivalently /proc/sys/fs/file-max) is a system-wide limit which you may want to raise too if you think it's in danger of being reached. For SLES8 (at least), it appears to be 9830 by default which is rather more than the per-user value of 1024 that you're hitting first but still may be worth increasing if there are going to be a number of processes all wanting more than a couple of thousand or so open files. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: Chan Attached Tape Major Minors Redux
James Tison writes: Back a couple of summers ago, I recall Sergey Korzhevsky, myself, and maybe a couple others involved in trying to figure out what the tape majors and minors were. IIRC, Sergey finally put all the pieces together. I just want to review them now that I've had a chance to actually channel attach a few 3490s and run tests on the drivers devices. By the way, I'm running SLES 8.0 without a maintenance agreement, so I could easily be wrong. There just seems to be no good document where all this (very simple) stuff is written down. I don't run the devfs, either. It's all documented in the Device Drivers and Installation Commands manual (LNUX-1313-02, Chapter 5 Channel-attached tape device driver) which is available directly as http://www10.software.ibm.com/developerworks/opensource/linux390/docu/lx24jun03dd01.pdf which is the link on the Linux on zSeries Library web page at http://www-1.ibm.com/servers/eserver/zseries/os/linux/library/index.html The tape device major -- whether block or character -- is always 254. Not necessarily: they are dynamically allocated (presumably nobody got around to getting a number allocated from LANANA) which means that the driver will look for the first free number available starting at 254 and going downwards. For example, if you have cpint loaded first then cpint would allocate char major 254 for itself and the tape char device would get major 253 whilst the tape block device would get major 254 (assuming that no other block device had been loaded that had snaffled major 254 first). Rather than guess, look in /proc/devices after the driver is loaded and look for the allocated numbers in there. The block device minors are always single within the major. For example, /dev/btibm0 is 254:0, /dev/btibm1 is 254:1, etc. Hmm, TFM says Character device [...] The minor number for the non-rewind device is the tape device number of /proc/tapedevices multiplied with 2. The minor number for the rewind device is the non-rewind number +1. Block device [...] The device nodes have the same minor as the matching non-rewinding character device. which would imply that block device minors would be 0,2,4,... The character device minors come in pairs, and they're sequential within the device major. The rewindable member of the pair is ODD. The non-rewindable member of the pair is EVEN. That agrees with the manual. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: RE : signal shutdown
Monteleone writes: This is what i get when i run bootshell lnxtrs7 login: /sbin/bootshell: /sbin/bootshell: cannot execute binary file [...] - run gcc -c ./bootshell-1.3.cc -o /sbin/bootshell [...] The -c option produces an object file, not an executable. Leave out the -c option and gcc will also do the link stage and create an executable for you. Another option for consideration may be the ext_int kernel module I wrote which lets you trap the external interrupt number of your choice and have it deliver a signal of your choice to the PID of your choice. I used that when doing the Large Scale Linux Deployment redbook to be able to trigger a remote shutdown of a Linux guest before the SIGNAL SHUTDOWN support was widely available. See section 9.8 of that redbook for details. Using it to trigger a shutdown is nice and simple since you only need to deliver a SIGINT (signal 2) to init (PID 1) and init will then do the ctrlaltdel line in your /etc/inittab (similar to how the SIGNAL SHUTDOWN does it, except that that communicates extra data (timeout info) out of band rather than just being the external interrupt). --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: RE : RE : signal shutdown
Monteleone writes: Have a look please to the response i get when i try to compile ext_int: lnxtrs7:/ext_int # gcc ext_int.c -o ext_int [...] Is there a particularity to compile this module ? Yes, it's a kernel module so it's not the same as an ordinary userland executable. For longer modules, I normally provide nice READMEs and Makefiles but this one was so short I didn't. Sorry. The following is the sort of thing you need gcc -D__KERNEL__ -I/lib/modules/2.4.19-3suse-SMP/build/include -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -fno-strict-aliasing -D__SMP__ -pipe -fno-strength-reduce -DMODULE -c ext_int.c That works on SLES8 which makes the necessary kernel include files available in /lib/modules/2.4.19-3suse-SMP/build/include (for the kernel version I have). If you can't find an appropriate directory in /lib/modules for your kernel version (or it doesn't have a build subdirectory) then we'll have to play games installing the kernel source package in which case let me know what distribution you have (and it may have a kernel-includes package). An older convention for kernel include files was to put them in /usr/include/linux and /usr/include/asm or to use them from a source tree in /usr/src/linux/include but that can lead to hard-to-find problems when you have multiple kernels or source trees installed. Given that this module only uses four particular kernel functions, it's not really sensitive to versioning differences but I don't want to do anything tasteless like send a binary module around. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: Too Many Open Files
Craig, Roger C [ITS] writes: We are running Linux under VM on a mainframe. We keep running into this Too many open files problem with one of our WebLogic Servers: Sep 8, 2003 5:58:25 AM CDT Notice WebLogicServer 000203 Reopening listen socket on port 7201 Sep 8, 2003 5:59:30 AM CDT Critical WebLogicServer 000204 Failed to listen on port 7201, failure count: 79, failing for 1,063,013,260 seconds, java.net.SocketException: Too many open files Sep 8, 2003 5:59:30 AM CDT Critical WebLogicServer 000206 Attempting to close and reopen the server socket on port 7201 We end up having to bounce the server (or Linux image) when we get this condition. Has anyone experienced this? Also is there a good way to display the number of open files? It's just an administrative limit these days (either via /etc/security/limits.conf for session initiated via PAM using pam_limits or via a default of, usually, 1024). Raise the limit in whichever way you like: WebLogic may have a preferred way of doing this depending on how its username starts up a session or you can use limits.conf (if WebLogic goes via a PAM config that includes pam_limits) or else use ulimit (bash) or limit (tcsh) in your daemon startup script. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: db2 using tsm
Noll, Ralph writes: db2 error message using tsm anyone seen this db2 = backup db police online use tsm DB21019E An error occurred while accessing the directory /root. db2 = /root is typically root's home directory. If you were running db2 as a non-root username then you would not have permission to access /root. This might happen, for example, if you used su db2user rather than su - db2user from root (which would leave the HOME environment variable set to /root) and db2 then tried to access some per-user configuration file living under $HOME. Type env and see which environment variables refer to /root. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: adding dasd under SUSE Enterprise 8...
Stefan Kopp writes: more minidisks). I only have to reboot Linux and the disks are online. You don't have to reboot Linux It was answered here yesterday, use echo add device range=xxx-yyy /proc/dasd/devices Ooops, sorry, you're right. I've always thought I have to reboot when I've updated the user.direct because the new adresses were not active. Now I've spend some time with the bookmanager, nice thingy. A #cp define mdisk returns Invalid option - MDISK, which I've solved with the entry OPTION DEVMAINT for the designated z/VM user. Now I can enter #cp define mdisk 205 1 1500 xyz - wohaa - Linux recognizes the new disk. Ouch, you don't want to do that. DEFINE MDISK is intended for a privileged user to bypass the table of real minidisks and just carve out any extent at all from a device. Dangerous stuff and rarely needed. Take that OPTION DEVMAINT off the directory entry because all you need to do is CP LINK * 205 205 W on the Linux guest and it will pick up the changes to the directory which were made behind its back (adjust link mode to taste). This will also trigger Linux into noticing the presence of the new disk and it will bring it online (if it's in the list of eligible DASD devices and hasn't has a set device range=... off done on it). Regards, --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: Far 3390-mod 9's
James Melin writes: Using this strategy from the link on Linux VM mount /dev/dasdh1 /mnt cd /var tar -clpSf - . | (cd /mnt; tar -xpSf - ) produced 3 errors tar: ./lib/mysql/mysql.sock: socket ignored tar: ./run/printer: socket ignored tar: ./run/.nscd_socket: socket ignored Those three errors translated into missing items in the copy rockhopper:/var # diff -r /var /mnt Binary files /var/db2/.fmcd.lock and /mnt/db2/.fmcd.lock differ Only in /var/lib/mysql: mysql.sock Only in /var/run: .nscd_socket Only in /var/run: printer I am concerned that such things are not being copied in this manner. Is there a way to make TAR grab these as well? These are Unix domain sockets which are created when an application binds an AF_UNIX socket into the filesystem namespace. They do not have any use outside of the context of the process which bound it or clients which connect to it (assuming the process even exists any more). They are not copyable and don't contain data that you need to be concerned about. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: HELP - using sed to translate upper to lower
Adam Thornton writes: On Mon, 2003-06-16 at 10:55, McKown, John wrote: As best as I can tell, the following sed script should change all the upper case to lower case. It is not working (SLES7) echo XX | sed 'y/[A-Z]/[a-z]/' What am I doing wrong? I always use tr: echo XX | tr '[A-Z]' '[a-z]' Note that if you need to enter the murky waters of i18n then you also need to distinguish between tr A-Z a-z which will only lowercase the 26 unadorned uppercase letters and tr '[:upper:]' '[:lower:]' which will also lowercase accented characters for reasonably straightforward locale settings. If you want to handle more complex Unicode lowercasing then you want to be using Perl's tr operator (and/or uc(), lc(), regexps etc.). If you want *really* weird Unicode stuff in all its full glory then even Perl may not get you there (and you'll also have my full sympathy). (Actually, the y/// syntax in Perl is a synonym for tr/// for those who like the sed syntax plus you still get the nicer range behaviour and hence echo XX | perl -pe 'y/A-Z/a-z/' works as you'd expect it would.) --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: batch compiles
McKown, John writes: OK, so I have a corrupted mindset, coming from MVS grin. But suppose that I want to compile a LOT of programs. In MVS, I code up some JCL and submit it to run later. When it completes, I get a notify to my TSO id and look at the output in SDSF. I repeat this for however many compiles that I want to do. Perhaps doing the submissions over a period of time. How do I do that in Linux (or any UNIX)? In VM/CMS, I remember a CMSBATCH virtual machine which worked a bit like the MVS initiator. The best that I can think of to do in Linux is: I'm surprised I haven't yet seen anyone else mention the batch command that comes as part of the at suite. It'll probably already be installed. It's very useful to be able to do at now somecommand and at will package up your current environment variables and arrange for the command to run now in the background, with all stdout ending up sent to you as mail to your username once the job is finished. For more complex resource control and timing, batch lets you set up queues which run at particular times and when the load average is low enough. It's certainly not as powerful as JES but it may suffice for basic batch usage. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: reset a computer
Tzafrir Cohen writes: On Sun, Mar 16, 2003 at 02:25:49AM +0200, Tzafrir Cohen wrote: Trying to explain the question once again On Thu, Mar 13, 2003 at 05:59:03PM +0200, Tzafrir Cohen wrote: Hi Short version of the question: How do do a hard-reset to a linux guest from within linux? Note that I don't mean to IPL the boot specific device: I need to re-run profile.exec from cms . I know I can do that using hcp. (As if the user has logged-off and re-logged-on) The answer is, of course, hcp 'i cms'. I have no idea why it didn't work for me when I first tried it (it got the system stalled in CMS, so I figured it as yet another one of he things that don't work. You may want to do hcp 'i cms parm autocr' so that CMS doesn't wait for you to hit Enter. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: QDIO for virtual NIC
Gregg C Levine writes: Has anyone actually tried out IPv6 under Linux, while its running on an appropriate Z-Box, and his relatives? I've tried it over the last couple of days after reading up on it again (the last time I played with IPv6 and the 6bone was before/during the first address bit allocation wars which dates my activity somewhat :-). The good news is that I can set up a sit tunnel from a Linux/ia32 box to a Linux SLES8 guest under VM and it works fine. SLES8 comes out of the box with the basic IPv6 tools and ping6/tracepath6 work fine. That shows the general IPv6 stuff is OK. The bad news is that I can't get a virtual QDIO interface (i.e. on a QDIO GuestLAN) to work with IPv6. This may well have something to do with the fact that the kernel logs the line qeth: IPv6 not supported on eth0 but I ploughed on regardless. This is z/VM 4.3 service level 0202 running 64-bit (second level) on a z900. The GuestLAN is type QDIO (i.e. not HIPER) and each of two Linux guests has a virtual NIC defined and coupled to it. They run SLES8 (I've tried with both the shipped qeth driver and the qeth-susekernel-2.4.19-s390-1 driver which developerworks implies is later and fixes a few bugs. One of the bugs fixed in that claims to be MAC address could not be determined for VM Guest LAN interfaces but even with the new driver ip link show eth0 still shows zeroes for the MAC address. Both those drivers work fine with IPv4. As far as IPv6 is concerned, an ip -6 addr ls shows 1: lo: LOOPBACK,UP mtu 16436 qdisc noqueue inet6 ::1/128 scope host 3: eth0: MULTICAST,UP mtu 1492 qdisc pfifo_fast qlen 100 inet6 fe80::200:ff:fe00:0/10 scope link 4: tr0: BROADCAST,MULTICAST,UP mtu 1492 qdisc pfifo_fast qlen 100 inet6 fe80::a00:5aff:fe0c:c6aa/10 scope link from which we can determine that the link local IPv6 address for tr0 is behaving (with the low bits correctly calculated from its MAC address) but that the link local address for eth0 (the GuestLAN interface) doesn't look right (especially when an hcp q nic 7000 shows its (faked) MAC address as 00-04-AC-00-00-00). Interestingly, the other Linux guest (still using the original SLES8 qeth module), shows exactly the same link local addres (oops) which led to a short hooray, ping6 of the other guest's IPv6 link-local address works before I realised that the duplicate address actually meant it was pinging itself. Regardless of the link-local address, I tried adding site-local addresses (fec0::2 and fec0::9) with appropriate routes to the guests sharing the GuestLAN but although setting the addresses and routes didn't give any errors, a ping6 from one guest to the other just sat there (no errors; the behaviour you'd get from packets dropped on the floor). The latest Device Drivers and Installation Commands manual (for the May 2002 stream) says about the qeth driver Support for IPv6 applies to Gigabit Ethernet (GbE) and Fast Ethernet (FENET) only. which may mean we don't support GuestLAN NICs or may mean we support GuestLAN NICs because they're the virtual equivalent of a real Gigabit Ethernet NIC. Given the ultra-concise qeth: IPv6 not supported on eth0 message, it's possible the former but, unfortunately, I can't go check the code to tell. Does anyone know any further detail for sure about IPv6 support for QDIO (GuestLAN and otherwise)? --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: Network Problems with new kernel....
Geyer, Thomas L. writes: I am running SLES7 under zVM 4.3 using a Guest Lan. The current kernel is 2.4.7. I have built kermel 2.4.19, when I reboot with the new kernel I see the folloowing errors: Initializing random number generator 7 [80C [10D [1;32mdone [m [m modprobe: modprobe: Can't locate module eth0 modprobe looks for a module or alias called eth0, looks up its module dependencies and then tries to load it/them. Check whether you have a line alias eth0 qeth in /etc/modules.conf or else modprobe won't even look for qeth. Since you later say it works for an earlier kernel, I guess this isn't then problem. [...] When I logon onto the virtual machine through the TN3270, I see (using the lsmod command) that the qdio.o and qeth.o modules have not been loaded. I then use the insmod command to load qdio.o and qeth.o followed by ifconfig and route command to get the Linux virtual machine on the network. If you are using insmod on qdio then qeth then you are resolving the module dependencies yourself. I suspect if you tried modprobe qeth (without loading qdio) then you might run into the same problem. The table of module dependencies is per-kernel-version-tree. You'll need to run a depmod -a to rebuild the dependencies for a new kernel. You may need to fiddle with explicit options to depmod to ensure you build the dependencies for the right kernel and put them in the right place. Look at the man page for depmod for details. Often, distributions will run an automatic depmod sometime during boot. This normally removes the need to do a manual depmod but equally makes it easy to forget when one *does* need to do one. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: TCPDUMP
Eddie Chen writes: I am look at a output from a tcpdump, and I found that the datagram of fragmented data are sent from the last fragmented datagram first. Is this correcrt (frag 9311:920@8880) (DF) (frag 9311:1480@7400+) (DF) (frag 9311:1480@5920+) (DF) (frag 9311:1480@4440+) (DF) (frag 9311:1480@2960+) (DF) (frag 9311:1480@1480+) (DF) 1472 proc-7 (frag 9311:1480@0+) Yes, it's a useful performance optimisation. It means the recipient can allocate a network buffer just the right size for the whole datagram as soon as it receives the first fragment. That saves it having to reallocate larger and larger buffers for each fragment that comes in. IIRC, it used to confuse one or two grotty old embedded TCP/IP stacks but that was years ago and I'd hope that everything today can handle it. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: HiperSockets and Guest LAN
Jxrgen Birkhaug writes: Thanks Malcolm. I checked my chandev.conf and it did contain the underscore. I probably messed up my orginal post. I have now defined a new hipersocket and when trying to initialize it I get: - qeth: Trying to use card with devnos 0x963/0x964/0x965 qeth: received an IDX TERMINATE on irq 0x14/0x15 with cause code 0x08 qeth: IDX_ACTIVATE on read channel irq 0x14: negative reply qeth: There were problems in hard-setting up the card. - At least it is a different cause code. Better make that triple of device numbers start on an even boundary. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: HiperSockets and Guest LAN
Jxrgen Birkhaug writes: Quoting Malcolm Beattie [EMAIL PROTECTED]: s Better make that triple of device numbers start on an even boundary. --Malcolm s Why? I'm sure I've seen somewhere that it's a requirement but I can't remember exactly which part of the system requires it and the only reference I can find at the moment is one which only mentions the requirement for OSE and not OSD (i.e. for non-QDIO). However, something does look a bit odd about your new try: Adapter 0963 Type: HIPER Name: UNASSIGNED Devices: 3 Port 0 MAC: 00-04-AC-00-00-0C LAN: SYSTEM LNXLAN02MFS: 16384 Connection Name: HALLOLE State: Session Established Device: 0964 Unit: 001 Role: CTL-READ Device: 0965 Unit: 002 Role: CTL-WRITE Device: 0963 Unit: 000 Role: DATA Notice that VM shows that the triple of device numbers 963,964,965 have been switched around to the order 964,965,963 in order for the first even number to become the CTL-READ device. The error message from your Linux guest was qeth: Trying to use card with devnos 0x963/0x964/0x965 qeth: received an IDX TERMINATE on irq 0x14/0x15 with cause code 0x08 qeth: IDX_ACTIVATE on read channel irq 0x14: negative reply qeth: There were problems in hard-setting up the card. and it may be worth checking whether Linux has decided to switch around the device numbers in the same way, perhaps by checking in /proc/subchannels or /proc/chandev whether subchannel 0x14 really is the control read device. On the other hand, it may be simpler just to enforce the even boundary constraint, if only to avoid having those permuted device numbers appearing. I guess that there may even be other differences since this time you're using a hipersockets device instead of a qdio one and it'll have a different portname and so on (which is case sensitive and so may be worth checking too: even if your OS/390 people see/quote it in upper case it's possible that the underlying portname could be lower case). Setting up QDIO/Hipersockets connections have quite a few little subtle requirements and getting any of them wrong can lead to the sort of errors you're seeing. It's a bit of nuisance but usually it's just a question of checking every little thing one more time to find the one that you're running into. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: HiperSockets and Guest LAN
Jxrgen Birkhaug writes: Ok - I've ditched the uneven device and reverted back to an even boundary. z/VM now sees the following *after* trying to initialize the qeth module: Q NIC DETAILS Adapter 0960 Type: HIPER Name: UNASSIGNED Devices: 3 Port 0 MAC: 00-04-AC-00-00-0E LAN: SYSTEM LNXLAN02 MFS: 16384 Connection Name: HALLOLE State: Startup Device: 0960 Unit: 000 Role: CTL-READ Unassigned Devices: Device: 0961 Unit: 001 Role: Unassigned Device: 0962 Unit: 002 Role: Unassigned The dev numbers do match with the contents of /proc/subchannels. I'm slightly perplexed as to why the nic is in State: Startup and why 0961 and 0962 are Unassigned. Linux, on the other hand, reports: qeth: Trying to use card with devnos 0x960/0x961/0x962 qeth: received an IDX TERMINATE on irq 0x11/0x12 with cause code 0x17 qeth: IDX_ACTIVATE on read channel irq 0x11: negative reply qeth: There were problems in hard-setting up the card. Back to scratch. OK, let's keep going at it. What's the output of # cat /proc/chandev on the Linux side (1) when you've freshly rebooted it, (2) after you've caused the chandev settings to take effect (whether you use SuSE's rcchandev, echo a read_conf to /proc/chandev or whatever) and also (3) after you do the modprobe qeth? --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: Virtual network topology questions...
Nix, Robert P. writes: 9672, so no hiper-sockets. In trial mode, so no money to buy a distribution or support, but with the potential to do so if / when it goes into production. Potentially running DB2 and WebSphere, so SuSE instead of RedHat, as IBM supports SuSE more so than RedHat, in our experience. I'd like to work within the confines I have. You don't need physical hiper-sockets hardware for the GuestLAN and virtual hipersockets provided by z/VM 4.3. GuestLAN (or virtual hsi) simplifies many things. Unless there's absolutely no way for you to use z/VM 4.3, you're good to go. Part 2 of the ...zSeries... Large Scale Linux Deployment redbook (SG246824) covers these sorts of issues and includes chapters on Hipersockets and z/VM GuestLAN, TCP/IP direct connection and TCP/IP routing. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: More NSS Info
David Boyes writes: You would need at least one non-root/swap address mounted as /config or something for storing the configuration of what goes where, and you'd have to move at least a few of the utilities (eg mount, ifconfig, etc) from /usr to /sbin (generating statically linked versions) and include /sbin in the root filesystem. The basevol+guestvol environment I describe in the ...zSeries...Large Scale Linux Deployment redbook (SG246824) (I really ought to bind that phrase to a single keystroke :-) lets you have a readonly root filesystem which is linked to (readonly) and booted by any number of clones. The boot process then mounts a (potentially very) small guest-specific readwrite volume (whatever disk is at devno 777) and binds all the necessary writable directories into the filesystem. Other parts of the redbook then describe how you can then bootstrap yourself to get other information (via a PROP guest and then via LDAP). We can do better than Sun since we have shared disks in known, manageable namespaces at boot time and since we have Al Viro's namespace support in Linux for bind mounts (again, described in the redbook for those unfamiliar with the concept). [Next is updates-in-place with CLONE_NEWNS and pivot_root() and/or immediate kernel-to-kernel reboots when kexec() is stable...] I'll set up the NSS stuff on my own VM system and get it to work nicely with basevol+guestvol (which I've just got working properly with SuSE SLES7; the original redbook environment having some dependencies on the RedHat boot scripts). --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: OSA express gb adapter
Adam Thornton writes: On Wed, Nov 06, 2002 at 09:07:07PM -0500, David Boyes wrote: Does Red Hat include the OCO modules for QDIO on an OSA? Thanks. Kyle Stewart The Kroger Co. No. However, IBM does supply the modules built for RH, and they also have a procedure for building a new initrd with those modules: http://oss.software.ibm.com/developerworks/opensource/linux390/special_oco_rh_2.4.shtml Plus there's a detailed practical run-through of a RedHat+OCO install in Appendix B of the ...zSeries...Large Scale Linux Deployment redbook, SG246824 (go to http://www.redbooks.ibm.com and type large scale linux into the search field). --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: OSA express gb adapter
Crowley, Glen L writes: I have an LPAR that shares a gb ethernet osa express adapter with OS/390 LPAR's. I am using SUSE distribution as it is the only one that I found that includes the OCO modules. I get the error that follows when setting up my network definition. I have on ocasion be able to get this to work, but 99% of the time it fails. Anybody have any ideas that might help me. Enter the device addresses for the qeth module, e.g. '0xf800,0xf801,0xf802' or auto for autoprobing (auto): Starting with microcode level 0146, OSA-Express QDIO require a portname to be set in the device driver. It identifies the port for sharing with other. OS images, for example the PORTNAME dataset used by OS/390. Do you have OSA Express microcode level 0146 or higher? y Note: If you share the card, you must use same portname on all guest/lpars using the card. Please enter the portname (must be 1 to 8 characters) to use: osa1 That name might be case-sensitive; I can't remember if I've ever tried without explicit uppercase so it's only a guess. Does trying OSA1 in caps make a difference? qeth: Trying to use card with devnos 0xC40/0xC41/0xC42 qeth: received an IDX TERMINATE on irq 0xAF4/0xAF5 with cause code 0x22 -- try another portname Are those device numbers right for the card you were intending to use? If anyone adds another OSA to your LPAR without telling you, you may end up with the wrong one. It might be safer to give the triple of device numbers explicitly at the Enter the device addresses prompt instead of letting it autodetect. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: Probably the first published shell code example for Linux/390
Jan Jaeger writes: When we are talking about storing (ie overlaying) programs (trojans) on the stack space, then only hardware protection can really help. One would need to come to a model where instructions cannot be executed from the stack. One can achive this in S/390, by making the stack space a separate space, which is only addressable thru an access register (like an MVS data space). This way instructions can never be executed from the stack space, however, I am afraid that such an implementation would break a few things. Solar Designer did a non-executable stack patch for Linux/ia32 (using segment protection for the stack space since ia32 page-level protection does not distinguish read from execute). The things that a non-executable stack break are mainly (1) gcc trampolines (used for nested functions), (2) signal delivery and (3) application-specific run-time code generation. He handled (1) and (2) by detecting such code and disabling the non-exec stack on the fly (yes, this is a slight exposure). For (3), he supported a an ELF executable marker which disabled non-exec stack for the whole program. It was fairly popular and worked well against the sort of attacks which it was designed to prevent. Needless to say, people then worked out how to do some exploits even with non-exec stack (return into libc et al). The arms war continues, as always. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: Plagued by YAST 3270 problems
Post, Mark K writes: Hey, the card reader on my Linux/390 guests have always worked, particularly with Malcolm Beattie's Unit Record device driver. Having VM helps with that, but David Boyes did verify its functionality with a piece of real UR hardware. plug And for those who weren't aware: the latest version of the driver, along with a userland utility (complete with man page) are now documented in Appendix A of the ... zSeries ... Large Scale Linux Deployment Redbook and available for download from the redbooks site: The UR device driver provides a Linux character device interface to an attached unit record device for a Linux guest. The UR utility provides a user interface to the UR device driver. Using the UR driver and utility, it is possible to exchange files between a Linux guest and a z/VM virtual machine (initiated within the Linux guest). The UR utility provides an interface for copying files between UR devices (typically the reader, punch, and printer defined by the virtual machine). It can handle any file block size, and record length, and will perform EBCDIC-to-ASCII conversion as required. The UR device driver and utility can be downloaded from the Internet as described in Appendix D, Additional material on page 279. For the ur utility, the syntax is: ur copy [ -tbf ] [ infile | - ] [ outfile | - ] ur info devfile ur list ur add minor devno blksz reclen flags [ devname [ perm ] ] ur remove minor with the last two lines providing dynamic device support. The Redbook is available online (HTML and PDF) by going to http://www.redbooks.ibm.com/ and entering SG246824 in the search box at the top. The direct URL to the HTML online version is http://www.redbooks.ibm.com/redbooks/SG246824.html and the direct URL to the PDF version is http://www.redbooks.ibm.com/pubs/pdfs/redbooks/sg246824.pdf /plug --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: Antwort: Max number of dasd devices
Jim Sibley writes: When will SuSE have the devfs as the default for zSeries so we don't have to compile the kernel to use it and get away from the double mapping we have to do between device and device node? It is a real nuisance to try and map 100 devices per LPAR for 7 or 8 LPARs. Then try moving 20 or thirty of those volumes to another LPAR when business needs dictate! W/O devfs, I can vouch that it is both a pain and error prone. devfs is not the only way of handling these device management issues. devfs carries along with it a certain amount of design and implementation history. Let's just say that distributions wouldn't gratuitously omit it just to make your life harder. There are two issues: the cleanliness of the kernel side and device management in userland. They only overlap slightly. In the medium to long term, the stick together multiple majors and index everything into arrays of stuff issue on kernel side should be solved via the combination of 32-bit dev_t (12 bit major, 20 bit minor), nice struct device, struct gendisk or whatever and devicefs. This assumes that Al Viro and co make the scramble before the 2.5 feature freeze next week (or get it in afterwards anyway :-). Linus gave him the OK two weeks ago so I have high hopes. For the userland issue, I've often wondered why someone hasn't done a version of scsidev for z/Linux (presumably dasddev would be the obvious name). It would simply go look at all the DASD information available via /proc/dasd/devices, /proc/partitions, query all the volumes for their volsers and build up a set of nodes and symlinks so you can refer to your volumes by label, /dev/dasdvol/VLABEL, or devno, /dev/dasdno/2345 and so on. I must admit, I haven't quite wondered hard enough for it to reach the top of my todo list though... --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: VSAM or Lightweight Database?
Paul Raulerson writes: Pretty much the last time I tried to use it for anything serious was in Solaris 7, but yes, I am thinking of what was delivered with BSD 4.1 and above. So you are saying the -ldb will give me multi-user, multi-key, transactional access to record based data under Linux/390? Multi-user: yes. Transactional: yes. Multi-key: weeell, it depends on what you mean by multi-key. Of the four current access methods (hash, btree, queue, recno), the btree and hash ones are the generic key-based ones. If by multi-key, you mean you want to have fields k1 and k2 so that lookup by the pair (k1, k2) is fast and so is a lookup by (k1) then you can use a btree with a flattened key field consisting of the concatenation of the k1 and k2 fields (with canonicalised length). The btree will mean that you can look up by (k1) and, by locality of reference, walk through all ordered (k1,k2) tuples nicely. If instead you want multiple independent key fields to data then you'd have to build you own indices with either a natural primary key for the main data or else recno access method and then manage separate index databases of key - record_id (recno or primary key) mappings yourself. Although db3 will do the ACID stuff for you, it won't do all the fancy constraint and index management that a proper relational database will do (whether DB2, PostgreSQL or whatever) but from my minimal knowledge of ISAM, I don't think that does either. libdb last time I looked was just a disk based associative array handler... Time to look again. Start, for example, at http://www.sleepycat.com/docs/ref/am_conf/intro.html --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: Determining the 'mass' of a file system tree.
James Melin writes: Is there a good tool to say analyze part of a file system tree and report how much space it is using? Say like /usr/sbin - which is not in it's own file system but part of a larger one. du -s /usr/sbin Useful variations on a theme are du -s /foo/bar/*/ to get subtotals of each subdirectory (note the trailing / to force the glob to match only directories) and including the option --total to print a grand total. I'm trying to size a new deployment based on another and adust for growth. I am limited at the moment to a mod-9 drive size, so its kinda critical to know what parts of the root FS contain the most mass. I am also limited on the number of volumes I can actually have, so I'm trying to figure out the best distribution of limited resources My thought was this: Your suggested breakdown of filesystems doesn't fit with usual practice. If you want to split your filesystem amongst many volumes (and there are frequently good reasons for doing this), then start with separate filesystems for: /, swap, /usr, /tmp, /var, /opt, /home and /usr/local. These need not be full 3390-3 or 3390-9 volumes but can be partitions instead (by using the CDL disk layout to get up to 3 partitions on each volume). Typically, you would want to keep the root filesystem smallish in such a setup. For a larger Linux system, you would mount extra volumes wherever needed (application specific data filesystems might want to be on /var/lib/foo/data123, /opt/foo/data/blah, /home/biggroupname, /usr/local/foo or a variety of other conventions). This all assumes that you have a large enough Linux system to make it worth the complexity of splitting everything up. You can go a long way with a single filesystem for the entire base system, a swap partition and, if needed, a separate /usr before you necessarily need to consider splitting off /var, /tmp, /opt or whatever. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: Firewall for zSeries Linux?
David J. Chase writes: I tried to search the archives and was unable to get in and I need the information as soon as I can find it so I'm going to ask here and beg your indulgence :-) I am going to use words I don't understand, so please try to read into my question if it doesn't make sense :-) :-) A customer has the SuSE distribution but feels that the default firewall doesn't have as many features as they want. It seems to only do network address translation and they are also looking for packet filtering. Is there a commercial firewall program available for Linux for zSeries? Is there anything else you can tell me? I tried searching linuxvm.org but couldn't find what I was looking for. If you really want a commercial solution then there's the new StoneGate product (from Stonesoft at http://www.stonesoft.com/ ) which is a firewall and VPN solution and there's also zGuard (from FBIT at http://www.fbit.de/ ) but I'm not certain about current availability. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: Linux scalability on SGI Itanium 2 prototype
Alan Altmark writes: On Tue, 2002-09-10 at 16:23, Phil Payne wrote: This was forwarded to me by a co-worker. It's interesting, and sort of echoes IBM's experience with 64-bit Linux on zSeries. IBM mainframes have a maximum of 16 processors per box, but they also saw linear scalability when running a 2.4 kernel in 64-bit mode. This is very nice verification of those results. I didn't think Linux supported a 16-way image. I would be remiss if I didn't point out that IBM only *sells* boxes with a maximum of 16 CPUs. That is not an architectural maximum of zSeries. Consider that z/VM guest virtual machines can have a maximum of 64 virtual CPUs (again, an implementation limit , not architeture)! Granted, it isn't useful to have more virtual CPUs than you have real ones, but I just don't want anyone to get the idea that mainframes have some sort of inherent CPU limit. I can confirm that a Linux guest (RedHat 7.1 s390x) boots OK with 32 CPUs (virtual ones) under z/VM 4.3. What amused me was that when I ran top it only had room to show 3 or 4 process lines at the bottom because the whole top section was taking 20 or so lines showing the per-CPU usage information. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: Attempting patches after Linux/390 Crash
Kittendorf, Craig writes: That corrected, everything appeared to be successful unitl I ipl'd. I go several messages along the lines of: modprobe: modprobe: Can't locate module eth0 Your boot time scripts are trying to configure network device eth0 but there's no driver built into the kernel that recognises eth0. So the kernel tries to dynamically load a driver module for it. modprobe tries to load a module named eth0 but can't find one. You need to put a line in /etc/modules.conf saying alias eth0 whatever_your_driver_module_is_called so that modprobe knows what to load when the kernel asks for eth0. In the meantime, for the current boot session, load it manually with modprobe whatever If you don't know what module you need to load, go back a few steps and look in other docs and messages about network configuration (I haven't been following this thread so I don't know if you're OSA, Guest LAN, LCS, CTC, IUCV or anything else). --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: Kernel commentary/books
Asher Glynn writes: Has anyone read a book on the Linux kernel that they would recommend buying? Depending on what parts of the kernel you're interested in, Linux Device Drivers (Rubini) and Understanding the Linux Kernel (Bovet Cesati), are worth reading. Both published by O'Reilly. There tends to be version skew in such things (between kernel releases and book edition releases) so don't expect them to be amazingly up to date for the latest kernels. Also, there tend to be intentional and unintentional ia32-isms in them but that's probably to be expected. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: reiser file system vs ext2 file system
Denis Beauchemin writes: This happens all the time because when directories are created in the copy process their size is optimal to hold the files present there. Old directories hold many deleted entries but the space isn't reclaimed until a new file has to be created there. Thus they are larger. There's also a difference between how ext2 and reiserfs allocate space for the underlying files/directories. ext2 uses a fairly traditional block allocation. Reiserfs is very different (balanced trees for directory lookup instead of traditional name - inode list; packs tails of files together into blocks). It's not surprising that the same files take up a different amount of raw space on the disk. There may be a contribution to the difference from the non-shrinking directory structures that ext2 uses but most of it will be from the difference in underlying filesystem layout. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: behaviour of tar
David Boyes writes: Thanks to Malcom for the tip on logger -- I hadn't seen that one before. Could I have my second l back in Malcolm please? When I was doing mainly Perl development, all the ls that ever vanished from my name always seemed to end up as gratuitous additions to Randal's name (people tended to write Randall Schwartz instead of Randal). Now that I'm mostly doing Linux and VM stuff there must be someone else who's built up a whole stock of extra ls by now :-) ObLogger: Don't try using logger with early versions of Digital UNIX/Tru64 UNIX/OSF/1/RIP: there was a bug whereby it ignored the facility and priority arguments and always sent its output as a system-wide emergency message to the terminal of every single logged-in user. Eek. --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: Max # of files
Tim Verhoeven writes: On Mon, 13 May 2002, [ISO-8859-2] Maciej Ksij?ycki wrote: How can I find out what is the maximum number of files a single user can open concurrently ? And how can I change this value ? I am using SuSE SLES 7 (2.4.7 beta). There isn't a user limit but a kernel level limit. You can find the numbers of the limit in /proc/sys/fs/. There *is* a user limit as well (in the sense of non-system-wide): it's one of the inherited setrlimit() resource limits. To see a shell's current resource limits (soft and hard limits respectively) you can use ulimit -Sa and ulimit -Ha (for bash-flavoured shells) or limit and limit -h (for tcsh-flavoured shells). Interactive login shells will typically have user limits set lower than the system-wide maximum: e.g. at 1024. System administrators may be able to configure this on a per-subsystem basis (e.g. those such as sshd which use pam_limits can edit /etc/security/limits.conf). --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: V-DISK swap space?
Sergey Korzhevsky writes: 02.05.2002 00:01:17 Sivey,Lonny wrote: What does linux do when it runs out of swap space? It will start kill processes and no permit create new. An important thing to note is that the behaviour depends heavily on kernel version. One of the main areas of change (and that's putting it mildly)in Linux between early 2.4.x and recent 2.4.x has been the memory management subsystem. Its behaviour under load is one of the noticeable consequences of those changes. There are two main issues: (1) how does the performance of the system change as you increase virtual memory activity over and above the amount of physical memory available but less than the total virtual amount available (i.e. including swap)? (2) does the system allow overbooking of virtual memory allocation (cf. airlines allocating seats) and, if it does, how does it choose which processes to kill (cf. whom to bounce)? Historically, Linux has fairly poor (relatively speaking) at (1) and for (2) has always allowed overbooking and invoked an oom killer (out-of-memory killer) to handle its unfortunate bouncees. Recent kernels of various flavours (Riel VM, AA VM, rmap additions, whatever is in mainline today) improve (1) hugely and oom behaviour has flitted all over the place across 2.4.x versions (and vendor versions). Alan Cox has coded up bookkeeping/beancounting which, when enabled, prevents the system from overbooking allocations and so avoiding dependence on an oom killer (always? almost always?) but I don't know off-hand if/whether it's in or destined for mainline. Just like airlines overbooking seats, it's nasty when bounces happen but strict booking can leave the common case with a surprisingly small fraction of the resources you have available. I have a feeling Alan's latest patches are a lot cleverer than what strict allocation did in the old days (with each fork() needing to be prepared for the child to scribble over its entire shared MAP_PRIVATE library mappings and modern applications average tens of them). I really must go and look harder (unless I've provoked him enough to give a few quick comments here...) --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself
Re: HELP-2.4.18 Kernel upgrade
Post, Mark K writes: Are you aware of a web resource anywhere that has a table of what all the DIAG codes are? That would be a good addition to my links page. They are all documented in the CP Programming Services manual. The z/VM 4.2 version, for example, is listed on the base publications page http://www.vm.ibm.com/pubs/pdf/vm420bas.html with a direct link of http://www.vm.ibm.com/pubs/pdf/hcse5a40.pdf --Malcolm -- Malcolm Beattie [EMAIL PROTECTED] Linux Technical Consultant IBM EMEA Enterprise Server Group... ...from home, speaking only for myself