Re: ABENDS
On Tue, 4 Feb 2014 08:42:34 +0100 Carsten Otte co...@de.ibm.com wrote: According to the progam check table in my Principles of Operation reference summary, interruption code (= program check code) 0x60004 is a protection exception. The program is writing to memory that is memory mapped as read-only. This is a user program, therefore the next step is to create a core dump and see why it did that. Tom, if you have the abrt tool installed in your Fedora, you should already have all potentially useful information collected. To list the crashes use abrt-cli list. For more information please see https://github.com/abrt/abrt/wiki/overview Dan with kind regards Carsten Otte System z firmware development / Boeblingen lab --- Every revolution was first a thought in one man's mind; and when the same thought occurs to another man, it is the key to that era. - Ralph Waldo Emerson, Essays: First Series, 1841 Tom Huegel tehuegel@gmail.c omTo Sent by: Linux on LINUX-390@vm.marist.edu, 390 Port cc linux-...@vm.mar ist.edu Subject ABENDS 04.02.2014 00:07 Please respond to Linux on 390 Port linux-...@vm.mar ist.edu I'm a LINIX dummy. What does this mean? I just installed s390x FEDORA 20 on a z196. [ 5372.889929] User process fault: interruption code 0x60004 in libc-2.18.so [4a7a25+1ae000] [ 5372.889936] failing address: 4A7A2ED000 [ 5372.889953] CPU: 0 PID: 45008 Comm: getent Not tainted 3.12.8-300.fc20.s390x #1 [ 5372.889956] task: 3dc208a8 ti: 13aa8000 task.ti: 13aa8000 [ 5372.889964] User PSW : 070520018000 004a7a2890ee (0x4a7a2890ee) [ 5372.889967]R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:1 AS:0 CC:2 PM:0 EA:3 User GPRS: 004a7a3ca8a2 004a7a2edfe1 004a7a3ca32e [ 5372.889987]0011 03fff6cd7bdc 004a7a404000 03a89f4d [ 5372.889992]03a89f4d 03a88d50 03a88cd8 [ 5372.889996]004a7a402000 004a7a3bd328 004a7a2890e2 03a88cb0 [ 5372.890006] User Code: 004a7a2890dc: c0e5000324b0brasl %r14,4a7a2eda3c 004a7a2890e2: c01a0be0 larl%r1,4a7a3ca8a2 #004a7a2890e8: c03a0923 larl%r3,4a7a3ca32e 004a7a2890ee: d20d20001000 mvc 0(14,%r2),0(% r1) 004a7a2890f4: b904002a lgr %r2,%r10 004a7a2890f8: c0e500021fe0 brasl % r14,4a7a2cd0b8 004a7a2890fe: b9020062 ltgr%r6,%r2 004a7a289102: a78401c8 brc 8,4a7a289492 [ 5372.890105] Last Breaking-Event-Address: [ 5372.890110] [004a7a2eda5c] 0x4a7a2eda5c [ 5372.891317] User process fault: interruption code 0x60004 in libc-2.18.so [4a7a25+1ae000] [ 5372.891325] failing address: 4A7A2ED000 [ 5372.891330] CPU: 0 PID: 45006 Comm: getent Not tainted 3.12.8-300.fc20.s390x #1 [ 5372.891335] task: 3d7788a8 ti: 13a7 task.ti: 13a7 [ 5372.891343] User PSW : 070520018000 004a7a2890ee (0x4a7a2890ee) [ 5372.891348]R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:1 AS:0 CC:2 PM:0 EA:3 User GPRS: 004a7a3ca8a2 004a7a2edfe1 004a7a3ca32e [ 5372.891360]0011 03fff6e02bdc 004a7a404000 03984f4d [ 5372.891432]03984f4d 039842b0 03984238 [ 5372.891435]004a7a402000 004a7a3bd328 004a7a2890e2 03984210 [ 5372.891441] User Code: 004a7a2890dc: c0e5000324b0brasl %r14,4a7a2eda3c 004a7a2890e2: c01a0be0 larl%r1,4a7a3ca8a2 #004a7a2890e8: c03a0923 larl%r3,4a7a3ca32e 004a7a2890ee: d20d20001000 mvc 0(14,%r2),0(% r1) 004a7a2890f4: b904002a lgr %r2,%r10 004a7a2890f8: c0e500021fe0 brasl % r14,4a7a2cd0b8 004a7a2890fe: b9020062 ltgr%r6,%r2 004a7a289102: a78401c8 brc 8,4a7a289492 [ 5372.891457] Last Breaking-Event-Address: [ 5372.891460] [004a7a2eda5c] 0x4a7a2eda5c [ 5372.891556] Pid 45006(getent) over core_pipe_limit [ 5372.891558] Skipping core dump -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
EMC request for GTFTRACE on z/VM and zLinux
We are currently having issues with a db2 server running under SLES 11 SP2 using zfcp to access an EMC SAN. EMC has requested a GTFTRACE trace. Is there even a way to run GTF under z/VM and would it make any sense since this seems to be a Linux issue? If there is a way, what is the process to run a GTF trace (I would assume it would have to be run under maint or some other z/VM support ID). Chris Will Systems Software (313) 549-9729 cw...@bcbsm.com The information contained in this communication is highly confidential and is intended solely for the use of the individual(s) to whom this communication is directed. If you are not the intended recipient, you are hereby notified that any viewing, copying, disclosure or distribution of this information is prohibited. Please notify the sender, by electronic mail or telephone, of any unintended receipt and delete the original message without making any copies. Blue Cross Blue Shield of Michigan and Blue Care Network of Michigan are nonprofit corporations and independent licensees of the Blue Cross and Blue Shield Association. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: EMC request for GTFTRACE on z/VM and zLinux
Chris, I don't think that this request makes sense. With V=V passthrough, z/VM is'nt involved in data transfer at all. I think that tracing needs to be done in Linux. with kind regards Carsten Otte System z firmware development / Boeblingen lab --- Every revolution was first a thought in one man's mind; and when the same thought occurs to another man, it is the key to that era. - Ralph Waldo Emerson, Essays: First Series, 1841 Will, Chris cw...@bcbsm.com Sent by: Linux on To 390 Port LINUX-390@vm.marist.edu, linux-...@vm.mar cc ist.edu Subject EMC request for GTFTRACE on z/VM 04.02.2014 15:15 and zLinux Please respond to Linux on 390 Port linux-...@vm.mar ist.edu We are currently having issues with a db2 server running under SLES 11 SP2 using zfcp to access an EMC SAN. EMC has requested a GTFTRACE trace. Is there even a way to run GTF under z/VM and would it make any sense since this seems to be a Linux issue? If there is a way, what is the process to run a GTF trace (I would assume it would have to be run under maint or some other z/VM support ID). Chris Will Systems Software (313) 549-9729 cw...@bcbsm.com The information contained in this communication is highly confidential and is intended solely for the use of the individual(s) to whom this communication is directed. If you are not the intended recipient, you are hereby notified that any viewing, copying, disclosure or distribution of this information is prohibited. Please notify the sender, by electronic mail or telephone, of any unintended receipt and delete the original message without making any copies. Blue Cross Blue Shield of Michigan and Blue Care Network of Michigan are nonprofit corporations and independent licensees of the Blue Cross and Blue Shield Association. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: EMC request for GTFTRACE on z/VM and zLinux
Another side effect of System z = z/OS mentality. This request makes no sense at all. If it's a direct attach FCP device, you'll have to run the trace in Linux, and the GTF utilities don't run there. Look at the CP TRACE command and then process the resulting CP monitor data if this is an EDEV. We are currently having issues with a db2 server running under SLES 11 SP2 using zfcp to access an EMC SAN. EMC has requested a GTFTRACE trace. Is there even a way to run GTF under z/VM and would it make any sense since this seems to be a Linux issue? If there is a way, what is the process to run a GTF trace (I would assume it would have to be run under maint or some other z/VM support ID). -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: ABENDS
On Mon, 3 Feb 2014 15:07:59 -0800 Tom Huegel tehue...@gmail.com wrote: I'm a LINIX dummy. What does this mean? I just installed s390x FEDORA 20 on a z196. [ 5372.889929] User process fault: interruption code 0x60004 in libc-2.18.so [4a7a25+1ae000] [ 5372.889936] failing address: 4A7A2ED000 one more idea - the address looks as a wrap over kernel memory pages when there is no page allocated for 4A7A2ED000, but a page is there for 4A7A2EC000, where the target is being copied, it can be an over-optimized version of memcpy() or something like that I already debugged this kind of crash with older glibc in RHEL. Dan [ 5372.889953] CPU: 0 PID: 45008 Comm: getent Not tainted 3.12.8-300.fc20.s390x #1 [ 5372.889956] task: 3dc208a8 ti: 13aa8000 task.ti: 13aa8000 [ 5372.889964] User PSW : 070520018000 004a7a2890ee (0x4a7a2890ee) [ 5372.889967]R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:1 AS:0 CC:2 PM:0 EA:3 User GPRS: 004a7a3ca8a2 004a7a2edfe1 004a7a3ca32e [ 5372.889987]0011 03fff6cd7bdc 004a7a404000 03a89f4d [ 5372.889992]03a89f4d 03a88d50 03a88cd8 [ 5372.889996]004a7a402000 004a7a3bd328 004a7a2890e2 03a88cb0 [ 5372.890006] User Code: 004a7a2890dc: c0e5000324b0brasl %r14,4a7a2eda3c 004a7a2890e2: c01a0be0 larl%r1,4a7a3ca8a2 #004a7a2890e8: c03a0923 larl%r3,4a7a3ca32e 004a7a2890ee: d20d20001000 mvc 0(14,%r2),0(% r1) 004a7a2890f4: b904002a lgr %r2,%r10 004a7a2890f8: c0e500021fe0 brasl % r14,4a7a2cd0b8 004a7a2890fe: b9020062 ltgr%r6,%r2 004a7a289102: a78401c8 brc 8,4a7a289492 [ 5372.890105] Last Breaking-Event-Address: [ 5372.890110] [004a7a2eda5c] 0x4a7a2eda5c [ 5372.891317] User process fault: interruption code 0x60004 in libc-2.18.so [4a7a25+1ae000] [ 5372.891325] failing address: 4A7A2ED000 [ 5372.891330] CPU: 0 PID: 45006 Comm: getent Not tainted 3.12.8-300.fc20.s390x #1 [ 5372.891335] task: 3d7788a8 ti: 13a7 task.ti: 13a7 [ 5372.891343] User PSW : 070520018000 004a7a2890ee (0x4a7a2890ee) [ 5372.891348]R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:1 AS:0 CC:2 PM:0 EA:3 User GPRS: 004a7a3ca8a2 004a7a2edfe1 004a7a3ca32e [ 5372.891360]0011 03fff6e02bdc 004a7a404000 03984f4d [ 5372.891432]03984f4d 039842b0 03984238 [ 5372.891435]004a7a402000 004a7a3bd328 004a7a2890e2 03984210 [ 5372.891441] User Code: 004a7a2890dc: c0e5000324b0brasl %r14,4a7a2eda3c 004a7a2890e2: c01a0be0 larl%r1,4a7a3ca8a2 #004a7a2890e8: c03a0923 larl%r3,4a7a3ca32e 004a7a2890ee: d20d20001000 mvc 0(14,%r2),0(% r1) 004a7a2890f4: b904002a lgr %r2,%r10 004a7a2890f8: c0e500021fe0 brasl % r14,4a7a2cd0b8 004a7a2890fe: b9020062 ltgr%r6,%r2 004a7a289102: a78401c8 brc 8,4a7a289492 [ 5372.891457] Last Breaking-Event-Address: [ 5372.891460] [004a7a2eda5c] 0x4a7a2eda5c [ 5372.891556] Pid 45006(getent) over core_pipe_limit [ 5372.891558] Skipping core dump -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: ABENDS
Dan Horák dho...@redhat.com Sent by: Linux on 390 Port LINUX-390@vm.marist.edu one more idea - the address looks as a wrap over kernel memory pages when there is no page allocated for 4A7A2ED000, but a page is there for 4A7A2EC000, where the target is being copied, it can be an over-optimized version of memcpy() or something like that I already debugged this kind of crash with older glibc in RHEL. Good point. Can this be reproduced with the command running in gdb? If so, what is in /proc/pid/maps at the memory location? cheers, Carsten -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: ABENDS
Interesting, this is a new install, fresh from the download.. On Tue, Feb 4, 2014 at 12:05 AM, Dan Horák dho...@redhat.com wrote: On Tue, 4 Feb 2014 08:42:34 +0100 Carsten Otte co...@de.ibm.com wrote: According to the progam check table in my Principles of Operation reference summary, interruption code (= program check code) 0x60004 is a protection exception. The program is writing to memory that is memory mapped as read-only. This is a user program, therefore the next step is to create a core dump and see why it did that. Tom, if you have the abrt tool installed in your Fedora, you should already have all potentially useful information collected. To list the crashes use abrt-cli list. For more information please see https://github.com/abrt/abrt/wiki/overview Dan with kind regards Carsten Otte System z firmware development / Boeblingen lab --- Every revolution was first a thought in one man's mind; and when the same thought occurs to another man, it is the key to that era. - Ralph Waldo Emerson, Essays: First Series, 1841 Tom Huegel tehuegel@gmail.c om To Sent by: Linux on LINUX-390@vm.marist.edu, 390 Port cc linux-...@vm.mar ist.edu Subject ABENDS 04.02.2014 00:07 Please respond to Linux on 390 Port linux-...@vm.mar ist.edu I'm a LINIX dummy. What does this mean? I just installed s390x FEDORA 20 on a z196. [ 5372.889929] User process fault: interruption code 0x60004 in libc-2.18.so [4a7a25+1ae000] [ 5372.889936] failing address: 4A7A2ED000 [ 5372.889953] CPU: 0 PID: 45008 Comm: getent Not tainted 3.12.8-300.fc20.s390x #1 [ 5372.889956] task: 3dc208a8 ti: 13aa8000 task.ti: 13aa8000 [ 5372.889964] User PSW : 070520018000 004a7a2890ee (0x4a7a2890ee) [ 5372.889967]R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:1 AS:0 CC:2 PM:0 EA:3 User GPRS: 004a7a3ca8a2 004a7a2edfe1 004a7a3ca32e [ 5372.889987]0011 03fff6cd7bdc 004a7a404000 03a89f4d [ 5372.889992]03a89f4d 03a88d50 03a88cd8 [ 5372.889996]004a7a402000 004a7a3bd328 004a7a2890e2 03a88cb0 [ 5372.890006] User Code: 004a7a2890dc: c0e5000324b0brasl %r14,4a7a2eda3c 004a7a2890e2: c01a0be0 larl%r1,4a7a3ca8a2 #004a7a2890e8: c03a0923 larl%r3,4a7a3ca32e 004a7a2890ee: d20d20001000 mvc 0(14,%r2),0(% r1) 004a7a2890f4: b904002a lgr %r2,%r10 004a7a2890f8: c0e500021fe0 brasl % r14,4a7a2cd0b8 004a7a2890fe: b9020062 ltgr%r6,%r2 004a7a289102: a78401c8 brc 8,4a7a289492 [ 5372.890105] Last Breaking-Event-Address: [ 5372.890110] [004a7a2eda5c] 0x4a7a2eda5c [ 5372.891317] User process fault: interruption code 0x60004 in libc-2.18.so [4a7a25+1ae000] [ 5372.891325] failing address: 4A7A2ED000 [ 5372.891330] CPU: 0 PID: 45006 Comm: getent Not tainted 3.12.8-300.fc20.s390x #1 [ 5372.891335] task: 3d7788a8 ti: 13a7 task.ti: 13a7 [ 5372.891343] User PSW : 070520018000 004a7a2890ee (0x4a7a2890ee) [ 5372.891348]R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:1 AS:0 CC:2 PM:0 EA:3 User GPRS: 004a7a3ca8a2 004a7a2edfe1 004a7a3ca32e [ 5372.891360]0011 03fff6e02bdc 004a7a404000 03984f4d [ 5372.891432]03984f4d 039842b0 03984238 [ 5372.891435]004a7a402000 004a7a3bd328 004a7a2890e2 03984210 [ 5372.891441] User Code: 004a7a2890dc: c0e5000324b0brasl %r14,4a7a2eda3c 004a7a2890e2: c01a0be0 larl%r1,4a7a3ca8a2 #004a7a2890e8: c03a0923 larl%r3,4a7a3ca32e 004a7a2890ee: d20d20001000 mvc 0(14,%r2),0(% r1) 004a7a2890f4: b904002a lgr %r2,%r10 004a7a2890f8: c0e500021fe0 brasl % r14,4a7a2cd0b8 004a7a2890fe: b9020062 ltgr%r6,%r2 004a7a289102: a78401c8 brc 8,4a7a289492 [ 5372.891457] Last Breaking-Event-Address: [ 5372.891460] [004a7a2eda5c] 0x4a7a2eda5c [ 5372.891556] Pid 45006(getent) over core_pipe_limit [ 5372.891558] Skipping core dump -- For LINUX-390 subscribe / signoff
Oracle and Virtual CPU's
Running Oracle 11 on Linux on z. Z/VM LPAR assigned 4 IFL's. We have a linux guest that got really busy for about 10-15 minutes nightly. Oracle process. Guest has 2 virtual CPU's defined. 1) Our Oracle DBA (consultant and I believe he is coming from an intel world) says we need more CPU's. I say no. Who's right and why? 2) on another guest on the same LPAR, we have 4 CPU's defined just to run Oracle (for PeopleSoft). I've never seen the CPU's 250% (out of 400%). Should we drop it down to 3 (The oracle DBA says no and wants more). Thanks! -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Oracle and Virtual CPU's
We have a linux guest that got really busy for about 10-15 minutes nightly. Oracle process. Guest has 2 virtual CPU's defined. 1) Our Oracle DBA (consultant and I believe he is coming from an intel world) says we need more CPU's. I say no. Who's right and why? This is not a problem unique to VM -- VMWare and Xen suffer the same issue. You're both partially right. The workload may need more REAL CPUs (in that there may not be enough real cycles available to meet the demand at a point in time), but defining more virtual CPUs will probably make the problem worse (your dispatch timeslice for the whole virtual machine is divided as equally as possible between the # of virtual CPUs defined, so defining more virtual CPUs actually DECREASES the amount of processing time available to each virtual CPU per timeslice). It also depends a lot on what the Oracle instance is being asked to do - some activities in Oracle aren't really very MP-friendly, so even if you DID add the virtual CPUs, it wouldn't make any difference because the code won't care (the task is scheduled on a virtual CPU and just runs until the timeslice is exhausted). If you have lots of tasks like that, the number of CPUs is irrelevant; the code is only going to use one at a time. Monitor data on the VM side will tell you more about how the real CPUs are being used in total; the performance data inside the VM will tell you how Linux is allocating workload to the virtual CPUs it sees, but that data alone is totally unreliable for capacity planning. It can only reliably see the division of labor, not the overall available machine usage. 2) on another guest on the same LPAR, we have 4 CPU's defined just to run Oracle (for PeopleSoft). I've never seen the CPU's 250% (out of 400%). Should we drop it down to 3 (The oracle DBA says no and wants more). See above. If he's just looking at data from inside the virtual machine, more virtual CPUs make the problem worse. Ask him what the problem workload is. If it's single long-running queries (Peoplesoft does a lot of those, and they're often stupidly constructed), more CPUs won't help. He'll likely get more bang for the buck optimizing the queries or adding indexes, but that's more work for him. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: EMC request for GTFTRACE on z/VM and zLinux
On Tuesday, 02/04/2014 at 09:17 EST, Will, Chris cw...@bcbsm.com wrote: We are currently having issues with a db2 server running under SLES 11 SP2 using zfcp to access an EMC SAN. EMC has requested a GTFTRACE trace. Is there even a way to run GTF under z/VM and would it make any sense since this seems to be a Linux issue? If there is a way, what is the process to run a GTF trace (I would assume it would have to be run under maint or some other z/VM support ID). Oh, my. How embarrassing for EMC. (I'm blushing on their behalf.) That request can best be translated as, Please get me some z/OS data [gtf trace] for a device that z/OS can't talk to [SCSI] on a platform that isn't even running z/OS [Linux]. Go back to EMC and ask for someone who is knowledgeable of the z/VM and Linux environment. There ARE such people in EMC. With a competent person working on your case, you will get a coherent request for data using tools you have. Alan Altmark Senior Managing z/VM and Linux Consultant IBM System Lab Services and Training ibm.com/systems/services/labservices office: 607.429.3323 mobile; 607.321.7556 alan_altm...@us.ibm.com IBM Endicott -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: ABENDS
Reproduced! I get dozens of these, similar, but different addresses and instructions. I tried starting over and reinstalling, but to no avail. On Tue, Feb 4, 2014 at 7:24 AM, Carsten Otte co...@de.ibm.com wrote: Dan Horák dho...@redhat.com Sent by: Linux on 390 Port LINUX-390@vm.marist.edu one more idea - the address looks as a wrap over kernel memory pages when there is no page allocated for 4A7A2ED000, but a page is there for 4A7A2EC000, where the target is being copied, it can be an over-optimized version of memcpy() or something like that I already debugged this kind of crash with older glibc in RHEL. Good point. Can this be reproduced with the command running in gdb? If so, what is in /proc/pid/maps at the memory location? cheers, Carsten -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Oracle and Virtual CPU's
Oracle DBA == more ;-) Lee Stewart ● VM System Support ● Visa ● Phone: 6(750)4601 - +1-303-389-4601 ● lstew...@visa.com -Original Message- From: Linux on 390 Port [mailto:LINUX-390@VM.MARIST.EDU] On Behalf Of karlkings...@ongov.net Sent: Tuesday, February 04, 2014 9:23 AM To: LINUX-390@VM.MARIST.EDU Subject: Oracle and Virtual CPU's Running Oracle 11 on Linux on z. Z/VM LPAR assigned 4 IFL's. We have a linux guest that got really busy for about 10-15 minutes nightly. Oracle process. Guest has 2 virtual CPU's defined. 1) Our Oracle DBA (consultant and I believe he is coming from an intel world) says we need more CPU's. I say no. Who's right and why? 2) on another guest on the same LPAR, we have 4 CPU's defined just to run Oracle (for PeopleSoft). I've never seen the CPU's 250% (out of 400%). Should we drop it down to 3 (The oracle DBA says no and wants more). Thanks! -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: ABENDS
On Tue, 4 Feb 2014 09:42:03 -0800 Tom Huegel tehue...@gmail.com wrote: Reproduced! I get dozens of these, similar, but different addresses and instructions. I tried starting over and reinstalling, but to no avail. when does it happen? after the installation? what application is it? looks as the getent tool, maybe run in a scriptlet during installation Dan On Tue, Feb 4, 2014 at 7:24 AM, Carsten Otte co...@de.ibm.com wrote: Dan Horák dho...@redhat.com Sent by: Linux on 390 Port LINUX-390@vm.marist.edu one more idea - the address looks as a wrap over kernel memory pages when there is no page allocated for 4A7A2ED000, but a page is there for 4A7A2EC000, where the target is being copied, it can be an over-optimized version of memcpy() or something like that I already debugged this kind of crash with older glibc in RHEL. Good point. Can this be reproduced with the command running in gdb? If so, what is in /proc/pid/maps at the memory location? cheers, Carsten -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: ABENDS
This is the first one during boot. Then they just never stop. Eventually I see a logon message, but attempting to logon only cause more messages... Starting Show Plymouth Boot Screen... [0.985347] User process fault: interruption code 0x40004 in libc-2.18.so [4010029000+1ae000] [0.985352] failing address: 40100C6000 [0.985355] CPU: 0 PID: 358 Comm: plymouthd Not tainted 3.12.8-300.fc20.s390x #1 [0.985357] task: 7ff4b3f0 ti: 0166 task.ti: 0166 [0.985360] User PSW : 070500018000 0040100b11e4 (0x40100b11e4) [0.985362]R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:1 AS:0 CC:0 PM:0 EA:3 User GPRS: 0001 0040100cdef8 0040100c6fd5 0040 [0.985366]0005 0022 03fffd6f6000 0002 [0.985368]03f30d60 005c 005c 03f30da0 [0.985370]0040101db000 00401019caf0 0040100b11d4 03f30c48 [0.985379] User Code: 0040100b11d8: a73aahi %r3,-1 0040100b11dc: 5030b0bc st %r3,188(%r11) #0040100b11e0: a774ffef brc 7,40100b11be 0040100b11e4: 92002000 mvi 0(%r2),0 0040100b11e8: c0195f20 larl%r1,40101dd028 0040100b11ee: e3201004 lg %r2,0(%r1) 0040100b11f4: b9040032 lgr %r3,%r2 0040100b11f8: eb361030 csg %r3,%r6,0(%r1) [0.985397] Last Breaking-Event-Address: [0.985399] [0040100c6a5c] 0x40100c6a5c [ [1;31mFAILED [0m] Failed to start Show Plymouth Boot Screen. See 'systemctl status plymouth-start.service' for details. On Tue, Feb 4, 2014 at 11:12 AM, Dan Horák dho...@redhat.com wrote: On Tue, 4 Feb 2014 09:42:03 -0800 Tom Huegel tehue...@gmail.com wrote: Reproduced! I get dozens of these, similar, but different addresses and instructions. I tried starting over and reinstalling, but to no avail. when does it happen? after the installation? what application is it? looks as the getent tool, maybe run in a scriptlet during installation Dan On Tue, Feb 4, 2014 at 7:24 AM, Carsten Otte co...@de.ibm.com wrote: Dan Horák dho...@redhat.com Sent by: Linux on 390 Port LINUX-390@vm.marist.edu one more idea - the address looks as a wrap over kernel memory pages when there is no page allocated for 4A7A2ED000, but a page is there for 4A7A2EC000, where the target is being copied, it can be an over-optimized version of memcpy() or something like that I already debugged this kind of crash with older glibc in RHEL. Good point. Can this be reproduced with the command running in gdb? If so, what is in /proc/pid/maps at the memory location? cheers, Carsten -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
DASD Interrupt Errors Related to XRC I/O Delays?
Hello List, We have a SLES11 SP4 Linux server running under z/VM 6.2 on a zEC12 that's generating the following message repeatedly during periods of peak I/O activity: kernel: dasd_erp(3990): 0.0.01bn: dasd_3990_erp_action_4: first time retry where n=1, 6 or 9 All 3 of these minidisks reside on the same Linux formatted VM volume (ECKD). I discovered that these messages are generated by the IBM DASD Error Recovery Procedure (ERP). After looking at the Linux code surrounding this ERP error (action 4), it appears that it can potentially retry up to 256 times, but I only see this first time retry message occurrence - but repeatedly, which suggests to me that these delays are of a fairly short duration (the code indicates the use of a 20 second timer between retries after the first retry). I also discovered that XRC DASD mirroring was made active just last week for the volume that these minidisks reside on. We have been running XRC DASD mirroring for a long time between MVS systems, but just started using it with our Linux production volumes. So, I suspect that these messages are the result of I/O interrupts caused by delays due to channel extenders, which are a part of the XRC configuration that associates this servers DASD volume with a counterpart at another data center over 700 miles away. Also, this server runs DataStage and gets pretty busy during peak periods due to extract, transform and load (ETL) processing. So, I have a couple of questions related to these ERP errors: 1. Have others seen these ERP messages related to XRC activity during peak I/O periods or can confirm that is what this is likely related to? 2. If it is related, then are these messages typical? If typical, is there a tuning knob that can adjust the amount of delay more appropriately to allow for this XRC delay during peak periods? Any help is greatly appreciated! Thanks in advance, Jim Moling IT Specialist, z/VM Linux on z Mainframe Services Branch Division of Platform Services Information and Security Services Bureau of the Fiscal Service Department of the Treasury james.mol...@fiscal.treasury.gov 202-874-9566 - This E-mail and its attachments (if any) are intended solely for the use of the addressee(s) and may contain sensitive but unclassified information. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or use of the information contained herein (including any reliance thereon) is strictly prohibited. If you have received this E-mail in error, please notify the sender immediately and destroy the E-mail and any attachments. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: Oracle and Virtual CPU's
On Wed, Feb 5, 2014 at 5:49 AM, David Boyes dbo...@sinenomine.net wrote: your dispatch timeslice for the whole virtual machine is divided as equally as possible between the # of virtual CPUs defined, so defining more virtual CPUs actually DECREASES the amount of processing time available to each virtual CPU per timeslice That is not really correct. If that was true, you'd never be able to go over 100%. Each virtual CPU is separately dispatchable and competes with all other VCPUs from all other guests for a timeslice on a logical processor. SHARE value is taken into account by dispatcher to determine the relative priority of all VCPUs for that guest. That will determine how often in a given period of time will that guest get a timeslice on any of its VCPUs. If you want to turn single-VCPU guest into a multi-VCPU one, you need to multiply the relative SHARE value by the number of VCPUs that you defined for a guest in hope that each VCPU gets about the same number of timeslices in a given period of time as the single one was getting on a single-VCPU guest. Adding a virtual CPUs to a guest increases the ratio of total number of VCPUs per logical processor and puts unnecessary burden on dispatcher if it (additional VCPU) is not effectively used. There is a finite number of time slices on all logical processors put together and with more VCPUs, the competition for those timeslices is higher. You should add another VCPU to a guest only if the processes running in it can effectively use it. Let's not forget that if you run in a LPAR that uses shared physical processors, then LPAR's logical processors compete for a timeslice (or running time) on a physical processor with other LPARs that use shared processors. CPU percentages may be misleading in that environment unless you have a good monitoring tool. CPU seconds per minute may be a better gauge. Ivica Brodaric -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/
Re: DASD Interrupt Errors Related to XRC I/O Delays?
Hi Jim. Most certainly XRC can slow your I/O down and it may be too much for linux. There is a parm somewhere on the GDPS side called write pacing that may be in play here. That keeps you from writing too fast and bogging down the XRC network so much that RPO is missed. Are u sharing datamovers with VM and MVS? Do you see a difference in your io response time numbers than you did before? Talk to your folks in charge of replication and definitely open a prob with IBM if they can't help. If you do get 256 of those your linux file system will go read only and that's not fun at all. Marcy. Sent from my BlackBerry. - Original Message - From: james.mol...@fiscal.treasury.gov [mailto:james.mol...@fiscal.treasury.gov] Sent: Tuesday, February 04, 2014 06:02 PM Central Standard Time To: LINUX-390@VM.MARIST.EDU LINUX-390@VM.MARIST.EDU Subject: [LINUX-390] DASD Interrupt Errors Related to XRC I/O Delays? Hello List, We have a SLES11 SP4 Linux server running under z/VM 6.2 on a zEC12 that's generating the following message repeatedly during periods of peak I/O activity: kernel: dasd_erp(3990): 0.0.01bn: dasd_3990_erp_action_4: first time retry where n=1, 6 or 9 All 3 of these minidisks reside on the same Linux formatted VM volume (ECKD). I discovered that these messages are generated by the IBM DASD Error Recovery Procedure (ERP). After looking at the Linux code surrounding this ERP error (action 4), it appears that it can potentially retry up to 256 times, but I only see this first time retry message occurrence - but repeatedly, which suggests to me that these delays are of a fairly short duration (the code indicates the use of a 20 second timer between retries after the first retry). I also discovered that XRC DASD mirroring was made active just last week for the volume that these minidisks reside on. We have been running XRC DASD mirroring for a long time between MVS systems, but just started using it with our Linux production volumes. So, I suspect that these messages are the result of I/O interrupts caused by delays due to channel extenders, which are a part of the XRC configuration that associates this servers DASD volume with a counterpart at another data center over 700 miles away. Also, this server runs DataStage and gets pretty busy during peak periods due to extract, transform and load (ETL) processing. So, I have a couple of questions related to these ERP errors: 1. Have others seen these ERP messages related to XRC activity during peak I/O periods or can confirm that is what this is likely related to? 2. If it is related, then are these messages typical? If typical, is there a tuning knob that can adjust the amount of delay more appropriately to allow for this XRC delay during peak periods? Any help is greatly appreciated! Thanks in advance, Jim Moling IT Specialist, z/VM Linux on z Mainframe Services Branch Division of Platform Services Information and Security Services Bureau of the Fiscal Service Department of the Treasury james.mol...@fiscal.treasury.gov 202-874-9566 - This E-mail and its attachments (if any) are intended solely for the use of the addressee(s) and may contain sensitive but unclassified information. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or use of the information contained herein (including any reliance thereon) is strictly prohibited. If you have received this E-mail in error, please notify the sender immediately and destroy the E-mail and any attachments. -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/ -- For LINUX-390 subscribe / signoff / archive access instructions, send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 -- For more information on Linux on System z, visit http://wiki.linuxvm.org/