Re: [dtrace-discuss] Solaris Internals Resource Threshold being hit

2010-10-29 Thread James Litchfield

 A recent S10 kernel patch *drastically* reduced the time consumed
by ::memstat. On large systems, it will often take just a minute
or two. I just tried it on a lightly loaded 512GB M9K and it was
less than 3 minutes.

Jim



On 10/29/10 03:55 PM, Phil Harman wrote:

Oracle often seems to recommend 1:1 (which is often not enough, especially with 
DISM). You don't even have 1:1.

Solaris also uses free memory as part of its swap space allocation. Locked 
memory, such as ISM/DISM eats free memory, and so reduces your available swap 
further.

You should confirm that DISM is off by running "pmap -x" against a process from each of 
your DBs (the shared memory should appear as "ism")

Commands like "swap -s" and good ol' "vmstat 5" are useful for monitoring swap. You should 
also run "echo :: memstat | mdb -k" from time to time to get a feel for hiw your RAM is being used" 
(on large machines, I've seen it take up to an hour to complete, and it will hig a CPU for the duration, but it 
seems to have little other impact on the system).

On 29 Oct 2010, at 23:37, Robin Cotgrove  wrote:


This is what Oracle says about swap for 11gR2. The
comment about
subtracting ISM is not
correct. A simple test shows that ISM does consume
swap (even if it's
not DISM). Think
about what happens when a memory segment is created
(before it goes to
ISM), if someone
happens to attach in non-ISM mode and when everyone
detaches from the
segment and it
ceases to be ISM). In the first and last stage swap
space is *required*
and the VM system
reserves the space needed when the segment is first
created.

I agree with you. In our case disabling the use of DISM really helped to make 
the platform more stable and helped with overall memory usage.

By the way, we using Oracle 10.2.0.4. No use of Oracle 11gR2 yet.

We have 192GB of physical memory and 96GB of swap device. The SGA/PGA  sizes of 
all the Oracle DB's fit well within the 192GB leaving a consistent ~50GB spare. 
Memory consumption stays stable on the platform and doesn't go up and down. 
This is the nature of the Oracle DB's allocating memory at start-up.


I would be cautious about Oracle assurances...

Yep

Jim
---


go to the following for full list of available

oracle book.

http://www.oracle.com/pls/db112/homepage

which links to the 11gr2 install guide
Db install guides


http://www.oracle.com/pls/db112/portal.portal_db?selec
ted=11&frame=

which links to the following section on memory


http://download.oracle.com/docs/cd/E11882_01/install.1
12/e17163/pre_install.htm#sthref62



--
2.2.1 Memory Requirements

The following are the memory requirements for

installing Oracle

Database 11g Release 2.

*

  At least 4 GB of RAM

  To determine the RAM size, enter the

following command:

# /usr/sbin/prtconf | grep "Memory size"

If the size of the RAM is less than the required

size, then you must

install more memory before continuing.

*

  The following table describes the

relationship between installed

RAM and the configured swap space recommendation:

  Note:
  On Solaris, if you use non-swappable memory,

like ISM, then you

should deduct the memory allocated to this space

from the available

RAM before calculating swap space.
  RAM Swap Space
  Between 4 GB and 16 GB Equal to the size

of RAM

  More than 16 GB 16 GB



On 10/29/2010 2:01 PM, Jim Mauro wrote:

Thanks Mike. Good point on the script.

Indeed, use of speculative tracing would be a

better

fit here. I'll see if I can get something together

and

send it out.

Thanks,
/jim

On Oct 29, 2010, at 4:45 PM, Mike Gerdts wrote:


On Fri, Oct 29, 2010 at 2:50 PM, Robin

Cotgrove   wrote:

Sorry guys. Swap is not the issue. We've had this

confirmed by Oracle and I can clearly see there is
96GB of swap awailable on the system and ~50GB of
main memory.

By who at Oracle?  Not everyone is equally

qualified.  I would tend to

trust Jim Mauro (who co-wrote the books[1] on

Solaris internals,

performance,&   dtrace) over most of the people you

will get to through

normal support channels.

1. http://www.amazon.com/Jim-Mauro/e/B001ILM8NC/

How do you know that available swap doesn't

momentarily drop?  I've

run into plenty of instances where a system has

tens of gigabytes of

free memory but is woefully short on reservable

swap (virtual memory,

as Jim approximates).  Usually "vmstat 1" is

helpful in observing

spikes, but as I said before this could miss very

short spikes.  If

you've already done this to see that swap is

unlikely to be an issue,

knowing that would be useful to know.  If you are

measuring the amount

of reservable swap with "swap -l", you are doing

it wrong.

I do agree that there can be other shortfalls that

can cause this.

This may call for speculative tracing of stacks

across the fork entry

and return calls, displaying results only when the

fork fails with

EAGAIN.  Jim's second script is similar to what I

suggest, except th

Re: [dtrace-discuss] Solaris Internals Resource Threshold being hit

2010-10-29 Thread Phil Harman
Oracle often seems to recommend 1:1 (which is often not enough, especially with 
DISM). You don't even have 1:1.

Solaris also uses free memory as part of its swap space allocation. Locked 
memory, such as ISM/DISM eats free memory, and so reduces your available swap 
further.

You should confirm that DISM is off by running "pmap -x" against a process from 
each of your DBs (the shared memory should appear as "ism")

Commands like "swap -s" and good ol' "vmstat 5" are useful for monitoring swap. 
You should also run "echo :: memstat | mdb -k" from time to time to get a feel 
for hiw your RAM is being used" (on large machines, I've seen it take up to an 
hour to complete, and it will hig a CPU for the duration, but it seems to have 
little other impact on the system).

On 29 Oct 2010, at 23:37, Robin Cotgrove  wrote:

>> This is what Oracle says about swap for 11gR2. The
>> comment about 
>> subtracting ISM is not
>> correct. A simple test shows that ISM does consume
>> swap (even if it's 
>> not DISM). Think
>> about what happens when a memory segment is created
>> (before it goes to 
>> ISM), if someone
>> happens to attach in non-ISM mode and when everyone
>> detaches from the 
>> segment and it
>> ceases to be ISM). In the first and last stage swap
>> space is *required* 
>> and the VM system
>> reserves the space needed when the segment is first
>> created.
> 
> I agree with you. In our case disabling the use of DISM really helped to make 
> the platform more stable and helped with overall memory usage. 
> 
> By the way, we using Oracle 10.2.0.4. No use of Oracle 11gR2 yet. 
> 
> We have 192GB of physical memory and 96GB of swap device. The SGA/PGA  sizes 
> of all the Oracle DB's fit well within the 192GB leaving a consistent ~50GB 
> spare. Memory consumption stays stable on the platform and doesn't go up and 
> down. This is the nature of the Oracle DB's allocating memory at start-up. 
> 
>> 
>> I would be cautious about Oracle assurances...
> 
> Yep
>> 
>> Jim
>> ---
>> 
>>> go to the following for full list of available
>> oracle book.
>>> http://www.oracle.com/pls/db112/homepage
>>> 
>>> which links to the 11gr2 install guide
>>> Db install guides
>>> 
>> http://www.oracle.com/pls/db112/portal.portal_db?selec
>> ted=11&frame=
>>> 
>>> which links to the following section on memory
>>> 
>> http://download.oracle.com/docs/cd/E11882_01/install.1
>> 12/e17163/pre_install.htm#sthref62 
>>> 
>>> 
>>> 
>>> --
>>> 2.2.1 Memory Requirements
>>> 
>>> The following are the memory requirements for
>> installing Oracle 
>>> Database 11g Release 2.
>>> 
>>>*
>>> 
>>>  At least 4 GB of RAM
>>> 
>>>  To determine the RAM size, enter the
>> following command:
>>> 
>>> # /usr/sbin/prtconf | grep "Memory size"
>>> 
>>> If the size of the RAM is less than the required
>> size, then you must 
>>> install more memory before continuing.
>>> 
>>>*
>>> 
>>>  The following table describes the
>> relationship between installed 
>>> RAM and the configured swap space recommendation:
>>> 
>>>  Note:
>>>  On Solaris, if you use non-swappable memory,
>> like ISM, then you 
>>> should deduct the memory allocated to this space
>> from the available 
>>> RAM before calculating swap space.
>>>  RAM Swap Space
>>>  Between 4 GB and 16 GB Equal to the size
>> of RAM
>>>  More than 16 GB 16 GB 
>> 
>> 
>> 
>> On 10/29/2010 2:01 PM, Jim Mauro wrote:
>>> Thanks Mike. Good point on the script.
>>> 
>>> Indeed, use of speculative tracing would be a
>> better
>>> fit here. I'll see if I can get something together
>> and
>>> send it out.
>>> 
>>> Thanks,
>>> /jim
>>> 
>>> On Oct 29, 2010, at 4:45 PM, Mike Gerdts wrote:
>>> 
 On Fri, Oct 29, 2010 at 2:50 PM, Robin
>> Cotgrove  wrote:
> Sorry guys. Swap is not the issue. We've had this
>> confirmed by Oracle and I can clearly see there is
>> 96GB of swap awailable on the system and ~50GB of
>> main memory.
 By who at Oracle?  Not everyone is equally
>> qualified.  I would tend to
 trust Jim Mauro (who co-wrote the books[1] on
>> Solaris internals,
 performance,&  dtrace) over most of the people you
>> will get to through
 normal support channels.
 
 1. http://www.amazon.com/Jim-Mauro/e/B001ILM8NC/
 
 How do you know that available swap doesn't
>> momentarily drop?  I've
 run into plenty of instances where a system has
>> tens of gigabytes of
 free memory but is woefully short on reservable
>> swap (virtual memory,
 as Jim approximates).  Usually "vmstat 1" is
>> helpful in observing
 spikes, but as I said before this could miss very
>> short spikes.  If
 you've already done this to see that swap is
>> unlikely to be an issue,
 knowing that would be useful to know.  If you are
>> measuring the amount
 of reservable swap with "swap -l", you are doing
>> it wrong.
 
 I do agree that there can be other shortfalls that
>> can cause this.
 This may call for specula

Re: [dtrace-discuss] Solaris Internals Resource Threshold being hit

2010-10-29 Thread Phil Harman
+1

I have seen many instancesof this. It is trivial to add swap, but I'm simply 
tired of the number of times DBA's have protested "we have enough" or even 
tried FUD like "Oracle won't support us if we add more" (yes, I had that one 
within the lasr year). Just do it!

As has already been pointed out, Solaris has a swap reservation model. It's a 
bit like car insurance: you have to have it to drive on the road, but you hope 
you'll never need it. Solaris won't let you drive underinsured.

On 29 Oct 2010, at 22:29, James Litchfield  wrote:

> This is what Oracle says about swap for 11gR2. The comment about subtracting 
> ISM is not
> correct. A simple test shows that ISM does consume swap (even if it's not 
> DISM). Think
> about what happens when a memory segment is created (before it goes to ISM), 
> if someone
> happens to attach in non-ISM mode and when everyone detaches from the segment 
> and it
> ceases to be ISM). In the first and last stage swap space is *required* and 
> the VM system
> reserves the space needed when the segment is first created.
> 
> I would be cautious about Oracle assurances...
> 
> Jim
> ---
> 
>> go to the following for full list of available oracle book. 
>> http://www.oracle.com/pls/db112/homepage 
>> 
>> which links to the 11gr2 install guide 
>> Db install guides 
>> http://www.oracle.com/pls/db112/portal.portal_db?selected=11&frame= 
>> 
>> which links to the following section on memory 
>> http://download.oracle.com/docs/cd/E11882_01/install.112/e17163/pre_install.htm#sthref62
>>  
>> 
>> 
>> -- 
>> 2.2.1 Memory Requirements 
>> 
>> The following are the memory requirements for installing Oracle Database 11g 
>> Release 2. 
>> 
>> * 
>> 
>>   At least 4 GB of RAM 
>> 
>>   To determine the RAM size, enter the following command: 
>> 
>> # /usr/sbin/prtconf | grep "Memory size" 
>> 
>> If the size of the RAM is less than the required size, then you must install 
>> more memory before continuing. 
>> 
>> * 
>> 
>>   The following table describes the relationship between installed RAM 
>> and the configured swap space recommendation: 
>> 
>>   Note: 
>>   On Solaris, if you use non-swappable memory, like ISM, then you should 
>> deduct the memory allocated to this space from the available RAM before 
>> calculating swap space. 
>>   RAM Swap Space 
>>   Between 4 GB and 16 GB Equal to the size of RAM 
>>   More than 16 GB 16 GB
> 
> 
> 
> On 10/29/2010 2:01 PM, Jim Mauro wrote:
>> 
>> Thanks Mike. Good point on the script.
>> 
>> Indeed, use of speculative tracing would be a better
>> fit here. I'll see if I can get something together and 
>> send it out.
>> 
>> Thanks,
>> /jim
>> 
>> On Oct 29, 2010, at 4:45 PM, Mike Gerdts wrote:
>> 
>>> On Fri, Oct 29, 2010 at 2:50 PM, Robin Cotgrove  wrote:
 Sorry guys. Swap is not the issue. We've had this confirmed by Oracle and 
 I can clearly see there is 96GB of swap awailable on the system and ~50GB 
 of main memory.
>>> By who at Oracle?  Not everyone is equally qualified.  I would tend to
>>> trust Jim Mauro (who co-wrote the books[1] on Solaris internals,
>>> performance, & dtrace) over most of the people you will get to through
>>> normal support channels.
>>> 
>>> 1. http://www.amazon.com/Jim-Mauro/e/B001ILM8NC/
>>> 
>>> How do you know that available swap doesn't momentarily drop?  I've
>>> run into plenty of instances where a system has tens of gigabytes of
>>> free memory but is woefully short on reservable swap (virtual memory,
>>> as Jim approximates).  Usually "vmstat 1" is helpful in observing
>>> spikes, but as I said before this could miss very short spikes.  If
>>> you've already done this to see that swap is unlikely to be an issue,
>>> knowing that would be useful to know.  If you are measuring the amount
>>> of reservable swap with "swap -l", you are doing it wrong.
>>> 
>>> I do agree that there can be other shortfalls that can cause this.
>>> This may call for speculative tracing of stacks across the fork entry
>>> and return calls, displaying results only when the fork fails with
>>> EAGAIN.  Jim's second script is similar to what I suggest, except that
>>> it doesn't show the code path taken between syscall::forksys:entry and
>>> syscall::forksys:return.
>>> 
>>> Also, I would be a little careful running the second script as is for
>>> long periods of time if you have a lot of forksys activity with unique
>>> stacks.  I think that as it is @ks may grow rather large over time
>>> because the successful forks are not cleared.
>>> 
>>> -- 
>>> Mike Gerdts
>>> http://mgerdts.blogspot.com/
>>> ___
>>> dtrace-discuss mailing list
>>> dtrace-discuss@opensolaris.org
>> ___
>> dtrace-discuss mailing list
>> dtrace-discuss@opensolaris.org
> 
> 
> -- 
> 
> James Litchfield | Senior Consultant
> Phone: +1 4082237059 | Mobile: +1 4082180790 
> Oracle Oracle ACS
> Cal

Re: [dtrace-discuss] Solaris Internals Resource Threshold being hit

2010-10-29 Thread Robin Cotgrove
> This is what Oracle says about swap for 11gR2. The
> comment about 
> subtracting ISM is not
> correct. A simple test shows that ISM does consume
> swap (even if it's 
> not DISM). Think
> about what happens when a memory segment is created
> (before it goes to 
> ISM), if someone
> happens to attach in non-ISM mode and when everyone
> detaches from the 
> segment and it
> ceases to be ISM). In the first and last stage swap
> space is *required* 
> and the VM system
> reserves the space needed when the segment is first
> created.

I agree with you. In our case disabling the use of DISM really helped to make 
the platform more stable and helped with overall memory usage. 

By the way, we using Oracle 10.2.0.4. No use of Oracle 11gR2 yet. 

We have 192GB of physical memory and 96GB of swap device. The SGA/PGA  sizes of 
all the Oracle DB's fit well within the 192GB leaving a consistent ~50GB spare. 
Memory consumption stays stable on the platform and doesn't go up and down. 
This is the nature of the Oracle DB's allocating memory at start-up. 

> 
> I would be cautious about Oracle assurances...

Yep
> 
> Jim
> ---
> 
> > go to the following for full list of available
> oracle book.
> > http://www.oracle.com/pls/db112/homepage
> >
> > which links to the 11gr2 install guide
> > Db install guides
> >
> http://www.oracle.com/pls/db112/portal.portal_db?selec
> ted=11&frame=
> >
> > which links to the following section on memory
> >
> http://download.oracle.com/docs/cd/E11882_01/install.1
> 12/e17163/pre_install.htm#sthref62 
> >
> >
> >
> > --
> > 2.2.1 Memory Requirements
> >
> > The following are the memory requirements for
> installing Oracle 
> > Database 11g Release 2.
> >
> > *
> >
> >   At least 4 GB of RAM
> >
> >   To determine the RAM size, enter the
> following command:
> >
> > # /usr/sbin/prtconf | grep "Memory size"
> >
> > If the size of the RAM is less than the required
> size, then you must 
> > install more memory before continuing.
> >
> > *
> >
> >   The following table describes the
> relationship between installed 
> > RAM and the configured swap space recommendation:
> >
> >   Note:
> >   On Solaris, if you use non-swappable memory,
> like ISM, then you 
> > should deduct the memory allocated to this space
> from the available 
> > RAM before calculating swap space.
> >   RAM Swap Space
> >   Between 4 GB and 16 GB Equal to the size
> of RAM
> >   More than 16 GB 16 GB 
> 
> 
> 
> On 10/29/2010 2:01 PM, Jim Mauro wrote:
> > Thanks Mike. Good point on the script.
> >
> > Indeed, use of speculative tracing would be a
> better
> > fit here. I'll see if I can get something together
> and
> > send it out.
> >
> > Thanks,
> > /jim
> >
> > On Oct 29, 2010, at 4:45 PM, Mike Gerdts wrote:
> >
> >> On Fri, Oct 29, 2010 at 2:50 PM, Robin
> Cotgrove  wrote:
> >>> Sorry guys. Swap is not the issue. We've had this
> confirmed by Oracle and I can clearly see there is
> 96GB of swap awailable on the system and ~50GB of
> main memory.
> >> By who at Oracle?  Not everyone is equally
> qualified.  I would tend to
> >> trust Jim Mauro (who co-wrote the books[1] on
> Solaris internals,
> >> performance,&  dtrace) over most of the people you
> will get to through
> >> normal support channels.
> >>
> >> 1. http://www.amazon.com/Jim-Mauro/e/B001ILM8NC/
> >>
> >> How do you know that available swap doesn't
> momentarily drop?  I've
> >> run into plenty of instances where a system has
> tens of gigabytes of
> >> free memory but is woefully short on reservable
> swap (virtual memory,
> >> as Jim approximates).  Usually "vmstat 1" is
> helpful in observing
> >> spikes, but as I said before this could miss very
> short spikes.  If
> >> you've already done this to see that swap is
> unlikely to be an issue,
> >> knowing that would be useful to know.  If you are
> measuring the amount
> >> of reservable swap with "swap -l", you are doing
> it wrong.
> >>
> >> I do agree that there can be other shortfalls that
> can cause this.
> >> This may call for speculative tracing of stacks
> across the fork entry
> >> and return calls, displaying results only when the
> fork fails with
> >> EAGAIN.  Jim's second script is similar to what I
> suggest, except that
> >> it doesn't show the code path taken between
> syscall::forksys:entry and
> >> syscall::forksys:return.
> >>
> >> Also, I would be a little careful running the
> second script as is for
> >> long periods of time if you have a lot of forksys
> activity with unique
> >> stacks.  I think that as it is @ks may grow rather
> large over time
> >> because the successful forks are not cleared.
> >>
> >> -- 
> >> Mike Gerdts
> >> http://mgerdts.blogspot.com/
> >> ___
> >> dtrace-discuss mailing list
> >> dtrace-discuss@opensolaris.org
> > ___
> > dtrace-discuss mailing list
> > dtrace-discuss@opensolaris.org
> 
> 
> -- 
> Oracle 

Re: [dtrace-discuss] Solaris Internals Resource Threshold being hit

2010-10-29 Thread James Litchfield


  
  
This is what Oracle says about swap for 11gR2. The comment about
subtracting ISM is not
correct. A simple test shows that ISM does consume swap (even if
it's not DISM). Think
about what happens when a memory segment is created (before it goes
to ISM), if someone
happens to attach in non-ISM mode and when everyone detaches from
the segment and it
ceases to be ISM). In the first and last stage swap space is
*required* and the VM system
reserves the space needed when the segment is first created.

I would be cautious about Oracle assurances...

Jim
---

go to the following for full list of
  available oracle book. 
  http://www.oracle.com/pls/db112/homepage
  
  
  which links to the 11gr2 install guide 
  Db install guides 
  http://www.oracle.com/pls/db112/portal.portal_db?selected=11&frame=
  
  
  which links to the following section on memory 
  http://download.oracle.com/docs/cd/E11882_01/install.112/e17163/pre_install.htm#sthref62
  
  
  
  -- 
  2.2.1 Memory Requirements 
  
  The following are the memory requirements for installing Oracle
  Database 11g Release 2. 
  
      * 
  
    At least 4 GB of RAM 
  
    To determine the RAM size, enter the following command: 
  
  # /usr/sbin/prtconf | grep "Memory size" 
  
  If the size of the RAM is less than the required size, then you
  must install more memory before continuing. 
  
      * 
  
    The following table describes the relationship between
  installed RAM and the configured swap space recommendation: 
  
    Note: 
    On Solaris, if you use non-swappable memory, like ISM, then
  you should deduct the memory allocated to this space from the
  available RAM before calculating swap space. 
    RAM Swap Space 
    Between 4 GB and 16 GB Equal to the size of RAM 
    More than 16 GB 16 GB 



On 10/29/2010 2:01 PM, Jim Mauro wrote:

  Thanks Mike. Good point on the script.

Indeed, use of speculative tracing would be a better
fit here. I'll see if I can get something together and 
send it out.

Thanks,
/jim

On Oct 29, 2010, at 4:45 PM, Mike Gerdts wrote:


  
On Fri, Oct 29, 2010 at 2:50 PM, Robin Cotgrove  wrote:


  Sorry guys. Swap is not the issue. We've had this confirmed by Oracle and I can clearly see there is 96GB of swap awailable on the system and ~50GB of main memory.



By who at Oracle?  Not everyone is equally qualified.  I would tend to
trust Jim Mauro (who co-wrote the books[1] on Solaris internals,
performance, & dtrace) over most of the people you will get to through
normal support channels.

1. http://www.amazon.com/Jim-Mauro/e/B001ILM8NC/

How do you know that available swap doesn't momentarily drop?  I've
run into plenty of instances where a system has tens of gigabytes of
free memory but is woefully short on reservable swap (virtual memory,
as Jim approximates).  Usually "vmstat 1" is helpful in observing
spikes, but as I said before this could miss very short spikes.  If
you've already done this to see that swap is unlikely to be an issue,
knowing that would be useful to know.  If you are measuring the amount
of reservable swap with "swap -l", you are doing it wrong.

I do agree that there can be other shortfalls that can cause this.
This may call for speculative tracing of stacks across the fork entry
and return calls, displaying results only when the fork fails with
EAGAIN.  Jim's second script is similar to what I suggest, except that
it doesn't show the code path taken between syscall::forksys:entry and
syscall::forksys:return.

Also, I would be a little careful running the second script as is for
long periods of time if you have a lot of forksys activity with unique
stacks.  I think that as it is @ks may grow rather large over time
because the successful forks are not cleared.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org

  
  
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org




-- 
  
  James Litchfield | Senior Consultant
Phone: +1 4082237059 |
Mobile: +1 4082180790 
Oracle Oracle ACS
California 
  
  
  Oracle is committed to developing practices and
products that help protect the environment
  
  

  

___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org

Re: [dtrace-discuss] Solaris Internals Resource Threshold being hit

2010-10-29 Thread Robin Cotgrove
> On Fri, Oct 29, 2010 at 2:50 PM, Robin Cotgrove
>  wrote:
> > Sorry guys. Swap is not the issue. We've had this
> confirmed by Oracle and I can clearly see there is
> 96GB of swap awailable on the system and ~50GB of
> main memory.
> 
> By who at Oracle?  Not everyone is equally qualified.
>  I would tend to
> rust Jim Mauro (who co-wrote the books[1] on Solaris
> internals,
> performance, & dtrace) over most of the people you
> will get to through
> normal support channels.

Agreed. The normal support channel told us the GUDS script would be better to 
capture the root cause over producing a memory dump. 

> 
> 1. http://www.amazon.com/Jim-Mauro/e/B001ILM8NC/
> 
> How do you know that available swap doesn't
> momentarily drop?  

Because I have been monitoring it during the issues with vmstat and I also 
understand the workload on the platform to know that nothing is starting with 
huge memory requirements suddenly. This is a VCS cluster with Oracle Database 
Resource Groups. DISM usage by the various Oracle DB's is not in use as we ran 
into that a bug with that some months ago. We've seen patched the system but we 
don't need the use of DISM on this dev/test Oracle VCS cluster. 

I've run into plenty of instances where a system has tens
> of gigabytes of
> free memory but is woefully short on reservable swap
> (virtual memory,
> as Jim approximates).  Usually "vmstat 1" is helpful
> in observing
> spikes, but as I said before this could miss very
> short spikes.  If
> you've already done this to see that swap is unlikely
> to be an issue,
> knowing that would be useful to know.  If you are
> measuring the amount
> of reservable swap with "swap -l", you are doing it
> wrong.

Agreed. I don't use it and I don't trust the output from the top utility either 
:-) 

> 
> I do agree that there can be other shortfalls that
> can cause this.
> This may call for speculative tracing of stacks
> across the fork entry
> and return calls, displaying results only when the
> fork fails with
> EAGAIN.  Jim's second script is similar to what I
> suggest, except that
> it doesn't show the code path taken between
> syscall::forksys:entry and
> syscall::forksys:return.
> 
> Also, I would be a little careful running the second
> script as is for
> long periods of time if you have a lot of forksys
> activity with unique
> stacks.  I think that as it is @ks may grow rather
> large over time
> because the successful forks are not cleared.
> 
> -- 
> Mike Gerdts
> http://mgerdts.blogspot.com/
> ___
> dtrace-discuss mailing list
> dtrace-discuss@opensolaris.org
>
-- 
This message posted from opensolaris.org
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Solaris Internals Resource Threshold being hit

2010-10-29 Thread Jim Mauro
Thanks Mike. Good point on the script.

Indeed, use of speculative tracing would be a better
fit here. I'll see if I can get something together and 
send it out.

Thanks,
/jim

On Oct 29, 2010, at 4:45 PM, Mike Gerdts wrote:

> On Fri, Oct 29, 2010 at 2:50 PM, Robin Cotgrove  wrote:
>> Sorry guys. Swap is not the issue. We've had this confirmed by Oracle and I 
>> can clearly see there is 96GB of swap awailable on the system and ~50GB of 
>> main memory.
> 
> By who at Oracle?  Not everyone is equally qualified.  I would tend to
> trust Jim Mauro (who co-wrote the books[1] on Solaris internals,
> performance, & dtrace) over most of the people you will get to through
> normal support channels.
> 
> 1. http://www.amazon.com/Jim-Mauro/e/B001ILM8NC/
> 
> How do you know that available swap doesn't momentarily drop?  I've
> run into plenty of instances where a system has tens of gigabytes of
> free memory but is woefully short on reservable swap (virtual memory,
> as Jim approximates).  Usually "vmstat 1" is helpful in observing
> spikes, but as I said before this could miss very short spikes.  If
> you've already done this to see that swap is unlikely to be an issue,
> knowing that would be useful to know.  If you are measuring the amount
> of reservable swap with "swap -l", you are doing it wrong.
> 
> I do agree that there can be other shortfalls that can cause this.
> This may call for speculative tracing of stacks across the fork entry
> and return calls, displaying results only when the fork fails with
> EAGAIN.  Jim's second script is similar to what I suggest, except that
> it doesn't show the code path taken between syscall::forksys:entry and
> syscall::forksys:return.
> 
> Also, I would be a little careful running the second script as is for
> long periods of time if you have a lot of forksys activity with unique
> stacks.  I think that as it is @ks may grow rather large over time
> because the successful forks are not cleared.
> 
> -- 
> Mike Gerdts
> http://mgerdts.blogspot.com/
> ___
> dtrace-discuss mailing list
> dtrace-discuss@opensolaris.org

___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Solaris Internals Resource Threshold being hit

2010-10-29 Thread Mike Gerdts
On Fri, Oct 29, 2010 at 2:50 PM, Robin Cotgrove  wrote:
> Sorry guys. Swap is not the issue. We've had this confirmed by Oracle and I 
> can clearly see there is 96GB of swap awailable on the system and ~50GB of 
> main memory.

By who at Oracle?  Not everyone is equally qualified.  I would tend to
trust Jim Mauro (who co-wrote the books[1] on Solaris internals,
performance, & dtrace) over most of the people you will get to through
normal support channels.

1. http://www.amazon.com/Jim-Mauro/e/B001ILM8NC/

How do you know that available swap doesn't momentarily drop?  I've
run into plenty of instances where a system has tens of gigabytes of
free memory but is woefully short on reservable swap (virtual memory,
as Jim approximates).  Usually "vmstat 1" is helpful in observing
spikes, but as I said before this could miss very short spikes.  If
you've already done this to see that swap is unlikely to be an issue,
knowing that would be useful to know.  If you are measuring the amount
of reservable swap with "swap -l", you are doing it wrong.

I do agree that there can be other shortfalls that can cause this.
This may call for speculative tracing of stacks across the fork entry
and return calls, displaying results only when the fork fails with
EAGAIN.  Jim's second script is similar to what I suggest, except that
it doesn't show the code path taken between syscall::forksys:entry and
syscall::forksys:return.

Also, I would be a little careful running the second script as is for
long periods of time if you have a lot of forksys activity with unique
stacks.  I think that as it is @ks may grow rather large over time
because the successful forks are not cleared.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Solaris Internals Resource Threshold being hit

2010-10-29 Thread James Litchfield


  
  
I would start with adding swap. oracle's swap recommendations are
utterly bogus.

Jim
===

On 10/29/2010 11:27 AM, Jim Mauro wrote:

  Mike is correct. Pretty much every time I've seen this, it's
VM (VM = virtual memory = swap) related.

There's a DTrace script below you can run when you hit this
problem that will show us which system call is failing with an
EAGAIN error. It is most likely fork(2) (and yes, I know printing
the errno in the return action is superfluous given we use it
in the predicate - it's me being OCD and sanity checking).

A second DTrace script further down should provide a kernel
stack trace if it is a fork(2) failure.

Or(disk is cheap) "swap -a" (add swap space) and see if the
problem goes away.

Thanks
/jim


#!/usr/sbin/dtrace -s

#pragma D option quiet

syscall:::entry
{
	self->flag[probefunc] = 1;
}
syscall:::return
/self->flag[probefunc] && errno == 11/
{
	printf("syscall: %s, arg0: %d, arg1: %d, errno: %d\n\n",probefunc,arg0,arg1,errno);
	self->flag[probefunc] = 0;
}




#!/usr/sbin/dtrace -s

#pragma D option quiet

syscall::forksys:entry
{
	self->flag = 1;
	@ks[stack(),ustack()] = count();
}
syscall::forksys:return
/self->flag && arg0 == -1 && errno != 0/
{
	printf("fork failed, errno: %d\n",errno);
	printa(@ks);
	clear(@ks);
	exit(0);
}


On Oct 29, 2010, at 12:00 PM, Robin Cotgrove wrote:


  
I need some assistance and guidance in writing a DTRACE script or even better, finding an example one which would help me identify what's going on our system. Intermittently, and we think it might be happening after about 60 days, on a E2900, 192GB, 24 core, Solaris 10 11.06 system with a fairly new patch cluster (Generic_142900-13) we are running into a problem whereby we suddenly hit a problem which results in processes failing to start and getting the error message 'resource temporarily unavailable' error. This is leading to Oracle crash/startup issues.

I ran a simple du command at the time it was happening at got the following response.

‘du: No more processes: Resource temporarily unavailable’ 

Approximately 6500 TCP connections on server at time. 6000 unix processes. The max UNIX processes per user is set to 29995. 60GB free physical memory and no swap being used. Absolutely baffling us at mo. 

Not managed to truss a failing command when it happened yet because it's so intermitttent in it's nature. 

We've checked all the usual suspects including max processes per users and cannot find the cause. Need a way to monitor all the internal kernel resources to see what we're hitting. Suggestions please on a postcard. All welcome. 

Robin Cotgrove
-- 
This message posted from opensolaris.org
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org

  
  
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org




-- 
  
  James Litchfield | Senior Consultant
Phone: +1 4082237059 |
Mobile: +1 4082180790 
Oracle Oracle ACS
California 
  
  
  Oracle is committed to developing practices and
products that help protect the environment
  
  

  

___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org

Re: [dtrace-discuss] Solaris Internals Resource Threshold being hit

2010-10-29 Thread Robin Cotgrove
Sorry guys. Swap is not the issue. We've had this confirmed by Oracle and I can 
clearly see there is 96GB of swap awailable on the system and ~50GB of main 
memory. 

Not everything relating to forking problems is swap. We have had a similar 
forking issue in the past and solved it with swap file addition and in one 
case, it was shared memory was being restricted a Solaris project setting. File 
descriptor limits being hit is another good one. Max processes per user is 
another common one.  All lot's of common reasons. This one is weird and we 
don't know what it is. 

Like the dtrace scripts though. Very useful to make things a lot clearer for 
people to interpret values.
-- 
This message posted from opensolaris.org
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Solaris Internals Resource Threshold being hit

2010-10-29 Thread Jim Mauro
Mike is correct. Pretty much every time I've seen this, it's
VM (VM = virtual memory = swap) related.

There's a DTrace script below you can run when you hit this
problem that will show us which system call is failing with an
EAGAIN error. It is most likely fork(2) (and yes, I know printing
the errno in the return action is superfluous given we use it
in the predicate - it's me being OCD and sanity checking).

A second DTrace script further down should provide a kernel
stack trace if it is a fork(2) failure.

Or(disk is cheap) "swap -a" (add swap space) and see if the
problem goes away.

Thanks
/jim


#!/usr/sbin/dtrace -s

#pragma D option quiet

syscall:::entry
{
self->flag[probefunc] = 1;
}
syscall:::return
/self->flag[probefunc] && errno == 11/
{
printf("syscall: %s, arg0: %d, arg1: %d, errno: 
%d\n\n",probefunc,arg0,arg1,errno);
self->flag[probefunc] = 0;
}




#!/usr/sbin/dtrace -s

#pragma D option quiet

syscall::forksys:entry
{
self->flag = 1;
@ks[stack(),ustack()] = count();
}
syscall::forksys:return
/self->flag && arg0 == -1 && errno != 0/
{
printf("fork failed, errno: %d\n",errno);
printa(@ks);
clear(@ks);
exit(0);
}


On Oct 29, 2010, at 12:00 PM, Robin Cotgrove wrote:

> I need some assistance and guidance in writing a DTRACE script or even 
> better, finding an example one which would help me identify what's going on 
> our system. Intermittently, and we think it might be happening after about 60 
> days, on a E2900, 192GB, 24 core, Solaris 10 11.06 system with a fairly new 
> patch cluster (Generic_142900-13) we are running into a problem whereby we 
> suddenly hit a problem which results in processes failing to start and 
> getting the error message 'resource temporarily unavailable' error. This is 
> leading to Oracle crash/startup issues.
> 
> I ran a simple du command at the time it was happening at got the following 
> response.
> 
> ‘du: No more processes: Resource temporarily unavailable’ 
> 
> Approximately 6500 TCP connections on server at time. 6000 unix processes. 
> The max UNIX processes per user is set to 29995. 60GB free physical memory 
> and no swap being used. Absolutely baffling us at mo. 
> 
> Not managed to truss a failing command when it happened yet because it's so 
> intermitttent in it's nature. 
> 
> We've checked all the usual suspects including max processes per users and 
> cannot find the cause. Need a way to monitor all the internal kernel 
> resources to see what we're hitting. Suggestions please on a postcard. All 
> welcome. 
> 
> Robin Cotgrove
> -- 
> This message posted from opensolaris.org
> ___
> dtrace-discuss mailing list
> dtrace-discuss@opensolaris.org

___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] Solaris Internals Resource Threshold being hit

2010-10-29 Thread Mike Gerdts
On Fri, Oct 29, 2010 at 11:00 AM, Robin Cotgrove  wrote:
> I need some assistance and guidance in writing a DTRACE script or even 
> better, finding an example one which would help me identify what's going on 
> our system. Intermittently, and we think it might be happening after about 60 
> days, on a E2900, 192GB, 24 core, Solaris 10 11.06 system with a fairly new 
> patch cluster (Generic_142900-13) we are running into a problem whereby we 
> suddenly hit a problem which results in processes failing to start and 
> getting the error message 'resource temporarily unavailable' error. This is 
> leading to Oracle crash/startup issues.
>
> I ran a simple du command at the time it was happening at got the following 
> response.
>
> ‘du: No more processes: Resource temporarily unavailable’

Does anything get logged to /var/adm/messages?

>
> Approximately 6500 TCP connections on server at time. 6000 unix processes. 
> The max UNIX processes per user is set to 29995. 60GB free physical memory 
> and no swap being used. Absolutely baffling us at mo.

Swap may not be used, but it is certainly reserved.  Note that Solaris
has multiple definitions of swap.  That disk space you allocated and
called "swap" is one thing.  The overall RAM and swap device backed
address space is another.

Unlike Linux (default config), Solaris does not allow memory to be
overcommitted.  If something does malloc(1024 * 1024 * 1024 * 1024),
the call will fail on Solaris unless you have 1 TB of free "swap"
(memory + swap devices).  On Linux, the malloc would likely succeed.
At such a time as you actually start writing to more pages of memory
than your system has in RAM + swap devices, the allocated memory, the
Linux Out of Memory Killer will kick in and start selecting things to
kill to free up memory.

We can see this with two runs of /opt/DTT/Mem/swapinfo.d on my
OpenSolaris system.  You can get this for Solaris 10 as part of the
DTraceToolkit.

# /opt/DTT/Mem/swapinfo.d
...
Swap ___Total  2496 MB
Swap Resv   619 MB
SwapAvail  1877 MB
Swap(Minfree)   222 MB

# /opt/DTT/Mem/swapinfo.d
...
Swap ___Total  2224 MB
Swap Resv  2047 MB
SwapAvail   176 MB
Swap(Minfree)   222 MB


One thing I just noticed - minfree does not become 176 MB as I would
have expected.  Be careful with that value!

Why was there such a big difference in Avail?  Because I ran this program:

/* Save as foo.c then compile with gcc -o foo foo.c */
#include 
#include 
#include 

int main(int argc, char **argv) {
if ( malloc(1024 * 1024 * 1700) == NULL ) {
perror("malloc");
exit(1);
}
sleep(5);
exit(0);
}

A likely scenario that would cause a database server to temporarily
reserve a lot more swap is when a new oracle process is created.  When
a process forks, memory is reserved for all of the pages of memory
that are anonymous (e.g. not an mmapped file or device), read-write,
and not shared.  This is required to support the copy-on-write
mechanism used by the virtual memory system.  You can use pmap to take
a look at the memory mappings of a process to get an idea of how much
space this takes.

To look at the amount of available swap that matters, refer to the
swap column of vmstat.  For things like this that are transient, you
may have trouble seeing it, even with "vmstat 1".  Note that while you
are looking at vmstat output, you should always ignore the first line
of output - it is a pretty much useless average since boot.  If you
need to get values at a higher resolution, you may want to adapt
swapinfo.d from the DTraceToolkit to use the profile provider to
quantize the available swap value.

>
> Not managed to truss a failing command when it happened yet because it's so 
> intermitttent in it's nature.
>
> We've checked all the usual suspects including max processes per users and 
> cannot find the cause. Need a way to monitor all the internal kernel 
> resources to see what we're hitting. Suggestions please on a postcard. All 
> welcome.

It seems quite likely to me that you will find that the swap that is
available to reserve temporarily dips to a minuscule value.  If this
is the case, adding more swap will help.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


[dtrace-discuss] Solaris Internals Resource Threshold being hit

2010-10-29 Thread Robin Cotgrove
I need some assistance and guidance in writing a DTRACE script or even better, 
finding an example one which would help me identify what's going on our system. 
Intermittently, and we think it might be happening after about 60 days, on a 
E2900, 192GB, 24 core, Solaris 10 11.06 system with a fairly new patch cluster 
(Generic_142900-13) we are running into a problem whereby we suddenly hit a 
problem which results in processes failing to start and getting the error 
message 'resource temporarily unavailable' error. This is leading to Oracle 
crash/startup issues.
 
I ran a simple du command at the time it was happening at got the following 
response.
 
‘du: No more processes: Resource temporarily unavailable’ 
 
Approximately 6500 TCP connections on server at time. 6000 unix processes. The 
max UNIX processes per user is set to 29995. 60GB free physical memory and no 
swap being used. Absolutely baffling us at mo. 
 
Not managed to truss a failing command when it happened yet because it's so 
intermitttent in it's nature. 
 
We've checked all the usual suspects including max processes per users and 
cannot find the cause. Need a way to monitor all the internal kernel resources 
to see what we're hitting. Suggestions please on a postcard. All welcome. 
 
Robin Cotgrove
-- 
This message posted from opensolaris.org
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org