Re: CVS commit: src/sys/kern

2010-11-08 Thread Masao Uebayashi
The first step would be to analyze code, identify glocal resource
allocations, and replace them to local ones.  m_get() / m_clget()
would become per-socket, by allocating per-socket pool_cache.

On Tue, Nov 09, 2010 at 12:36:16PM +0900, Masao Uebayashi wrote:
> The root cause of such a problem is, we don't do I/O scheduling at all.
> 
> I think that pool(9) (and pool_cache(9)) is a great tool to manage
> limited resources.  The way to go would be to extend it to manage
> bandwidth in I/O subsystems.  (We already use pool_cache(9) for
> KVA.)
> 
> Resources are shared, and have dependencies.  I/O resources would
> look like some hierachical structure chained with pool backend
> callbacks.  Probably combined with device tree too.
> 
> That said, I have no idea how to *fix* our networking stack.  It's
> unmaintainable IMO...
> 
> On Thu, Oct 14, 2010 at 05:07:39PM +0900, o...@netbsd.org wrote:
> > Hello,
> > 
> > Mindaugas said:
> > > Can you explain what the actual problem M_WAIT causes i.e. how would
> > > it "freeze network stack"?
> > 
> > In M_WAIT case, m_reclaim() will run and run until get mbuf cluster
> > if mclpool limit reached.  If m_reclaim() repeatedly but cannot to
> > get new mbuf cluster, m_clget() will not return.
> > 
> > network stacks using mbufs is use with M_DONTWAIT, but it will failed
> > to get new mbuf cluster in this case.  "freeze" means that.
> > 
> > Generally NMBCLUSTERS is enough and SB_MAX is small, the problem
> > is rarely caused.  It depends parameter configuration, I think.
> > 
> > In other words, to cause this problem,
> > - many packets receives from the network, it will read by userland
> > - userland program sends many packets, but not read packet from the stack
> > - and mclpool limit reached.
> > - no one calls pool_put()
> > 
> > In M_DONTWAIT case, M_EXT flag is result of m_clget().
> > code of checking M_EXT is already prepared, but didn't used
> > it with M_WAIT.  I think this is careless miss.
> > 
> > Thanks.
> > --
> > Masaru OKI 

-- 
Masao Uebayashi / Tombi Inc. / Tel: +81-90-9141-4635


Re: CVS commit: src/sys/arch/mips/mips

2010-11-08 Thread Simon Burge
David Holland wrote:

> On Mon, Nov 08, 2010 at 06:09:39PM +, Antti Kantee wrote:
>  > Modified Files:
>  >src/sys/arch/mips/mips: locore_mips1.S
>  > 
>  > Log Message:
>  > In TLBRead, restore PID before doing the saves so that the caller's
>  > TLB entries are used instead of the PID given as the argument.
>  > 
>  > from Alessandro Forin
> 
> [ ... ]
> 
> Do you have any further explanations or reports of symptoms? I don't
> see anything on any of the mips lists.

I'm curious about this too.  That code hasn't changed for at least
14 years...

Cheers,
Simon.


Re: CVS commit: src/sys/kern

2010-11-08 Thread Masao Uebayashi
The root cause of such a problem is, we don't do I/O scheduling at all.

I think that pool(9) (and pool_cache(9)) is a great tool to manage
limited resources.  The way to go would be to extend it to manage
bandwidth in I/O subsystems.  (We already use pool_cache(9) for
KVA.)

Resources are shared, and have dependencies.  I/O resources would
look like some hierachical structure chained with pool backend
callbacks.  Probably combined with device tree too.

That said, I have no idea how to *fix* our networking stack.  It's
unmaintainable IMO...

On Thu, Oct 14, 2010 at 05:07:39PM +0900, o...@netbsd.org wrote:
> Hello,
> 
> Mindaugas said:
> > Can you explain what the actual problem M_WAIT causes i.e. how would
> > it "freeze network stack"?
> 
> In M_WAIT case, m_reclaim() will run and run until get mbuf cluster
> if mclpool limit reached.  If m_reclaim() repeatedly but cannot to
> get new mbuf cluster, m_clget() will not return.
> 
> network stacks using mbufs is use with M_DONTWAIT, but it will failed
> to get new mbuf cluster in this case.  "freeze" means that.
> 
> Generally NMBCLUSTERS is enough and SB_MAX is small, the problem
> is rarely caused.  It depends parameter configuration, I think.
> 
> In other words, to cause this problem,
> - many packets receives from the network, it will read by userland
> - userland program sends many packets, but not read packet from the stack
> - and mclpool limit reached.
> - no one calls pool_put()
> 
> In M_DONTWAIT case, M_EXT flag is result of m_clget().
> code of checking M_EXT is already prepared, but didn't used
> it with M_WAIT.  I think this is careless miss.
> 
> Thanks.
> --
> Masaru OKI 


Re: CVS commit: src/sys/arch/mips/mips

2010-11-08 Thread Warner Losh

 On 11/08/2010 15:19, Antti Kantee wrote:

On Mon Nov 08 2010 at 21:40:49 +, David Holland wrote:

On Mon, Nov 08, 2010 at 06:09:39PM +, Antti Kantee wrote:
  >  Modified Files:
  >  src/sys/arch/mips/mips: locore_mips1.S
  >
  >  Log Message:
  >  In TLBRead, restore PID before doing the saves so that the caller's
  >  TLB entries are used instead of the PID given as the argument.
  >
  >  from Alessandro Forin

This doesn't make any sense.

As far as I can tell, the real problem is that the code is not
attending to pipeline hazards properly; moving things around until it
experimentally seems to work is just going to create other odd
behavior sometime down the line under different timing circumstances.

I don't have a mips1-specific reference on hand, but my recollection
is that you need *three* nops when messing with cop0 registers for all
effects to flush through.

locore_mips1.S doesn't agree with your recollection, e.g. mips1_TLBUpdate
right above mips1_TLBRead.
mips1 had a 2 stage pipeline, so at most you'd need 1 nop.  3 nops is 
for the R4000's and similar.  Then it got weird with multi-issue, 
threads, and sometimes silicon bugs that required more before MIPS32r2 
and MIPS64r2 codified a single, special kind of nop that did the trick.


Warner


Re: CVS commit: src/sys/arch/mips/mips

2010-11-08 Thread Antti Kantee
On Mon Nov 08 2010 at 21:40:49 +, David Holland wrote:
> On Mon, Nov 08, 2010 at 06:09:39PM +, Antti Kantee wrote:
>  > Modified Files:
>  >src/sys/arch/mips/mips: locore_mips1.S
>  > 
>  > Log Message:
>  > In TLBRead, restore PID before doing the saves so that the caller's
>  > TLB entries are used instead of the PID given as the argument.
>  > 
>  > from Alessandro Forin
> 
> This doesn't make any sense.
> 
> As far as I can tell, the real problem is that the code is not
> attending to pipeline hazards properly; moving things around until it
> experimentally seems to work is just going to create other odd
> behavior sometime down the line under different timing circumstances.
> 
> I don't have a mips1-specific reference on hand, but my recollection
> is that you need *three* nops when messing with cop0 registers for all
> effects to flush through.

locore_mips1.S doesn't agree with your recollection, e.g. mips1_TLBUpdate
right above mips1_TLBRead.


Re: CVS commit: src/sys/arch/mips/mips

2010-11-08 Thread David Holland
On Mon, Nov 08, 2010 at 06:09:39PM +, Antti Kantee wrote:
 > Modified Files:
 >  src/sys/arch/mips/mips: locore_mips1.S
 > 
 > Log Message:
 > In TLBRead, restore PID before doing the saves so that the caller's
 > TLB entries are used instead of the PID given as the argument.
 > 
 > from Alessandro Forin

This doesn't make any sense.

As far as I can tell, the real problem is that the code is not
attending to pipeline hazards properly; moving things around until it
experimentally seems to work is just going to create other odd
behavior sometime down the line under different timing circumstances.

I don't have a mips1-specific reference on hand, but my recollection
is that you need *three* nops when messing with cop0 registers for all
effects to flush through.

Do you have any further explanations or reports of symptoms? I don't
see anything on any of the mips lists.

-- 
David A. Holland
dholl...@netbsd.org


Re: CVS commit: src/tests/fs/tmpfs

2010-11-08 Thread David Holland
On Mon, Nov 08, 2010 at 03:25:50PM +, Antti Kantee wrote:
 > Module Name: src
 > Committed By:pooka
 > Date:Mon Nov  8 15:25:50 UTC 2010
 > 
 > Modified Files:
 >  src/tests/fs/tmpfs: t_mkdir.sh
 > 
 > Log Message:
 > Use su -m since _atf has nologin as its shell.
 > XXX: this will still fail if root's shell is csh

If you're going to go to the trouble of littering the system with test
users and groups, why not create a wrapper program that behaves in a
useful and predictable manner instead of using su and blowing up on
random user settings?

-- 
David A. Holland
dholl...@netbsd.org


Re: CVS commit: src/sys/arch/powerpc

2010-11-08 Thread Masao Uebayashi
On Sun, Nov 07, 2010 at 01:46:06PM +1100, Simon Burge wrote:
> "Masao Uebayashi" wrote:
> 
> > Module Name:src
> > Committed By:   uebayasi
> > Date:   Sat Nov  6 16:36:27 UTC 2010
> > 
> > Modified Files:
> > 
> > src/sys/arch/powerpc/ibm4xx: pmap.c
> > src/sys/arch/powerpc/include/ibm4xx: vmparam.h
> > 
> > Log Message:
> > 
> > Merge from uebayasi-xip:
> > 
> > revision 1.60.2.5
> > date: 2010/08/14 02:09:57;  author: uebayasi;  state: Exp;  lines: +2 -1
> > Teach TLB miss handler (pmap_tlbmiss()) to map "Expansion ROM" area as
> > PA == VA.  Now we don't need to reserve a TLB entry for it.
> > 
> 
> At a quick glance, this seems to hard-wire in device addresses which may
> not be the same on every ibm4xx platform (the 0xef00 address).
> 
> Also I'm not sure I like the idea of modifying a general pmap to have
> knowledge of device-specific "Expansion ROM" address ranges.  Are these
> _always_ guaranteed to live in the address range you've specified or
> is there a chance a ROM could be mapped at different addresses on some
> boards?  The change as is also limits expansion ROMs to a bit over
> 256MB which mightn't be a problem now but also indicates a problem with
> hardcoding specific values in a general pmap.
> 
> I'm not sure what the "correct" way of doing this is.  I think I'd
> prefer to see some sort of registration of expansion ROM space that you
> want to reserve, but would need to put some more thought in to it.

I'm sure what the correct way should be.

This is part of fault handler specific to CPU.  If we decide to
map part kernel VA to fixed addreess backed by fixed pages on this
CPU, we *resolve* it first.  Otherwise we'll try more general path.

So, we follow the strict definition that pmap(9) manages per-virtual
space (proc) address translation information, this pmap_tlbmiss()
breaks the rule.

> 
> If this change was to remain in its current form, the comment
> immediately above the change you made in the tlb miss handler is now
> incorrect and would need to be adjusted.
> 
> Cheers,
> Simon.