Re: CVS commit: src/sys/arch/powerpc

2010-11-08 Thread Masao Uebayashi
On Sun, Nov 07, 2010 at 01:46:06PM +1100, Simon Burge wrote:
 Masao Uebayashi wrote:
 
  Module Name:src
  Committed By:   uebayasi
  Date:   Sat Nov  6 16:36:27 UTC 2010
  
  Modified Files:
  
  src/sys/arch/powerpc/ibm4xx: pmap.c
  src/sys/arch/powerpc/include/ibm4xx: vmparam.h
  
  Log Message:
  
  Merge from uebayasi-xip:
  
  revision 1.60.2.5
  date: 2010/08/14 02:09:57;  author: uebayasi;  state: Exp;  lines: +2 -1
  Teach TLB miss handler (pmap_tlbmiss()) to map Expansion ROM area as
  PA == VA.  Now we don't need to reserve a TLB entry for it.
  
 
 At a quick glance, this seems to hard-wire in device addresses which may
 not be the same on every ibm4xx platform (the 0xef00 address).
 
 Also I'm not sure I like the idea of modifying a general pmap to have
 knowledge of device-specific Expansion ROM address ranges.  Are these
 _always_ guaranteed to live in the address range you've specified or
 is there a chance a ROM could be mapped at different addresses on some
 boards?  The change as is also limits expansion ROMs to a bit over
 256MB which mightn't be a problem now but also indicates a problem with
 hardcoding specific values in a general pmap.
 
 I'm not sure what the correct way of doing this is.  I think I'd
 prefer to see some sort of registration of expansion ROM space that you
 want to reserve, but would need to put some more thought in to it.

I'm sure what the correct way should be.

This is part of fault handler specific to CPU.  If we decide to
map part kernel VA to fixed addreess backed by fixed pages on this
CPU, we *resolve* it first.  Otherwise we'll try more general path.

So, we follow the strict definition that pmap(9) manages per-virtual
space (proc) address translation information, this pmap_tlbmiss()
breaks the rule.

 
 If this change was to remain in its current form, the comment
 immediately above the change you made in the tlb miss handler is now
 incorrect and would need to be adjusted.
 
 Cheers,
 Simon.


Re: CVS commit: src/sys/arch/mips/mips

2010-11-08 Thread David Holland
On Mon, Nov 08, 2010 at 06:09:39PM +, Antti Kantee wrote:
  Modified Files:
   src/sys/arch/mips/mips: locore_mips1.S
  
  Log Message:
  In TLBRead, restore PID before doing the saves so that the caller's
  TLB entries are used instead of the PID given as the argument.
  
  from Alessandro Forin

This doesn't make any sense.

As far as I can tell, the real problem is that the code is not
attending to pipeline hazards properly; moving things around until it
experimentally seems to work is just going to create other odd
behavior sometime down the line under different timing circumstances.

I don't have a mips1-specific reference on hand, but my recollection
is that you need *three* nops when messing with cop0 registers for all
effects to flush through.

Do you have any further explanations or reports of symptoms? I don't
see anything on any of the mips lists.

-- 
David A. Holland
dholl...@netbsd.org


Re: CVS commit: src/sys/arch/mips/mips

2010-11-08 Thread Antti Kantee
On Mon Nov 08 2010 at 21:40:49 +, David Holland wrote:
 On Mon, Nov 08, 2010 at 06:09:39PM +, Antti Kantee wrote:
   Modified Files:
  src/sys/arch/mips/mips: locore_mips1.S
   
   Log Message:
   In TLBRead, restore PID before doing the saves so that the caller's
   TLB entries are used instead of the PID given as the argument.
   
   from Alessandro Forin
 
 This doesn't make any sense.
 
 As far as I can tell, the real problem is that the code is not
 attending to pipeline hazards properly; moving things around until it
 experimentally seems to work is just going to create other odd
 behavior sometime down the line under different timing circumstances.
 
 I don't have a mips1-specific reference on hand, but my recollection
 is that you need *three* nops when messing with cop0 registers for all
 effects to flush through.

locore_mips1.S doesn't agree with your recollection, e.g. mips1_TLBUpdate
right above mips1_TLBRead.


Re: CVS commit: src/sys/kern

2010-11-08 Thread Masao Uebayashi
The root cause of such a problem is, we don't do I/O scheduling at all.

I think that pool(9) (and pool_cache(9)) is a great tool to manage
limited resources.  The way to go would be to extend it to manage
bandwidth in I/O subsystems.  (We already use pool_cache(9) for
KVA.)

Resources are shared, and have dependencies.  I/O resources would
look like some hierachical structure chained with pool backend
callbacks.  Probably combined with device tree too.

That said, I have no idea how to *fix* our networking stack.  It's
unmaintainable IMO...

On Thu, Oct 14, 2010 at 05:07:39PM +0900, o...@netbsd.org wrote:
 Hello,
 
 Mindaugas said:
  Can you explain what the actual problem M_WAIT causes i.e. how would
  it freeze network stack?
 
 In M_WAIT case, m_reclaim() will run and run until get mbuf cluster
 if mclpool limit reached.  If m_reclaim() repeatedly but cannot to
 get new mbuf cluster, m_clget() will not return.
 
 network stacks using mbufs is use with M_DONTWAIT, but it will failed
 to get new mbuf cluster in this case.  freeze means that.
 
 Generally NMBCLUSTERS is enough and SB_MAX is small, the problem
 is rarely caused.  It depends parameter configuration, I think.
 
 In other words, to cause this problem,
 - many packets receives from the network, it will read by userland
 - userland program sends many packets, but not read packet from the stack
 - and mclpool limit reached.
 - no one calls pool_put()
 
 In M_DONTWAIT case, M_EXT flag is result of m_clget().
 code of checking M_EXT is already prepared, but didn't used
 it with M_WAIT.  I think this is careless miss.
 
 Thanks.
 --
 Masaru OKI o...@netbsd.org


Re: CVS commit: src/sys/arch/mips/mips

2010-11-08 Thread Simon Burge
David Holland wrote:

 On Mon, Nov 08, 2010 at 06:09:39PM +, Antti Kantee wrote:
   Modified Files:
  src/sys/arch/mips/mips: locore_mips1.S
   
   Log Message:
   In TLBRead, restore PID before doing the saves so that the caller's
   TLB entries are used instead of the PID given as the argument.
   
   from Alessandro Forin
 
 [ ... ]
 
 Do you have any further explanations or reports of symptoms? I don't
 see anything on any of the mips lists.

I'm curious about this too.  That code hasn't changed for at least
14 years...

Cheers,
Simon.


Re: CVS commit: src/sys/kern

2010-11-08 Thread Masao Uebayashi
The first step would be to analyze code, identify glocal resource
allocations, and replace them to local ones.  m_get() / m_clget()
would become per-socket, by allocating per-socket pool_cache.

On Tue, Nov 09, 2010 at 12:36:16PM +0900, Masao Uebayashi wrote:
 The root cause of such a problem is, we don't do I/O scheduling at all.
 
 I think that pool(9) (and pool_cache(9)) is a great tool to manage
 limited resources.  The way to go would be to extend it to manage
 bandwidth in I/O subsystems.  (We already use pool_cache(9) for
 KVA.)
 
 Resources are shared, and have dependencies.  I/O resources would
 look like some hierachical structure chained with pool backend
 callbacks.  Probably combined with device tree too.
 
 That said, I have no idea how to *fix* our networking stack.  It's
 unmaintainable IMO...
 
 On Thu, Oct 14, 2010 at 05:07:39PM +0900, o...@netbsd.org wrote:
  Hello,
  
  Mindaugas said:
   Can you explain what the actual problem M_WAIT causes i.e. how would
   it freeze network stack?
  
  In M_WAIT case, m_reclaim() will run and run until get mbuf cluster
  if mclpool limit reached.  If m_reclaim() repeatedly but cannot to
  get new mbuf cluster, m_clget() will not return.
  
  network stacks using mbufs is use with M_DONTWAIT, but it will failed
  to get new mbuf cluster in this case.  freeze means that.
  
  Generally NMBCLUSTERS is enough and SB_MAX is small, the problem
  is rarely caused.  It depends parameter configuration, I think.
  
  In other words, to cause this problem,
  - many packets receives from the network, it will read by userland
  - userland program sends many packets, but not read packet from the stack
  - and mclpool limit reached.
  - no one calls pool_put()
  
  In M_DONTWAIT case, M_EXT flag is result of m_clget().
  code of checking M_EXT is already prepared, but didn't used
  it with M_WAIT.  I think this is careless miss.
  
  Thanks.
  --
  Masaru OKI o...@netbsd.org

-- 
Masao Uebayashi / Tombi Inc. / Tel: +81-90-9141-4635