Re: [Milkymist port] virtual memory management

2014-02-09 Thread Yann Sionneau

Hello Eduardo,

Le 30/05/13 22:45, Eduardo Horvath a écrit :

On Wed, 29 May 2013, Yann Sionneau wrote:


Hello NetBSD fellows,

As I mentioned in my previous e-mail, I may need from time to time a little
bit of help since this is my first full featured OS port.

I am wondering how I can manage virtual memory (especially how to avoid tlb
miss, or deal with them) in exception handlers.

There are essentially three ways to do this.  Which one you chose depends
on the hardware.

1) Turn off the MMU on exception

2) Keep parts of the address space untranslated

3) Lock important pages into the TLB


Turning off the MMU is pretty straight-forward.  ISTR the PowerPC Book E
processors do this.  Just look up the TLB entry in the table and return
from exception.  You just have to make sure that the kernel manages page
tables using physical addresses.
This seems like the easiest thing to do (because I won't have to think 
about recursive faults) but then if I put physical addresses in my 1st 
level page table, how does the kernel manage the page table entries?
Since the kernel runs with MMU on, using virtual addresses, it cannot 
dereference physical pointers then it cannot add/modify/remove PTEs, right?
I'm sure there is some kernel internal mechanism that I don't know about 
which could help me getting the virtual address from the physical one, 
do you know which mechanism it would be?


Also, is it possible to make sure that everything (in kernel space) is 
mapped so that virtual_addr = physical_addr - RAM_START_ADDR + 
virtual_offset
In my case RAM_START_ADDR is 0x4000 and I am trying to use 
virtual_offset of 0xc000 (everything in my kernel ELF binary is 
mapped at virtual address starting at 0xc000)
If I can ensure that this formula is always correct I can then use a 
very simple macro to translate statically a physical address to a 
virtual address.


Then I have another question, who is supposed to build the kernel's page 
table? pmap_bootstrap()?
If so, then how do I allocate pages for that purpose? using 
pmap_pte_pagealloc() and pmap_segtab_init() ?


FYI I am using those files for my pmap:

uvm/pmap/pmap.c
uvm/pmap/pmap_segtab.c
uvm/pmap/pmap_tlb.c

I am taking inspiration from the PPC Book-E (mpc85xx) code.

Thanks !

Regards,

--
Yann


Re: [Milkymist port] virtual memory management

2014-02-09 Thread Matt Thomas

On Feb 9, 2014, at 10:07 AM, Yann Sionneau yann.sionn...@gmail.com wrote:

 This seems like the easiest thing to do (because I won't have to think about 
 recursive faults) but then if I put physical addresses in my 1st level page 
 table, how does the kernel manage the page table entries?

BookE always has the MMU on and contains fixed TLB entries to make sure
all of physical ram is always mapped.

 Since the kernel runs with MMU on, using virtual addresses, it cannot 
 dereference physical pointers then it cannot add/modify/remove PTEs, right?

Wrong.  See above.  Note that on BookE, PTEs are purely a software 
construction and the H/W never reads them directly.

 I'm sure there is some kernel internal mechanism that I don't know about 
 which could help me getting the virtual address from the physical one, do you 
 know which mechanism it would be?

Look at __HAVE_MM_MD_DIRECT_MAPPED_PHYS and/or PMAP_{MAP,UNMAP}_POOLPAGE.


 Also, is it possible to make sure that everything (in kernel space) is mapped 
 so that virtual_addr = physical_addr - RAM_START_ADDR + virtual_offset
 In my case RAM_START_ADDR is 0x4000 and I am trying to use virtual_offset 
 of 0xc000 (everything in my kernel ELF binary is mapped at virtual 
 address starting at 0xc000)
 If I can ensure that this formula is always correct I can then use a very 
 simple macro to translate statically a physical address to a virtual 
 address.

Not knowing how much ram you have, I can only speak in generalities. 
But in general you reserve a part of the address space for direct mapped
memory and then place the kernel about that.

For instance, you might have 512MB of RAM which you map at 0xa000.
and then have the kernel's mapped va space start at 0xc000..

Then conversion to from PA to VA is just adding a constant while getting
the PA from a direct mapped VA is just subtraction.

 Then I have another question, who is supposed to build the kernel's page 
 table? pmap_bootstrap()?

Some part of MD code.  pmap_bootstrap() could be that.

 If so, then how do I allocate pages for that purpose? using 
 pmap_pte_pagealloc() and pmap_segtab_init() ?

usually you use pmap_steal_memory to do that.
But for mpc85xx I just allocate the kernel initial segmap in the .bss.
But the page tables were from allocated using uvm can do prebootstrap
allocations.

 
 FYI I am using those files for my pmap:
 
 uvm/pmap/pmap.c
 uvm/pmap/pmap_segtab.c
 uvm/pmap/pmap_tlb.c
 
 I am taking inspiration from the PPC Book-E (mpc85xx) code.



one time crash in usb_allocmem_flags

2014-02-09 Thread Alexander Nasonov
Hi,

I was running current amd64 (last updated few weeks ago) when I got
a random crash shortly after switching to X mode. If my analysis is
correct, it crashed in usb_allocmem_flags inside this loop:

LIST_FOREACH(f, usb_frag_freelist, next) {
KDASSERTMSG(usb_valid_block_p(f-block, usb_blk_fraglist),
%s: usb frag %p: unknown block pointer %p,
 __func__, f, f-block);
if (f-block-tag == tag)
break;
}

It couldn't access f-block-tag. I wasn't actively using any of
the usb devices at that time. I wonder if it's a known problem or
should I file a PR? Details of the analysis is below.

Thanks,
Alex

crash dmesg
...
fatal protection fault in supervisor mode
trap type 4 code 0 rip 808515e2 cs 8 rflags 13282 cr2
7f7ff5773020 ilevel 0 rsp fe80ca6f16c0
curlwp 0xfe811a8aaba0 pid 475.1 lowest kstack 0xfe80ca6ee000
panic: trap
cpu2: Begin traceback...
vpanic() at netbsd:vpanic+0x13e
printf_nolog() at netbsd:printf_nolog
startlwp() at netbsd:startlwp
alltraps() at netbsd:alltraps+0x9e
ehci_allocm() at netbsd:ehci_allocm+0x2c
usbd_transfer() at netbsd:usbd_transfer+0x5f
usbd_open_pipe_intr() at netbsd:usbd_open_pipe_intr+0xcb
uhidev_open() at netbsd:uhidev_open+0xb3
wsmouseopen() at netbsd:wsmouseopen+0xf3
cdev_open() at netbsd:cdev_open+0x87
spec_open() at netbsd:spec_open+0x183
VOP_OPEN() at netbsd:VOP_OPEN+0x33
vn_open() at netbsd:vn_open+0x1b0
do_open() at netbsd:do_open+0x102
do_sys_openat() at netbsd:do_sys_openat+0x68
sys_open() at netbsd:sys_open+0x24
syscall() at netbsd:syscall+0x9a
--- syscall (number 5) ---
7f7ff403af3a:
cpu2: End traceback...
rebooting in 10 9 8 7 6 5 4 3 2 1 0

crash dmesg|grep usb
usb0 at xhci0: USB revision 2.0
usb1 at ehci0: USB revision 2.0
uhub0 at usb0: NetBSD xHCI Root Hub, class 9/0, rev 2.00/1.00, addr 0
uhub1 at usb1: vendor 0x8086 EHCI root hub, class 9/0, rev 2.00/1.00,
addr 1
usbd_transfer() at netbsd:usbd_transfer+0x5f
usbd_open_pipe_intr() at netbsd:usbd_open_pipe_intr+0xcb

crash x 0x808515e2
usb_allocmem_flags+0xfd:751a3948


$ objdump -d /netbsd
...
8085158b:   48 c7 c7 60 15 f8 80mov$0x80f81560,%rdi
80851592:   e8 69 42 d3 ff  callq  80585800 
mutex_enter
80851597:   48 8b 05 c2 bf 69 00mov0x69bfc2(%rip),%rax  
  # 80eed560 usb_frag_freelist
8085159e:   48 85 c0test   %rax,%rax
808515a1:   75 3c   jne808515df 
usb_allocmem_flags+0xfa

/* You don't need to look at this block */
808515a3:   48 8d 4d c8 lea-0x38(%rbp),%rcx
808515a7:   45 31 c0xor%r8d,%r8d
808515aa:   ba 40 00 00 00  mov$0x40,%edx
808515af:   be 00 10 00 00  mov$0x1000,%esi
808515b4:   48 89 dfmov%rbx,%rdi
808515b7:   e8 f4 fb ff ff  callq  808511b0 
usb_block_allocmem
808515bc:   89 c3   mov%eax,%ebx
808515be:   85 c0   test   %eax,%eax
808515c0:   75 ac   jne8085156e 
usb_allocmem_flags+0x89
808515c2:   48 8b 4d c8 mov-0x38(%rbp),%rcx
808515c6:   c7 41 38 00 00 00 00movl   $0x0,0x38(%rcx)
808515cd:   bb 40 00 00 00  mov$0x40,%ebx
808515d2:   31 d2   xor%edx,%edx
808515d4:   eb 57   jmp8085162d 
usb_allocmem_flags+0x148
/* end of block. */

/*  LIST_FOREACH(f, usb_frag_freelist, next) { */
808515d6:   48 8b 40 10 mov0x10(%rax),%rax
808515da:   48 85 c0test   %rax,%rax
808515dd:   74 c4   je 808515a3 
usb_allocmem_flags+0xbe
808515df:   48 8b 10mov(%rax),%rdx
808515e2:   48 39 1acmp%rbx,(%rdx)
808515e5:   75 ef   jne808515d6 
usb_allocmem_flags+0xf1


crash ps
PIDLID S CPU FLAGS   STRUCT LWP *   NAME WAIT
475 1 7   2 0   fe811a8aaba0   Xorg
72   1 2   3   902   fe811a709b80  xinit
43   1 2   3   802   fe811a709760 sh
437  1 2   3   802   fe811d311720ksh
420  1 2   2   802   fe811e2b6240  getty
435  1 2   0   802   fe811e2b6a80  getty
429  1 2   3   802   fe811e2b6660  login
412  1 2   0   802   fe811e4c1220  getty
390  1 2   0   802   fe8119a90b60   cron
407  1 2   0   802   fe811d767b00  inetd
342  1 2   3   802   fe811d311300  

Re: 4byte aligned com(4) and PCI_MAPREG_TYPE_MEM

2014-02-09 Thread Christos Zoulas
In article 52f7c96e.6000...@execsw.org,
SAITOH Masanobu  msai...@execsw.org wrote:
Hello, all.

 I'm now working to support Intel Quark X1000.
This chip's internal com is MMIO(PCI_MAPREG_TYPE_MEM).
Our com and puc don't support such type of device, yet.
To solve the problem, I wrote a patch.

 Registers of Quark X1000's com are 4byte aligned.
Some other machines have such type of device, so
I modified COM_INIT_REGS() macro to support both
byte aligned and 4byte aligned. This change reduce
special modifications done in atheros, rmi and
marvell drivers.

 One of problem is serial console on i386 and amd64.
These archs calls consinit() three times. The function
is called in the following order:

   1) machdep.c::init386() or init_x86_64()
   2) init_main.c::main()
   *) (call uvm_init())
   *) (call extent_init())
   3) machdep.c::cpu_startup()

When consinit() called in init386(), it calls

  comcnattach()
-comcnattach1()
  -comcninit()
- bus_space_map() with x86_bus_space_mem tag.
  -bus_space_reservation_map()
-x86_mem_add_mapping()
  -uvm_km_alloc()
panic in KASSERT(vm_map_pmap(map) == pmap_kernel());

What should I do?
One of the solution is to check whether extent_init() was called
or not. There is no easy way to know it, so I added a global
variable extent_initted. Is it acceptable?

Looks great, can't you use cold instead, or is that too late?

christos



Re: [Milkymist port] virtual memory management

2014-02-09 Thread Yann Sionneau

Thank you for your answer Matt,

Le 09/02/14 19:49, Matt Thomas a écrit :

On Feb 9, 2014, at 10:07 AM, Yann Sionneau yann.sionn...@gmail.com wrote:


This seems like the easiest thing to do (because I won't have to think about 
recursive faults) but then if I put physical addresses in my 1st level page 
table, how does the kernel manage the page table entries?

BookE always has the MMU on and contains fixed TLB entries to make sure
all of physical ram is always mapped.
My TLB hardware is very simple and does not give me the option to fix 
a TLB entry so I won't be able to do that.
the lm32 MMU is turned off upon exception (tlb miss for instance) 
automatically, then I can enable it back if I want. In the end the MMU 
is enabled back upon return from exception.



Since the kernel runs with MMU on, using virtual addresses, it cannot 
dereference physical pointers then it cannot add/modify/remove PTEs, right?

Wrong.  See above.
You mean that the TLB contains entries which map a physical address to 
itself? like 0xabcd. is mapped to 0xabcd.? Or you mean all RAM 
is always mapped but to the (0xa000.000+physical_pframe) kind of virtual 
address you mention later in your reply?

Note that on BookE, PTEs are purely a software
construction and the H/W never reads them directly.
Here my HW is like BookE, I don't have hardware page tree walker, PTEs 
are only for the software to reload the TLB when there is an exception 
(tlb miss), TLB will never read memory to find PTE in my lm32 MMU 
implementation.



I'm sure there is some kernel internal mechanism that I don't know about which 
could help me getting the virtual address from the physical one, do you know 
which mechanism it would be?

Look at __HAVE_MM_MD_DIRECT_MAPPED_PHYS and/or PMAP_{MAP,UNMAP}_POOLPAGE.

For now I have something like that:

vaddr_t
pmap_md_map_poolpage(paddr_t pa, vsize_t size)
{
  const vaddr_t sva = (vaddr_t) pa - 0x4000 + 0xc000;
  return sva;
}

But I guess it only works to access the content of kernel ELF (text and 
data) but not to access dynamic runtime kernel allocations, right?




Also, is it possible to make sure that everything (in kernel space) is mapped 
so that virtual_addr = physical_addr - RAM_START_ADDR + virtual_offset
In my case RAM_START_ADDR is 0x4000 and I am trying to use virtual_offset 
of 0xc000 (everything in my kernel ELF binary is mapped at virtual address 
starting at 0xc000)
If I can ensure that this formula is always correct I can then use a very simple macro to 
translate statically a physical address to a virtual address.

Not knowing how much ram you have, I can only speak in generalities.

I have 128 MB of RAM.

But in general you reserve a part of the address space for direct mapped
memory and then place the kernel about that.

For instance, you might have 512MB of RAM which you map at 0xa000.
and then have the kernel's mapped va space start at 0xc000..
So if I understand correctly, the first page of physical ram 
(0x4000.) is mapped at virtual address 0xa000. *and* at 
0xc000. ?
Isn't it a problem that a physical address is mapped twice in the same 
process (here the kernel)?

My caches are VIPT, couldn't it generate cache aliases issues?


Then conversion to from PA to VA is just adding a constant while getting
the PA from a direct mapped VA is just subtraction.


Then I have another question, who is supposed to build the kernel's page table? 
pmap_bootstrap()?

Some part of MD code.  pmap_bootstrap() could be that.


If so, then how do I allocate pages for that purpose? using 
pmap_pte_pagealloc() and pmap_segtab_init() ?

usually you use pmap_steal_memory to do that.
But for mpc85xx I just allocate the kernel initial segmap in the .bss.
But the page tables were from allocated using uvm can do prebootstrap
allocations.

Are you referring to the following code?

  /*
   * Now actually allocate the kernel PTE array (must be done
   * after virtual_end is initialized).
   */
  const vaddr_t kv_segtabs = avail[0].start;
  KASSERT(kv_segtabs == endkernel);
  KASSERT(avail[0].size = NBPG * kv_nsegtabs);
  printf( kv_nsegtabs=%#PRIxVSIZE, kv_nsegtabs);
  printf( kv_segtabs=%#PRIxVADDR, kv_segtabs);
  avail[0].start += NBPG * kv_nsegtabs;
  avail[0].size -= NBPG * kv_nsegtabs;
  endkernel += NBPG * kv_nsegtabs;

  /*
   * Initialize the kernel's two-level page level.  This only wastes
   * an extra page for the segment table and allows the user/kernel
   * access to be common.
   */
  pt_entry_t **ptp = stp-seg_tab[VM_MIN_KERNEL_ADDRESS  SEGSHIFT];
  pt_entry_t *ptep = (void *)kv_segtabs;
  memset(ptep, 0, NBPG * kv_nsegtabs);
  for (size_t i = 0; i  kv_nsegtabs; i++, ptep += NPTEPG) {
*ptp++ = ptep;
  }



FYI I am using those files for my pmap:

uvm/pmap/pmap.c
uvm/pmap/pmap_segtab.c
uvm/pmap/pmap_tlb.c

I am taking inspiration from the PPC Book-E (mpc85xx) code.

Regards,

--
Yann