Re: [RFC][PATCH] nameing reserved pages [0/3]

2005-04-20 Thread Dave Hansen
On Wed, 2005-04-20 at 14:34 +0200, Arjan van de Ven wrote:
> On Wed, 2005-04-20 at 21:02 +0900, KAMEZAWA Hiroyuki wrote:
> > Hi,
> > 
> > There are several types of PG_reserved pages,
> > (a) Memory Hole
> > (b) Used by Kernel
> > (c) Set by drivers
> > (d) Isorated by MCA
> > (e) used by perfmon
> > etc
> > 
> > I think it's useful to distinguish many types of PG_reserved pages.
> 
> I'm not so sure about this. at all.

Neither am I, that's why I hoped somebody would figure out something
better :)

> > For example, Memory Hotplug can ignore (a).
> 
> Memory Hotplug can also use page_is_ram().

It uses this, to some degree, internally.  But, things like the e820
table don't get updated as memory hotplugs occur.

This should a way to give more fine-grained information about what pages
are availabe as RAM at any point in time. kdump would need something
like this to figure out which pages inside of /dev/mem are actually
valid to dump.  Here was another approach that used /proc files:

http://lkml.org/lkml/2005/3/24/11

> /dev/memstate really looks like a bad idea to me as well... I rather
> have less than more /dev/*mem*

Any other ideas?

-- Dave

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] nameing reserved pages [0/3]

2005-04-20 Thread Dave Hansen
On Wed, 2005-04-20 at 16:30 +0200, Arjan van de Ven wrote:
> Why do you want this exported to userspace? There is absolutely no way
> you can get this exported race free without shutting the VM down, and
> without being race free this information has absolutely no meaning !!
> (and when you shut the VM down you really shouldn't depend on userspace
> anymore either)

The two cases where this is expected to be used are not concerned with
races.  The first is when a memory remove operation occurs.  It first
looks at the hotplug area, and removes all the pages that it can from
the allocator.  Then, it sets about migrating all of the other pages
that are being used for things like page cache or anonymous memory.

After that, the question sometimes remains why particular pages can't be
removed.  Kame's patch is an attempt to help figure that out.

That's one reason I suggested having an individual device file for each
of the memory areas that get added or removed.  It would keep the
confusion to a minimum, and you'd be more sure that what you were
looking at was information only for the memory area that is *almost*
removed.

I don't know what state the system is in when the kdump folks want to
read this information.

-- Dave

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] nameing reserved pages [0/3]

2005-04-20 Thread Kamezawa Hiroyuki
Arjan van de Ven wrote:
On Wed, 2005-04-20 at 23:15 +0900, Kamezawa Hiroyuki wrote:
 

MCA's probably shouldn't set PG_reserved; I don't see why they should.
They could just steal the page and "leak" it.
 

Actually leaked pages cannot be hot-removed/replaced. So we have to 
trace which pages is removed by MCA.
I think Set PG_reserved and set page->private = Removed_by_MCA is a 
simple idea.

/dev/memstate really looks like a bad idea to me as well... I rather
have less than more /dev/*mem*
 

For showing page usage and its "location", I've thought of other 
interface, sysfs, procfs...
But I have no idea.
   

Why do you want this exported to userspace? There is absolutely no way
you can get this exported race free without shutting the VM down, and
without being race free this information has absolutely no meaning !!
 

No meaning ? 
Before memory-hotremove, we can guessing whether memory is hot-removable 
or not.
As you say , this is not atomic and not fully responsible.

After failing memory-hotremove, detecting why hot-remove was failed is 
very important.
I think ,when memory hot-remove faild, memory area is isolated until it 
is pushed back by an operator.
We can get a real snapshot of specified memory area.

Regards,
-- Kame
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] nameing reserved pages [0/3]

2005-04-20 Thread Arjan van de Ven
On Wed, 2005-04-20 at 23:15 +0900, Kamezawa Hiroyuki wrote:
> Arjan van de Ven wrote:
> 
> >>For example, Memory Hotplug can ignore (a).
> >>
> >>
> >
> >Memory Hotplug can also use page_is_ram().
> >  
> >
> Yes. we can use page_is_ram() for finding (a)memory hole.
> But I'd like to catch other removable PG_reserved pages like (d)Isorated 
> by MCA (e)used by perfmon and
> some of (b) used by kernerl and (c) Set by drivers.
> What I'm thinking of is to detect whether memory is hot-removable or not 
> before removing actually.

MCA's probably shouldn't set PG_reserved; I don't see why they should.
They could just steal the page and "leak" it.

> 
> >/dev/memstate really looks like a bad idea to me as well... I rather
> >have less than more /dev/*mem*
> >  
> >
> For showing page usage and its "location", I've thought of other 
> interface, sysfs, procfs...
> But I have no idea.

Why do you want this exported to userspace? There is absolutely no way
you can get this exported race free without shutting the VM down, and
without being race free this information has absolutely no meaning !!
(and when you shut the VM down you really shouldn't depend on userspace
anymore either)



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] nameing reserved pages [0/3]

2005-04-20 Thread Kamezawa Hiroyuki
Arjan van de Ven wrote:
For example, Memory Hotplug can ignore (a).
   

Memory Hotplug can also use page_is_ram().
 

Yes. we can use page_is_ram() for finding (a)memory hole.
But I'd like to catch other removable PG_reserved pages like (d)Isorated 
by MCA (e)used by perfmon and
some of (b) used by kernerl and (c) Set by drivers.
What I'm thinking of is to detect whether memory is hot-removable or not 
before removing actually.

/dev/memstate really looks like a bad idea to me as well... I rather
have less than more /dev/*mem*
 

For showing page usage and its "location", I've thought of other 
interface, sysfs, procfs...
But I have no idea.
Physical memory area has vast space and I want to use lseek() or 
ioctl().( I don't like  ioctl())
Do you have any recommendation ?

-- Kame
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] nameing reserved pages [0/3]

2005-04-20 Thread Arjan van de Ven
On Wed, 2005-04-20 at 21:02 +0900, KAMEZAWA Hiroyuki wrote:
> Hi,
> 
> There are several types of PG_reserved pages,
> (a) Memory Hole
> (b) Used by Kernel
> (c) Set by drivers
> (d) Isorated by MCA
> (e) used by perfmon
> etc
> 
> I think it's useful to distinguish many types of PG_reserved pages.

I'm not so sure about this. at all.

> For example, Memory Hotplug can ignore (a).

Memory Hotplug can also use page_is_ram().

/dev/memstate really looks like a bad idea to me as well... I rather
have less than more /dev/*mem*



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC][PATCH] nameing reserved pages [0/3]

2005-04-20 Thread KAMEZAWA Hiroyuki
Hi,
There are several types of PG_reserved pages,
(a) Memory Hole
(b) Used by Kernel
(c) Set by drivers
(d) Isorated by MCA
(e) used by perfmon
etc
I think it's useful to distinguish many types of PG_reserved pages.
For example, Memory Hotplug can ignore (a).
2 patches [1/3][2/3] are for naming PG_reserved pages.
A type of a page is recoreded in page->private.
I'm not sure whether this is safe or not, so only reserved-at-boot pages are 
named, currently.
patch [3/3] is an interface to show state of memmap, /dev/memstate.
In /dev/memstate, file offset is pfn and a byte represents a state of a page.
In this patch, memory hole and Reserved pages has its value.
below is output of my box.
0xff --- Invalid page
0x00 --- Common page
0x02 --- Reserved at boot page
[EMAIL PROTECTED] char]#  od  -t x1 -j 0 -N 65535 /dev/memstate
000 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
*
0001540 ff ff ff ff ff ff ff ff ff ff ff ff ff ff 02 02
0001560 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02
*
0002400 02 02 02 00 00 00 00 00 00 02 02 02 02 02 02 02
0002420 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02
*
0003400 02 02 02 02 02 02 02 02 02 02 02 00 00 00 00 00
0003420 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
*
001 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02
*
0010640 02 02 02 02 02 02 02 02 02 02 02 02 02 02 00 00
0010660 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
This would be useful for Memory-Hotplug and some other stuffs.
I think more detailed types can be supported.
Thanks.
-- Kame <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/