Re: user limits for 'security'?

2001-06-25 Thread LA Walsh

I suppose another question related to the first, is 'limit' checking
part of the 'standard linux security' that embedded Linux users might
find to be a waste of precious code-space?

-l

--
The above thoughts and| I know I don't know the opinions
writings are my own.  | of every part of my company. :-)
L A Walsh, law at sgi.com | Sr Eng, Trust Technology
01-650-933-5338   | Core Linux, SGI



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



user limits for 'security'?

2001-06-25 Thread LA Walsh

I've seen some people saying that user-limits are an essential part of a
secure system to prevent local DoS attacks.  Given that, should
a system call like 'fork' return -EPERM if the user has reached their
limit?

My local manpage (SuSE 7.2 system) says this under fork:

ERRORS
   EAGAIN fork  cannot allocate sufficient memory to copy the
  parent's page tables and allocate a task  structure
  for the child.
-
Should the man page be updated to reflect that EAGAIN is returned
when the user has reached their limit?  From a user-monitoring point
of view, it might be security relevant to know if a EAGAIN is being
returned because the system really is low on resources or if it
is a user hitting their limit.

--
The above thoughts and| I know I don't know the opinions
writings are my own.  | of every part of my company. :-)
L A Walsh, law at sgi.com | Sr Eng, Trust Technology
01-650-933-5338   | Core Linux, SGI



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



user limits for 'security'?

2001-06-25 Thread LA Walsh

I've seen some people saying that user-limits are an essential part of a
secure system to prevent local DoS attacks.  Given that, should
a system call like 'fork' return -EPERM if the user has reached their
limit?

My local manpage (SuSE 7.2 system) says this under fork:

ERRORS
   EAGAIN fork  cannot allocate sufficient memory to copy the
  parent's page tables and allocate a task  structure
  for the child.
-
Should the man page be updated to reflect that EAGAIN is returned
when the user has reached their limit?  From a user-monitoring point
of view, it might be security relevant to know if a EAGAIN is being
returned because the system really is low on resources or if it
is a user hitting their limit.

--
The above thoughts and| I know I don't know the opinions
writings are my own.  | of every part of my company. :-)
L A Walsh, law at sgi.com | Sr Eng, Trust Technology
01-650-933-5338   | Core Linux, SGI



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: user limits for 'security'?

2001-06-25 Thread LA Walsh

I suppose another question related to the first, is 'limit' checking
part of the 'standard linux security' that embedded Linux users might
find to be a waste of precious code-space?

-l

--
The above thoughts and| I know I don't know the opinions
writings are my own.  | of every part of my company. :-)
L A Walsh, law at sgi.com | Sr Eng, Trust Technology
01-650-933-5338   | Core Linux, SGI



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Break 2.4 VM in five easy steps

2001-06-07 Thread LA Walsh

"Eric W. Biederman" wrote:

> LA Walsh <[EMAIL PROTECTED]> writes:
>
> > Now for whatever reason, since 2.4, I consistently use at least
> > a few Mb of swap -- stands at 5Meg now.  Weird -- but I notice things
> > like nscd running 7 copies that take 72M.  Seems like overkill for
> > a laptop.
>
> So the question becomes why you are seeing an increased swap usage.
> Currently there are two canidates in the 2.4.x code path.
>
> 1) Delayed swap deallocation, when a program exits after it
>has gone into swap it's swap usage is not freed. Ouch.

---
Double ouch.  Swap is backing a non-existent program?

>
>
> 2) Increased tenacity of swap caching.  In particular in 2.2.x if a page
>that was in the swap cache was written to the the page in the swap
>space would be removed.  In 2.4.x the location in swap space is
>retained with the goal of getting more efficient swap-ins.


But if the page in memory is 'dirty', you can't be efficient with swapping
*in* the page.  The page on disk is invalid and should be released, or am I
missing something?

> Neither of the known canidates from increasing the swap load applies
> when you aren't swapping in the first place.  They may aggrevate the
> usage of swap when you are already swapping but they do not cause
> swapping themselves.  This is why the intial recommendation for
> increased swap space size was made.  If you are swapping we will use
> more swap.
>
> However what pushes your laptop over the edge into swapping is an
> entirely different question.  And probably what should be solved.


On my laptop, it is insignificant and to my knowledge has no measurable
impact.  It seems like there is always 3-5 Meg used in swap no matter what's
running (or not) on the system.

> > I think that is the point -- it was supported in 2.2, it is, IMO,
> > a serious regression that it is not supported in 2.4.
>
> The problem with this general line of arguing is that it lumps a whole
> bunch of real issues/regressions into one over all perception.  Since
> there are multiple reasons people are seeing problems, they need to be
> tracked down with specifics.

---
Uhhh, yeah, sorta -- it's addressing the statement that a "new requirement of
2.4 is to have double the swap space".  If everyone agrees that's a problem, then
yes, we can go into specifics of what is causing or contributing to the problem.
It's getting past the attitude of some people that 2xMem for swap is somehow
'normal and acceptable -- deal with it".  In my case, seems like 10Mb of swap would
be all that would generally be used (I don't think I've ever seen swap usage over 7Mb)
on a 512M system.  To be told "oh, your wrong, you *should* have 1Gig or you are
operating in an 'unsupported' or non-standard configuration".  I find that very
user-unfriendly.


>
> The swapoff case comes down to dead swap pages in the swap cache.
> Which greatly increases the number of swap pages slows the system
> down, but since these pages are trivial to free we don't generate any
> I/O so don't wait for I/O and thus never enter the scheduler.  Making
> nothing else in the system runnable.

---
I haven't ever *noticed* this on my machine but that could be
because there isn't much in swap to begin with?  Could be I was
just blissfully ignorant of the time it took to do a swapoff.
Hmmmlet's see...  Just tried it.  I didn't get a total lock up,
but cursor movement was definitely jerky:
> time sudo swapoff -a

real0m10.577s
user0m0.000s
sys 0m9.430s

Looking at vmstat, the needed space was taken mostly out of the
page cache (86M->81.8M) and about 700K each out of free and buff.


> Your case is significantly different.  I don't know if you are seeing
> any issues with swapping at all.  With a 5M usage it may simply be
> totally unused pages being pushed out to the swap space.

---
Probably -- I guess the page cache and disk buffers put enough pressure to
push some things off to swap.

-linda
--
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Senior MTS, Trust Tech, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Break 2.4 VM in five easy steps

2001-06-07 Thread LA Walsh

"Eric W. Biederman" wrote:

> There are cetain scenario's where you can't avoid virtual mem =
> min(RAM,swap). Which is what I was trying to say, (bad formula).  What
> happens is that pages get referenced  evenly enough and quickly enough
> that you simply cannot reuse the on disk pages.  Basically in the
> worst case all of RAM is pretty much in flight doing I/O.  This is
> true of all paging systems.


So, if I understand, you are talking about thrashing behavior
where your active set is larger than physical ram.  If that
is the case then requiring 2X+ swap for "better" performance
is reasonable.  However, if your active set is truely larger
than your physical memory on a consistant basis, in this day,
the solution is usually "add more RAM".  I may be wrong, but
my belief is that with today's computers people are used to having
enough memory to do their normal tasks and that swap is for
"peak loads" that don't occur on a sustained basis.  Of course
I imagine that this is my belief as it is my own practice/view.
I want to have considerably more memory than my normal working
set.  Swap on my laptop disk is *slow*.  It's a low-power, low-RPM,
slow seek rate all to conserve power (difference between spinning/off
= 1W).  So I have 50% of my phys mem on swap -- because I want to
'feel' it when I goto swap and start looking for memory hogs.
For me, the pathological case is touching swap *at all*.  So the
idea of the entire active set being >=phys mem is already broken
on my setup.  Thus my expectation of swap only as 'warning'/'buffer'
zone.

Now for whatever reason, since 2.4, I consistently use at least
a few Mb of swap -- stands at 5Meg now.  Weird -- but I notice things
like nscd running 7 copies that take 72M.  Seems like overkill for
a laptop.

> However just because in the worst case virtual mem = min(RAM,swap), is
> no reason other cases should use that much swap.  If you are doing a
> lot of swapping it is more efficient to plan on mem = min(RAM,swap) as
> well, because frequently you can save on I/O operations by simply
> reusing the existing swap page.

---
Agreed.  But planning your swap space for a worst
case scenario that you never hit is wasteful.  My worst
case is using any swap.  The system should be able to live
with swap=1/2*phys in my situation.  I don't think I'm
unique in this respect.

> It's a theoretical worst case and they all have it.  In practice it is
> very hard to find a work load where practically every page in the
> system is close to the I/O point howerver.

---
Well exactly the point.  It was in such situations in some older
systems that some programs were swapped out and temporarily made
unavailable for running (they showed up in the 'w' space in vmstat).

> Except for removing pages that aren't used paging with swap < RAM is
> not useful.  Simply removing pages that aren't in active use but might
> possibly be used someday is a common case, so it is worth supporting.

---
I think that is the point -- it was supported in 2.2, it is, IMO,
a serious regression that it is not supported in 2.4.

-linda

--
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Senior MTS, Trust Tech., Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Break 2.4 VM in five easy steps

2001-06-07 Thread LA Walsh

Eric W. Biederman wrote:

 There are cetain scenario's where you can't avoid virtual mem =
 min(RAM,swap). Which is what I was trying to say, (bad formula).  What
 happens is that pages get referenced  evenly enough and quickly enough
 that you simply cannot reuse the on disk pages.  Basically in the
 worst case all of RAM is pretty much in flight doing I/O.  This is
 true of all paging systems.


So, if I understand, you are talking about thrashing behavior
where your active set is larger than physical ram.  If that
is the case then requiring 2X+ swap for better performance
is reasonable.  However, if your active set is truely larger
than your physical memory on a consistant basis, in this day,
the solution is usually add more RAM.  I may be wrong, but
my belief is that with today's computers people are used to having
enough memory to do their normal tasks and that swap is for
peak loads that don't occur on a sustained basis.  Of course
I imagine that this is my belief as it is my own practice/view.
I want to have considerably more memory than my normal working
set.  Swap on my laptop disk is *slow*.  It's a low-power, low-RPM,
slow seek rate all to conserve power (difference between spinning/off
= 1W).  So I have 50% of my phys mem on swap -- because I want to
'feel' it when I goto swap and start looking for memory hogs.
For me, the pathological case is touching swap *at all*.  So the
idea of the entire active set being =phys mem is already broken
on my setup.  Thus my expectation of swap only as 'warning'/'buffer'
zone.

Now for whatever reason, since 2.4, I consistently use at least
a few Mb of swap -- stands at 5Meg now.  Weird -- but I notice things
like nscd running 7 copies that take 72M.  Seems like overkill for
a laptop.

 However just because in the worst case virtual mem = min(RAM,swap), is
 no reason other cases should use that much swap.  If you are doing a
 lot of swapping it is more efficient to plan on mem = min(RAM,swap) as
 well, because frequently you can save on I/O operations by simply
 reusing the existing swap page.

---
Agreed.  But planning your swap space for a worst
case scenario that you never hit is wasteful.  My worst
case is using any swap.  The system should be able to live
with swap=1/2*phys in my situation.  I don't think I'm
unique in this respect.

 It's a theoretical worst case and they all have it.  In practice it is
 very hard to find a work load where practically every page in the
 system is close to the I/O point howerver.

---
Well exactly the point.  It was in such situations in some older
systems that some programs were swapped out and temporarily made
unavailable for running (they showed up in the 'w' space in vmstat).

 Except for removing pages that aren't used paging with swap  RAM is
 not useful.  Simply removing pages that aren't in active use but might
 possibly be used someday is a common case, so it is worth supporting.

---
I think that is the point -- it was supported in 2.2, it is, IMO,
a serious regression that it is not supported in 2.4.

-linda

--
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Senior MTS, Trust Tech., Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Break 2.4 VM in five easy steps

2001-06-07 Thread LA Walsh

Eric W. Biederman wrote:

 LA Walsh [EMAIL PROTECTED] writes:

  Now for whatever reason, since 2.4, I consistently use at least
  a few Mb of swap -- stands at 5Meg now.  Weird -- but I notice things
  like nscd running 7 copies that take 72M.  Seems like overkill for
  a laptop.

 So the question becomes why you are seeing an increased swap usage.
 Currently there are two canidates in the 2.4.x code path.

 1) Delayed swap deallocation, when a program exits after it
has gone into swap it's swap usage is not freed. Ouch.

---
Double ouch.  Swap is backing a non-existent program?



 2) Increased tenacity of swap caching.  In particular in 2.2.x if a page
that was in the swap cache was written to the the page in the swap
space would be removed.  In 2.4.x the location in swap space is
retained with the goal of getting more efficient swap-ins.


But if the page in memory is 'dirty', you can't be efficient with swapping
*in* the page.  The page on disk is invalid and should be released, or am I
missing something?

 Neither of the known canidates from increasing the swap load applies
 when you aren't swapping in the first place.  They may aggrevate the
 usage of swap when you are already swapping but they do not cause
 swapping themselves.  This is why the intial recommendation for
 increased swap space size was made.  If you are swapping we will use
 more swap.

 However what pushes your laptop over the edge into swapping is an
 entirely different question.  And probably what should be solved.


On my laptop, it is insignificant and to my knowledge has no measurable
impact.  It seems like there is always 3-5 Meg used in swap no matter what's
running (or not) on the system.

  I think that is the point -- it was supported in 2.2, it is, IMO,
  a serious regression that it is not supported in 2.4.

 The problem with this general line of arguing is that it lumps a whole
 bunch of real issues/regressions into one over all perception.  Since
 there are multiple reasons people are seeing problems, they need to be
 tracked down with specifics.

---
Uhhh, yeah, sorta -- it's addressing the statement that a new requirement of
2.4 is to have double the swap space.  If everyone agrees that's a problem, then
yes, we can go into specifics of what is causing or contributing to the problem.
It's getting past the attitude of some people that 2xMem for swap is somehow
'normal and acceptable -- deal with it.  In my case, seems like 10Mb of swap would
be all that would generally be used (I don't think I've ever seen swap usage over 7Mb)
on a 512M system.  To be told oh, your wrong, you *should* have 1Gig or you are
operating in an 'unsupported' or non-standard configuration.  I find that very
user-unfriendly.



 The swapoff case comes down to dead swap pages in the swap cache.
 Which greatly increases the number of swap pages slows the system
 down, but since these pages are trivial to free we don't generate any
 I/O so don't wait for I/O and thus never enter the scheduler.  Making
 nothing else in the system runnable.

---
I haven't ever *noticed* this on my machine but that could be
because there isn't much in swap to begin with?  Could be I was
just blissfully ignorant of the time it took to do a swapoff.
Hmmmlet's see...  Just tried it.  I didn't get a total lock up,
but cursor movement was definitely jerky:
 time sudo swapoff -a

real0m10.577s
user0m0.000s
sys 0m9.430s

Looking at vmstat, the needed space was taken mostly out of the
page cache (86M-81.8M) and about 700K each out of free and buff.


 Your case is significantly different.  I don't know if you are seeing
 any issues with swapping at all.  With a 5M usage it may simply be
 totally unused pages being pushed out to the swap space.

---
Probably -- I guess the page cache and disk buffers put enough pressure to
push some things off to swap.

-linda
--
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Senior MTS, Trust Tech, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Break 2.4 VM in five easy steps

2001-06-06 Thread LA Walsh

"Eric W. Biederman" wrote:

> The hard rule will always be that to cover all pathological cases swap
> must be greater than RAM.  Because in the worse case all RAM will be
> in thes swap cache.  That this is more than just the worse case in 2.4
> is problematic.  I.e. In the worst case:
> Virtual Memory = RAM + (swap - RAM).

Hmmmso my 512M laptop only really has 256M?  Um...I regularlly run
more than 256M of programs.  I don't want it to swap -- its a special, weird
condition if I do start swapping.  I don't want to waste 1G of HD (5%) for
something I never want to use.  IRIX runs just fine with swap You can't improve the worst case.  We can improve the worst case that
> many people are facing.

---
Other OS's don't have this pathological 'worst case' scenario.  Even
my Windows [vm]box seems to operate fine with swap It's worth complaining about.  It is also worth digging into and find
> out what the real problem is.  I have a hunch that this hole
> conversation on swap sizes being irritating is hiding the real
> problem.

---
Okay, admission of ignorance.  When we speak of "swap space",
is this term inclusive of both demand paging space and
swap-out-entire-programs space or one or another?
-linda

--
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Break 2.4 VM in five easy steps

2001-06-06 Thread LA Walsh

Eric W. Biederman wrote:

 The hard rule will always be that to cover all pathological cases swap
 must be greater than RAM.  Because in the worse case all RAM will be
 in thes swap cache.  That this is more than just the worse case in 2.4
 is problematic.  I.e. In the worst case:
 Virtual Memory = RAM + (swap - RAM).

Hmmmso my 512M laptop only really has 256M?  Um...I regularlly run
more than 256M of programs.  I don't want it to swap -- its a special, weird
condition if I do start swapping.  I don't want to waste 1G of HD (5%) for
something I never want to use.  IRIX runs just fine with swapRAM.  In
Irix, your Virtual Memory = RAM + swap.  Seems like the Linux kernel requires
more swap than other old OS's (SunOS3 (virtual mem = min(mem,swap)).
I *thought* I remember that restriction being lifted in SunOS4 when they
upgraded the VM.  Even though I worked there for 6 years, that was
6 years ago...

 You can't improve the worst case.  We can improve the worst case that
 many people are facing.

---
Other OS's don't have this pathological 'worst case' scenario.  Even
my Windows [vm]box seems to operate fine with swapMEM.  On IRIX,
virtual space closely approximates physical + disk memory.

 It's worth complaining about.  It is also worth digging into and find
 out what the real problem is.  I have a hunch that this hole
 conversation on swap sizes being irritating is hiding the real
 problem.

---
Okay, admission of ignorance.  When we speak of swap space,
is this term inclusive of both demand paging space and
swap-out-entire-programs space or one or another?
-linda

--
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ln -s broken on 2.4.5

2001-05-30 Thread LA Walsh

Marcus Meissner wrote:

> $ ln -s fupp/bar bar
> $ ls -la bar

---
Is it peculiar to a specific architecture?
What does strace show for args to the symlink cmd?
-l
--
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: ln -s broken on 2.4.5

2001-05-30 Thread LA Walsh

Marcus Meissner wrote:

 $ ln -s fupp/bar bar
 $ ls -la bar

---
Is it peculiar to a specific architecture?
What does strace show for args to the symlink cmd?
-l
--
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[i386 arch] MTR messages significant?]

2001-05-08 Thread LA Walsh

I've been seeing these for a while now (2.4.4 - <=2.4.2) also coincidental
with a change to XFree86 X 4.0.3 from "MetroX" in the time frame.  Am not sure
exactly when they started but was wondering if they were significant.  It
seems some app is trying to delete or modify something.  On console and in syslog:

mtrr: no MTRR for fd00,80 found
mtrr: MTRR 1 not used
mtrr: reg 1 not used

while /proc/mtrr currently contains:

reg00: base=0x (   0MB), size= 512MB: write-back, count=1
reg01: base=0xfd00 (4048MB), size=   8MB: write-combining, count=1

Could it be the X server trying to delete a segment when it it starts up or
shuts down?  Is it an error in the X server to try to delete a non-existant
segment?  Does the kernel 'care'?  I.e. -- why is it printing out messages --
are they debug messages that perhaps should be off by default?

Concurrent with these messages and perhaps unrelated is a new, unwelcome,
behavior of X dying on display of some Netscape-rendered websites (cf. it
doesn't die under konqueror).

thanks,
-linda
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[i386 arch] MTR messages significant?]

2001-05-08 Thread LA Walsh

I've been seeing these for a while now (2.4.4 - =2.4.2) also coincidental
with a change to XFree86 X 4.0.3 from MetroX in the time frame.  Am not sure
exactly when they started but was wondering if they were significant.  It
seems some app is trying to delete or modify something.  On console and in syslog:

mtrr: no MTRR for fd00,80 found
mtrr: MTRR 1 not used
mtrr: reg 1 not used

while /proc/mtrr currently contains:

reg00: base=0x (   0MB), size= 512MB: write-back, count=1
reg01: base=0xfd00 (4048MB), size=   8MB: write-combining, count=1

Could it be the X server trying to delete a segment when it it starts up or
shuts down?  Is it an error in the X server to try to delete a non-existant
segment?  Does the kernel 'care'?  I.e. -- why is it printing out messages --
are they debug messages that perhaps should be off by default?

Concurrent with these messages and perhaps unrelated is a new, unwelcome,
behavior of X dying on display of some Netscape-rendered websites (cf. it
doesn't die under konqueror).

thanks,
-linda
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4.4 code breaks compile of VMWare network bridging

2001-05-02 Thread LA Walsh

"Mohammad A. Haque" wrote:

> This was answered several hours ago. Check the list archives.

---
Many thanks -- it was in my neverending backlog

-l

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



2.4.4 code breaks compile of VMWare network bridging

2001-05-02 Thread LA Walsh

In 2.4.4, the define, in
include/linux/skbuff.h
and corresponding structure in
net/core/skbuff.c
, "skb_datarefp" disappeared.

I'm not reporting this as a 'bug' as kernel internal interfaces are subject
to change, but more as an "FYI".  I haven't had a chance to try to
debug or figure out the offending bit of code to see exactly what it
was trying to do, but the offending code snippet follows.  I haven't yet
reported it to the folks at VMware, but their response to problem reports
against 2.4.x is "can you duplicate it against 2.2.x, we don't support
2.4.x yet".  Perhaps someone expert in the 'net/core' area could explain
what changed and what they shouldn't be doing anymore?

It appears the references:
#  define KFREE_SKB(skb, type)  kfree_skb(skb)
#  define DEV_KFREE_SKB(skb, type)  dev_kfree_skb(skb)
^^
are the offending culprits.

Thanks for any insights...
-linda

/*
 *--
 * VNetBridgeReceiveFromDev --
 *  Receive a packet from a bridged peer device
 *  This is called from the bottom half.  Must be careful.
 * Results:
 *  errno.
 * Side effects:
 *  A packet may be sent to the vnet.
 *--
 */
int
VNetBridgeReceiveFromDev(struct sk_buff *skb,
 struct device *dev,
 struct packet_type *pt)
{
   VNetBridge *bridge = *(VNetBridge**)&((struct sock *)pt->data)->protinfo;
   int i;

   if (bridge->dev == NULL) {
  LOG(3, (KERN_DEBUG "bridge-%s: received %d closed\n",
  bridge->name, (int) skb->len));
  DEV_KFREE_SKB(skb, FREE_READ);
  return -EIO;  // value is ignored anyway
   }

   // XXX need to lock history
   for (i = 0; i < VNET_BRIDGE_HISTORY; i++) {
  struct sk_buff *s = bridge->history[i];
  if (s != NULL &&
  (s == skb || SKB_IS_CLONE_OF(skb, s))) {
 bridge->history[i] = NULL;
 KFREE_SKB(s, FREE_WRITE);
 LOG(3, (KERN_DEBUG "bridge-%s: receive %d self %d\n",
 bridge->name, (int) skb->len, i));
 // FREE_WRITE because we did the allocation, it's not used anyway
 DEV_KFREE_SKB(skb, FREE_WRITE);
 return 0;
  }
   }
   skb_push(skb, skb->data - skb->mac.raw);
   VNetSend(>port.jack, skb);

   return 0;
}

--
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



2.4.4 code breaks compile of VMWare network bridging

2001-05-02 Thread LA Walsh

In 2.4.4, the define, in
include/linux/skbuff.h
and corresponding structure in
net/core/skbuff.c
, skb_datarefp disappeared.

I'm not reporting this as a 'bug' as kernel internal interfaces are subject
to change, but more as an FYI.  I haven't had a chance to try to
debug or figure out the offending bit of code to see exactly what it
was trying to do, but the offending code snippet follows.  I haven't yet
reported it to the folks at VMware, but their response to problem reports
against 2.4.x is can you duplicate it against 2.2.x, we don't support
2.4.x yet.  Perhaps someone expert in the 'net/core' area could explain
what changed and what they shouldn't be doing anymore?

It appears the references:
#  define KFREE_SKB(skb, type)  kfree_skb(skb)
#  define DEV_KFREE_SKB(skb, type)  dev_kfree_skb(skb)
^^
are the offending culprits.

Thanks for any insights...
-linda

/*
 *--
 * VNetBridgeReceiveFromDev --
 *  Receive a packet from a bridged peer device
 *  This is called from the bottom half.  Must be careful.
 * Results:
 *  errno.
 * Side effects:
 *  A packet may be sent to the vnet.
 *--
 */
int
VNetBridgeReceiveFromDev(struct sk_buff *skb,
 struct device *dev,
 struct packet_type *pt)
{
   VNetBridge *bridge = *(VNetBridge**)((struct sock *)pt-data)-protinfo;
   int i;

   if (bridge-dev == NULL) {
  LOG(3, (KERN_DEBUG bridge-%s: received %d closed\n,
  bridge-name, (int) skb-len));
  DEV_KFREE_SKB(skb, FREE_READ);
  return -EIO;  // value is ignored anyway
   }

   // XXX need to lock history
   for (i = 0; i  VNET_BRIDGE_HISTORY; i++) {
  struct sk_buff *s = bridge-history[i];
  if (s != NULL 
  (s == skb || SKB_IS_CLONE_OF(skb, s))) {
 bridge-history[i] = NULL;
 KFREE_SKB(s, FREE_WRITE);
 LOG(3, (KERN_DEBUG bridge-%s: receive %d self %d\n,
 bridge-name, (int) skb-len, i));
 // FREE_WRITE because we did the allocation, it's not used anyway
 DEV_KFREE_SKB(skb, FREE_WRITE);
 return 0;
  }
   }
   skb_push(skb, skb-data - skb-mac.raw);
   VNetSend(bridge-port.jack, skb);

   return 0;
}

--
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4.4 code breaks compile of VMWare network bridging

2001-05-02 Thread LA Walsh

Mohammad A. Haque wrote:

 This was answered several hours ago. Check the list archives.

---
Many thanks -- it was in my neverending backlog

-l

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-27 Thread LA Walsh

Rik van Riel wrote:

> On Fri, 27 Apr 2001, LA Walsh wrote:
>
> > An interesting option (though with less-than-stellar performance
> > characteristics) would be a dynamically expanding swapfile.  If you're
> > going to be hit with swap penalties, it may be useful to not have to
> > pre-reserve something you only hit once in a great while.
>
> This makes amazingly little sense since you'd still need to
> pre-reserve the disk space the swapfile grows into.

---
Why?  Why not have a zero length file that you grow only if you spill?
If you can't spill, you are out of memory -- or reserve a 'safety'
margin ahead -- like reserve 32k at a time and grow it.  It may make
little sense, but I believe it is what is used on pseudo OS's
like Windows -- you *can* preallocate, but the normal case has
Windows managing the swap file and growing it as needed up to
available disk space.  If it is doable in windows, you'd think there'd
be some way of doing it in Linux, but perhaps linux's complexity
doesn't allow for that type of feature.

As for disk-space reserves, if you have 5% reserved for
root' on a 20G ext disk, that still amounts to 1G reserved for root.
Seems an automatically sizing swap file might be just fine for some people
not me, I don't even like to use swap, but I'm not my mom using windows ME either).

But, conversely, if it's coming out of space I wouldn't normally
use anyway -- say the "5%" -- i.e. the 5% is something I'd likely only
use under *rare* conditions.  I might have enough memory and the
right system load that I also 'rarely' use swap -- so not reserving
1G/1G (2xMEM) on my laptop both of which will rarely get used seems like
a waste of 2G.  I suppose if I put it that way I might convince myself
to use it,

--
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-27 Thread LA Walsh

Rogier Wolff wrote:

> > > On Linux any swap adds to the memory pool, so 1xRAM would be
> > > equivalent to 2xRAM with the old old OS's.
> >
> > no more true AFAIK
>
> I've always been trying to convice people that 2x RAM remains a good
> rule-of-thumb.

---
Ug.  I like to view swap as "low grade memory" -- i.e. I really
should spend 99.9% of my time in RAM -- if I spill, then it means
I'm running too much/too big for my computer and should get more RAM --
meanwhile, I suffer with performance degradation to remind me I'm really
exceeding my machine's physical memory capacity.

An interesting option (though with less-than-stellar performance
characteristics) would be a dynamically expanding swapfile.  If you're
going to be hit with swap penalties, it may be useful to not have to
pre-reserve something you only hit once in a great while.

Definitely only for systems where you don't expect to use swap (but
it could be there for "emergencies" up to some predefined limit or
available disk space).

--
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] SMP race in ext2 - metadata corruption.

2001-04-27 Thread LA Walsh

Andrzej Krzysztofowicz wrote:

> I know a few people that often do:
>
> dd if=/dev/hda1 of=/dev/hdc1
> e2fsck /dev/hdc1
>
> to make an "exact" copy of a currently working system.

---
Presumably this isn't a problem is the source disks are either unmounted or 
mounted 'read-only' ?


--
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] SMP race in ext2 - metadata corruption.

2001-04-27 Thread LA Walsh

Andrzej Krzysztofowicz wrote:

 I know a few people that often do:

 dd if=/dev/hda1 of=/dev/hdc1
 e2fsck /dev/hdc1

 to make an exact copy of a currently working system.

---
Presumably this isn't a problem is the source disks are either unmounted or 
mounted 'read-only' ?


--
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-27 Thread LA Walsh

Rogier Wolff wrote:

   On Linux any swap adds to the memory pool, so 1xRAM would be
   equivalent to 2xRAM with the old old OS's.
 
  no more true AFAIK

 I've always been trying to convice people that 2x RAM remains a good
 rule-of-thumb.

---
Ug.  I like to view swap as low grade memory -- i.e. I really
should spend 99.9% of my time in RAM -- if I spill, then it means
I'm running too much/too big for my computer and should get more RAM --
meanwhile, I suffer with performance degradation to remind me I'm really
exceeding my machine's physical memory capacity.

An interesting option (though with less-than-stellar performance
characteristics) would be a dynamically expanding swapfile.  If you're
going to be hit with swap penalties, it may be useful to not have to
pre-reserve something you only hit once in a great while.

Definitely only for systems where you don't expect to use swap (but
it could be there for emergencies up to some predefined limit or
available disk space).

--
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 2.4 and 2GB swap partition limit

2001-04-27 Thread LA Walsh

Rik van Riel wrote:

 On Fri, 27 Apr 2001, LA Walsh wrote:

  An interesting option (though with less-than-stellar performance
  characteristics) would be a dynamically expanding swapfile.  If you're
  going to be hit with swap penalties, it may be useful to not have to
  pre-reserve something you only hit once in a great while.

 This makes amazingly little sense since you'd still need to
 pre-reserve the disk space the swapfile grows into.

---
Why?  Why not have a zero length file that you grow only if you spill?
If you can't spill, you are out of memory -- or reserve a 'safety'
margin ahead -- like reserve 32k at a time and grow it.  It may make
little sense, but I believe it is what is used on pseudo OS's
like Windows -- you *can* preallocate, but the normal case has
Windows managing the swap file and growing it as needed up to
available disk space.  If it is doable in windows, you'd think there'd
be some way of doing it in Linux, but perhaps linux's complexity
doesn't allow for that type of feature.

As for disk-space reserves, if you have 5% reserved for
root' on a 20G ext disk, that still amounts to 1G reserved for root.
Seems an automatically sizing swap file might be just fine for some people
not me, I don't even like to use swap, but I'm not my mom using windows ME either).

But, conversely, if it's coming out of space I wouldn't normally
use anyway -- say the 5% -- i.e. the 5% is something I'd likely only
use under *rare* conditions.  I might have enough memory and the
right system load that I also 'rarely' use swap -- so not reserving
1G/1G (2xMEM) on my laptop both of which will rarely get used seems like
a waste of 2G.  I suppose if I put it that way I might convince myself
to use it,

--
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [QUESTION] 2.4.x nice level

2001-04-02 Thread LA Walsh

Quim K Holland wrote:
> 
> > "BS" == BERECZ Szabolcs <[EMAIL PROTECTED]> writes:
> 
> BS> ... a setiathome running at nice level 19, and a bladeenc at
> BS> nice level 0. setiathome uses 14 percent, and bladeenc uses
> BS> 84 percent of the processor. I think, setiathome should use
> BS> max 2-3 percent.  the 14 percent is way too much for me.
> BS> ...
> BS> with kernel 2.2.16 it worked for me.
> BS> now I use 2.4.2-ac20
---
I was running 2 copies of setiathome on a 4 CPU server
@ work.  The two processes ran nice'd -19.  The builds we were 
running still took 20-30% longer as opposed to when setiathome wasn't
running (went from 45 minutes up to about an hour).  This machine
has 1G, so I don't think it was hurting from swapping.

I finally wrote a script that checked every 30 seconds -- if the
load on the machine climbed over '4', the script would SIGSTOP the
seti jobs.  Once the load on the machine fell below 2, it would 
send a SIGCONT to them.  

I was also running setiathome on my laptop for a short while --
but the fan kept coming on and the computer would get really hot.
So I stopped that.  Linux @ idle doesn't seem to ever kick on
the fan, but turn on a CPU cruching program and it sure seemed
to heat up the machine.  I still wonder how many kilo or mega watts
go to running dispersed computation programs.  Just one of those
things I may never know

-l

-- 
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [QUESTION] 2.4.x nice level

2001-04-02 Thread LA Walsh

Quim K Holland wrote:
 
  "BS" == BERECZ Szabolcs [EMAIL PROTECTED] writes:
 
 BS ... a setiathome running at nice level 19, and a bladeenc at
 BS nice level 0. setiathome uses 14 percent, and bladeenc uses
 BS 84 percent of the processor. I think, setiathome should use
 BS max 2-3 percent.  the 14 percent is way too much for me.
 BS ...
 BS with kernel 2.2.16 it worked for me.
 BS now I use 2.4.2-ac20
---
I was running 2 copies of setiathome on a 4 CPU server
@ work.  The two processes ran nice'd -19.  The builds we were 
running still took 20-30% longer as opposed to when setiathome wasn't
running (went from 45 minutes up to about an hour).  This machine
has 1G, so I don't think it was hurting from swapping.

I finally wrote a script that checked every 30 seconds -- if the
load on the machine climbed over '4', the script would SIGSTOP the
seti jobs.  Once the load on the machine fell below 2, it would 
send a SIGCONT to them.  

I was also running setiathome on my laptop for a short while --
but the fan kept coming on and the computer would get really hot.
So I stopped that.  Linux @ idle doesn't seem to ever kick on
the fan, but turn on a CPU cruching program and it sure seemed
to heat up the machine.  I still wonder how many kilo or mega watts
go to running dispersed computation programs.  Just one of those
things I may never know

-l

-- 
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: unistd.h and 'extern's and 'syscall' "standard(?)"

2001-04-01 Thread LA Walsh

Andreas Schwab wrote:
> Don't use kernel headers in user programs.  Just use syscall(3).
> 
> Andreas.
---
I'm on a SuSE71 system and have all the manpages installed:
law> man syscall
No manual entry for syscall

The problem is not so much for user programs as library
writers that write support libraries for kernel calls.  For 
example there is libcap to implement posix capabilities on top
of the kernel call.  We have a libaudit to implement posix-auditing
on top a a few kernel calls.  It's the "system" library to system-call
interface that's the problem, mainly.  On ia64, it doesn't seem
like there is a reliable, cross-distro, cross architecture way of
interfacing to the kernel.

In saying "use syscall(3)" (which is undocumented on
my SuSE system, and on a RH61 sytem), implies it is in some
library.  I've heard rumors that the call isn't present in RH
distros and they claim its because it's not exported from glibc.
Then I heard glibc said it wasn't their intention to export it.
(This is all 2nd hand, so forgive me if I have parties or details
confused or mis-stated). It seems like kernel source points to an 
external source, Vender points at glibc, glibc says not their intention.
Meanwhile, an important bit of kernel functionality --
being able to use syscall0, syscall1, syscall2...etc, ends up
missing for those wanting to construct libraries on top of the
kernel.

I end up being rather perplexed about the correct course
of action to take.  Seeing as you work for suse, would you know
where this 'syscall(3)' interface should be documented?  Is it
supposed to be present in all distro's?  


Thanks,
-linda
-- 
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



unistd.h and 'extern's and 'syscall' "standard(?)"

2001-04-01 Thread LA Walsh


I have a question.  Some architectures have "system calls"
implemented as library calls (calls that are "system calls" on ia32)
For example, the expectation on 'arm', seems to be that sys_sync
is in a library.  On alpha, sys_open appears to be in a library.
Is this correct?

Is it the expectation that the library that handles this
is the 'glibc' for that platform or is there a special "kernel.lib"
that goes with each platform?

Is there 1 library that I need to link my apps with to
get the 'externs' referenced in "unistd.h"?

The reason I ask is that in ia64 the 'syscall' call
isn't done with inline assembler but is itself an 'extern' call.
This implies that you can't do system calls directly w/o some 
support library.

The implication of this is that developers working on
platform independent system calls and library functions, for
example, extended attributes, audit or MAC, can't provide
platform independent patches w/o also providing their own
syscall implementation for ia64.

This came up as a problem when we wanted to provide a
a new piece of code but found it wouldn't link on some distributions.
In inquiry there seems to be some confusion regarding who is responsible
for providing this the code/library to satisfy this 'unistd.h' extern.

Should something so basic as the 'syscall' interface be provided
in the kernel sources, perhaps as a kernel-provided 'lib', or is
it expected it will be provided by someone else or is it expected
that each developer should provide their own syscall implementation for
ia64?

Thanks,
-linda
-- 
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



unistd.h and 'extern's and 'syscall' standard(?)

2001-04-01 Thread LA Walsh


I have a question.  Some architectures have "system calls"
implemented as library calls (calls that are "system calls" on ia32)
For example, the expectation on 'arm', seems to be that sys_sync
is in a library.  On alpha, sys_open appears to be in a library.
Is this correct?

Is it the expectation that the library that handles this
is the 'glibc' for that platform or is there a special "kernel.lib"
that goes with each platform?

Is there 1 library that I need to link my apps with to
get the 'externs' referenced in "unistd.h"?

The reason I ask is that in ia64 the 'syscall' call
isn't done with inline assembler but is itself an 'extern' call.
This implies that you can't do system calls directly w/o some 
support library.

The implication of this is that developers working on
platform independent system calls and library functions, for
example, extended attributes, audit or MAC, can't provide
platform independent patches w/o also providing their own
syscall implementation for ia64.

This came up as a problem when we wanted to provide a
a new piece of code but found it wouldn't link on some distributions.
In inquiry there seems to be some confusion regarding who is responsible
for providing this the code/library to satisfy this 'unistd.h' extern.

Should something so basic as the 'syscall' interface be provided
in the kernel sources, perhaps as a kernel-provided 'lib', or is
it expected it will be provided by someone else or is it expected
that each developer should provide their own syscall implementation for
ia64?

Thanks,
-linda
-- 
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: unistd.h and 'extern's and 'syscall' standard(?)

2001-04-01 Thread LA Walsh

Andreas Schwab wrote:
 Don't use kernel headers in user programs.  Just use syscall(3).
 
 Andreas.
---
I'm on a SuSE71 system and have all the manpages installed:
law man syscall
No manual entry for syscall

The problem is not so much for user programs as library
writers that write support libraries for kernel calls.  For 
example there is libcap to implement posix capabilities on top
of the kernel call.  We have a libaudit to implement posix-auditing
on top a a few kernel calls.  It's the "system" library to system-call
interface that's the problem, mainly.  On ia64, it doesn't seem
like there is a reliable, cross-distro, cross architecture way of
interfacing to the kernel.

In saying "use syscall(3)" (which is undocumented on
my SuSE system, and on a RH61 sytem), implies it is in some
library.  I've heard rumors that the call isn't present in RH
distros and they claim its because it's not exported from glibc.
Then I heard glibc said it wasn't their intention to export it.
(This is all 2nd hand, so forgive me if I have parties or details
confused or mis-stated). It seems like kernel source points to an 
external source, Vender points at glibc, glibc says not their intention.
Meanwhile, an important bit of kernel functionality --
being able to use syscall0, syscall1, syscall2...etc, ends up
missing for those wanting to construct libraries on top of the
kernel.

I end up being rather perplexed about the correct course
of action to take.  Seeing as you work for suse, would you know
where this 'syscall(3)' interface should be documented?  Is it
supposed to be present in all distro's?  


Thanks,
-linda
-- 
The above thoughts and   | They may have nothing to do with
writings are my own. | the opinions of my employer. :-)
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 64-bit block sizes on 32-bit systems

2001-03-27 Thread LA Walsh

Jan Harkes wrote:
> 
> On Tue, Mar 27, 2001 at 01:57:42PM -0600, Jesse Pollard wrote:
> > > Using similar numbers as presented. If we are working our way through
> > > every single block in a Pentabyte filesystem, and the blocksize is 512
> > > bytes. Then the 1us in extra CPU cycles because of 64-bit operations
> > > would add, according to by back of the envelope calculation, 2199023
> > > seconds of CPU time a bit more than 25 days.
> >
> > Ummm... I don't think it adds that much. You seem to be leaving out the
> > overlap disk/IO and computation for read-ahead. This should eliminate the
> > majority of the delay effect.
> 
> 1024 TB should be around 2*10^12 512-byte blocks, divide by 10^6 (1us)
> of "assumed" overhead per block operation is 2*10^6 seconds, no I
> believe I'm pretty close there. I am considering everything being
> "available in the cache", i.e. no waiting for disk access.
---
If everything being used is only used from the cache, then
the application probably doesn't need 64-bit block support.  

I submit that your argument may be flawed in the assumption that
if an application needs multi-terabyte files and devices, that most
of the data will be in the in-memory cache. 
 

> The time to update the pagetables is identical to the time to update a
> 4KB page when the OS is using a 2MB pagesize. Ofcourse it will take more
> time to load the data into the page, however it should be a consecutive
> stretch of data on disk, which should give a more efficient transfer
> than small blocks scattered around the disk.
---
Not if you were doing alot of random reads where you only
needd 1-2K of data.  The read-time of the extra 2M-1K would seem
to eat into any performance boot gained by the large pagesize.

> 
> > Granted, 512 bytes could be considered too small for some things, but
> > once you pass 32K you start adding a lot of rotational delay problems.
> > I've used file systems with 256K blocks - they are slow when compaired
> > to the throughput using 32K. I wasn't the one running the benchmarks,
> > but with a MaxStrat 400GB raid with 256K sized data transfer was much
> > slower (around 3 times slower) than 32K. (The target application was
> > a GIS server using Oracle).
> 
> But your subsystem (the disk) was probably still using 512 byte blocks,
> possibly scattered. And the OS was still using 4KB pages, it takes more
> time to reclaim and gather 64 pages per IO operation than one, that's
> why I'm saying that the pagesize needs to scale along with the blocksize.
> 
> The application might have been assuming a small block size as well, and
> the OS was told to do several read/modify/write cycles, perhaps even 512
> times as much as necessary.
> 
> I'm not saying that the current system will perform well when working
> with large blocks, but compared to increasing the size of block_t, a
> larger blocksize has more potential to give improvements in the long
> term without adding an unrecoverable performance hit.
---
That's totally application dependent.  Database applications
might tend to skip around in the data and do short/reads/writes over
a very large file.  Large block sizes will degrade their performance.

This was the idea of making it a *configurable* option.  If
you need it, configure it.  Same with block size -- that should
likely have a wider range for configuration as well.  But
configuration (and ideally auto-configuration where possible)
seems the ultimate win-win situation.

-l
-- 
The above thoughts are my own and do not necessarily represent those
of my employer.
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 64-bit block sizes on 32-bit systems

2001-03-27 Thread LA Walsh

Ion Badulescu wrote:
> Are you being deliberately insulting, "L", or are you one of those users
> who bitch and scream for features they *need* at *any cost*, and who
> have never even opened up the book for Computer Architecture 101?
---
Sorry, I was borderline insulting.  I'm getting pressure on
personal fronts other than just here.  But my degree is in computer
science and I've had almost 20 years experience programming things
as small as 8080's w/ 4K ram on up.  I'm familiar with 'cost' of
emulation.

> Let's try to keep the discussion civilized, shall we?
---
Certainly.
> 
> Compile option or not, 64-bit arithmetic is unacceptable on IA32. The
> introduction of LFS was bad enough, we don't need yet another proof that
> IA32 sucks. Especially when there *are* better alternatives.
===
So if it is a compile option -- the majority of people
wouldn't be affected, is that in agreement?  Since the default would
be to use the same arithmetic as we use  now.

In fact, I posit that if anything, the majority of the people
might be helped as the block_nr becomes a a 'typed' value -- and
perhaps the sector_nr as well.  They remain the same size, but as
a typed value the kernel gains increased integrity from the increased
type checking.  At worst, it finds no new bugs and there is no impact
in speed.  Are we in agreement so far?

Now lets look at the sites want to process terabytes of
data -- perhaps files systems up into the Pentabyte range.  Often I
can see these being large multi-node (think 16-1024 clusters as 
are in use today for large super-clusters).  If I was to characterize
the performance of them, I'd likely see the CPU pegged at 100% 
with 99% usage in user space.  Let's assume that increasing the
block size decreases disk accesses by as much as 10% (you'll have
to admit -- using a 64bit quantity vs. 32bit quantity isn't going
to even come close to increasing disk access times by 1 millisecond,
really, so it really is going to be a much smaller fraction when
compared to the actual disk latency).  

Ok...but for the sake of
argument using 10% -- that's still only 10% of 1% spent in the system.
or a slowdown of .1%.  Now that's using a really liberal figure
of 10%.  If you look at the actual speed of 64 bit arithmatic vs.
32, we're likely talking -- upper bound, 10x the clocks for 
disk block arithmetic.  Disk block arithmetic is a small fraction
of time spent in the kernel.  We have to be looking at *maximum*
slowdowns in the range of a few hundred maybe a few thousand extra clocks.
A 1000 extra clocks on a 1G machine is 1 microsecond, or approx
1/5000th your average seek latency on a *fast* hard disk.  So
instead of 10% slowdown we are talking slowdowns in the 1/1000 range
or less.  Now that's a slowdown in the 1% that was being spent in
the kernel, so now we've slowdown the total program speed by .001%
at the increase benefit (to that site) of being able to process
those mega-gig's (Pentabytes) of information.  For a hit that is
not noticable to human perception, they go from not being able to
use super-clusters of IA32 machines (for which HW and SW is cheap), 
to being able to use it.  That's quite a cost savings for them.

Is there some logical flaw in the above reasoning?

-linda
-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 64-bit block sizes on 32-bit systems

2001-03-27 Thread LA Walsh

Ion Badulescu wrote:
 Are you being deliberately insulting, "L", or are you one of those users
 who bitch and scream for features they *need* at *any cost*, and who
 have never even opened up the book for Computer Architecture 101?
---
Sorry, I was borderline insulting.  I'm getting pressure on
personal fronts other than just here.  But my degree is in computer
science and I've had almost 20 years experience programming things
as small as 8080's w/ 4K ram on up.  I'm familiar with 'cost' of
emulation.

 Let's try to keep the discussion civilized, shall we?
---
Certainly.
 
 Compile option or not, 64-bit arithmetic is unacceptable on IA32. The
 introduction of LFS was bad enough, we don't need yet another proof that
 IA32 sucks. Especially when there *are* better alternatives.
===
So if it is a compile option -- the majority of people
wouldn't be affected, is that in agreement?  Since the default would
be to use the same arithmetic as we use  now.

In fact, I posit that if anything, the majority of the people
might be helped as the block_nr becomes a a 'typed' value -- and
perhaps the sector_nr as well.  They remain the same size, but as
a typed value the kernel gains increased integrity from the increased
type checking.  At worst, it finds no new bugs and there is no impact
in speed.  Are we in agreement so far?

Now lets look at the sites want to process terabytes of
data -- perhaps files systems up into the Pentabyte range.  Often I
can see these being large multi-node (think 16-1024 clusters as 
are in use today for large super-clusters).  If I was to characterize
the performance of them, I'd likely see the CPU pegged at 100% 
with 99% usage in user space.  Let's assume that increasing the
block size decreases disk accesses by as much as 10% (you'll have
to admit -- using a 64bit quantity vs. 32bit quantity isn't going
to even come close to increasing disk access times by 1 millisecond,
really, so it really is going to be a much smaller fraction when
compared to the actual disk latency).  

Ok...but for the sake of
argument using 10% -- that's still only 10% of 1% spent in the system.
or a slowdown of .1%.  Now that's using a really liberal figure
of 10%.  If you look at the actual speed of 64 bit arithmatic vs.
32, we're likely talking -- upper bound, 10x the clocks for 
disk block arithmetic.  Disk block arithmetic is a small fraction
of time spent in the kernel.  We have to be looking at *maximum*
slowdowns in the range of a few hundred maybe a few thousand extra clocks.
A 1000 extra clocks on a 1G machine is 1 microsecond, or approx
1/5000th your average seek latency on a *fast* hard disk.  So
instead of 10% slowdown we are talking slowdowns in the 1/1000 range
or less.  Now that's a slowdown in the 1% that was being spent in
the kernel, so now we've slowdown the total program speed by .001%
at the increase benefit (to that site) of being able to process
those mega-gig's (Pentabytes) of information.  For a hit that is
not noticable to human perception, they go from not being able to
use super-clusters of IA32 machines (for which HW and SW is cheap), 
to being able to use it.  That's quite a cost savings for them.

Is there some logical flaw in the above reasoning?

-linda
-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 64-bit block sizes on 32-bit systems

2001-03-27 Thread LA Walsh

Jan Harkes wrote:
 
 On Tue, Mar 27, 2001 at 01:57:42PM -0600, Jesse Pollard wrote:
   Using similar numbers as presented. If we are working our way through
   every single block in a Pentabyte filesystem, and the blocksize is 512
   bytes. Then the 1us in extra CPU cycles because of 64-bit operations
   would add, according to by back of the envelope calculation, 2199023
   seconds of CPU time a bit more than 25 days.
 
  Ummm... I don't think it adds that much. You seem to be leaving out the
  overlap disk/IO and computation for read-ahead. This should eliminate the
  majority of the delay effect.
 
 1024 TB should be around 2*10^12 512-byte blocks, divide by 10^6 (1us)
 of "assumed" overhead per block operation is 2*10^6 seconds, no I
 believe I'm pretty close there. I am considering everything being
 "available in the cache", i.e. no waiting for disk access.
---
If everything being used is only used from the cache, then
the application probably doesn't need 64-bit block support.  

I submit that your argument may be flawed in the assumption that
if an application needs multi-terabyte files and devices, that most
of the data will be in the in-memory cache. 
 

 The time to update the pagetables is identical to the time to update a
 4KB page when the OS is using a 2MB pagesize. Ofcourse it will take more
 time to load the data into the page, however it should be a consecutive
 stretch of data on disk, which should give a more efficient transfer
 than small blocks scattered around the disk.
---
Not if you were doing alot of random reads where you only
needd 1-2K of data.  The read-time of the extra 2M-1K would seem
to eat into any performance boot gained by the large pagesize.

 
  Granted, 512 bytes could be considered too small for some things, but
  once you pass 32K you start adding a lot of rotational delay problems.
  I've used file systems with 256K blocks - they are slow when compaired
  to the throughput using 32K. I wasn't the one running the benchmarks,
  but with a MaxStrat 400GB raid with 256K sized data transfer was much
  slower (around 3 times slower) than 32K. (The target application was
  a GIS server using Oracle).
 
 But your subsystem (the disk) was probably still using 512 byte blocks,
 possibly scattered. And the OS was still using 4KB pages, it takes more
 time to reclaim and gather 64 pages per IO operation than one, that's
 why I'm saying that the pagesize needs to scale along with the blocksize.
 
 The application might have been assuming a small block size as well, and
 the OS was told to do several read/modify/write cycles, perhaps even 512
 times as much as necessary.
 
 I'm not saying that the current system will perform well when working
 with large blocks, but compared to increasing the size of block_t, a
 larger blocksize has more potential to give improvements in the long
 term without adding an unrecoverable performance hit.
---
That's totally application dependent.  Database applications
might tend to skip around in the data and do short/reads/writes over
a very large file.  Large block sizes will degrade their performance.

This was the idea of making it a *configurable* option.  If
you need it, configure it.  Same with block size -- that should
likely have a wider range for configuration as well.  But
configuration (and ideally auto-configuration where possible)
seems the ultimate win-win situation.

-l
-- 
The above thoughts are my own and do not necessarily represent those
of my employer.
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 64-bit block sizes on 32-bit systems

2001-03-26 Thread LA Walsh

Manfred Spraul wrote:
> 
> >4k page size * 2GB = 8TB.
> 
> Try it.
> If your drive (array) is larger than 512byte*4G (4TB) linux will eat
> your data.
---
I have a block device that doesn't use 'sectors'.  It
only uses the logical block size (which is currently set for
1K).  Seems I could up that to the max blocksize (4k?) and
get 8TB...No?

I don't use the generic block make request (have my
own).  

-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 64-bit block sizes on 32-bit systems

2001-03-26 Thread LA Walsh


Matthew Wilcox wrote:
> 
> On Mon, Mar 26, 2001 at 08:39:21AM -0800, LA Walsh wrote:
> > I vaguely remember a discussion about this a few months back.
> > If I remember, the reasoning was it would unnecessarily slow
> > down smaller systems that would never have block devices in
> > the 4-28T range attached.
> 
> 4k page size * 2GB = 8TB.
---
Drat...was being more optimistic -- you're right
the block_nr can be negative.  Somehow thought page size could
be 8Kliving in future land.  That just makes the limitations
even closer at hand...:-(

> you keep on trying to increase the size of types without looking at
> what gcc outputs in the way of code that manipulates 64-bit types.
---
Maybe someone will backport some of the features of the
IA-64 code generator into 'gcc'.  I've been told that in some 
cases it's a 2.5x performance difference.  If 'gcc' is generating
bad code, then maybe the 'gcc' people will increase the quality
of their code -- I'm sure they are just as eagerly working on
gcc improvements as we are kernel improvements.  When I worked
on the PL/M compiler project at Intel, I know our code-optimization
guy would spend endless cycles trying to get better optimization
out of the code.  He got great joy out of doing so. -- and
that was almost 20 years ago -- and code generation has come
a *long* way since then.

> seriously, why don't you just try it?  see what the performance is.
> see what the code size is.  then come back with some numbers.  and i mean
> numbers, not `it doesn't feel any slower'.
---
As for 'trying' it -- would anyone care if we virtualized
the block_nr into a typedef?  That seems like it would provide
for cleaner (type-checked) code at no performance penalty and
more easily allow such comparisons.

Well this is my point: if I have disks > 8T, wouldn't
it be at *all* beneficial to be able to *choose* some slight
performance impact and access those large disks vs. having not
choice?  Having it as a configurable would allow a given 
installation to make that choice rather than them having no
choice.  BTW, are block_nr's on RAID arrays subject to this
limitation?
> 
> personally, i'm going to see what the situation looks like in 5 years time
> and try to solve the problem then.
---
It's not the same, but SGI has had customers for over
3 years using >2T *files*.  The point I'm looking at is if
the P-X series gets developed enough, and someone is using a
4-16P system, a corp user might be approaching that limit
today or tomorrow.  Joe User, might not for 5 years, but that's
what the configurability is about.  Keep linux usable for both
ends of the scale -- "I love scalability"

-l

-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



64-bit block sizes on 32-bit systems

2001-03-26 Thread LA Walsh

I vaguely remember a discussion about this a few months back.
If I remember, the reasoning was it would unnecessarily slow
down smaller systems that would never have block devices in
the 4-28T range attached.  

However, isn't it possible there will continue to be a series
of P-IV,V,VI,VII ...etc, addons that will be used for sometime
to come.  I've even heard it suggested that we might see
2 or more CPU's on a single chip as a way to increase cpu
capacity w/o driving up clock speed.  Given the cheapness of
.25T drives now, seeing the possibility of 4T drives doesn't seem
that remote (maybe 5 years?).  

Side question: does the 32-bit block size limit also apply to 
RAID disks or does it use a different block-nr type?

So...is it the plan, or has it been though about -- 'abstracting'
block numbes as a typedef 'block_nr', then at compile time
having it be selectable as to whether or not this was to
be a 32-bit or 64 bit quantity -- that way older systems would
lose no efficiency.  Drivers that couldn't be or hadn't been
ported to use 'block_nr' could default to being disabled if
64-bit blocks were selected, etc.

So has this idea been tossed about and or previously thrashed?

-l

-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



64-bit block sizes on 32-bit systems

2001-03-26 Thread LA Walsh

I vaguely remember a discussion about this a few months back.
If I remember, the reasoning was it would unnecessarily slow
down smaller systems that would never have block devices in
the 4-28T range attached.  

However, isn't it possible there will continue to be a series
of P-IV,V,VI,VII ...etc, addons that will be used for sometime
to come.  I've even heard it suggested that we might see
2 or more CPU's on a single chip as a way to increase cpu
capacity w/o driving up clock speed.  Given the cheapness of
.25T drives now, seeing the possibility of 4T drives doesn't seem
that remote (maybe 5 years?).  

Side question: does the 32-bit block size limit also apply to 
RAID disks or does it use a different block-nr type?

So...is it the plan, or has it been though about -- 'abstracting'
block numbes as a typedef 'block_nr', then at compile time
having it be selectable as to whether or not this was to
be a 32-bit or 64 bit quantity -- that way older systems would
lose no efficiency.  Drivers that couldn't be or hadn't been
ported to use 'block_nr' could default to being disabled if
64-bit blocks were selected, etc.

So has this idea been tossed about and or previously thrashed?

-l

-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 64-bit block sizes on 32-bit systems

2001-03-26 Thread LA Walsh


Matthew Wilcox wrote:
 
 On Mon, Mar 26, 2001 at 08:39:21AM -0800, LA Walsh wrote:
  I vaguely remember a discussion about this a few months back.
  If I remember, the reasoning was it would unnecessarily slow
  down smaller systems that would never have block devices in
  the 4-28T range attached.
 
 4k page size * 2GB = 8TB.
---
Drat...was being more optimistic -- you're right
the block_nr can be negative.  Somehow thought page size could
be 8Kliving in future land.  That just makes the limitations
even closer at hand...:-(

 you keep on trying to increase the size of types without looking at
 what gcc outputs in the way of code that manipulates 64-bit types.
---
Maybe someone will backport some of the features of the
IA-64 code generator into 'gcc'.  I've been told that in some 
cases it's a 2.5x performance difference.  If 'gcc' is generating
bad code, then maybe the 'gcc' people will increase the quality
of their code -- I'm sure they are just as eagerly working on
gcc improvements as we are kernel improvements.  When I worked
on the PL/M compiler project at Intel, I know our code-optimization
guy would spend endless cycles trying to get better optimization
out of the code.  He got great joy out of doing so. -- and
that was almost 20 years ago -- and code generation has come
a *long* way since then.

 seriously, why don't you just try it?  see what the performance is.
 see what the code size is.  then come back with some numbers.  and i mean
 numbers, not `it doesn't feel any slower'.
---
As for 'trying' it -- would anyone care if we virtualized
the block_nr into a typedef?  That seems like it would provide
for cleaner (type-checked) code at no performance penalty and
more easily allow such comparisons.

Well this is my point: if I have disks  8T, wouldn't
it be at *all* beneficial to be able to *choose* some slight
performance impact and access those large disks vs. having not
choice?  Having it as a configurable would allow a given 
installation to make that choice rather than them having no
choice.  BTW, are block_nr's on RAID arrays subject to this
limitation?
 
 personally, i'm going to see what the situation looks like in 5 years time
 and try to solve the problem then.
---
It's not the same, but SGI has had customers for over
3 years using 2T *files*.  The point I'm looking at is if
the P-X series gets developed enough, and someone is using a
4-16P system, a corp user might be approaching that limit
today or tomorrow.  Joe User, might not for 5 years, but that's
what the configurability is about.  Keep linux usable for both
ends of the scale -- "I love scalability"

-l

-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: 64-bit block sizes on 32-bit systems

2001-03-26 Thread LA Walsh

Manfred Spraul wrote:
 
 4k page size * 2GB = 8TB.
 
 Try it.
 If your drive (array) is larger than 512byte*4G (4TB) linux will eat
 your data.
---
I have a block device that doesn't use 'sectors'.  It
only uses the logical block size (which is currently set for
1K).  Seems I could up that to the max blocksize (4k?) and
get 8TB...No?

I don't use the generic block make request (have my
own).  

-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: NCR53c8xx driver and multiple controllers...(not new prob)

2001-03-25 Thread LA Walsh

Here is the 'alternate' output when the ncr53c8xx driver is
compiled in:

SCSI subsystem driver Revision: 1.00
scsi-ncr53c7,8xx : at PCI bus 0, device 8, function 0
scsi-ncr53c7,8xx : warning : revision of 35 is greater than 2.
scsi-ncr53c7,8xx : NCR53c810 at memory 0xfa101000, io 0x2000, irq 58
scsi0 : burst length 16
scsi0 : NCR code relocated to 0x37d6c610 (virt 0xf7d6c610)
scsi0 : test 1 started
scsi0 : NCR53c{7,8}xx (rel 17)
request_module[block-major-8]: Root fs not mounted
VFS: Cannot open root device "807" or 08:07
Please append a correct "root=" boot option
Kernel panic: VFS: Unable to mount root fs on 08:07
-
Note how this compares to the case where the driver is a module:

(note scsi0 was an IDE emulation in this setup -- something also removed in
the above setup)
ncr53c8xx: at PCI bus 0, device 8, function 0
ncr53c8xx: 53c810a detected
ncr53c8xx: at PCI bus 1, device 3, function 0
ncr53c8xx: 53c896 detected
ncr53c8xx: at PCI bus 1, device 3, function 1
ncr53c8xx: 53c896 detected
ncr53c810a-0: rev=0x23, base=0xfa101000, io_port=0x2000, irq=58
ncr53c810a-0: ID 7, Fast-10, Parity Checking
ncr53c810a-0: restart (scsi reset).
ncr53c896-1: rev=0x01, base=0xfe004000, io_port=0x3000, irq=57
ncr53c896-1: ID 7, Fast-40, Parity Checking
ncr53c896-1: on-chip RAM at 0xfe00
ncr53c896-1: restart (scsi reset).
ncr53c896-1: Downloading SCSI SCRIPTS.
ncr53c896-2: rev=0x01, base=0xfe004400, io_port=0x3400, irq=56
ncr53c896-2: ID 7, Fast-40, Parity Checking
ncr53c896-2: on-chip RAM at 0xfe002000
ncr53c896-2: restart (scsi reset).
ncr53c896-2: Downloading SCSI SCRIPTS.
scsi1 : ncr53c8xx - version 3.2a-2
scsi2 : ncr53c8xx - version 3.2a-2
scsi3 : ncr53c8xx - version 3.2a-2
scsi : 4 hosts.
  Vendor: SEAGATE   Model: ST318203LCRev: 0002
  Type:   Direct-Access  ANSI SCSI revision: 02
Detected scsi disk sda at scsi2, channel 0, id 1, lun 0
  Vendor: SGI   Model: SEAGATE ST318203  Rev: 2710
  Type:   Direct-Access  ANSI SCSI revision: 02
Detected scsi disk sdb at scsi2, channel 0, id 2, lun 0
  Vendor: SGI   Model: SEAGATE ST336704  Rev: 2742


This is on a 4x550 PIII(Xeon) system.  The 2nd two 
controllers are on pci bus 1.  The boot disk is sda, which is off of
scsi2 in the working example, or scsi1 in the non-working example.

It seems that compiling it in somehow causes controllers
1 and 2 (which are off of the 2nd pci bus, "1", to get missed during 
scsi initialization.  Is there a parameter I need to pass to the 
ncr53c8xx driver to get it to scan the 2nd bus?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: NCR53c8xx driver and multiple controllers...(not new prob)

2001-03-25 Thread LA Walsh

Here is the 'alternate' output when the ncr53c8xx driver is
compiled in:

SCSI subsystem driver Revision: 1.00
scsi-ncr53c7,8xx : at PCI bus 0, device 8, function 0
scsi-ncr53c7,8xx : warning : revision of 35 is greater than 2.
scsi-ncr53c7,8xx : NCR53c810 at memory 0xfa101000, io 0x2000, irq 58
scsi0 : burst length 16
scsi0 : NCR code relocated to 0x37d6c610 (virt 0xf7d6c610)
scsi0 : test 1 started
scsi0 : NCR53c{7,8}xx (rel 17)
request_module[block-major-8]: Root fs not mounted
VFS: Cannot open root device "807" or 08:07
Please append a correct "root=" boot option
Kernel panic: VFS: Unable to mount root fs on 08:07
-
Note how this compares to the case where the driver is a module:

(note scsi0 was an IDE emulation in this setup -- something also removed in
the above setup)
ncr53c8xx: at PCI bus 0, device 8, function 0
ncr53c8xx: 53c810a detected
ncr53c8xx: at PCI bus 1, device 3, function 0
ncr53c8xx: 53c896 detected
ncr53c8xx: at PCI bus 1, device 3, function 1
ncr53c8xx: 53c896 detected
ncr53c810a-0: rev=0x23, base=0xfa101000, io_port=0x2000, irq=58
ncr53c810a-0: ID 7, Fast-10, Parity Checking
ncr53c810a-0: restart (scsi reset).
ncr53c896-1: rev=0x01, base=0xfe004000, io_port=0x3000, irq=57
ncr53c896-1: ID 7, Fast-40, Parity Checking
ncr53c896-1: on-chip RAM at 0xfe00
ncr53c896-1: restart (scsi reset).
ncr53c896-1: Downloading SCSI SCRIPTS.
ncr53c896-2: rev=0x01, base=0xfe004400, io_port=0x3400, irq=56
ncr53c896-2: ID 7, Fast-40, Parity Checking
ncr53c896-2: on-chip RAM at 0xfe002000
ncr53c896-2: restart (scsi reset).
ncr53c896-2: Downloading SCSI SCRIPTS.
scsi1 : ncr53c8xx - version 3.2a-2
scsi2 : ncr53c8xx - version 3.2a-2
scsi3 : ncr53c8xx - version 3.2a-2
scsi : 4 hosts.
  Vendor: SEAGATE   Model: ST318203LCRev: 0002
  Type:   Direct-Access  ANSI SCSI revision: 02
Detected scsi disk sda at scsi2, channel 0, id 1, lun 0
  Vendor: SGI   Model: SEAGATE ST318203  Rev: 2710
  Type:   Direct-Access  ANSI SCSI revision: 02
Detected scsi disk sdb at scsi2, channel 0, id 2, lun 0
  Vendor: SGI   Model: SEAGATE ST336704  Rev: 2742


This is on a 4x550 PIII(Xeon) system.  The 2nd two 
controllers are on pci bus 1.  The boot disk is sda, which is off of
scsi2 in the working example, or scsi1 in the non-working example.

It seems that compiling it in somehow causes controllers
1 and 2 (which are off of the 2nd pci bus, "1", to get missed during 
scsi initialization.  Is there a parameter I need to pass to the 
ncr53c8xx driver to get it to scan the 2nd bus?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



NCR53c8xx driver and multiple controllers...(not new prob)

2001-03-24 Thread LA Walsh

I have a machine with 3 of these controllers (a 4 CPU server).  The
3 controllers are:
ncr53c810a-0: rev=0x23, base=0xfa101000, io_port=0x2000, irq=58
ncr53c810a-0: ID 7, Fast-10, Parity Checking
ncr53c896-1: rev=0x01, base=0xfe004000, io_port=0x3000, irq=57
ncr53c896-1: ID 7, Fast-40, Parity Checking
ncr53c896-2: rev=0x01, base=0xfe004400, io_port=0x3400, irq=56
ncr53c896-2: ID 7, Fast-40, Parity Checking
ncr53c896-2: on-chip RAM at 0xfe002000

I'd like to be able to make a kernel with the driver compiled in and
no loadable module support.  It don't see how to do this from the
documentation -- it seems to require a separate module loaded for
each controller.  When I compile it in, it only see the 1st controller
and the boot partition I think is on the 3rd.  Any ideas?

This problem is present in the 2.2.x series as well as 2.4.x (x up to 2).

Thanks,
-linda
-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



NCR53c8xx driver and multiple controllers...(not new prob)

2001-03-24 Thread LA Walsh

I have a machine with 3 of these controllers (a 4 CPU server).  The
3 controllers are:
ncr53c810a-0: rev=0x23, base=0xfa101000, io_port=0x2000, irq=58
ncr53c810a-0: ID 7, Fast-10, Parity Checking
ncr53c896-1: rev=0x01, base=0xfe004000, io_port=0x3000, irq=57
ncr53c896-1: ID 7, Fast-40, Parity Checking
ncr53c896-2: rev=0x01, base=0xfe004400, io_port=0x3400, irq=56
ncr53c896-2: ID 7, Fast-40, Parity Checking
ncr53c896-2: on-chip RAM at 0xfe002000

I'd like to be able to make a kernel with the driver compiled in and
no loadable module support.  It don't see how to do this from the
documentation -- it seems to require a separate module loaded for
each controller.  When I compile it in, it only see the 1st controller
and the boot partition I think is on the 3rd.  Any ideas?

This problem is present in the 2.2.x series as well as 2.4.x (x up to 2).

Thanks,
-linda
-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Is swap == 2 * RAM a permanent thing?

2001-03-15 Thread LA Walsh

The not reclaiming swap space is flawed in more than once instance.
Suppose my P1 and P2 have their swap reserved -- now both grow.
P3 is idle but can't fit in swap.  This is going to result in fragmentation
no?  How is this fragmentation less worse than just freeing swap.

Ever since Ram sizes got to about 256M, I've tended toward using swap spaces 
about half my RAM size -- thinking of swap as an 'overflow' place that
really shouldn't get used much if at all.  As you mention, not reclaiming
swap space, but having 'double-reservations' for previously swapped
programs becomes a problem fast in this situation.  Makes the swap
much less flexible.

-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Is swap == 2 * RAM a permanent thing?

2001-03-15 Thread LA Walsh

The not reclaiming swap space is flawed in more than once instance.
Suppose my P1 and P2 have their swap reserved -- now both grow.
P3 is idle but can't fit in swap.  This is going to result in fragmentation
no?  How is this fragmentation less worse than just freeing swap.

Ever since Ram sizes got to about 256M, I've tended toward using swap spaces 
about half my RAM size -- thinking of swap as an 'overflow' place that
really shouldn't get used much if at all.  As you mention, not reclaiming
swap space, but having 'double-reservations' for previously swapped
programs becomes a problem fast in this situation.  Makes the swap
much less flexible.

-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: (struct dentry *)->vfsmnt;

2001-03-09 Thread LA Walsh

Alexander Viro wrote:
> No such thing. The same fs may be present in many places. Please,
> describe the situation - where do you get that dentry from?
> Cheers,
> Al
---

Al,
I'm getting it from various places, 1) if I want to know the
path relative to the root of the dentry at the end of 'path_walk'
or __user_path_walk (as used in truncate)  and
2) If I've gotten a dentry as in sys_fchdir/fchown/fstat/newfstat 
from a file descriptor and I want the absolute path or if multple
(such as multiple mounts of the same fs in different locations), the
one that the user used to access the dentry.

In 2.2 there was a way to get the path only from the
dentry (d_path) -- I'm looking for similar functionality for the
above cases.

Is it such that in 2.2 dentries were only relative to root
where in 2.4 they are relative to their mount point and instead of
duplicate dcache entries for each possible mount point, they get stored
as one?  

If that's the case, then while I might get a path for user-path
walk, if I just have a 'fd', it may not be poasible to backtrace into
the path the user used to access the file?

Just some wild speculations on my part:-/...did
I refine the question enough?

thanks,
-linda


-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



(struct dentry *)->vfsmnt;

2001-03-09 Thread LA Walsh

Could someone enlighten me as to the purpose of this field in the
dentry struct?  There is no elucidating comment in the header for this
particular field and the name/type only indicate it is pointing to
a list of vfsmounts.  Can a dentry belong to more than one vfsmount?

If I have a 'dentry' and simply want to determine what the absolute
path from root is, in the 'd_path' macro, would I use 'rootmnt' of my
current->fs as the 'vfsmount' as well?

Thanks, in advance...
-linda


-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-53
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



(struct dentry *)-vfsmnt;

2001-03-09 Thread LA Walsh

Could someone enlighten me as to the purpose of this field in the
dentry struct?  There is no elucidating comment in the header for this
particular field and the name/type only indicate it is pointing to
a list of vfsmounts.  Can a dentry belong to more than one vfsmount?

If I have a 'dentry' and simply want to determine what the absolute
path from root is, in the 'd_path' macro, would I use 'rootmnt' of my
current-fs as the 'vfsmount' as well?

Thanks, in advance...
-linda


-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-53
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: (struct dentry *)-vfsmnt;

2001-03-09 Thread LA Walsh

Alexander Viro wrote:
 No such thing. The same fs may be present in many places. Please,
 describe the situation - where do you get that dentry from?
 Cheers,
 Al
---

Al,
I'm getting it from various places, 1) if I want to know the
path relative to the root of the dentry at the end of 'path_walk'
or __user_path_walk (as used in truncate)  and
2) If I've gotten a dentry as in sys_fchdir/fchown/fstat/newfstat 
from a file descriptor and I want the absolute path or if multple
(such as multiple mounts of the same fs in different locations), the
one that the user used to access the dentry.

In 2.2 there was a way to get the path only from the
dentry (d_path) -- I'm looking for similar functionality for the
above cases.

Is it such that in 2.2 dentries were only relative to root
where in 2.4 they are relative to their mount point and instead of
duplicate dcache entries for each possible mount point, they get stored
as one?  

If that's the case, then while I might get a path for user-path
walk, if I just have a 'fd', it may not be poasible to backtrace into
the path the user used to access the file?

Just some wild speculations on my part:-/...did
I refine the question enough?

thanks,
-linda


-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Elevator algorithm parameters

2001-03-08 Thread LA Walsh

I hate when that happens...

LA Walsh wrote:
> If you ask for code from me, it'll be a while -- My read and write
...Q's are rather full right now with some higher priority I/O...:-)
-l
-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Elevator algorithm parameters

2001-03-08 Thread LA Walsh

I have a few comments/questions on the elv. alg. as it is now.  Some
of them may be based on a flawed understanding, but please be patient
anyway :-).

1) read-ahead is given the same 'latency' [max-wait priority] as 'read'
   I can see r-a as being less important than 'read' -- 'read' means
   some app is blocked waiting for input *now*.  'ra' -- means the
   kernel is being clever in hopes it is predicting a usage pattern where
   reading ahead will be useful.  I'd be tempted to give read-ahead
   a higher acceptable latency than reads and possibly higher than
   writes.  By definition, 'ra' i/o is i/o that no one currently has
   requested be done.
   a) the code may be there, but if a read request comes in for a
  sector marked for ra, then the latency should be set to 
  min(r-latency,remaining ra latency)

2) I seem to notice a performance boost for my laptop setting the
   read latency down to 1/8th of the write (2048/16384) instead of
   the current 1:2 ratio.  

   I am running my machine as a nfs server as well as doing local tasks
   and compiles.  I got better overall performance because nfs requests
   got serviced more quickly to feed a data-hungry dual-processor
   "compiler-server".  Also, my interactive processes which need
   lots of random reads perform better because they got 'fed' faster
   while some background data transfers (read and writes) of large
   streams of data were going on.

3) It seems that the balance of optimal latency figures would vary
   based on how many cpu-processes are blocked on data-reads, how many
   cpu's are reading from the same disk, the disk speed, the cpu speed
   and available memory for buffering.  Maybe there is a neat wiz-bang
   self-adjusting algorithm that can adapt dynamically to different
   loads (like say detects -- hmmm, we have 100 non mergable read 
   requests plugged, should I wait for more?...well only 1 active write
   request is runningmaybe I should lower the read latency...etc).
   However, in the interim, it seems having the values at least be
   tunable via /proc (rather than the current ioctl) would be useful --
   just able to echo some values into there @ runtime.  I couldn't
   seem to find such a beast in /proc.

Comments/cares?

If you ask for code from me, it'll be a while -- My read and write 
-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Elevator algorithm parameters

2001-03-08 Thread LA Walsh

I hate when that happens...

LA Walsh wrote:
 If you ask for code from me, it'll be a while -- My read and write
...Q's are rather full right now with some higher priority I/O...:-)
-l
-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Elevator algorithm parameters

2001-03-08 Thread LA Walsh

I have a few comments/questions on the elv. alg. as it is now.  Some
of them may be based on a flawed understanding, but please be patient
anyway :-).

1) read-ahead is given the same 'latency' [max-wait priority] as 'read'
   I can see r-a as being less important than 'read' -- 'read' means
   some app is blocked waiting for input *now*.  'ra' -- means the
   kernel is being clever in hopes it is predicting a usage pattern where
   reading ahead will be useful.  I'd be tempted to give read-ahead
   a higher acceptable latency than reads and possibly higher than
   writes.  By definition, 'ra' i/o is i/o that no one currently has
   requested be done.
   a) the code may be there, but if a read request comes in for a
  sector marked for ra, then the latency should be set to 
  min(r-latency,remaining ra latency)

2) I seem to notice a performance boost for my laptop setting the
   read latency down to 1/8th of the write (2048/16384) instead of
   the current 1:2 ratio.  

   I am running my machine as a nfs server as well as doing local tasks
   and compiles.  I got better overall performance because nfs requests
   got serviced more quickly to feed a data-hungry dual-processor
   "compiler-server".  Also, my interactive processes which need
   lots of random reads perform better because they got 'fed' faster
   while some background data transfers (read and writes) of large
   streams of data were going on.

3) It seems that the balance of optimal latency figures would vary
   based on how many cpu-processes are blocked on data-reads, how many
   cpu's are reading from the same disk, the disk speed, the cpu speed
   and available memory for buffering.  Maybe there is a neat wiz-bang
   self-adjusting algorithm that can adapt dynamically to different
   loads (like say detects -- hmmm, we have 100 non mergable read 
   requests plugged, should I wait for more?...well only 1 active write
   request is runningmaybe I should lower the read latency...etc).
   However, in the interim, it seems having the values at least be
   tunable via /proc (rather than the current ioctl) would be useful --
   just able to echo some values into there @ runtime.  I couldn't
   seem to find such a beast in /proc.

Comments/cares?

If you ask for code from me, it'll be a while -- My read and write 
-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



setfsuid

2001-03-07 Thread LA Walsh

Why doesn't setfsuid return -EPERM when it can't perform the operation?
file: kernel/sys.c, 'sys_setfsuid' around line 779 depending on your
source version.

There is a check if capable(CAP_SETUID), that if it fails, doesn't
return an error.  This seems inconsistent.  In fact the manpage
I have on it states:

RETURN VALUE
   On success, the previous value of fsuid is  returned.   On
   error, the current value of fsuid is returned.
BUGS
   No error messages of any kind are returned to the  caller.
   At  the very least, EPERM should be returned when the call
   fails.

-l
-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



setfsuid

2001-03-07 Thread LA Walsh

Why doesn't setfsuid return -EPERM when it can't perform the operation?
file: kernel/sys.c, 'sys_setfsuid' around line 779 depending on your
source version.

There is a check if capable(CAP_SETUID), that if it fails, doesn't
return an error.  This seems inconsistent.  In fact the manpage
I have on it states:

RETURN VALUE
   On success, the previous value of fsuid is  returned.   On
   error, the current value of fsuid is returned.
BUGS
   No error messages of any kind are returned to the  caller.
   At  the very least, EPERM should be returned when the call
   fails.

-l
-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Annoying CD-rom driver error messages

2001-03-06 Thread LA Walsh

Alan Cox wrote:
> 
> > support to function efficiently -- perhaps that technology needs to be further 
>developed
> > on Linux so app writers don't also have to be kernel experts and experts in all the
> > various bus and device types out there?
> 
> You mean someone should write a libcdrom that handles stuff like that - quite
> possibly
---
More generally -- if I want to know if a DVD has been inserted and of what type
and/or a floppy has been inserted or a removable media of type "X" or perhaps
more generally -- not just if a 'device' has changed but a file or directory?

I think that is what famd is supposed to do, but apparently it does so (I'm 
guessing from the external description) by polling and says it needs kernel support
to be more efficient.  Famd was apparently ported to Linux from Irix where it had
the kernel ability to be notified of changed file-space items (file-space = anything
accessible w/a pathname).


Now if I can just remember where I saw this mythical port of the 'file-access
monitoring daemon'

-l

-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Annoying CD-rom driver error messages

2001-03-06 Thread LA Walsh

Alan Cox wrote:
> 
> >   Then it seems the less ideal question is what is the "approved and 
>recommended
> > way for a program to "poll" such devices to check for 'changes' and 'media type'
> > without the kernel generating spurious WARNINGS/ERRORS?
> 
> The answer to that could probably fill a book unfortunately. You need to use
> the various mtfuji and other ata or scsi query commands intended to notify you
> politely of media and other status changes
---
Taking myself out of the role of someone who knows anything about the kernel --
and only knows application writing in the fields of GUI's and audio, what do you think
I'm going to use to check if their has been a playable CD inserted into the CD drive?

There is an application called 'famd' -- which says it needs some kernel 
support to function efficiently -- perhaps that technology needs to be further 
developed
on Linux so app writers don't also have to be kernel experts and experts in all the
various bus and device types out there?

Just an idea...?
-linda 
-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Annoying CD-rom driver error messages

2001-03-06 Thread LA Walsh

God wrote:
> 
> On Mon, 5 Mar 2001, Alan Cox wrote:
> 
> > > > this isnt a kernel problem, its a _very_ stupid app
> > > ---
> > > Must be more than one stupid app...
> >
> > Could well be. You have something continually trying to open your cdrom and
> > see if there is media in it
> 
> Gnome / KDE? does exactly that... (rather annoying too) ..  what app
> specificaly I don't know...
---
So I'm still wondering what the "approved and recommended" way for a program
to be "automatically" informed of a CD or floppy change/insertion and be able to
informed of media 'type' w/o kernel warnings/error messages.  It sounds like
there is no kernel support for this so far?

Then it seems the less ideal question is what is the "approved and recommended
way for a program to "poll" such devices to check for 'changes' and 'media type'
without the kernel generating spurious WARNINGS/ERRORS?


-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Annoying CD-rom driver error messages

2001-03-06 Thread LA Walsh

God wrote:
 
 On Mon, 5 Mar 2001, Alan Cox wrote:
 
this isnt a kernel problem, its a _very_ stupid app
   ---
   Must be more than one stupid app...
 
  Could well be. You have something continually trying to open your cdrom and
  see if there is media in it
 
 Gnome / KDE? does exactly that... (rather annoying too) ..  what app
 specificaly I don't know...
---
So I'm still wondering what the "approved and recommended" way for a program
to be "automatically" informed of a CD or floppy change/insertion and be able to
informed of media 'type' w/o kernel warnings/error messages.  It sounds like
there is no kernel support for this so far?

Then it seems the less ideal question is what is the "approved and recommended
way for a program to "poll" such devices to check for 'changes' and 'media type'
without the kernel generating spurious WARNINGS/ERRORS?


-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Annoying CD-rom driver error messages

2001-03-06 Thread LA Walsh

Alan Cox wrote:
 
Then it seems the less ideal question is what is the "approved and 
recommended
  way for a program to "poll" such devices to check for 'changes' and 'media type'
  without the kernel generating spurious WARNINGS/ERRORS?
 
 The answer to that could probably fill a book unfortunately. You need to use
 the various mtfuji and other ata or scsi query commands intended to notify you
 politely of media and other status changes
---
Taking myself out of the role of someone who knows anything about the kernel --
and only knows application writing in the fields of GUI's and audio, what do you think
I'm going to use to check if their has been a playable CD inserted into the CD drive?

There is an application called 'famd' -- which says it needs some kernel 
support to function efficiently -- perhaps that technology needs to be further 
developed
on Linux so app writers don't also have to be kernel experts and experts in all the
various bus and device types out there?

Just an idea...?
-linda 
-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Annoying CD-rom driver error messages

2001-03-06 Thread LA Walsh

Alan Cox wrote:
 
  support to function efficiently -- perhaps that technology needs to be further 
developed
  on Linux so app writers don't also have to be kernel experts and experts in all the
  various bus and device types out there?
 
 You mean someone should write a libcdrom that handles stuff like that - quite
 possibly
---
More generally -- if I want to know if a DVD has been inserted and of what type
and/or a floppy has been inserted or a removable media of type "X" or perhaps
more generally -- not just if a 'device' has changed but a file or directory?

I think that is what famd is supposed to do, but apparently it does so (I'm 
guessing from the external description) by polling and says it needs kernel support
to be more efficient.  Famd was apparently ported to Linux from Irix where it had
the kernel ability to be notified of changed file-space items (file-space = anything
accessible w/a pathname).


Now if I can just remember where I saw this mythical port of the 'file-access
monitoring daemon'

-l

-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Annoying CD-rom driver error messages

2001-03-05 Thread LA Walsh

Alan Cox wrote:
> 
> > > this isnt a kernel problem, its a _very_ stupid app
> > ---
> >   Must be more than one stupid app...
> 
> Could well be. You have something continually trying to open your cdrom and
> see if there is media in it
---
Is there some feature they *should* be using instead to check for media
presence so I can forward it to their dev-team?

Thanks!
-l

-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Annoying CD-rom driver error messages

2001-03-05 Thread LA Walsh

LA Walsh wrote:
> 
> > this isnt a kernel problem, its a _very_ stupid app
> ---
> Must be more than one stupid app...
> 
> xena:/var/log# rpm -q magicdev
> package magicdev is not installed
> xena:/var/log# locate magicdev
> xena:/var/log#
> xena:/var/log# rpm -qa |grep -i magic
> ImageMagick-5.2.6-4
---

Maybe the stupid app is 'freeamp'?  It only happens when I run it...:-(


-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Annoying CD-rom driver error messages

2001-03-05 Thread LA Walsh

> this isnt a kernel problem, its a _very_ stupid app
---
Must be more than one stupid app...

xena:/var/log# rpm -q magicdev
package magicdev is not installed
xena:/var/log# locate magicdev
xena:/var/log#
xena:/var/log# rpm -qa |grep -i magic
ImageMagick-5.2.6-4



-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Annoying CD-rom driver error messages

2001-03-05 Thread LA Walsh


Slightly less annoying -- when no CD is in the drive, I'm getting:

Mar  5 09:30:42 xena kernel: VFS: Disk change detected on device ide1(22,0)
Mar  5 09:31:17 xena last message repeated 7 times
Mar  5 09:32:18 xena last message repeated 12 times
Mar  5 09:33:23 xena last message repeated 13 times
Mar  5 09:34:24 xena last message repeated 12 times

(22,0 = /dev/hdc,cdrom)

Perturbing.

-l
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Annoying CD-rom driver error messages

2001-03-05 Thread LA Walsh

I have a music play program (freeamp) playing MP3's running.  It has the
feature in that it scans to see if a CD is in the drive and tries to look it up
in CDDB.  Well, I don't have a CD in the drive -- I have a DVD-ROM with UDF file
system on it.  Freeamp doesn't complain, but in my syslog/warnings file, every 5 
seconds
I get:

Mar  5 09:17:00 xena kernel: hdc: packet command error: status=0x51 { DriveReady 
SeekComplete Error }
Mar  5 09:17:00 xena kernel: hdc: packet command error: error=0x50
Mar  5 09:17:00 xena kernel: ATAPI device hdc:
Mar  5 09:17:00 xena kernel:   Error: Illegal request -- (Sense key=0x05)
Mar  5 09:17:00 xena kernel:   Cannot read medium - incompatible format -- (asc=0x30, 
ascq=0x02)
Mar  5 09:17:00 xena kernel:   The failed "Read Subchannel" packet command was:
Mar  5 09:17:00 xena kernel:   "42 02 40 01 00 00 00 00 10 00 00 00 "

Needless to say, this fills up messages/warnings fairly quickly.  If there's no
DVD in the drive or if there is a CD in the drive, I don't notice this problem.

Seems like a undesirable feature for the kernel to write out 7-line error messages
everytime a program polls for a CD and fails.  Is there a way to disable this when I
have a DVD ROM disk in the drive? (vanilla 2.4.2 kernel).

Thanks...
-l


-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Annoying CD-rom driver error messages

2001-03-05 Thread LA Walsh

I have a music play program (freeamp) playing MP3's running.  It has the
feature in that it scans to see if a CD is in the drive and tries to look it up
in CDDB.  Well, I don't have a CD in the drive -- I have a DVD-ROM with UDF file
system on it.  Freeamp doesn't complain, but in my syslog/warnings file, every 5 
seconds
I get:

Mar  5 09:17:00 xena kernel: hdc: packet command error: status=0x51 { DriveReady 
SeekComplete Error }
Mar  5 09:17:00 xena kernel: hdc: packet command error: error=0x50
Mar  5 09:17:00 xena kernel: ATAPI device hdc:
Mar  5 09:17:00 xena kernel:   Error: Illegal request -- (Sense key=0x05)
Mar  5 09:17:00 xena kernel:   Cannot read medium - incompatible format -- (asc=0x30, 
ascq=0x02)
Mar  5 09:17:00 xena kernel:   The failed "Read Subchannel" packet command was:
Mar  5 09:17:00 xena kernel:   "42 02 40 01 00 00 00 00 10 00 00 00 "

Needless to say, this fills up messages/warnings fairly quickly.  If there's no
DVD in the drive or if there is a CD in the drive, I don't notice this problem.

Seems like a undesirable feature for the kernel to write out 7-line error messages
everytime a program polls for a CD and fails.  Is there a way to disable this when I
have a DVD ROM disk in the drive? (vanilla 2.4.2 kernel).

Thanks...
-l


-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Annoying CD-rom driver error messages

2001-03-05 Thread LA Walsh


Slightly less annoying -- when no CD is in the drive, I'm getting:

Mar  5 09:30:42 xena kernel: VFS: Disk change detected on device ide1(22,0)
Mar  5 09:31:17 xena last message repeated 7 times
Mar  5 09:32:18 xena last message repeated 12 times
Mar  5 09:33:23 xena last message repeated 13 times
Mar  5 09:34:24 xena last message repeated 12 times

(22,0 = /dev/hdc,cdrom)

Perturbing.

-l
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Annoying CD-rom driver error messages

2001-03-05 Thread LA Walsh

 this isnt a kernel problem, its a _very_ stupid app
---
Must be more than one stupid app...

xena:/var/log# rpm -q magicdev
package magicdev is not installed
xena:/var/log# locate magicdev
xena:/var/log#
xena:/var/log# rpm -qa |grep -i magic
ImageMagick-5.2.6-4



-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Annoying CD-rom driver error messages

2001-03-05 Thread LA Walsh

LA Walsh wrote:
 
  this isnt a kernel problem, its a _very_ stupid app
 ---
 Must be more than one stupid app...
 
 xena:/var/log# rpm -q magicdev
 package magicdev is not installed
 xena:/var/log# locate magicdev
 xena:/var/log#
 xena:/var/log# rpm -qa |grep -i magic
 ImageMagick-5.2.6-4
---

Maybe the stupid app is 'freeamp'?  It only happens when I run it...:-(


-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Annoying CD-rom driver error messages

2001-03-05 Thread LA Walsh

Alan Cox wrote:
 
   this isnt a kernel problem, its a _very_ stupid app
  ---
Must be more than one stupid app...
 
 Could well be. You have something continually trying to open your cdrom and
 see if there is media in it
---
Is there some feature they *should* be using instead to check for media
presence so I can forward it to their dev-team?

Thanks!
-l

-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



odd memory corrupt problem

2001-02-22 Thread LA Walsh

I have a kernel driver that has a variable (surprise) 'audit_state'.  It's statically
initialized to 0 in the C code.  The only way it can get set on is if the audit modules
are loaded and one makes a system call to enable it.

There is no 'driver' initialization performed.

This code seemed to work in 2.2.17, but not in the 2.4.x series.

Somehow the 'audit_state' variable is being mysteriously set to '1' (which with the
driver not loaded causes less than perfect behavior.  

So I started sprinkling "if (audit_state) BUG();" in various places in the code.
It fails during the pcnet32 driver initialization (compiled in vs. module).  That
in turn calls pci init code which calls net driver code.  That calls 'core/'
register_netdevice, which finally ends up calling run_sbin_hotplug in net/core/dev.c.
That tries to load the program /sbin/hotplug via call_usermodehelper in kmod.c  
That 'schedules' the task and things are still ok, then it goes down on the process 
sem 
to wait until it has started.  The program it is trying to execute "hotplug" which
doesn't exist on my machine...ok, fine (the network interface seems to function just
fine).  The program doesn't exist, but when it gets back from the down(), the
value of "audit_state" has changed to 1.  

Any ideas why?  Not that I'm whining, but a good debugger with a 'watch' capability
would do wonders at this point.  I'm trying to figure out code that has nothing to
do with my driver -- just happens to be randomly stomping on a key variable.  

I suppose something could be stomping on the checks to see if the module is loaded
and something is randomly calling the system call to turn it on, but that seems like
a less likely path.  Note that the system hasn't even gotten up to the point of calling
the 'boot' script yet.

I get the same behavior in 2.4.0, 2.4.1 and 2.4.2 (was hoping some memory corruption
bug got fixed along the way).  

Meanwhile, guess it's on to more debugging linux style -- insert printk's.  How
quaint.

Linda
-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
p
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



interactive disk performance

2001-02-22 Thread LA Walsh

A problem that I seem to have noticed to some extent or another in the 2.4 series
is that while the elevator algorithm may achieve best disk bandwidth utilization,
it seems to be heavily at the expense of interactive use.

I was running a disk intensive program over nfs, so the nfsd's were quite busy --
usually 3/4 were in 'D' wait.

During this time, I tried to bring up this compose window for the email I am
writing.  It took over 2 minutes to come up.  Now the CPU is 66%idle, 31%in idled --
meaning it's fairly inactive -- everything was waiting on the disk waits.

I'm sure that the file the nfsd's were writing out was one long contiguous stream --
most of which could be coalesced into large multi-block writes.  Somehow it seems
that the multi-block writer was getting 1 block in, then more blocks kept coming
in so fast that the Q would only unplug every once in a while -- and maybe 1
block of an interactive request would go through.

I don't remember the exact timeout or max wait/sector while blocks are being
coalesced, but it seems it heavily favors the heavy disk user.

In Unix design, the CPU algorithm was designed to lower the priority of CPU
intensive tasks such that interactive use got higher priority for short bursts.

Maybe a process should have a disk (and maybe net while we are at it) priority that 
adjusts
based on usage in the way the CPU algorithm adjusts -- then the block structure could
have an added 'priority' field of what the process's priority was when it wrote the
block.  Thus even if a process goes away -- the blocks still retain priority.

Then the elevator algorithm would sort not just by locality but also weighting it 
with the block's priority.  Perhaps it would be a make-time or run-time configurable
whether or not to optimize for disk-throughput, or interactive usage.  Perhaps even
a 'nice' value that allows the user to subjectively prioritize processes.

Possible?  Usefulness?

-l


-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



interactive disk performance

2001-02-22 Thread LA Walsh

A problem that I seem to have noticed to some extent or another in the 2.4 series
is that while the elevator algorithm may achieve best disk bandwidth utilization,
it seems to be heavily at the expense of interactive use.

I was running a disk intensive program over nfs, so the nfsd's were quite busy --
usually 3/4 were in 'D' wait.

During this time, I tried to bring up this compose window for the email I am
writing.  It took over 2 minutes to come up.  Now the CPU is 66%idle, 31%in idled --
meaning it's fairly inactive -- everything was waiting on the disk waits.

I'm sure that the file the nfsd's were writing out was one long contiguous stream --
most of which could be coalesced into large multi-block writes.  Somehow it seems
that the multi-block writer was getting 1 block in, then more blocks kept coming
in so fast that the Q would only unplug every once in a while -- and maybe 1
block of an interactive request would go through.

I don't remember the exact timeout or max wait/sector while blocks are being
coalesced, but it seems it heavily favors the heavy disk user.

In Unix design, the CPU algorithm was designed to lower the priority of CPU
intensive tasks such that interactive use got higher priority for short bursts.

Maybe a process should have a disk (and maybe net while we are at it) priority that 
adjusts
based on usage in the way the CPU algorithm adjusts -- then the block structure could
have an added 'priority' field of what the process's priority was when it wrote the
block.  Thus even if a process goes away -- the blocks still retain priority.

Then the elevator algorithm would sort not just by locality but also weighting it 
with the block's priority.  Perhaps it would be a make-time or run-time configurable
whether or not to optimize for disk-throughput, or interactive usage.  Perhaps even
a 'nice' value that allows the user to subjectively prioritize processes.

Possible?  Usefulness?

-l


-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



odd memory corrupt problem

2001-02-22 Thread LA Walsh

I have a kernel driver that has a variable (surprise) 'audit_state'.  It's statically
initialized to 0 in the C code.  The only way it can get set on is if the audit modules
are loaded and one makes a system call to enable it.

There is no 'driver' initialization performed.

This code seemed to work in 2.2.17, but not in the 2.4.x series.

Somehow the 'audit_state' variable is being mysteriously set to '1' (which with the
driver not loaded causes less than perfect behavior.  

So I started sprinkling "if (audit_state) BUG();" in various places in the code.
It fails during the pcnet32 driver initialization (compiled in vs. module).  That
in turn calls pci init code which calls net driver code.  That calls 'core/'
register_netdevice, which finally ends up calling run_sbin_hotplug in net/core/dev.c.
That tries to load the program /sbin/hotplug via call_usermodehelper in kmod.c  
That 'schedules' the task and things are still ok, then it goes down on the process 
sem 
to wait until it has started.  The program it is trying to execute "hotplug" which
doesn't exist on my machine...ok, fine (the network interface seems to function just
fine).  The program doesn't exist, but when it gets back from the down(sem), the
value of "audit_state" has changed to 1.  

Any ideas why?  Not that I'm whining, but a good debugger with a 'watch' capability
would do wonders at this point.  I'm trying to figure out code that has nothing to
do with my driver -- just happens to be randomly stomping on a key variable.  

I suppose something could be stomping on the checks to see if the module is loaded
and something is randomly calling the system call to turn it on, but that seems like
a less likely path.  Note that the system hasn't even gotten up to the point of calling
the 'boot' script yet.

I get the same behavior in 2.4.0, 2.4.1 and 2.4.2 (was hoping some memory corruption
bug got fixed along the way).  

Meanwhile, guess it's on to more debugging linux style -- insert printk's.  How
quaint.

Linda
-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
p
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux stifles innovation...

2001-02-16 Thread LA Walsh

"David D.W. Downey" wrote:
> 
> Seriously though folks, look at who's doing this!
> 
> They've already tried once to sue 'Linux', were told they couldn't because
> Linux is a non-entity (or at least one that they can not effectively sue
> due to the classification Linux holds), ...
---
Not having a long memory on these things, do you have an article
or reference on this -- I'd love to read about that one.  Sue Linux?  For
what?  Competing?  

Perhaps by saying Open Source is a threat to the "American Way", they
mean they can't effectively 'sue', buy up or destroy it?  

-l

-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux stifles innovation...

2001-02-16 Thread LA Walsh

"David D.W. Downey" wrote:
 
 Seriously though folks, look at who's doing this!
 
 They've already tried once to sue 'Linux', were told they couldn't because
 Linux is a non-entity (or at least one that they can not effectively sue
 due to the classification Linux holds), ...
---
Not having a long memory on these things, do you have an article
or reference on this -- I'd love to read about that one.  Sue Linux?  For
what?  Competing?  

Perhaps by saying Open Source is a threat to the "American Way", they
mean they can't effectively 'sue', buy up or destroy it?  

-l

-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



To Linus: kdb in 2.4?

2001-02-13 Thread LA Walsh

I'm wondering about the possibility of re-examining the idea of a kernel debugger
option distributed with 2.4.  

I'm thinking that it could be a great teaching tool to break and examine structures,
variables, process states, as well as an aid to people who may not have a grasp
of the entire kernel but need to write device drivers.

It's easy for someone who's "grown up" with Linux to know it all so thoroughly 
that such a tool seems fluff.  But even the best mechanics on new cars use complex
diagnostic tools to do car repair.  Sure there may be experts that designed the engine
that wouldn't need it, but large numbers of people need to repair cars or modify them 
for
their purposes.  Having tools to aid in that isn't so much a crutch as it is
a learning tool.  It's like being able to look at the characters of the alphabet
individually before one learns to comprehend the entirety of the writings of Buddha.

Certainly Buddha doesn't need to know how to read to know his own writings -- and
certainly, if everyone meditates and 'evolves' to their Buddha nature, they wouldn't
need to read the texts or recognize the letters either.  

But not everyone is at the same place on the mountain (or even the same mountain, for
that matter).

In wisdom, one would, I posit, understand others are in different places and may
find it useful to have tools to learn to read before they comprehend.  

Just my 2-4 cents on the matter...
-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



To Linus: kdb in 2.4?

2001-02-13 Thread LA Walsh

I'm wondering about the possibility of re-examining the idea of a kernel debugger
option distributed with 2.4.  

I'm thinking that it could be a great teaching tool to break and examine structures,
variables, process states, as well as an aid to people who may not have a grasp
of the entire kernel but need to write device drivers.

It's easy for someone who's "grown up" with Linux to know it all so thoroughly 
that such a tool seems fluff.  But even the best mechanics on new cars use complex
diagnostic tools to do car repair.  Sure there may be experts that designed the engine
that wouldn't need it, but large numbers of people need to repair cars or modify them 
for
their purposes.  Having tools to aid in that isn't so much a crutch as it is
a learning tool.  It's like being able to look at the characters of the alphabet
individually before one learns to comprehend the entirety of the writings of Buddha.

Certainly Buddha doesn't need to know how to read to know his own writings -- and
certainly, if everyone meditates and 'evolves' to their Buddha nature, they wouldn't
need to read the texts or recognize the letters either.  

But not everyone is at the same place on the mountain (or even the same mountain, for
that matter).

In wisdom, one would, I posit, understand others are in different places and may
find it useful to have tools to learn to read before they comprehend.  

Just my 2-4 cents on the matter...
-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Block driver design issue

2001-02-12 Thread LA Walsh

I have a block driver I inherited that I working on that has a problem and
was wondering for cleaner solutions.

The driver can accept written characters from either userspace programs or from
the kernel.  From userspace it uses sys_write.  That in turn calls block_write.
There's almost 100 lines of duplicated code in a copy of the block_write
code in the driver "block_writek" as well as duplicate code in audit_write vs. 
audit_writek.
The only difference being down in block_write at the "copy_from_user(p,buf,chars); "
which becomes a "memcpy(p,buf,chars)" in the "block_writek" version.  

I find this duplication of code to be inefficient.  Is there a way to dummy up the
the 'buf' address so that the "copy_from_user" will copy the buffer from kernel space?
My assumption is that it wouldn't "just work" (which may also be an invalid 
assumption).

Suggestions?  Abuse?

Thanks!
-linda

-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://vger.kernel.org/lkml/



Block driver design issue

2001-02-12 Thread LA Walsh

I have a block driver I inherited that I working on that has a problem and
was wondering for cleaner solutions.

The driver can accept written characters from either userspace programs or from
the kernel.  From userspace it uses sys_write.  That in turn calls block_write.
There's almost 100 lines of duplicated code in a copy of the block_write
code in the driver "block_writek" as well as duplicate code in audit_write vs. 
audit_writek.
The only difference being down in block_write at the "copy_from_user(p,buf,chars); "
which becomes a "memcpy(p,buf,chars)" in the "block_writek" version.  

I find this duplication of code to be inefficient.  Is there a way to dummy up the
the 'buf' address so that the "copy_from_user" will copy the buffer from kernel space?
My assumption is that it wouldn't "just work" (which may also be an invalid 
assumption).

Suggestions?  Abuse?

Thanks!
-linda

-- 
L A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://vger.kernel.org/lkml/



question on comment in fs.h

2001-02-10 Thread LA Walsh

Excuse my ignorance, but in file include/linux/fs.h, 2.4.x source
in the struct buffer_head, there is a member:
unsigned short b_size;  /* block size */
later there is a member:
char * b_data;  /* pointer to data block (512 byte) */ 

Is the "(512 byte)" part of the comment in error or do I misunderstand
the nature of 'b_size'

-l

-- 
Linda A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



question on comment in fs.h

2001-02-10 Thread LA Walsh

Excuse my ignorance, but in file include/linux/fs.h, 2.4.x source
in the struct buffer_head, there is a member:
unsigned short b_size;  /* block size */
later there is a member:
char * b_data;  /* pointer to data block (512 byte) */ 

Is the "(512 byte)" part of the comment in error or do I misunderstand
the nature of 'b_size'

-l

-- 
Linda A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



2.4.x Shared memory question

2001-02-04 Thread LA Walsh


Another oddity -- I notice things taking alot more memory
in 2.4.  This coincides with 'top' consistently showing I have 0 shared
memory.  These two observations would have me wondering if I
have somehow misconfigured my system to disallow sharing.  Note
that /proc/meminfo also shows 0 shared memory:

total:used:free:  shared: buffers:  cached:
Mem:  525897728 465264640 606330880 82145280 287862784
Swap: 2709094400 270909440
MemTotal:   513572 kB
MemFree: 59212 kB
MemShared:   0 kB
Buffers: 80220 kB
Cached: 281116 kB
Active:  22340 kB
Inact_dirty:338996 kB
Inact_clean: 0 kB
Inact_target:0 kB
HighTotal:   0 kB
HighFree:0 kB
LowTotal:   513572 kB
LowFree: 59212 kB
SwapTotal:  264560 kB
SwapFree:   264560 kB 

Not that it seems unrelated, but I do have filesystem type shm 
mounted on /dev/shm as suggested for POSIX shared memory.


-- 
Linda A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338



-- 
Linda A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



2.4.2-test1 better on disk lock/freezups

2001-02-04 Thread LA Walsh

In trying to apply Jens's patch I upgraded to 2.4.2-pre1.  The figures on it(242-p1) 
look
better at this point: a vmstat dump, same data...notice this time it only took maybe 45
seconds to write out the data.  I also got better interactive performance.
So write speed is up to about 3.5Mb/s.  Fastest reads using 'hdparm' are in the 
12-14Mb/s
range.  Sooo...IDE hdparm block dev read vs. file writes...3-4:1 ratio?

I honestly have little clue as to what would be considered 'good' numbers.

Note the maximum 'system freeze' seems under 10 seconds now -- alot more 
tolerable.  

Note also, this was without my applying Jens's patch -- as I could not figure out how
to get it to apply cleanly  :-(.


 0  0  0  0  77564  80220 280164   0   0 0   348  287  1367  10   7  83
 0  0  1  0  77560  80220 280164   0   0 0   304  193   225   0   1  99
 0  1  1  0  77572  80220 280156   0   0 0   162  241   354   4   2  95
 0  1  1  0  77572  80220 280156   0   0 0   156  218   182   0   1  99
 1  1  1  0  77560  80220 280164   0   0 0   165  217   218   0   1  99
 0  1  1  0  77328  80220 280164   0   0 0   134  213   215   1   1  97
 0  1  1  0  77328  80220 280164   0   0 0   138  217   177   0   1  98
 0  1  1  0  77328  80220 280164   0   0 0   206  215   178   0   1  99
 0  1  1  0  77332  80220 280164   0   0 0   166  219   206   1   1  98
 0  0  0  0  85632  80220 280172   0   01412  192   360   1   1  98
 
-- 
Linda A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: System unresponsitive when copying HD/HD

2001-02-04 Thread LA Walsh

Alan Cox wrote:
> But try 2.4.1 before worrying too much. That fixed a lot of the block
> performance problems I was seeing (2.4.1 ruins the VM performance under paging
> loads but the I/O speed is fixed ;))

---
Seems to have gotten a bit worse.  Vmstat output after 'vmware' had completed
write -- but system unresponsive and writing out a 155M file...

 1  0  0  0 113960  47528 277152   0   0 0 0  397   861   1  24  75
 1  0  0  0 114060  47560 277152   0   0 4   350  432  1435   4  17  79
 0  0  1  0 127380  47560 266196   0   0 0   516  216   435   7   3  90
 1  0  1  0 127380  47560 266196   0   0 0   240  203   173   0   1  99
 0  0  1  0 127380  47560 266196   0   0 0   434  275   180   0   2  98
 1  0  1  0 127376  47560 266196   0   0 0   218  204   173   0   2  98
 0  0  1  0 127376  47560 266196   0   0 0   288  203   174   0   0 100
 0  0  1  0 127376  47560 266196   0   0 0   337  230   176   0   1  99
 0  0  1  0 127376  47560 266196   0   0 0   267  241   177   0   1  99
 0  0  1  0 127376  47560 266196   0   0 0   210  204   173   0   1  99
 0  0  1  0 127376  47560 266196   0   0 0   204  203   173   0   1  99
 0  0  1  0 127376  47560 266196   0   0 0   216  212   250   0   1  99
 0  0  1  0 127376  47560 266196   0   0 0   208  205   172   0   2  98
 0  0  1  0 127372  47560 266196   0   0 0   225  203   160   0   2  98
 0  0  1  0 127372  47560 266196   0   0 0   316  214   212   0   1  99
 1  0  1  0 127144  47560 266196   0   0 0   281  218   304   1   2  96
 0  0  0  0 127144  47560 266196   0   0 0 1  161   240   1   0  99
 0  0  0  0 127144  47560 266196   0   0 0 0  101   232   0   1  99 
---
What is the meaning of having a process in the 'w' column?  On other
systems, I was used to that meaning an executable had been *swapped* out completely
(as opposed to no pages mapped in) and that it meant your system vm was 'thrashing'.
But that obviously isn't the case here.

Those columns are output from a 'vmstat 5'.  Meaning it took about 70 seconds
to write out 158M.  Or about 2.2M/s.  That's probably not bad.  It still locks
up the system for over a minute though -- which is really undesirable performance
for interactive use.  I'm guessing the vmstat output numbers are showing 4K? 8K? 
blocks?  8K would about make sense for the 2.2M average.

-- 
Linda A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: System unresponsitive when copying HD/HD

2001-02-04 Thread LA Walsh

Alan Cox wrote:
 But try 2.4.1 before worrying too much. That fixed a lot of the block
 performance problems I was seeing (2.4.1 ruins the VM performance under paging
 loads but the I/O speed is fixed ;))

---
Seems to have gotten a bit worse.  Vmstat output after 'vmware' had completed
write -- but system unresponsive and writing out a 155M file...

 1  0  0  0 113960  47528 277152   0   0 0 0  397   861   1  24  75
 1  0  0  0 114060  47560 277152   0   0 4   350  432  1435   4  17  79
 0  0  1  0 127380  47560 266196   0   0 0   516  216   435   7   3  90
 1  0  1  0 127380  47560 266196   0   0 0   240  203   173   0   1  99
 0  0  1  0 127380  47560 266196   0   0 0   434  275   180   0   2  98
 1  0  1  0 127376  47560 266196   0   0 0   218  204   173   0   2  98
 0  0  1  0 127376  47560 266196   0   0 0   288  203   174   0   0 100
 0  0  1  0 127376  47560 266196   0   0 0   337  230   176   0   1  99
 0  0  1  0 127376  47560 266196   0   0 0   267  241   177   0   1  99
 0  0  1  0 127376  47560 266196   0   0 0   210  204   173   0   1  99
 0  0  1  0 127376  47560 266196   0   0 0   204  203   173   0   1  99
 0  0  1  0 127376  47560 266196   0   0 0   216  212   250   0   1  99
 0  0  1  0 127376  47560 266196   0   0 0   208  205   172   0   2  98
 0  0  1  0 127372  47560 266196   0   0 0   225  203   160   0   2  98
 0  0  1  0 127372  47560 266196   0   0 0   316  214   212   0   1  99
 1  0  1  0 127144  47560 266196   0   0 0   281  218   304   1   2  96
 0  0  0  0 127144  47560 266196   0   0 0 1  161   240   1   0  99
 0  0  0  0 127144  47560 266196   0   0 0 0  101   232   0   1  99 
---
What is the meaning of having a process in the 'w' column?  On other
systems, I was used to that meaning an executable had been *swapped* out completely
(as opposed to no pages mapped in) and that it meant your system vm was 'thrashing'.
But that obviously isn't the case here.

Those columns are output from a 'vmstat 5'.  Meaning it took about 70 seconds
to write out 158M.  Or about 2.2M/s.  That's probably not bad.  It still locks
up the system for over a minute though -- which is really undesirable performance
for interactive use.  I'm guessing the vmstat output numbers are showing 4K? 8K? 
blocks?  8K would about make sense for the 2.2M average.

-- 
Linda A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



2.4.2-test1 better on disk lock/freezups

2001-02-04 Thread LA Walsh

In trying to apply Jens's patch I upgraded to 2.4.2-pre1.  The figures on it(242-p1) 
look
better at this point: a vmstat dump, same data...notice this time it only took maybe 45
seconds to write out the data.  I also got better interactive performance.
So write speed is up to about 3.5Mb/s.  Fastest reads using 'hdparm' are in the 
12-14Mb/s
range.  Sooo...IDE hdparm block dev read vs. file writes...3-4:1 ratio?

I honestly have little clue as to what would be considered 'good' numbers.

Note the maximum 'system freeze' seems under 10 seconds now -- alot more 
tolerable.  

Note also, this was without my applying Jens's patch -- as I could not figure out how
to get it to apply cleanly  :-(.


 0  0  0  0  77564  80220 280164   0   0 0   348  287  1367  10   7  83
 0  0  1  0  77560  80220 280164   0   0 0   304  193   225   0   1  99
 0  1  1  0  77572  80220 280156   0   0 0   162  241   354   4   2  95
 0  1  1  0  77572  80220 280156   0   0 0   156  218   182   0   1  99
 1  1  1  0  77560  80220 280164   0   0 0   165  217   218   0   1  99
 0  1  1  0  77328  80220 280164   0   0 0   134  213   215   1   1  97
 0  1  1  0  77328  80220 280164   0   0 0   138  217   177   0   1  98
 0  1  1  0  77328  80220 280164   0   0 0   206  215   178   0   1  99
 0  1  1  0  77332  80220 280164   0   0 0   166  219   206   1   1  98
 0  0  0  0  85632  80220 280172   0   01412  192   360   1   1  98
 
-- 
Linda A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



2.4.x Shared memory question

2001-02-04 Thread LA Walsh


Another oddity -- I notice things taking alot more memory
in 2.4.  This coincides with 'top' consistently showing I have 0 shared
memory.  These two observations would have me wondering if I
have somehow misconfigured my system to disallow sharing.  Note
that /proc/meminfo also shows 0 shared memory:

total:used:free:  shared: buffers:  cached:
Mem:  525897728 465264640 606330880 82145280 287862784
Swap: 2709094400 270909440
MemTotal:   513572 kB
MemFree: 59212 kB
MemShared:   0 kB
Buffers: 80220 kB
Cached: 281116 kB
Active:  22340 kB
Inact_dirty:338996 kB
Inact_clean: 0 kB
Inact_target:0 kB
HighTotal:   0 kB
HighFree:0 kB
LowTotal:   513572 kB
LowFree: 59212 kB
SwapTotal:  264560 kB
SwapFree:   264560 kB 

Not that it seems unrelated, but I do have filesystem type shm 
mounted on /dev/shm as suggested for POSIX shared memory.


-- 
Linda A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338



-- 
Linda A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: System unresponsitive when copying HD/HD

2001-02-03 Thread LA Walsh

I've noticed less responsive disk response on 2.4.0 vs. 2.2.17.  For example --
I run vmware and suspend it frequently when I'm not using it.  One of them requires
a 158Mb save file.  Before, I could suspend that one, then start another which
reads in a smaller 50M save file.  The smaller one would come up while the other
was still saving.  As of 2.4, the smaller one doesn't come up -- I can't even do
an 'ls' until the big save finishes.  

Now big image program has actually exited and I can close the window -- the disk
writes are going on from the disk cache with 'kupdate' taking some minor fraction (<1%)
of the CPU and the rest of the system being mostly idle.

If I have vmstat running, I notice blocks trickling out to the disk, 5sec averages
495,142,151,155,136,257,15,0.  Note that the maximum read rate (hdparm -t) of this
disk is in the 12-14M/s range.  I'm getting about 1-5% of that on output with the
system's disk subsystem being apparently unable to do anything else.

This is with IDE hard disk with DMA enabled.

a) is this expected performance on a large linear write?  
b) should I expect other disk operations to be denied service as long as
the write is 'flushing'?

-l
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: System unresponsitive when copying HD/HD

2001-02-03 Thread LA Walsh

I've noticed less responsive disk response on 2.4.0 vs. 2.2.17.  For example --
I run vmware and suspend it frequently when I'm not using it.  One of them requires
a 158Mb save file.  Before, I could suspend that one, then start another which
reads in a smaller 50M save file.  The smaller one would come up while the other
was still saving.  As of 2.4, the smaller one doesn't come up -- I can't even do
an 'ls' until the big save finishes.  

Now big image program has actually exited and I can close the window -- the disk
writes are going on from the disk cache with 'kupdate' taking some minor fraction (1%)
of the CPU and the rest of the system being mostly idle.

If I have vmstat running, I notice blocks trickling out to the disk, 5sec averages
495,142,151,155,136,257,15,0.  Note that the maximum read rate (hdparm -t) of this
disk is in the 12-14M/s range.  I'm getting about 1-5% of that on output with the
system's disk subsystem being apparently unable to do anything else.

This is with IDE hard disk with DMA enabled.

a) is this expected performance on a large linear write?  
b) should I expect other disk operations to be denied service as long as
the write is 'flushing'?

-l
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Power usage Q and parallel make question (separate issues)

2001-02-01 Thread LA Walsh

Keith Owens wrote:
> 
> On Wed, 31 Jan 2001 19:02:03 -0800,
> LA Walsh <[EMAIL PROTECTED]> wrote:
> >This seems to serialize the delete, run the mod-installs in parallel, then run the
> >depmod when they are done.
> 
> It works, until somebody does this
> 
>  make -j 4 modules modules_install
---
But that doesn't work now.  

> There is not, and never has been, any interlock between make modules
> and make modules_install.  If you let modules_install run in parallel
> then people will be tempted to issue the incorrect command above
> instead of the required separate commands.
---

> 
>  make -j 4 modules
>  make -j 4 modules_install
> 
> You gain a few seconds on module_install but leave more room for user
> error.
---
A bit of documentation at the beginning of the Makefile would do wonders
for kernel-developer (not end user, please!) clarity.  I've oft'asked the question
as to what really is supported.  I've tried things like make dep bzImage modules --
I noticed it didn't work fairly quickly.  Same with modules/modules_install -- 
people would probably figure that one out, but just a bit of documentation would
help even that.  



-- 
Linda A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Power usage Q and parallel make question (separate issues)

2001-02-01 Thread LA Walsh

Keith Owens wrote:
 
 On Wed, 31 Jan 2001 19:02:03 -0800,
 LA Walsh [EMAIL PROTECTED] wrote:
 This seems to serialize the delete, run the mod-installs in parallel, then run the
 depmod when they are done.
 
 It works, until somebody does this
 
  make -j 4 modules modules_install
---
But that doesn't work now.  

 There is not, and never has been, any interlock between make modules
 and make modules_install.  If you let modules_install run in parallel
 then people will be tempted to issue the incorrect command above
 instead of the required separate commands.
---

 
  make -j 4 modules
  make -j 4 modules_install
 
 You gain a few seconds on module_install but leave more room for user
 error.
---
A bit of documentation at the beginning of the Makefile would do wonders
for kernel-developer (not end user, please!) clarity.  I've oft'asked the question
as to what really is supported.  I've tried things like make dep bzImage modules --
I noticed it didn't work fairly quickly.  Same with modules/modules_install -- 
people would probably figure that one out, but just a bit of documentation would
help even that.  



-- 
Linda A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Power usage Q and parallel make question (separate issues)

2001-01-31 Thread LA Walsh

Keith Owens wrote:
>
> The only bit that could run in parallel is this one.
> 
> .PHONY: $(patsubst %, _modinst_%, $(SUBDIRS))
> $(patsubst %, _modinst_%, $(SUBDIRS)) :
> $(MAKE) -C $(patsubst _modinst_%, %, $@) modules_install
> 
> The erase must be done first (serial), then make modules_install in
> every subdir (parallel), then depmod (serial).
---
Right...Wouldn't something like this work?  (Seems to)
--- Makefile.oldWed Jan 31 18:57:21 2001
+++ MakefileWed Jan 31 18:54:53 2001
@@ -351,8 +351,12 @@
 $(patsubst %, _mod_%, $(SUBDIRS)) : include/linux/version.h include/config/MARKER
$(MAKE) -C $(patsubst _mod_%, %, $@) CFLAGS="$(CFLAGS) $(MODFLAGS)" 
MAKING_MODULES=1 modules
 
+modules_inst_subdirs: _modinst_
+   $(MAKE) $(patsubst %, _modinst_%, $(SUBDIRS))
+
+
 .PHONY: modules_install
-modules_install: _modinst_ $(patsubst %, _modinst_%, $(SUBDIRS)) _modinst_post
+modules_install: _modinst_post
 
 .PHONY: _modinst_
 _modinst_:
@@ -372,7 +376,7 @@
 depmod_opts:= -b $(INSTALL_MOD_PATH) -r
 endif
 .PHONY: _modinst_post
-_modinst_post: _modinst_post_pcmcia
+_modinst_post: _modinst_post_pcmcia modules_inst_subdirs
if [ -r System.map ]; then $(DEPMOD) -ae -F System.map $(depmod_opts) 
$(KERNELRELEASE); fi
 
 # Backwards compatibilty symlinks for people still using old versions  
---
This seems to serialize the delete, run the mod-installs in parallel, then run the
depmod when they are done.  
-- 
Linda A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Power usage Q and parallel make question (separate issues)

2001-01-31 Thread LA Walsh

I remember reading some time back that on a pentium the difference between a
pentium in HLT vs. running was about 2-3 watts vs. 15-20 watts.  Does anyone
know the difference for today's CPU's?  P-III/P-IV or other archs?

How about the difference when calling the BIOS power-save feature?  With
the threat of rolling blackouts here in CA, I was wondering what the power
consumption might be of a 100,000 or 1,000,000 CPU's in HLT vs. doing complex
mathematical computation?

Separately -- Parallel Make's
--===
So, just about anyone I know uses make -j X [-l Y] bzImage modules, but I noticed that
make modules_install isn't parallel safe in 2.4 -- since it takes much longer than the
old, it would make sense to want to run it in parallel as well, but it has a 
delete-old, , index-new for deps.  Those "3" steps can't be done
in parallel safely.  Was this intentional or would a 'fix' be desired?

Is it the intention of the Makefile maintainers to allow a parallel or distributed
make?  I know for me it makes a noticable difference even on a 1 CPU machine
(CPU overlap with disk I/O), and with multi CPU machines, it's even more noticable.

Is a make of the kernel and/or the modules designed to be parallel safe?  Is it 
something I should 'rely' on?  If it isn't, should it be?

-l

-- 
Linda A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: seti@home and es1371

2001-01-31 Thread LA Walsh

Try "freeamp".  It uses darn close to 0 CPU and may not be affected by setiathome.
2nd -- renice setiathome to '19' -- you only want it to use up 'background' cputime
anyway



Rainer Wiener wrote:
> 
> Hi,
> 
> I hope you can help me. I have a problem with my on board soundcard and
> seti. I have a Gigabyte GA-7ZX Creative 5880 sound chip. I use the kernel
> driver es1371 and it works goot. But when I run seti@home I got some noise
> in my sound when I play mp3 and other sound. But it is not every time 10s
> play good than for 2 s bad and than 10s good 2s bad and so on. When I kill
> seti@home every thing is ok. So what can I do?
> 
> I have a Athlon 800 Mhz and 128 MB RAM

-- 
Linda A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: seti@home and es1371

2001-01-31 Thread LA Walsh

Try "freeamp".  It uses darn close to 0 CPU and may not be affected by setiathome.
2nd -- renice setiathome to '19' -- you only want it to use up 'background' cputime
anyway



Rainer Wiener wrote:
 
 Hi,
 
 I hope you can help me. I have a problem with my on board soundcard and
 seti. I have a Gigabyte GA-7ZX Creative 5880 sound chip. I use the kernel
 driver es1371 and it works goot. But when I run seti@home I got some noise
 in my sound when I play mp3 and other sound. But it is not every time 10s
 play good than for 2 s bad and than 10s good 2s bad and so on. When I kill
 seti@home every thing is ok. So what can I do?
 
 I have a Athlon 800 Mhz and 128 MB RAM

-- 
Linda A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Power usage Q and parallel make question (separate issues)

2001-01-31 Thread LA Walsh

I remember reading some time back that on a pentium the difference between a
pentium in HLT vs. running was about 2-3 watts vs. 15-20 watts.  Does anyone
know the difference for today's CPU's?  P-III/P-IV or other archs?

How about the difference when calling the BIOS power-save feature?  With
the threat of rolling blackouts here in CA, I was wondering what the power
consumption might be of a 100,000 or 1,000,000 CPU's in HLT vs. doing complex
mathematical computation?

Separately -- Parallel Make's
--===
So, just about anyone I know uses make -j X [-l Y] bzImage modules, but I noticed that
make modules_install isn't parallel safe in 2.4 -- since it takes much longer than the
old, it would make sense to want to run it in parallel as well, but it has a 
delete-old, multiple sub-dirs, index-new for deps.  Those "3" steps can't be done
in parallel safely.  Was this intentional or would a 'fix' be desired?

Is it the intention of the Makefile maintainers to allow a parallel or distributed
make?  I know for me it makes a noticable difference even on a 1 CPU machine
(CPU overlap with disk I/O), and with multi CPU machines, it's even more noticable.

Is a make of the kernel and/or the modules designed to be parallel safe?  Is it 
something I should 'rely' on?  If it isn't, should it be?

-l

-- 
Linda A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Power usage Q and parallel make question (separate issues)

2001-01-31 Thread LA Walsh

Keith Owens wrote:

 The only bit that could run in parallel is this one.
 
 .PHONY: $(patsubst %, _modinst_%, $(SUBDIRS))
 $(patsubst %, _modinst_%, $(SUBDIRS)) :
 $(MAKE) -C $(patsubst _modinst_%, %, $@) modules_install
 
 The erase must be done first (serial), then make modules_install in
 every subdir (parallel), then depmod (serial).
---
Right...Wouldn't something like this work?  (Seems to)
--- Makefile.oldWed Jan 31 18:57:21 2001
+++ MakefileWed Jan 31 18:54:53 2001
@@ -351,8 +351,12 @@
 $(patsubst %, _mod_%, $(SUBDIRS)) : include/linux/version.h include/config/MARKER
$(MAKE) -C $(patsubst _mod_%, %, $@) CFLAGS="$(CFLAGS) $(MODFLAGS)" 
MAKING_MODULES=1 modules
 
+modules_inst_subdirs: _modinst_
+   $(MAKE) $(patsubst %, _modinst_%, $(SUBDIRS))
+
+
 .PHONY: modules_install
-modules_install: _modinst_ $(patsubst %, _modinst_%, $(SUBDIRS)) _modinst_post
+modules_install: _modinst_post
 
 .PHONY: _modinst_
 _modinst_:
@@ -372,7 +376,7 @@
 depmod_opts:= -b $(INSTALL_MOD_PATH) -r
 endif
 .PHONY: _modinst_post
-_modinst_post: _modinst_post_pcmcia
+_modinst_post: _modinst_post_pcmcia modules_inst_subdirs
if [ -r System.map ]; then $(DEPMOD) -ae -F System.map $(depmod_opts) 
$(KERNELRELEASE); fi
 
 # Backwards compatibilty symlinks for people still using old versions  
---
This seems to serialize the delete, run the mod-installs in parallel, then run the
depmod when they are done.  
-- 
Linda A Walsh| Trust Technology, Core Linux, SGI
[EMAIL PROTECTED]  | Voice: (650) 933-5338
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



  1   2   >