from:"Rogier Wolff"

Re: [PATCH] tty: amba-pl011: Make TX optimisation conditional

2019-07-12 Thread Rogier Wolff

On Fri, Jul 12, 2019 at 12:21:05PM +0100, Dave Martin wrote:
> diff --git a/drivers/tty/serial/amba-pl011.c b/drivers/tty/serial/amba-pl011.c
> index 89ade21..1902071 100644
> --- a/drivers/tty/serial/amba-pl011.c
> +++ b/drivers/tty/serial/amba-pl011.c
> @@ -1307,6 +1307,13 @@ static bool pl011_tx_chars(struct uart_amba_port *uap, 
> bool from_irq);
>  /* Start TX with programmed I/O only (no DMA) */
>  static void pl011_start_tx_pio(struct uart_amba_port *uap)
>  {
> + /*
> +  * Avoid FIFO overfills if the TX IRQ is active:
> +  * pl011_int() will comsume chars waiting in the xmit queue anyway.
> +  */
> + if (uap->im & UART011_TXIM)
> + return;
> +

I'm no expert on PL011, have no knowledge of the current bug, but have
programmed serial drivers in the past.

This looks "dangerous" to me.

The normal situation is that you push the first few characters into
the FIFO with PIO and then the interrupt will trigger once the FIFO
empties and then you can refil the FIFO until the buffer empties.

The danger in THIS fix is that you might have a race that causes those
first few PIO-ed characters not to be put in the hardware resulting in
the interrupt never triggering If you can software-trigger the
interrupt just before the "return" here that'd be a way to fix things.

I'm ok with a reaction like "I've thought about this, it's not a
problem, now shut up".

Roger. 

-- 
** r.e.wo...@bitwizard.nl ** https://www.BitWizard.nl/ ** +31-15-2049110 **
**Delftechpark 11 2628 XJ  Delft, The Netherlands.  KVK: 27239233**
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

Re: POSIX violation by writeback error

2018-09-27 Thread Rogier Wolff

On Wed, Sep 26, 2018 at 07:10:55PM +0100, Alan Cox wrote:
> > And I think that's fine.  The only way we can make any guarantees is
> > if we do what Alan suggested, which is to imply that a read on a dirty
> > page *block* until the the page is successfully written back.  This
> > would destroy performance.
> 
> In almost all cases you don't care so you wouldn't use it. In those cases
> where it might matter it's almost always the case that a reader won't
> consume it before it hits the media.

Wait! Source code builds (*) nowadays are quite fast because
everything happens without hitting the disk. This means my compile has
finished linking the executable by the time the kernel starts thinking
about writing the objects to disk.

Roger. 

(*) Of projects smaller than the Linux kernel. 

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

Re: POSIX violation by writeback error

2018-09-27 Thread Rogier Wolff

On Wed, Sep 26, 2018 at 07:10:55PM +0100, Alan Cox wrote:
> > And I think that's fine.  The only way we can make any guarantees is
> > if we do what Alan suggested, which is to imply that a read on a dirty
> > page *block* until the the page is successfully written back.  This
> > would destroy performance.
> 
> In almost all cases you don't care so you wouldn't use it. In those cases
> where it might matter it's almost always the case that a reader won't
> consume it before it hits the media.

Wait! Source code builds (*) nowadays are quite fast because
everything happens without hitting the disk. This means my compile has
finished linking the executable by the time the kernel starts thinking
about writing the objects to disk.

Roger. 

(*) Of projects smaller than the Linux kernel. 

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

Re: POSIX violation by writeback error

2018-09-25 Thread Rogier Wolff

On Tue, Sep 25, 2018 at 11:46:27AM -0400, Theodore Y. Ts'o wrote:
> (Especially since you can get most of the functionality by
> using some naming convention for files that in the process of being
> written, and then teach some program that is regularly scanning the
> entire file system, such as updatedb(2) to nuke the files from a cron
> job.  It won't be as efficient, but it would be much easier to
> implement.)

It is MUCH easier to have a per-application cleanup job. You can run
that at boot-time. 

  #/etc/init.d/myname startup script
  rm /var/run/myname/unfinished.*

Simple things should be kept simple.

Roger. 

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

Re: POSIX violation by writeback error

2018-09-25 Thread Rogier Wolff

On Tue, Sep 25, 2018 at 11:46:27AM -0400, Theodore Y. Ts'o wrote:
> (Especially since you can get most of the functionality by
> using some naming convention for files that in the process of being
> written, and then teach some program that is regularly scanning the
> entire file system, such as updatedb(2) to nuke the files from a cron
> job.  It won't be as efficient, but it would be much easier to
> implement.)

It is MUCH easier to have a per-application cleanup job. You can run
that at boot-time. 

  #/etc/init.d/myname startup script
  rm /var/run/myname/unfinished.*

Simple things should be kept simple.

Roger. 

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

Re: [PATCH 1/2] sc16is7xx: Fix for multi-channel stall

2018-09-18 Thread Rogier Wolff

On Tue, Sep 18, 2018 at 02:13:15PM +0100, Phil Elwell wrote:
> I could add a limit on the number of iterations, but if the limit is ever hit,
> leading to an early exit, the port is basically dead because it will never
> receive another interrupt.

Especially if you print something like: ": Too many
iterations with work-to-do bailing out" while bailing out, then
hanging just one driver/piece of hardware as opposed to the whole
system when somehow the hardware never indicates "all work done"
would be preferable. 

Under normal circumstances you never expect to hit that number of
iterations. But if the card keeps hitting the driver with "more work
to do" then you'd hang the system. Better try and recover, and provide
debug info for the user who knows where to look.

Best would be to ignore the driver for say a second and start handling
interrupts again a while later. Should the system be overloaded with
work, (and a slow CPU?) then you can recover and just make things slow
down a bit without hanging the system. 

Getting edge-triggered interrupts right is VERY difficult. In general
I'd advise against them. It looks like a nice solution, but in reality
the chances for difficult-to-debug race conditions is enormous.  In
those race conditions the card will get "new work to do" and
(re-)assert the interrupt line when the driver is already on the "no
more work to do" path.

Roger. 

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

Re: [PATCH 1/2] sc16is7xx: Fix for multi-channel stall

2018-09-18 Thread Rogier Wolff

On Tue, Sep 18, 2018 at 02:13:15PM +0100, Phil Elwell wrote:
> I could add a limit on the number of iterations, but if the limit is ever hit,
> leading to an early exit, the port is basically dead because it will never
> receive another interrupt.

Especially if you print something like: ": Too many
iterations with work-to-do bailing out" while bailing out, then
hanging just one driver/piece of hardware as opposed to the whole
system when somehow the hardware never indicates "all work done"
would be preferable. 

Under normal circumstances you never expect to hit that number of
iterations. But if the card keeps hitting the driver with "more work
to do" then you'd hang the system. Better try and recover, and provide
debug info for the user who knows where to look.

Best would be to ignore the driver for say a second and start handling
interrupts again a while later. Should the system be overloaded with
work, (and a slow CPU?) then you can recover and just make things slow
down a bit without hanging the system. 

Getting edge-triggered interrupts right is VERY difficult. In general
I'd advise against them. It looks like a nice solution, but in reality
the chances for difficult-to-debug race conditions is enormous.  In
those race conditions the card will get "new work to do" and
(re-)assert the interrupt line when the driver is already on the "no
more work to do" path.

Roger. 

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

Re: POSIX violation by writeback error

2018-09-06 Thread Rogier Wolff

On Thu, Sep 06, 2018 at 12:57:09PM +1000, Dave Chinner wrote:
> On Wed, Sep 05, 2018 at 02:07:46PM +0200, Rogier Wolff wrote:

> > And this has worked for years because
> > the kernel caches stuff from inodes and data-blocks. If you suddenly
> > write stuff to harddisk at 10ms for each seek between inode area and
> > data-area..
> 
> You're assuming an awful lot about filesystem implementation here.
> Neither ext4, btrfs or XFS issue physical IO like this when flushing
> data.

My thinking is: When fsync (implicit or explicit)  needs to know 
the result of the underlying IO, it needs to wait for it to have
happened. 

My thinking is: You can either log the data in the logfile or just the
metadata. By default/most people will chose the last. In the "make sure
it hits storage" case, you have three areas. 
* The logfile
* the inode area
* the data area. 

When you allow the application to continue pasta close, you can gather
up say a few megabytes of updates to each area and do say 50 seeks per
second. (achieving maybe about 50% of the throughput performance of
your drive)

If you don't store the /data/, you can stay in the inode or logfile
area and get a high throughput on your drive. But when a crash has the
filesystem in a defined state, what use is that if your application is
in a bad state because it is getting bad data?

Of course the application can be rewritten to have multiple threads so
that while one thread is waiting for a close to finish another one can
open/write/close another file. But there are existing applicaitons run
by users who do not have the knowledge or option to delve into the
source and rewrite the application to be multithreaded.

Your 100k files per second is closely similar to mine. In real life we
are not going to see such extreme numbers, but in some cases the
benchmark does predict a part of the performance of an application.
In practice, an application may spend 50% of the time on thinking
about the next file to make, and then 50k times per second actually
making the file.

Roger. 

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

Re: POSIX violation by writeback error

2018-09-06 Thread Rogier Wolff

On Thu, Sep 06, 2018 at 12:57:09PM +1000, Dave Chinner wrote:
> On Wed, Sep 05, 2018 at 02:07:46PM +0200, Rogier Wolff wrote:

> > And this has worked for years because
> > the kernel caches stuff from inodes and data-blocks. If you suddenly
> > write stuff to harddisk at 10ms for each seek between inode area and
> > data-area..
> 
> You're assuming an awful lot about filesystem implementation here.
> Neither ext4, btrfs or XFS issue physical IO like this when flushing
> data.

My thinking is: When fsync (implicit or explicit)  needs to know 
the result of the underlying IO, it needs to wait for it to have
happened. 

My thinking is: You can either log the data in the logfile or just the
metadata. By default/most people will chose the last. In the "make sure
it hits storage" case, you have three areas. 
* The logfile
* the inode area
* the data area. 

When you allow the application to continue pasta close, you can gather
up say a few megabytes of updates to each area and do say 50 seeks per
second. (achieving maybe about 50% of the throughput performance of
your drive)

If you don't store the /data/, you can stay in the inode or logfile
area and get a high throughput on your drive. But when a crash has the
filesystem in a defined state, what use is that if your application is
in a bad state because it is getting bad data?

Of course the application can be rewritten to have multiple threads so
that while one thread is waiting for a close to finish another one can
open/write/close another file. But there are existing applicaitons run
by users who do not have the knowledge or option to delve into the
source and rewrite the application to be multithreaded.

Your 100k files per second is closely similar to mine. In real life we
are not going to see such extreme numbers, but in some cases the
benchmark does predict a part of the performance of an application.
In practice, an application may spend 50% of the time on thinking
about the next file to make, and then 50k times per second actually
making the file.

Roger. 

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

Re: POSIX violation by writeback error

2018-09-05 Thread Rogier Wolff

On Wed, Sep 05, 2018 at 08:07:25AM -0400, Austin S. Hemmelgarn wrote:
> On 2018-09-05 04:37, 焦晓冬 wrote:

> >At this point, the /bin/sh may be partially old and partially new. Execute
> >this corrupted bin is also dangerous though.

> But the system may still be usable in that state, while returning an
> error there guarantees it isn't.  This is, in general, not the best
> example though, because no sane package manager directly overwrites
> _anything_, they all do some variation on replace-by-rename and call
> fsync _before_ renaming, so this situation is not realistically
> going to happen on any real system.

Again, the "returning an error guarantees it isn't" is what's
important here. A lot of scenarios exist where it is a slightly less
important file than "/bin/sh" that would trigger such a comiplete
system failure. But there are a whole lot of files that can be pretty
critical for a system where "old value" is better than "all programs
get an error now". So when you propose "reads should now return error", 
you really need to think things through. 

It is not enough to say: But I encountered a situation where returning
an error was preferable, you need to think through the counter-cases
that others might run into that would make the "new" situation worse
than before.

Roger. 

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

Re: POSIX violation by writeback error

2018-09-05 Thread Rogier Wolff

On Wed, Sep 05, 2018 at 08:07:25AM -0400, Austin S. Hemmelgarn wrote:
> On 2018-09-05 04:37, 焦晓冬 wrote:

> >At this point, the /bin/sh may be partially old and partially new. Execute
> >this corrupted bin is also dangerous though.

> But the system may still be usable in that state, while returning an
> error there guarantees it isn't.  This is, in general, not the best
> example though, because no sane package manager directly overwrites
> _anything_, they all do some variation on replace-by-rename and call
> fsync _before_ renaming, so this situation is not realistically
> going to happen on any real system.

Again, the "returning an error guarantees it isn't" is what's
important here. A lot of scenarios exist where it is a slightly less
important file than "/bin/sh" that would trigger such a comiplete
system failure. But there are a whole lot of files that can be pretty
critical for a system where "old value" is better than "all programs
get an error now". So when you propose "reads should now return error", 
you really need to think things through. 

It is not enough to say: But I encountered a situation where returning
an error was preferable, you need to think through the counter-cases
that others might run into that would make the "new" situation worse
than before.

Roger. 

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

Re: POSIX violation by writeback error

2018-09-05 Thread Rogier Wolff

On Wed, Sep 05, 2018 at 06:55:15AM -0400, Jeff Layton wrote:
> There is no requirement for a filesystem to flush data on close().

And you can't start doing things like that. In some weird cases, you
might have an application open-write-close files at a much higher rate
than what a harddisk can handle. And this has worked for years because
the kernel caches stuff from inodes and data-blocks. If you suddenly
write stuff to harddisk at 10ms for each seek between inode area and
data-area... You end up limited to about 50 of these open-write-close
cycles per second.

My home system is now able make/write/close about 10 files per
second.

assurancetourix:~/testfiles> time ../a.out 10 000
0.103u 0.999s 0:01.10 99.0% 0+0k 0+80io 0pf+0w

(The test program was accessing arguments beyond the end-of-arguments,
An extra argument for this one time program was easier than
open/fix/recompile).

Roger. 

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

Re: POSIX violation by writeback error

2018-09-05 Thread Rogier Wolff

On Wed, Sep 05, 2018 at 06:55:15AM -0400, Jeff Layton wrote:
> There is no requirement for a filesystem to flush data on close().

And you can't start doing things like that. In some weird cases, you
might have an application open-write-close files at a much higher rate
than what a harddisk can handle. And this has worked for years because
the kernel caches stuff from inodes and data-blocks. If you suddenly
write stuff to harddisk at 10ms for each seek between inode area and
data-area... You end up limited to about 50 of these open-write-close
cycles per second.

My home system is now able make/write/close about 10 files per
second.

assurancetourix:~/testfiles> time ../a.out 10 000
0.103u 0.999s 0:01.10 99.0% 0+0k 0+80io 0pf+0w

(The test program was accessing arguments beyond the end-of-arguments,
An extra argument for this one time program was easier than
open/fix/recompile).

Roger. 

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

Re: POSIX violation by writeback error

2018-09-05 Thread Rogier Wolff

On Wed, Sep 05, 2018 at 09:39:58AM +0200, Martin Steigerwald wrote:
> Rogier Wolff - 05.09.18, 09:08:
> > So when a mail queuer puts mail the mailq files and the mail processor
> > can get them out of there intact, nobody is going to notice.  (I know
> > mail queuers should call fsync and report errors when that fails, but
> > there are bound to be applications where calling fsync is not
> > appropriate (*))
> 
> AFAIK at least Postfix MDA only reports mail as being accepted over SMTP 
> once fsync() on the mail file completed successfully. And I´d expect 
> every sensible MDA to do this. I don´t know how Dovecot MDA which I 
> currently use for sieve support does this tough.

Yes. That's why I added the remark that mailers will call fsync and know
about it on the write side. I encountered a situation in the last few
days that when a developer runs into this while developing, would have
caused him to write: 
  /* Calling this fsync causes unacceptable performance */
  // fsync (fd); 

I know of an application somewhere that does realtime-gathering of
call-records (number X called Y for Z seconds). They come in from a
variety of sources, get de-duplicated standardized and written to
files. Then different output modules push the data to the different
consumers within the company. Billing among them. 

Now getting old data there would be pretty bad. And calling fsync
all the time might have performance issues 

That's the situation where "old data is really bad". 

But when apt-get upgrade replaces your /bin/sh and gets a write error
returning error on subsequent reads is really bad. 

It is more difficult than you think. 

Roger. 

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

Re: POSIX violation by writeback error

2018-09-05 Thread Rogier Wolff

On Wed, Sep 05, 2018 at 09:39:58AM +0200, Martin Steigerwald wrote:
> Rogier Wolff - 05.09.18, 09:08:
> > So when a mail queuer puts mail the mailq files and the mail processor
> > can get them out of there intact, nobody is going to notice.  (I know
> > mail queuers should call fsync and report errors when that fails, but
> > there are bound to be applications where calling fsync is not
> > appropriate (*))
> 
> AFAIK at least Postfix MDA only reports mail as being accepted over SMTP 
> once fsync() on the mail file completed successfully. And I´d expect 
> every sensible MDA to do this. I don´t know how Dovecot MDA which I 
> currently use for sieve support does this tough.

Yes. That's why I added the remark that mailers will call fsync and know
about it on the write side. I encountered a situation in the last few
days that when a developer runs into this while developing, would have
caused him to write: 
  /* Calling this fsync causes unacceptable performance */
  // fsync (fd); 

I know of an application somewhere that does realtime-gathering of
call-records (number X called Y for Z seconds). They come in from a
variety of sources, get de-duplicated standardized and written to
files. Then different output modules push the data to the different
consumers within the company. Billing among them. 

Now getting old data there would be pretty bad. And calling fsync
all the time might have performance issues 

That's the situation where "old data is really bad". 

But when apt-get upgrade replaces your /bin/sh and gets a write error
returning error on subsequent reads is really bad. 

It is more difficult than you think. 

Roger. 

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

Re: POSIX violation by writeback error

2018-09-05 Thread Rogier Wolff

On Tue, Sep 04, 2018 at 11:44:20AM -0400, Jeff Layton wrote:
> On Tue, 2018-09-04 at 22:56 +0800, 焦晓冬 wrote:
> > On Tue, Sep 4, 2018 at 7:09 PM Jeff Layton  wrote:
> > > 
> > > On Tue, 2018-09-04 at 16:58 +0800, Trol wrote:
> > > > That is certainly not possible to be done. But at least, shall we report
> > > > error on read()? Silently returning wrong data may cause further damage,
> > > > such as removing wrong files since it was marked as garbage in the old 
> > > > file.
> > > > 
> > > 
> > > Is the data wrong though? You tried to write and then that failed.
> > > Eventually we want to be able to get at the data that's actually in the
> > > file -- what is that point?
> > 
> > The point is silently data corruption is dangerous. I would prefer getting 
> > an
> > error back to receive wrong data.
> > 
> 
> Well, _you_ might like that, but there are whole piles of applications
> that may fall over completely in this situation. Legacy usage matters
> here.

Can I make a suggestion here?

First imagine a spherical cow in a vacuum. 

What I mean is: In the absence of boundary conditions (the real world)
what would ideally happen?

I'd say: 

* When you've written data to a file, you would want to read that
  written data back. Even in the presence of errors on the backing
  media.

But already this is controversial: I've seen time-and-time again that
people with raid-5 setups continue to work untill the second drive
fails: They ignored the signals the system was giving: "Please replace
a drive".

So when a mail queuer puts mail the mailq files and the mail processor
can get them out of there intact, nobody is going to notice.  (I know
mail queuers should call fsync and report errors when that fails, but
there are bound to be applications where calling fsync is not
appropriate (*))

So maybe when the write fails, the reads on that file should fail?

Then it means the data required to keep in memory is much reduced: you
only have to keep the metadata.

In both cases, semantics change when a reboot happens before the
read. Should we care? If we can't fix it when a reboot has happened,
does it make sense to do something different when a reboot has NOT
happened?

Roger. 

(*) I have 800Gb of data I need to give to a client. The
truck-of-tapes solution of today is a 1Tb USB-3 drive. Writing that
data onto the drive runs at 30Mb/sec (USB2 speed: USB3 didn't work for
some reason) for 5-10 seconds and then slows down to 200k/sec for
minutes at a time. One of the reasons might be that fuse-ntfs is
calling fsync on the MFT and directory files to keep stuff consistent
just in case things crash. Well... In this case this means that
copying the data took 3 full days instead of 3 hours. Too much calling
fsync is not good either.

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

Re: POSIX violation by writeback error

2018-09-05 Thread Rogier Wolff

On Tue, Sep 04, 2018 at 11:44:20AM -0400, Jeff Layton wrote:
> On Tue, 2018-09-04 at 22:56 +0800, 焦晓冬 wrote:
> > On Tue, Sep 4, 2018 at 7:09 PM Jeff Layton  wrote:
> > > 
> > > On Tue, 2018-09-04 at 16:58 +0800, Trol wrote:
> > > > That is certainly not possible to be done. But at least, shall we report
> > > > error on read()? Silently returning wrong data may cause further damage,
> > > > such as removing wrong files since it was marked as garbage in the old 
> > > > file.
> > > > 
> > > 
> > > Is the data wrong though? You tried to write and then that failed.
> > > Eventually we want to be able to get at the data that's actually in the
> > > file -- what is that point?
> > 
> > The point is silently data corruption is dangerous. I would prefer getting 
> > an
> > error back to receive wrong data.
> > 
> 
> Well, _you_ might like that, but there are whole piles of applications
> that may fall over completely in this situation. Legacy usage matters
> here.

Can I make a suggestion here?

First imagine a spherical cow in a vacuum. 

What I mean is: In the absence of boundary conditions (the real world)
what would ideally happen?

I'd say: 

* When you've written data to a file, you would want to read that
  written data back. Even in the presence of errors on the backing
  media.

But already this is controversial: I've seen time-and-time again that
people with raid-5 setups continue to work untill the second drive
fails: They ignored the signals the system was giving: "Please replace
a drive".

So when a mail queuer puts mail the mailq files and the mail processor
can get them out of there intact, nobody is going to notice.  (I know
mail queuers should call fsync and report errors when that fails, but
there are bound to be applications where calling fsync is not
appropriate (*))

So maybe when the write fails, the reads on that file should fail?

Then it means the data required to keep in memory is much reduced: you
only have to keep the metadata.

In both cases, semantics change when a reboot happens before the
read. Should we care? If we can't fix it when a reboot has happened,
does it make sense to do something different when a reboot has NOT
happened?

Roger. 

(*) I have 800Gb of data I need to give to a client. The
truck-of-tapes solution of today is a 1Tb USB-3 drive. Writing that
data onto the drive runs at 30Mb/sec (USB2 speed: USB3 didn't work for
some reason) for 5-10 seconds and then slows down to 200k/sec for
minutes at a time. One of the reasons might be that fuse-ntfs is
calling fsync on the MFT and directory files to keep stuff consistent
just in case things crash. Well... In this case this means that
copying the data took 3 full days instead of 3 hours. Too much calling
fsync is not good either.

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

Re: POSIX violation by writeback error

2018-09-04 Thread Rogier Wolff

On Tue, Sep 04, 2018 at 12:12:03PM -0400, J. Bruce Fields wrote:

> 
> Well, I think the point was that in the above examples you'd prefer that
> the read just fail--no need to keep the data.  A bit marking the file
> (or even the entire filesystem) unreadable would satisfy posix, I guess.
> Whether that's practical, I don't know.

When you would do it like that (mark the whole filesystem as "in
error") things go from bad to worse even faster. The Linux kernel 
tries to keep the system up even in the face of errors. 

With that suggestion, having one application run into a writeback
error would effectively crash the whole system because the filesystem
may be the root filesystem and stuff like "sshd" that you need to
diagnose the problem needs to be read from the disk 

Roger. 

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

Re: POSIX violation by writeback error

2018-09-04 Thread Rogier Wolff

On Tue, Sep 04, 2018 at 12:12:03PM -0400, J. Bruce Fields wrote:

> 
> Well, I think the point was that in the above examples you'd prefer that
> the read just fail--no need to keep the data.  A bit marking the file
> (or even the entire filesystem) unreadable would satisfy posix, I guess.
> Whether that's practical, I don't know.

When you would do it like that (mark the whole filesystem as "in
error") things go from bad to worse even faster. The Linux kernel 
tries to keep the system up even in the face of errors. 

With that suggestion, having one application run into a writeback
error would effectively crash the whole system because the filesystem
may be the root filesystem and stuff like "sshd" that you need to
diagnose the problem needs to be read from the disk 

Roger. 

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

Re: POSIX violation by writeback error

2018-09-04 Thread Rogier Wolff

On Tue, Sep 04, 2018 at 04:58:59PM +0800, 焦晓冬 wrote:

> As for suggestion, maybe the error flag of inode/mapping, or the entire inode
> should not be evicted if there was an error. That hopefully won't take much
> memory. On extreme conditions, where too much error inode requires staying
> in memory, maybe we should panic rather then spread the error.

Again you are hoping it will fit in memory. In an extreme case it
won't fit in memory. Tyring to come up with heuristics about when to
remember and when to forget such things from the past is very
difficult.

Think of my comments as: "it's harder than you think", not as "can't
be done".

Roger.

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

Re: POSIX violation by writeback error

2018-09-04 Thread Rogier Wolff

On Tue, Sep 04, 2018 at 04:58:59PM +0800, 焦晓冬 wrote:

> As for suggestion, maybe the error flag of inode/mapping, or the entire inode
> should not be evicted if there was an error. That hopefully won't take much
> memory. On extreme conditions, where too much error inode requires staying
> in memory, maybe we should panic rather then spread the error.

Again you are hoping it will fit in memory. In an extreme case it
won't fit in memory. Tyring to come up with heuristics about when to
remember and when to forget such things from the past is very
difficult.

Think of my comments as: "it's harder than you think", not as "can't
be done".

Roger.

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

Re: POSIX violation by writeback error

2018-09-04 Thread Rogier Wolff

On Tue, Sep 04, 2018 at 02:32:28PM +0800, 焦晓冬 wrote:
> Hi,
> 
> After reading several writeback error handling articles from LWN, I
> begin to be upset about writeback error handling.
> 
> Jlayton's patch is simple but wonderful idea towards correct error
> reporting. It seems one crucial thing is still here to be fixed. Does
> anyone have some idea?
> 
> The crucial thing may be that a read() after a successful
> open()-write()-close() may return old data.
> 
> That may happen where an async writeback error occurs after close()
> and the inode/mapping get evicted before read().

Suppose I have 1Gb of RAM. Suppose I open a file, write 0.5Gb to it
and then close it. Then I repeat this 9 times. 

Now, when writing those files to storage fails, there is 5Gb of data
to remember and only 1Gb of RAM. 

I can choose any part of that 5Gb and try to read it. 

Please make a suggestion about where we should store that data?

In the easy case, where the data easily fits in RAM, you COULD write a
solution. But when the hardware fails, the SYSTEM will not be able to
follow the posix rules. 

Roger. 

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

Re: POSIX violation by writeback error

2018-09-04 Thread Rogier Wolff

On Tue, Sep 04, 2018 at 02:32:28PM +0800, 焦晓冬 wrote:
> Hi,
> 
> After reading several writeback error handling articles from LWN, I
> begin to be upset about writeback error handling.
> 
> Jlayton's patch is simple but wonderful idea towards correct error
> reporting. It seems one crucial thing is still here to be fixed. Does
> anyone have some idea?
> 
> The crucial thing may be that a read() after a successful
> open()-write()-close() may return old data.
> 
> That may happen where an async writeback error occurs after close()
> and the inode/mapping get evicted before read().

Suppose I have 1Gb of RAM. Suppose I open a file, write 0.5Gb to it
and then close it. Then I repeat this 9 times. 

Now, when writing those files to storage fails, there is 5Gb of data
to remember and only 1Gb of RAM. 

I can choose any part of that 5Gb and try to read it. 

Please make a suggestion about where we should store that data?

In the easy case, where the data easily fits in RAM, you COULD write a
solution. But when the hardware fails, the SYSTEM will not be able to
follow the posix rules. 

Roger. 

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

Re: IRQ number question.

2018-09-04 Thread Rogier Wolff

On Mon, Sep 03, 2018 at 07:09:03PM +0100, Alan Cox wrote:

> The IRQ number in the PCI configuration space is just a label really for
> legacy OS stuff. Nothing actually routes interrupts according to it (*).
> If it's coming up as 14 that looks more like the BIOS mislabelled it.
> Legacy PCI interrupts care about lines and pins not irq numbers.

Yeah, on ISA there are lines marked IRQ3, IRQ4 and you pull on such a
line and the CPU gets interrupted.

On PCI there are lines marked IRQA, IRQB... IRQD and you yan one line
and the CPU gets interrupted.

Usually IRQA on the first slot gets wired to IRQB on the second slot
and so on. This is because cards are expected to require only one IRQ,
but /can/ have upto four.

the IRQ register in the PCI configuration space is just an 8bit
register that is required to be able to remember those 8 bits nothing
else.

The BIOS is expected to know which SLOTx INTy line goes where in the
hardware. And then it should store the resulting interrupt number in
that 8-bit register.

Now when Linux knows enough about the hardware (interrupt controller)
to reroute interrupts on an interrupt controller, I would have liked
it to also note the result in the interrupt register in the affected
card. Apparently it doesn't. 

Anyway. Yesterday I was: "OK, I'll have to make it into a real
PCI-device-driver", and today I'm thinking: "Now I know how this comes
about, I can make it nicer if I have time left", on to other stuff for
the time being.

Roger. 

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

Re: IRQ number question.

2018-09-04 Thread Rogier Wolff

On Mon, Sep 03, 2018 at 07:09:03PM +0100, Alan Cox wrote:

> The IRQ number in the PCI configuration space is just a label really for
> legacy OS stuff. Nothing actually routes interrupts according to it (*).
> If it's coming up as 14 that looks more like the BIOS mislabelled it.
> Legacy PCI interrupts care about lines and pins not irq numbers.

Yeah, on ISA there are lines marked IRQ3, IRQ4 and you pull on such a
line and the CPU gets interrupted.

On PCI there are lines marked IRQA, IRQB... IRQD and you yan one line
and the CPU gets interrupted.

Usually IRQA on the first slot gets wired to IRQB on the second slot
and so on. This is because cards are expected to require only one IRQ,
but /can/ have upto four.

the IRQ register in the PCI configuration space is just an 8bit
register that is required to be able to remember those 8 bits nothing
else.

The BIOS is expected to know which SLOTx INTy line goes where in the
hardware. And then it should store the resulting interrupt number in
that 8-bit register.

Now when Linux knows enough about the hardware (interrupt controller)
to reroute interrupts on an interrupt controller, I would have liked
it to also note the result in the interrupt register in the affected
card. Apparently it doesn't. 

Anyway. Yesterday I was: "OK, I'll have to make it into a real
PCI-device-driver", and today I'm thinking: "Now I know how this comes
about, I can make it nicer if I have time left", on to other stuff for
the time being.

Roger. 

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

Re: IRQ number question.

2018-09-03 Thread Rogier Wolff

On Mon, Sep 03, 2018 at 07:09:03PM +0100, Alan Cox wrote:
> On Mon, 3 Sep 2018 19:16:39 +0200

> > irq 18: nobody cared (try booting with the "irqpoll" option)
> > 
> > I've been writing device drivers in the past, but in the past
> > when the lspci listed "IRQ 14" then I'd have to request_irq (14, ...
> 
> The IRQ number in the PCI configuration space is just a label really for
> legacy OS stuff. Nothing actually routes interrupts according to it (*).
> If it's coming up as 14 that looks more like the BIOS mislabelled it.
> Legacy PCI interrupts care about lines and pins not irq numbers.
> 
> Are you looking at values after things like pci_enable_device were called
> or before ? Are you also looking at what is in pcidev->irq after the
> enable ?

The driver used to be for an ISA card. But as the ISA hardware is
becoming less and less available, things were in need of an upgrade.

So... So far I was just doing
  inmod  mydriver.ko pci=1 irq=14 io=0xae00 mem=0xfda0

keeping most of the ISA driver. (for testing I was able to run the ISA
card with the upgraded driver that does the PCI card as well...

So io= is the address I got from lspci, mem= and irq= the
same. Apparently All of them are accurate except for the IRQ?

So the answer is: No I wasn't doing pci_enable_device. I guess I'll 
have to make a proper PCI driver then. Hmm. OK. I'll look into it. 

Roger. 

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

Re: IRQ number question.

2018-09-03 Thread Rogier Wolff

On Mon, Sep 03, 2018 at 07:09:03PM +0100, Alan Cox wrote:
> On Mon, 3 Sep 2018 19:16:39 +0200

> > irq 18: nobody cared (try booting with the "irqpoll" option)
> > 
> > I've been writing device drivers in the past, but in the past
> > when the lspci listed "IRQ 14" then I'd have to request_irq (14, ...
> 
> The IRQ number in the PCI configuration space is just a label really for
> legacy OS stuff. Nothing actually routes interrupts according to it (*).
> If it's coming up as 14 that looks more like the BIOS mislabelled it.
> Legacy PCI interrupts care about lines and pins not irq numbers.
> 
> Are you looking at values after things like pci_enable_device were called
> or before ? Are you also looking at what is in pcidev->irq after the
> enable ?

The driver used to be for an ISA card. But as the ISA hardware is
becoming less and less available, things were in need of an upgrade.

So... So far I was just doing
  inmod  mydriver.ko pci=1 irq=14 io=0xae00 mem=0xfda0

keeping most of the ISA driver. (for testing I was able to run the ISA
card with the upgraded driver that does the PCI card as well...

So io= is the address I got from lspci, mem= and irq= the
same. Apparently All of them are accurate except for the IRQ?

So the answer is: No I wasn't doing pci_enable_device. I guess I'll 
have to make a proper PCI driver then. Hmm. OK. I'll look into it. 

Roger. 

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

IRQ number question.

2018-09-03 Thread Rogier Wolff

Hi, 

I'm writing a kernel driver. It is not going to be widely used, so I'm
not motivated to make things nice enough for inclusion in the standard
kernel.

But lspci shows my device: 

03:01.0 Serial bus controller [0c80]: Phoenix Contact GmbH & Co. Device 0002 
(rev b7)
Flags: bus master, stepping, medium devsel, latency 32, IRQ 14
I/O ports at e070 [size=16]
Memory at f7d0 (32-bit, non-prefetchable) [size=256K]

Now when I start my module and prod the device a bit, it will generate
an interrupt. (in this case the monitor program needs to start sending
messages to the card.)

Then the kernel reports: 

irq 18: nobody cared (try booting with the "irqpoll" option)

I've been writing device drivers in the past, but in the past
when the lspci listed "IRQ 14" then I'd have to request_irq (14, ...

Has this changed? Or is this hardware "odd"/"bad"/"broken" in that it
initializes the PCI devices wrong? (*)

My driver now works with the interrupts coming in nicely on IRQ18...

I have this card where I'm writing my own driver, and another PCI card
that uses an "included-in-the-kernel" driver, and it too behaves as if
it doesn't get any interrupts.

Roger. 

(*) Obviously according to everybody "windows works", so could it be
that modern windows simply activates an irq and polls to see what
driver handles it? Or something like that? Ah! That would be somewhat
similar to what "irqpoll" does on Linux!

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

IRQ number question.

2018-09-03 Thread Rogier Wolff

Hi, 

I'm writing a kernel driver. It is not going to be widely used, so I'm
not motivated to make things nice enough for inclusion in the standard
kernel.

But lspci shows my device: 

03:01.0 Serial bus controller [0c80]: Phoenix Contact GmbH & Co. Device 0002 
(rev b7)
Flags: bus master, stepping, medium devsel, latency 32, IRQ 14
I/O ports at e070 [size=16]
Memory at f7d0 (32-bit, non-prefetchable) [size=256K]

Now when I start my module and prod the device a bit, it will generate
an interrupt. (in this case the monitor program needs to start sending
messages to the card.)

Then the kernel reports: 

irq 18: nobody cared (try booting with the "irqpoll" option)

I've been writing device drivers in the past, but in the past
when the lspci listed "IRQ 14" then I'd have to request_irq (14, ...

Has this changed? Or is this hardware "odd"/"bad"/"broken" in that it
initializes the PCI devices wrong? (*)

My driver now works with the interrupts coming in nicely on IRQ18...

I have this card where I'm writing my own driver, and another PCI card
that uses an "included-in-the-kernel" driver, and it too behaves as if
it doesn't get any interrupts.

Roger. 

(*) Obviously according to everybody "windows works", so could it be
that modern windows simply activates an irq and polls to see what
driver handles it? Or something like that? Ah! That would be somewhat
similar to what "irqpoll" does on Linux!

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

Re: [PATCH v2 3/4] irqchip: Add BCM2835 AUX interrupt controller

2017-06-12 Thread Rogier Wolff

On Mon, Jun 12, 2017 at 05:19:03PM +0100, Marc Zyngier wrote:

> > Does Linux not notice when one calls generic_handle_irq with the number of 
> > an
> > interrupt without a handler?
> 
> It is not so much that the interrupt doesn't have a handler, but that
> the device (or one of the devices) is in some sort of interrupt frenzy,
> and the driver is not able to handle this interrupt.
> 
> In such a case, Linux tries to mask this interrupt, which in your case
> does exactly nothing. At this point, the system is dead.

In the old days, you had edge-triggered interrupts. That always led to
race conditions: If you handled the interrupt, the hardware might fire
an interrupt again AFTER you've checked: "nothing more to do?" and
before you have told the hardware "I've seen that interrupt". So then
you end up with hardware thinking the interrupt has been handled while
it has in fact not been handled. You can be very careful in what order
you do things, and get it almost right 

So nowadays interrupts are level triggered. That means that a device
that wants attention, but for SOME reason, thinks that it was not
properly handled will keep the interrupt line asserted, and interrupts
will keep firing.

When this happens (it's common when you're writing the device driver,
but it sometimes happens in the field when something unexpected
occurs), an interrupt storm starts. As soon as the generic handler
returns from interrupt, the hardware reenters the interrupt handler.

Without any countermeasures, the system would lock up without much
debugging options. Nowadays (since two decades or so), the Linux
kernel can disable the interrupt, print an error message and try to
continue. It won't work if other important interrupts for the system
were on the same IRQ line, but often enough, you just get a message
that an interrupt was disabled and that one peripheral will stop
working. Good opportunities for debugging the situation.

Roger. 

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

Re: [PATCH v2 3/4] irqchip: Add BCM2835 AUX interrupt controller

2017-06-12 Thread Rogier Wolff

On Mon, Jun 12, 2017 at 05:19:03PM +0100, Marc Zyngier wrote:

> > Does Linux not notice when one calls generic_handle_irq with the number of 
> > an
> > interrupt without a handler?
> 
> It is not so much that the interrupt doesn't have a handler, but that
> the device (or one of the devices) is in some sort of interrupt frenzy,
> and the driver is not able to handle this interrupt.
> 
> In such a case, Linux tries to mask this interrupt, which in your case
> does exactly nothing. At this point, the system is dead.

In the old days, you had edge-triggered interrupts. That always led to
race conditions: If you handled the interrupt, the hardware might fire
an interrupt again AFTER you've checked: "nothing more to do?" and
before you have told the hardware "I've seen that interrupt". So then
you end up with hardware thinking the interrupt has been handled while
it has in fact not been handled. You can be very careful in what order
you do things, and get it almost right 

So nowadays interrupts are level triggered. That means that a device
that wants attention, but for SOME reason, thinks that it was not
properly handled will keep the interrupt line asserted, and interrupts
will keep firing.

When this happens (it's common when you're writing the device driver,
but it sometimes happens in the field when something unexpected
occurs), an interrupt storm starts. As soon as the generic handler
returns from interrupt, the hardware reenters the interrupt handler.

Without any countermeasures, the system would lock up without much
debugging options. Nowadays (since two decades or so), the Linux
kernel can disable the interrupt, print an error message and try to
continue. It won't work if other important interrupts for the system
were on the same IRQ line, but often enough, you just get a message
that an interrupt was disabled and that one peripheral will stop
working. Good opportunities for debugging the situation.

Roger. 

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.

Re: [PATCH] dmaengine: bcm2835: Add slave dma support

2015-04-16 Thread Rogier Wolff

On Wed, Apr 15, 2015 at 08:53:07PM +0200, Noralf Trønnes wrote:

> A 16-bit register can't hold a value of 65536.
> Either the max value is 65535 or the register is 17-bits wide.

It is common for hardware registers to have the value "0" mean 65536
in case of a 16-bit register.

The hardware would then FIRST decrement the register and THEN check
for zero. This results in the behaviour that "1" requires one cycle to
complete, "10" requires ten cycles, and "0" means the same as the
total number of bitpatterns possible in the register. (256 for an
8-bit register, 65536 for a 16-bit register).

Another way to implement such a register in hardware would "check for
zero" first, and not do antyhing if the register equals zero. This
results in differnet behaviour for the "0" value.

That said: IMHO, the overhead of setting up 2 transfers for each 64k
block as opposed to only one results in such a small performance
penalty that I'd prefer to play it safe unless you're very sure you
can adequately test it. (Another option would be to set the maximum
transfer size to 0xf000: 60kbytes. Less than 10% extra transfers in 
the long run than when aiming for the edge...)

Roger. 

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] dmaengine: bcm2835: Add slave dma support

2015-04-16 Thread Rogier Wolff

On Wed, Apr 15, 2015 at 08:53:07PM +0200, Noralf Trønnes wrote:

 A 16-bit register can't hold a value of 65536.
 Either the max value is 65535 or the register is 17-bits wide.

It is common for hardware registers to have the value 0 mean 65536
in case of a 16-bit register.

The hardware would then FIRST decrement the register and THEN check
for zero. This results in the behaviour that 1 requires one cycle to
complete, 10 requires ten cycles, and 0 means the same as the
total number of bitpatterns possible in the register. (256 for an
8-bit register, 65536 for a 16-bit register).

Another way to implement such a register in hardware would check for
zero first, and not do antyhing if the register equals zero. This
results in differnet behaviour for the 0 value.

That said: IMHO, the overhead of setting up 2 transfers for each 64k
block as opposed to only one results in such a small performance
penalty that I'd prefer to play it safe unless you're very sure you
can adequately test it. (Another option would be to set the maximum
transfer size to 0xf000: 60kbytes. Less than 10% extra transfers in 
the long run than when aiming for the edge...)

Roger. 

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Buffer cache not working.

2014-05-14 Thread Rogier Wolff


Hi Guys, 

I have a file-server that has plenty of memory to cache most of the
things that I use. But after I've upgraded the machine it seems to
expire things from the cache even when there is zero memory pressure.

The case that irritates me most is my mailbox. Ok, I should clean it
out, but also my linux-kernel mailbox has 20k messages (= 1 month).

When I open such a mailbox in mutt, it takes 2 minutes for it to read
all the messages. If I quit and restart, it takes less than a second. 
I've resorted to a workaround: every 30 seconds: head Mailbox/cur/* . 
That worked. 

Oh, before a recent upgrade all this worked as I expected: Once a day,
after the backup had run, scanning the mailbox takes a while and then
the rest of the day things are quick. 

Then I gathered: Other people must have the same problem. It is
probably a "tuning" setting which works for desktop machines, but not
for my server. So I've googled a few "tuning your Linux system" pages
and all I could think of that might relate to my problem is
"swappiness".

I've decreased the value from 100 (in a few steps) to 5 now, and this
seems to help somewhat. But still after a few minutes, the buffer
cache drops my maildir, and starting mutt takes a long time again.

All this with the machine reporting 11Gb or more of RAM "Free"
(empty). 

With the default swappiness, I've tested that in a minute already some
of the maildir gets ejected from the cache, while when accessing all
files every 30 seconds seems to keep them cached. 

Kernel: from Ubuntu 14.04, 3.13.0-19-generic
32-bit userspace
16Gb RAM installed. 
Disks: 3x 3T in RAID5

ganglia shows the server memory graph as follows: 
http://prive.bitwizard.nl/server_mem.png

All other computers have a "nice" filled graph there.

So, is there a bug in my kernel or am I missing an important tuning
parameter?


Roger. 

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Buffer cache not working.

2014-05-14 Thread Rogier Wolff


Hi Guys, 

I have a file-server that has plenty of memory to cache most of the
things that I use. But after I've upgraded the machine it seems to
expire things from the cache even when there is zero memory pressure.

The case that irritates me most is my mailbox. Ok, I should clean it
out, but also my linux-kernel mailbox has 20k messages (= 1 month).

When I open such a mailbox in mutt, it takes 2 minutes for it to read
all the messages. If I quit and restart, it takes less than a second. 
I've resorted to a workaround: every 30 seconds: head Mailbox/cur/* . 
That worked. 

Oh, before a recent upgrade all this worked as I expected: Once a day,
after the backup had run, scanning the mailbox takes a while and then
the rest of the day things are quick. 

Then I gathered: Other people must have the same problem. It is
probably a tuning setting which works for desktop machines, but not
for my server. So I've googled a few tuning your Linux system pages
and all I could think of that might relate to my problem is
swappiness.

I've decreased the value from 100 (in a few steps) to 5 now, and this
seems to help somewhat. But still after a few minutes, the buffer
cache drops my maildir, and starting mutt takes a long time again.

All this with the machine reporting 11Gb or more of RAM Free
(empty). 

With the default swappiness, I've tested that in a minute already some
of the maildir gets ejected from the cache, while when accessing all
files every 30 seconds seems to keep them cached. 

Kernel: from Ubuntu 14.04, 3.13.0-19-generic
32-bit userspace
16Gb RAM installed. 
Disks: 3x 3T in RAID5

ganglia shows the server memory graph as follows: 
http://prive.bitwizard.nl/server_mem.png

All other computers have a nice filled graph there.

So, is there a bug in my kernel or am I missing an important tuning
parameter?


Roger. 

-- 
** r.e.wo...@bitwizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
The plan was simple, like my brother-in-law Phil. But unlike
Phil, this plan just might work.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -v5 2/2] Updating ctime and mtime at syncing

2008-01-17 Thread Rogier Wolff

On Thu, Jan 17, 2008 at 04:16:47PM +0300, Anton Salikhmetov wrote:
> 2008/1/17, Miklos Szeredi <[EMAIL PROTECTED]>:
> > > > 4. Recording the time was the file data changed
> > > >
> > > > Finally, I noticed yet another issue with the previous version of my 
> > > > patch.
> > > > Specifically, the time stamps were set to the current time of the moment
> > > > when syncing but not the write reference was being done. This led to the
> > > > following adverse effect on my development system:
> > > >
> > > > 1) a text file A was updated by process B;
> > > > 2) process B exits without calling any of the *sync() functions;
> > > > 3) vi editor opens the file A;
> > > > 4) file data synced, file times updated;
> > > > 5) vi is confused by "thinking" that the file was changed after 3).
> >
> > Updating the time in remove_vma() would fix this, no?
> 
> We need to save modification time. Otherwise, updating time stamps
> will be confusing the vi editor.

If process B exits before vi opens the file, the timestamp should at
the latest be the time that process B exits. There is no excuse for
setting the timestamp later than the time that B exits.

If process B no longer modifies the file, but still keeps it mapped
until after vi starts, then the system can't help the
situation. Wether or not B acesses those pages is unknown to the
system. So you get what you deserve.

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -v5 2/2] Updating ctime and mtime at syncing

2008-01-17 Thread Rogier Wolff

On Thu, Jan 17, 2008 at 01:45:43PM +0100, Miklos Szeredi wrote:
> > > 4. Recording the time was the file data changed
> > >
> > > Finally, I noticed yet another issue with the previous version of my 
> > > patch.
> > > Specifically, the time stamps were set to the current time of the moment
> > > when syncing but not the write reference was being done. This led to the
> > > following adverse effect on my development system:
> > >
> > > 1) a text file A was updated by process B;
> > > 2) process B exits without calling any of the *sync() functions;
> > > 3) vi editor opens the file A;
> > > 4) file data synced, file times updated;
> > > 5) vi is confused by "thinking" that the file was changed after 3).
> 
> Updating the time in remove_vma() would fix this, no?

That sounds to me as the right thing to do. Although not explcitly
mentioned in the standard, it is the logical (latest allowable)
timestamp to put on the modifications by process B. 

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -v5 2/2] Updating ctime and mtime at syncing

2008-01-17 Thread Rogier Wolff

On Thu, Jan 17, 2008 at 01:45:43PM +0100, Miklos Szeredi wrote:
   4. Recording the time was the file data changed
  
   Finally, I noticed yet another issue with the previous version of my 
   patch.
   Specifically, the time stamps were set to the current time of the moment
   when syncing but not the write reference was being done. This led to the
   following adverse effect on my development system:
  
   1) a text file A was updated by process B;
   2) process B exits without calling any of the *sync() functions;
   3) vi editor opens the file A;
   4) file data synced, file times updated;
   5) vi is confused by thinking that the file was changed after 3).
 
 Updating the time in remove_vma() would fix this, no?

That sounds to me as the right thing to do. Although not explcitly
mentioned in the standard, it is the logical (latest allowable)
timestamp to put on the modifications by process B. 

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -v5 2/2] Updating ctime and mtime at syncing

2008-01-17 Thread Rogier Wolff

On Thu, Jan 17, 2008 at 04:16:47PM +0300, Anton Salikhmetov wrote:
 2008/1/17, Miklos Szeredi [EMAIL PROTECTED]:
4. Recording the time was the file data changed
   
Finally, I noticed yet another issue with the previous version of my 
patch.
Specifically, the time stamps were set to the current time of the moment
when syncing but not the write reference was being done. This led to the
following adverse effect on my development system:
   
1) a text file A was updated by process B;
2) process B exits without calling any of the *sync() functions;
3) vi editor opens the file A;
4) file data synced, file times updated;
5) vi is confused by thinking that the file was changed after 3).
 
  Updating the time in remove_vma() would fix this, no?
 
 We need to save modification time. Otherwise, updating time stamps
 will be confusing the vi editor.

If process B exits before vi opens the file, the timestamp should at
the latest be the time that process B exits. There is no excuse for
setting the timestamp later than the time that B exits.

If process B no longer modifies the file, but still keeps it mapped
until after vi starts, then the system can't help the
situation. Wether or not B acesses those pages is unknown to the
system. So you get what you deserve.

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mtime updates for mmapped files.

2008-01-16 Thread Rogier Wolff

On Wed, Jan 16, 2008 at 02:22:49PM +0300, Anton Salikhmetov wrote:
> Unfortunately, this issue has not been fully fixed yet.
> 
> My last attempt (http://lkml.org/lkml/2008/1/15/202) to solve
> this problem has a couple of drawbacks:
> 
> 1) calling a possibly sleeping function from atomic context -
> I've already corrected this;
> 
> 2) there's a very special case with retouching the memory-mapped data
> after a call to msync() with MS_ASYNC.
 
> I'm still working on the latter case, but I guess that I have found
> a solution.
 
> If you badly need a quick fix, I can send my working unreleased
> patch to you.  I reckon that your particular problem will be fixed
> by this patch.  Let me know if you want that.

No need. Thanks for the offer. 

> However, if your application calls msync() with the MS_ASYNC flag,
> it's better to wait a little bit more - I'll release the next version
> of my solution
> shortly.

My application calls "exit" (*). My workaround was to type

touch 

to get things to work. In my situation, I was worried about the more
general case, and not really about my personal situation.

I could integrate the "touch " into my application, but I
think it is the OS's duty to do this for me. This would be the easiest
non-manual fix for me.

Roger.

(*) So please check that: 
- mmap file
- change mmapped area
- exit
also works (modifies the timestamp). 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

mtime updates for mmapped files.

2008-01-16 Thread Rogier Wolff


Hi,

I  wrote a small app yesterday that updates a file by mmapping the 
file (RW), changing the thing around, and then exiting. 

This did not trigger a change in the mtime of the file. Thus rsync
didn't pick up that the file had changed.

I understand that tracking every change to a RW mmapped file is
costly, and thus unfeasable, but shouldn't then the close cause a
mtime update?

The server where this happened is running 2.6.21, so my apologies if
this has already been corrected.

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

mtime updates for mmapped files.

2008-01-16 Thread Rogier Wolff


Hi,

I  wrote a small app yesterday that updates a file by mmapping the 
file (RW), changing the thing around, and then exiting. 

This did not trigger a change in the mtime of the file. Thus rsync
didn't pick up that the file had changed.

I understand that tracking every change to a RW mmapped file is
costly, and thus unfeasable, but shouldn't then the close cause a
mtime update?

The server where this happened is running 2.6.21, so my apologies if
this has already been corrected.

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mtime updates for mmapped files.

2008-01-16 Thread Rogier Wolff

On Wed, Jan 16, 2008 at 02:22:49PM +0300, Anton Salikhmetov wrote:
 Unfortunately, this issue has not been fully fixed yet.
 
 My last attempt (http://lkml.org/lkml/2008/1/15/202) to solve
 this problem has a couple of drawbacks:
 
 1) calling a possibly sleeping function from atomic context -
 I've already corrected this;
 
 2) there's a very special case with retouching the memory-mapped data
 after a call to msync() with MS_ASYNC.
 
 I'm still working on the latter case, but I guess that I have found
 a solution.
 
 If you badly need a quick fix, I can send my working unreleased
 patch to you.  I reckon that your particular problem will be fixed
 by this patch.  Let me know if you want that.

No need. Thanks for the offer. 

 However, if your application calls msync() with the MS_ASYNC flag,
 it's better to wait a little bit more - I'll release the next version
 of my solution
 shortly.

My application calls exit (*). My workaround was to type

touch destfile

to get things to work. In my situation, I was worried about the more
general case, and not really about my personal situation.

I could integrate the touch destfile into my application, but I
think it is the OS's duty to do this for me. This would be the easiest
non-manual fix for me.

Roger.

(*) So please check that: 
- mmap file
- change mmapped area
- exit
also works (modifies the timestamp). 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: OOM killer gripe (was Re: What still uses the block layer?)

2007-10-19 Thread Rogier Wolff

On Fri, Oct 19, 2007 at 01:49:31AM -0500, Rob Landley wrote:
> On Thursday 18 October 2007 8:00:49 am Rogier Wolff wrote:
> > So... IMHO, it would be useful to implement something that pages out
> > chunks of memory larger than a single hardware page. This would reduce
> > the size of the memory management tables (*), as well as improve disk
> > throughput if things DO come to paging
> 
> I believe that was more or less the topic of this paper:
>   http://kernel.org/doc/ols/2006/ols2006v2-pages-73-78.pdf

Not really. They are talking about doing this for the page
cache. That's where filesystem files are cached in memory. I'm talking
about the memory that programs use while they are running.

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: OOM killer gripe (was Re: What still uses the block layer?)

2007-10-19 Thread Rogier Wolff

On Fri, Oct 19, 2007 at 01:49:31AM -0500, Rob Landley wrote:
 On Thursday 18 October 2007 8:00:49 am Rogier Wolff wrote:
  So... IMHO, it would be useful to implement something that pages out
  chunks of memory larger than a single hardware page. This would reduce
  the size of the memory management tables (*), as well as improve disk
  throughput if things DO come to paging
 
 I believe that was more or less the topic of this paper:
   http://kernel.org/doc/ols/2006/ols2006v2-pages-73-78.pdf

Not really. They are talking about doing this for the page
cache. That's where filesystem files are cached in memory. I'm talking
about the memory that programs use while they are running.

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: OOM killer gripe (was Re: What still uses the block layer?)

2007-10-18 Thread Rogier Wolff

On Tue, Oct 16, 2007 at 05:34:15PM +1000, Nick Piggin wrote:
> > It's a hard call.  The I/O time for 1MB of contiguous disk data
> > is about the I/O time of 512 bytes of contiguous disk data.
> 
> And if you're thrashing, then by definition you need to throw
> out 1MB of your working set in order to read it in.

Right. But you need a differential hit rate of only a few percent on
that 1020 extra kb of data you swapped in versus the 1Mb of data you
swapped out for this to be advantageous.

With "differential hit rate" I mean the chances of getting a hit on
the 1Mb of data just paged in, minus the chances of getting a hit on
the 1Mb of data just paged out. 

With a little luck that 1Mb that is paged out didn't get used for
quite a while, while there is a hint that the 1Mb you're paging in
is active, as one of its sub-pages just got a hit.

So... IMHO, it would be useful to implement something that pages out
chunks of memory larger than a single hardware page. This would reduce
the size of the memory management tables (*), as well as improve disk
throughput if things DO come to paging

This should of course be configurable. Some workloads are better off
with a virtual page size of 8k, some with 128k. some with 1M.

As far as I can see, the "page-cluster" parameter defines how many
pages at a time are selected for page-out at a time. This increases
the page-out efficiency. Improving the page-in efficiency is also
useful: It is the other half of hte equation.

Roger. 

(*) If the kernel starts working with a 1Mb virtual page size, you
need a 256 times smaller mapping table between processes and memory or
swap. Of course, the hardware doesn't support this (actually, it does
for 1Mb virtual pages), so you'll have to create 256 page table
entries for the hardware instead of just one.

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: OOM killer gripe (was Re: What still uses the block layer?)

2007-10-18 Thread Rogier Wolff

On Tue, Oct 16, 2007 at 05:34:15PM +1000, Nick Piggin wrote:
  It's a hard call.  The I/O time for 1MB of contiguous disk data
  is about the I/O time of 512 bytes of contiguous disk data.
 
 And if you're thrashing, then by definition you need to throw
 out 1MB of your working set in order to read it in.

Right. But you need a differential hit rate of only a few percent on
that 1020 extra kb of data you swapped in versus the 1Mb of data you
swapped out for this to be advantageous.

With differential hit rate I mean the chances of getting a hit on
the 1Mb of data just paged in, minus the chances of getting a hit on
the 1Mb of data just paged out. 

With a little luck that 1Mb that is paged out didn't get used for
quite a while, while there is a hint that the 1Mb you're paging in
is active, as one of its sub-pages just got a hit.

So... IMHO, it would be useful to implement something that pages out
chunks of memory larger than a single hardware page. This would reduce
the size of the memory management tables (*), as well as improve disk
throughput if things DO come to paging

This should of course be configurable. Some workloads are better off
with a virtual page size of 8k, some with 128k. some with 1M.

As far as I can see, the page-cluster parameter defines how many
pages at a time are selected for page-out at a time. This increases
the page-out efficiency. Improving the page-in efficiency is also
useful: It is the other half of hte equation.

Roger. 


(*) If the kernel starts working with a 1Mb virtual page size, you
need a 256 times smaller mapping table between processes and memory or
swap. Of course, the hardware doesn't support this (actually, it does
for 1Mb virtual pages), so you'll have to create 256 page table
entries for the hardware instead of just one.



-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: raid5:md3: read error corrected , followed by , Machine Check Exception: .

2007-07-16 Thread Rogier Wolff

On Sat, Jul 14, 2007 at 07:32:38PM -0700, Mr. James W. Laferriere wrote:
> >So your disk throws a fit
> 
>   Actually it's brand new .  Infant mortallity ?  I at least have a 
>   cold spare available .  So Yes I am replacing that puppy .  I'll drop 
> it 
> into another system & give it the format command & see how much the user 
> bad block table grows .  I'll bet I'll get a table full overflow on it .

Manually keep both tables under supervision. I'd guess that if you
send it a format, it will update the factory table (and move the user
bad block table there). 

But most will retry writing to the bad sectors. And with a fully-fresh
copy of the data, it will still be readable, and the blocks will be
marked as ready-for-use, because that's better for performance

> >And at some point at least 18 minutes after the raid incident you log
> >CPU problems.
> 
>   I didn't notice the 18 Minute differance .  Drats .

I'm not sure how this happened, but the disk errror messages seem to have
been logged by syslog, and the MCEs seem to have been copied from the 
console: They don't have the date attached?

Roger.
-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/3] Make the IDE DMA timeout modifiable

2007-07-16 Thread Rogier Wolff

On Fri, Jul 13, 2007 at 05:00:18PM -0400, Mark Lord wrote:
> >   Ah, that makes sense -- during PIO interrupts happen a lot more often.
> >20 secs still seem to be too much.
> 
> I don't think so, even for modern drives.
> Figure 8-10 seconds max for spin-up,
> plus 6-9 seconds to do a sector re-assignment
> or retries on a bad block (a measured *real-life* value).
> 
> That adds up to 14-19 seconds, so 20 seconds is probably good.
> 
> Still, this does need to be adjustable for faster (CF) devices,
> and slower (optical/tape) devices, rather than just a single
> set of fixed timeout values.

In real life, with real bad blocks on real harddrives, some harddrives
take more than the DMA TIMEOUT time to read a single block, even without
having to spin up. 

The current code then resets the drive, on which the drive reports
"busy, not ready for command", and things go downhill from there. 

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/3] Make the IDE DMA timeout modifiable

2007-07-16 Thread Rogier Wolff

On Fri, Jul 13, 2007 at 05:00:18PM -0400, Mark Lord wrote:
Ah, that makes sense -- during PIO interrupts happen a lot more often.
 20 secs still seem to be too much.
 
 I don't think so, even for modern drives.
 Figure 8-10 seconds max for spin-up,
 plus 6-9 seconds to do a sector re-assignment
 or retries on a bad block (a measured *real-life* value).
 
 That adds up to 14-19 seconds, so 20 seconds is probably good.
 
 Still, this does need to be adjustable for faster (CF) devices,
 and slower (optical/tape) devices, rather than just a single
 set of fixed timeout values.

In real life, with real bad blocks on real harddrives, some harddrives
take more than the DMA TIMEOUT time to read a single block, even without
having to spin up. 

The current code then resets the drive, on which the drive reports
busy, not ready for command, and things go downhill from there. 

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RAID performance is not too well....

2007-06-29 Thread Rogier Wolff


Hi,

I have an application that creates some 228 thousand files,
spread over about 4000 directories. Total is not more than 
1.3Gb.  (I'm not sure, and I don't care if it's 10% or 90% of
that number)

Anyway, I've loaded all of the 1.3Gb into the cache (the machine
has 8Gb of RAM), so that only writes need to take place. 

After a while the machine goes into a routine of writing
about 500 to 1000kbytes per second. 

Sync seems to take a long time: 

zebigbos:/recover7/bd4256_jense/tree> time sync 
0.004u 0.136s 5:44.66 0.0%  0+0k 0+0io 0pf+0w
zebigbos:/recover7/bd4256_jense/tree> 

The machine normally reads up to about 150 Mbytes per second without
trouble. 

I'm suspecting that the writes to the inodes and files all end
up "fragmented" such that reads to complete the RAID stripes 
need to be performed: 

Iostat shows: 

Device:tpskB_read/skB_wrtn/skB_readkB_wrtn
sda  75.25   277.23   126.73280128
sdb  91.09   400.00   134.65404136
sdc  71.29   253.4795.05256 96
sdd 100.99   221.78   304.95224308

However, I would say that all those new files should be "clustered" 
such that the chances of writing a full stripe becomes reasonable. 
Moreover, clustering should, even with reading other parts of the
stripe result in a performance on the order of 10 to 50 times better. 

Raid block (stripe) size  is 64k.  (Next time I format a partition, 
I will chose 512k, causing the readperformance to increasae from 150Mb
per second to about 200Mb per second). 

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RAID performance is not too well....

2007-06-29 Thread Rogier Wolff


Hi,

I have an application that creates some 228 thousand files,
spread over about 4000 directories. Total is not more than 
1.3Gb.  (I'm not sure, and I don't care if it's 10% or 90% of
that number)

Anyway, I've loaded all of the 1.3Gb into the cache (the machine
has 8Gb of RAM), so that only writes need to take place. 

After a while the machine goes into a routine of writing
about 500 to 1000kbytes per second. 

Sync seems to take a long time: 

zebigbos:/recover7/bd4256_jense/tree time sync 
0.004u 0.136s 5:44.66 0.0%  0+0k 0+0io 0pf+0w
zebigbos:/recover7/bd4256_jense/tree 

The machine normally reads up to about 150 Mbytes per second without
trouble. 

I'm suspecting that the writes to the inodes and files all end
up fragmented such that reads to complete the RAID stripes 
need to be performed: 

Iostat shows: 

Device:tpskB_read/skB_wrtn/skB_readkB_wrtn
sda  75.25   277.23   126.73280128
sdb  91.09   400.00   134.65404136
sdc  71.29   253.4795.05256 96
sdd 100.99   221.78   304.95224308

However, I would say that all those new files should be clustered 
such that the chances of writing a full stripe becomes reasonable. 
Moreover, clustering should, even with reading other parts of the
stripe result in a performance on the order of 10 to 50 times better. 

Raid block (stripe) size  is 64k.  (Next time I format a partition, 
I will chose 512k, causing the readperformance to increasae from 150Mb
per second to about 200Mb per second). 

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Nbd problem now oopses.

2007-05-13 Thread Rogier Wolff


Hi,


After turning on the debugging for allocations and locks, I 
now get a kernel ooops. 


[ 5628.608000] BUG: unable to handle kernel NULL pointer dereference at virtual 
address 
[ 5628.608000]  printing eip:
[ 5628.608000] c0293210
[ 5628.608000] *pde = 
[ 5628.608000] Oops: 0002 [#1]
[ 5628.608000] Modules linked in: nbd
[ 5628.608000] CPU:0
[ 5628.608000] EIP:0060:[]Not tainted VLI
[ 5628.608000] EFLAGS: 00010246   (2.6.21 #8)
[ 5628.608000] EIP is at tcp_sendmsg+0x726/0xab3
[ 5628.608000] eax:    ebx: c24576b8   ecx:    edx: 
[ 5628.608000] esi: c30a006c   edi: 0840   ebp:    esp: c3f8fc5c
[ 5628.608000] ds: 007b   es: 007b   fs: 00d8  gs:   ss: 0068
[ 5628.608000] Process kblockd/0 (pid: 34, ti=c3f8e000 task=c3f89550 
task.ti=c3f8e000)
[ 5628.608000] Stack: 0002 012b 0001 0046 000a c011ad49 
 c100dea0 
[ 5628.608000]0001   c2b2c7c0 0840 07c0 
baa8 05a8 
[ 5628.608000]c000  c3f8fdf8 c3f8e000 1000 7fff 
c03a3820 c30a006c 
[ 5628.608000] Call Trace:
[ 5628.608000]  [] __do_softirq+0x35/0x73
[ 5628.608000]  [] inet_sendmsg+0x39/0x43
[ 5628.608000]  [] sock_sendmsg+0xbc/0xd4
...
[ 5628.608000] Code: d2 89 d5 74 26 83 be 80 01 00 00 00 0f 85 7b 03 00 00 c7 
86 88 01 00 00 00 00 00 00 8b 5c 24 1c 89 9e 80 01 00 00 e9 62 03 00 00 [ 
5628.608000] EIP: [] tcp_sendmsg+0x726/0xab3 SS:ESP 0068:c3f8fc5c


which seems to be: 

0xc02931f1 :  jne0xc0293572 
0xc02931f7 :  movl   $0x0,0x188(%esi)
0xc0293201 :  mov0x1c(%esp),%ebx
0xc0293205 :  mov%ebx,0x180(%esi)
0xc029320b :  jmp0xc0293572 

EIP points here: 
0xc0293210 :  cmpl   $0x0,0x24(%esp)
0xc0293215 :  mov0x98(%ebx),%edx
0xc029321b :  je 0xc0293232 
0xc029321d :  mov0x20(%esp),%ecx
0xc0293221 :  movzwl 0x12(%edx,%ecx,8),%eax
0xc0293226 :  add%edi,%eax
0xc0293228 :  mov%ax,0x12(%edx,%ecx,8)
0xc029322d :  jmp0xc02932b4 


which is 

790 if (err) {
791 /* If this page was new, give it to 
the
792  * socket so it does not get leaked.
793  */
794 if (!TCP_PAGE(sk)) {
795 TCP_PAGE(sk) = page;
796 TCP_OFF(sk) = 0;
797 }
798 goto do_error;
799 }
800 
801 /* Update the skb. */

EIP Points here. 
802 if (merge) {
803 skb_shinfo(skb)->frags[i - 1].size 
+=
804 
copy;


and now the question is: How can the 
cmpl   $0x0,0x24(%esp)
trap at address 0? 

How can "if (merge)" cause a segmentation fault?

If EIP is a bit off, it could be a line erarlier or further. So, could it
crash on the jmp tcp_sendmsg+2696? I dont' thinks so. 

How about "mov0x98(%ebx),%edx"? If ebx is invalid, this should 
crash. (ebx apparently holds skb if I understand things correctly). 
But from the dump, ebx holds c24576b8, and if that's invalid it would
not say 
  BUG: unable to handle kernel NULL pointer dereference at virtual address 

right?

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Nbd problem now oopses.

2007-05-13 Thread Rogier Wolff


Hi,


After turning on the debugging for allocations and locks, I 
now get a kernel ooops. 


[ 5628.608000] BUG: unable to handle kernel NULL pointer dereference at virtual 
address 
[ 5628.608000]  printing eip:
[ 5628.608000] c0293210
[ 5628.608000] *pde = 
[ 5628.608000] Oops: 0002 [#1]
[ 5628.608000] Modules linked in: nbd
[ 5628.608000] CPU:0
[ 5628.608000] EIP:0060:[c0293210]Not tainted VLI
[ 5628.608000] EFLAGS: 00010246   (2.6.21 #8)
[ 5628.608000] EIP is at tcp_sendmsg+0x726/0xab3
[ 5628.608000] eax:    ebx: c24576b8   ecx:    edx: 
[ 5628.608000] esi: c30a006c   edi: 0840   ebp:    esp: c3f8fc5c
[ 5628.608000] ds: 007b   es: 007b   fs: 00d8  gs:   ss: 0068
[ 5628.608000] Process kblockd/0 (pid: 34, ti=c3f8e000 task=c3f89550 
task.ti=c3f8e000)
[ 5628.608000] Stack: 0002 012b 0001 0046 000a c011ad49 
 c100dea0 
[ 5628.608000]0001   c2b2c7c0 0840 07c0 
baa8 05a8 
[ 5628.608000]c000  c3f8fdf8 c3f8e000 1000 7fff 
c03a3820 c30a006c 
[ 5628.608000] Call Trace:
[ 5628.608000]  [c011ad49] __do_softirq+0x35/0x73
[ 5628.608000]  [c02ab61c] inet_sendmsg+0x39/0x43
[ 5628.608000]  [c026c744] sock_sendmsg+0xbc/0xd4
...
[ 5628.608000] Code: d2 89 d5 74 26 83 be 80 01 00 00 00 0f 85 7b 03 00 00 c7 
86 88 01 00 00 00 00 00 00 8b 5c 24 1c 89 9e 80 01 00 00 e9 62 03 00 00 [ 
5628.608000] EIP: [c0293210] tcp_sendmsg+0x726/0xab3 SS:ESP 0068:c3f8fc5c


which seems to be: 

0xc02931f1 tcp_sendmsg+1799:  jne0xc0293572 tcp_sendmsg+2696
0xc02931f7 tcp_sendmsg+1805:  movl   $0x0,0x188(%esi)
0xc0293201 tcp_sendmsg+1815:  mov0x1c(%esp),%ebx
0xc0293205 tcp_sendmsg+1819:  mov%ebx,0x180(%esi)
0xc029320b tcp_sendmsg+1825:  jmp0xc0293572 tcp_sendmsg+2696

EIP points here: 
0xc0293210 tcp_sendmsg+1830:  cmpl   $0x0,0x24(%esp)
0xc0293215 tcp_sendmsg+1835:  mov0x98(%ebx),%edx
0xc029321b tcp_sendmsg+1841:  je 0xc0293232 tcp_sendmsg+1864
0xc029321d tcp_sendmsg+1843:  mov0x20(%esp),%ecx
0xc0293221 tcp_sendmsg+1847:  movzwl 0x12(%edx,%ecx,8),%eax
0xc0293226 tcp_sendmsg+1852:  add%edi,%eax
0xc0293228 tcp_sendmsg+1854:  mov%ax,0x12(%edx,%ecx,8)
0xc029322d tcp_sendmsg+1859:  jmp0xc02932b4 tcp_sendmsg+1994


which is 

790 if (err) {
791 /* If this page was new, give it to 
the
792  * socket so it does not get leaked.
793  */
794 if (!TCP_PAGE(sk)) {
795 TCP_PAGE(sk) = page;
796 TCP_OFF(sk) = 0;
797 }
798 goto do_error;
799 }
800 
801 /* Update the skb. */

EIP Points here. 
802 if (merge) {
803 skb_shinfo(skb)-frags[i - 1].size 
+=
804 
copy;


and now the question is: How can the 
cmpl   $0x0,0x24(%esp)
trap at address 0? 

How can if (merge) cause a segmentation fault?

If EIP is a bit off, it could be a line erarlier or further. So, could it
crash on the jmp tcp_sendmsg+2696? I dont' thinks so. 

How about mov0x98(%ebx),%edx? If ebx is invalid, this should 
crash. (ebx apparently holds skb if I understand things correctly). 
But from the dump, ebx holds c24576b8, and if that's invalid it would
not say 
  BUG: unable to handle kernel NULL pointer dereference at virtual address 

right?

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: nbd problem.

2007-05-09 Thread Rogier Wolff

On Wed, May 09, 2007 at 01:10:49PM +0200, Rogier Wolff wrote:
> On Tue, May 08, 2007 at 01:33:52PM -0700, Satyam Sharma wrote:
> > On 5/8/07, Rogier Wolff <[EMAIL PROTECTED]> wrote:
> > >
> > >Hi,
> > >
> > >The nbd client still reliably hangs when I use it.
> 
> Someone suggested to use 
> 
> http://git.kernel.org/?p=linux/kernel/git/axboe/linux-2.6-block.git;a=summary
> 
> and that fixed it.  (i.e. there is something in there that should
> be merged)

Cancel the party! It got MUCH further than before, but crashed
eventually. 

ozon:~> ps auxww | grep D
USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
root   110  0.4  0.0  0 0 ?D10:28   0:31 [pdflush]
root   112  0.0  0.0  0 0 ?D<   10:28   0:05 [kswapd0]
root  1649  0.0  0.1   1604   108 pts/0D+   11:17   0:03 nbd-client 
petisuix 1234 /dev/nd0
root  1654  0.9  4.5   4648  2816 pts/0D+   11:17   0:44 rsync 
/usr/src/linux-2.6.21.ozon /mnt/test1 -av --progress
wolff 1716  0.0  0.9   1648   560 pts/1R+   12:33   0:00 grep D
ozon:~> 

Can anybody help me figure out what these proceses are waiting for?

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: nbd problem.

2007-05-09 Thread Rogier Wolff

On Tue, May 08, 2007 at 01:33:52PM -0700, Satyam Sharma wrote:
> On 5/8/07, Rogier Wolff <[EMAIL PROTECTED]> wrote:
> >
> >Hi,
> >
> >The nbd client still reliably hangs when I use it.

Someone suggested to use 

http://git.kernel.org/?p=linux/kernel/git/axboe/linux-2.6-block.git;a=summary

and that fixed it.  (i.e. there is something in there that should
be merged)

Jens, thanks for pointing out that there were different locks 
involved.

Roger. 

(I seem to have lost all other EMails in this thread. Apparently
my delete-old-list-emails is too agressive today...)

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: nbd problem.

2007-05-09 Thread Rogier Wolff

On Tue, May 08, 2007 at 01:33:52PM -0700, Satyam Sharma wrote:
 On 5/8/07, Rogier Wolff [EMAIL PROTECTED] wrote:
 
 Hi,
 
 The nbd client still reliably hangs when I use it.

Someone suggested to use 

http://git.kernel.org/?p=linux/kernel/git/axboe/linux-2.6-block.git;a=summary

and that fixed it.  (i.e. there is something in there that should
be merged)

Jens, thanks for pointing out that there were different locks 
involved.

Roger. 

(I seem to have lost all other EMails in this thread. Apparently
my delete-old-list-emails is too agressive today...)

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: nbd problem.

2007-05-09 Thread Rogier Wolff

On Wed, May 09, 2007 at 01:10:49PM +0200, Rogier Wolff wrote:
 On Tue, May 08, 2007 at 01:33:52PM -0700, Satyam Sharma wrote:
  On 5/8/07, Rogier Wolff [EMAIL PROTECTED] wrote:
  
  Hi,
  
  The nbd client still reliably hangs when I use it.
 
 Someone suggested to use 
 
 http://git.kernel.org/?p=linux/kernel/git/axboe/linux-2.6-block.git;a=summary
 
 and that fixed it.  (i.e. there is something in there that should
 be merged)

Cancel the party! It got MUCH further than before, but crashed
eventually. 

ozon:~ ps auxww | grep D
USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
root   110  0.4  0.0  0 0 ?D10:28   0:31 [pdflush]
root   112  0.0  0.0  0 0 ?D   10:28   0:05 [kswapd0]
root  1649  0.0  0.1   1604   108 pts/0D+   11:17   0:03 nbd-client 
petisuix 1234 /dev/nd0
root  1654  0.9  4.5   4648  2816 pts/0D+   11:17   0:44 rsync 
/usr/src/linux-2.6.21.ozon /mnt/test1 -av --progress
wolff 1716  0.0  0.9   1648   560 pts/1R+   12:33   0:00 grep D
ozon:~ 

Can anybody help me figure out what these proceses are waiting for?

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

nbd problem.

2007-05-08 Thread Rogier Wolff


Hi,

The nbd client still reliably hangs when I use it. 

While looking into this, I found:


446 req->errors = 0;
447 spin_unlock_irq(q->queue_lock);
   
448 
449 mutex_lock(>tx_lock);
450 if (unlikely(!lo->sock)) {
451 mutex_unlock(>tx_lock);
452 printk(KERN_ERR "%s: Attempted send on closed 
socket\n",
453lo->disk->disk_name);
454 req->errors++;
455 nbd_end_request(req);
456 spin_lock_irq(q->queue_lock);
457 continue;
458 }
459 
460 lo->active_req = req;
461 
462 if (nbd_send_req(lo, req) != 0) {
463 printk(KERN_ERR "%s: Request send failed\n",
464 lo->disk->disk_name);
465 req->errors++;
466 nbd_end_request(req);
467 } else {
468 spin_lock(>queue_lock);
 ^^
469 list_add(>queuelist, >queue_head);
470 spin_unlock(>queue_lock);
471 }
472 
473 lo->active_req = NULL;


As far as I read things, the function is called with the lock
held and interrupts disabled., the lock can then be released and 
retaken without disabling interrupts again. 

Should this be fixed?

(it doesn't fix my hang though)

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

nbd problem.

2007-05-08 Thread Rogier Wolff


Hi,

The nbd client still reliably hangs when I use it. 

While looking into this, I found:


446 req-errors = 0;
447 spin_unlock_irq(q-queue_lock);
   
448 
449 mutex_lock(lo-tx_lock);
450 if (unlikely(!lo-sock)) {
451 mutex_unlock(lo-tx_lock);
452 printk(KERN_ERR %s: Attempted send on closed 
socket\n,
453lo-disk-disk_name);
454 req-errors++;
455 nbd_end_request(req);
456 spin_lock_irq(q-queue_lock);
457 continue;
458 }
459 
460 lo-active_req = req;
461 
462 if (nbd_send_req(lo, req) != 0) {
463 printk(KERN_ERR %s: Request send failed\n,
464 lo-disk-disk_name);
465 req-errors++;
466 nbd_end_request(req);
467 } else {
468 spin_lock(lo-queue_lock);
 ^^
469 list_add(req-queuelist, lo-queue_head);
470 spin_unlock(lo-queue_lock);
471 }
472 
473 lo-active_req = NULL;


As far as I read things, the function is called with the lock
held and interrupts disabled., the lock can then be released and 
retaken without disabling interrupts again. 

Should this be fixed?

(it doesn't fix my hang though)

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233**
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/4] 2.6.21-rc7 NFS writes: fix a series of issues

2007-04-29 Thread Rogier Wolff

On Tue, Apr 17, 2007 at 10:37:38PM -0700, Andrew Morton wrote:
> Florin, can we please see /proc/meminfo as well?
> 
> Also the result of `echo m > /proc/sysrq-trigger'

Hi,

It's been a while since this thread died out, but maybe I'm 
having the same problem. Networking, large part of memory is 
buffering writes. 

In my case I'm using NBD. 

Oh, 

/sys/block/nbd0/stat gives:
 636   88 5353 1700  99119554   16227263156   
43  1452000 61802352
I put some debugging stuff in nbd, and it DOES NOT KNOW about the
43 requests that the io scheduler claims are in flight at the
driver 

Those requests start a couple of seconds AFTER the whole thing
grinds to a halt.

I switched from crashing my 512Mb-ram-workstation to my development
machine, which has only 64M of RAM. (I got the development machine
back up and running after some effort). 

My rsync (and also "sync" if I call it, or reboot without -n) also
gets stuck in D state: 

<4>[  622.364000] rsync D 0019C170 0  2456   2455 (NOTLB)
<4>[  622.364000]c04d7c80 0086 c1f61ba8 0019c170  0008 
c31048a0 0003382e 
<4>[  622.364000] c24c9908 0286 c1092740 c3e12590 c2330a50 
c2330b5c 00061a80 
<4>[  622.364000]8639a400 0078 c04d7cd0  c10801b8 c04d7c88 
c03082af c0176c48 
<4>[  622.364000] Call Trace:
<4>[  622.364000]  [] io_schedule+0xe/0x16
<4>[  622.364000]  [] sync_buffer+0x0/0x2e
<4>[  622.364000]  [] sync_buffer+0x2b/0x2e
<4>[  622.364000]  [] __wait_on_bit+0x2c/0x51
<4>[  622.364000]  [] sync_buffer+0x0/0x2e
<4>[  622.364000]  [] out_of_line_wait_on_bit+0x73/0x7b
<4>[  622.364000]  [] wake_bit_function+0x0/0x3c
<4>[  622.364000]  [] wake_bit_function+0x0/0x3c
<4>[  622.364000]  [] __wait_on_buffer+0x22/0x25
<4>[  622.364000]  [] ext3_find_entry+0x1aa/0x36f
<4>[  622.364000]  [] journal_dirty_metadata+0x1b6/0x1d3
<4>[  622.364000]  [] ext3_lookup+0x28/0xc6
<4>[  622.364000]  [] real_lookup+0x53/0xc2
<4>[  622.364000]  [] do_lookup+0x57/0x9d
<4>[  622.364000]  [] __link_path_walk+0x7ae/0xb81
<4>[  622.364000]  [] __do_softirq+0x57/0x83
<4>[  622.364000]  [] link_path_walk+0x3d/0xa0
<4>[  622.364000]  [] sys_lchown+0x3c/0x44
<4>[  622.364000]  [] get_unused_fd+0xa0/0xbc
<4>[  622.364000]  [] do_path_lookup+0x1b7/0x200
<4>[  622.364000]  [] __path_lookup_intent_open+0x42/0x72
<4>[  622.364000]  [] path_lookup_open+0x20/0x25
<4>[  622.364000]  [] open_namei+0x8c/0x532
<4>[  622.364000]  [] sys_fchmodat+0xac/0xb9
<4>[  622.364000]  [] do_filp_open+0x25/0x39
<4>[  622.364000]  [] sys_lchown+0x3c/0x44
<4>[  622.364000]  [] get_unused_fd+0xa0/0xbc
<4>[  622.364000]  [] do_sys_open+0x42/0xbe
<4>[  622.364000]  [] sys_open+0x1a/0x1c
<4>[  622.364000]  [] syscall_call+0x7/0xb
<4>[  622.364000]  ===

--
<6>[  871.52] SysRq : Show Memory
<6>[  871.52] Mem-info:
<4>[  871.52] DMA per-cpu:
<4>[  871.52] CPU0: Hot: hi:0, btch:   1 usd:   0   Cold: hi:0, 
btch:   1 usd:   0
<4>[  871.52] Normal per-cpu:
<4>[  871.52] CPU0: Hot: hi:6, btch:   1 usd:   0   Cold: hi:2, 
btch:   1 usd:   0
<4>[  871.52] Active:5632 inactive:6764 dirty:0 writeback:302 unstable:0
<4>[  871.52]  free:717 slab:2024 mapped:926 pagetables:135 bounce:0
<4>[  871.52] DMA free:1104kB min:252kB low:312kB high:376kB active:3600kB 
inactive:6820kB present:16256kB pages_scanned:0 all_unreclaimable? no
<4>[  871.52] lowmem_reserve[]: 0 47
<4>[  871.52] Normal free:1764kB min:760kB low:948kB high:1140kB 
active:18928kB inactive:20236kB present:48708kB pages_scanned:0 
all_unreclaimable? no
<4>[  871.52] lowmem_reserve[]: 0 0
<4>[  871.52] DMA: 118*4kB 19*8kB 2*16kB 2*32kB 0*64kB 1*128kB 1*256kB 
0*512kB 0*1024kB 0*2048kB 0*4096kB = 1104kB
<4>[  871.52] Normal: 171*4kB 23*8kB 0*16kB 0*32kB 4*64kB 1*128kB 0*256kB 
1*512kB 0*1024kB 0*2048kB 0*4096kB = 1764kB
<4>[  871.52] Swap cache: add 0, delete 0, find 0/0, race 0+0
<4>[  871.52] Free swap  = 0kB
<4>[  871.52] Total swap = 0kB
<6>[  871.52] Free swap:0kB
<6>[  871.52] 16368 pages of RAM
<6>[  871.52] 0 pages of HIGHMEM
<6>[  871.52] 1044 reserved pages
<6>[  871.52] 13456 pages shared
<6>[  871.52] 0 pages swap cached
<6>[  871.52] 0 pages dirty
<6>[  871.52] 302 pages writeback
<6>[  871.52] 926 pages mapped
<6>[  871.52] 2024 pages slab
<6>[  871.52] 135 pages pagetables

--
ozon:/home/wolff# cat /proc/meminfo 
MemTotal:61296 kB
MemFree:  2752 kB
Buffers:  2228 kB
Cached:  29968 kB
SwapCached:  0 kB
Active:  22632 kB
Inactive:27056 kB
SwapTotal:   0 kB
SwapFree:0 kB
Dirty:   0 kB
Writeback:1208 kB
AnonPages:   17512 kB
Mapped:   3704 kB
Slab: 8088 kB
SReclaimable: 3656 kB
SUnreclaim:   4432 kB
PageTables:552 kB

Re: [PATCH 0/4] 2.6.21-rc7 NFS writes: fix a series of issues

2007-04-29 Thread Rogier Wolff

On Tue, Apr 17, 2007 at 10:37:38PM -0700, Andrew Morton wrote:
 Florin, can we please see /proc/meminfo as well?
 
 Also the result of `echo m  /proc/sysrq-trigger'

Hi,

It's been a while since this thread died out, but maybe I'm 
having the same problem. Networking, large part of memory is 
buffering writes. 

In my case I'm using NBD. 

Oh, 

/sys/block/nbd0/stat gives:
 636   88 5353 1700  99119554   16227263156   
43  1452000 61802352
I put some debugging stuff in nbd, and it DOES NOT KNOW about the
43 requests that the io scheduler claims are in flight at the
driver 

Those requests start a couple of seconds AFTER the whole thing
grinds to a halt.

I switched from crashing my 512Mb-ram-workstation to my development
machine, which has only 64M of RAM. (I got the development machine
back up and running after some effort). 

My rsync (and also sync if I call it, or reboot without -n) also
gets stuck in D state: 

4[  622.364000] rsync D 0019C170 0  2456   2455 (NOTLB)
4[  622.364000]c04d7c80 0086 c1f61ba8 0019c170  0008 
c31048a0 0003382e 
4[  622.364000] c24c9908 0286 c1092740 c3e12590 c2330a50 
c2330b5c 00061a80 
4[  622.364000]8639a400 0078 c04d7cd0  c10801b8 c04d7c88 
c03082af c0176c48 
4[  622.364000] Call Trace:
4[  622.364000]  [c03082af] io_schedule+0xe/0x16
4[  622.364000]  [c0176c48] sync_buffer+0x0/0x2e
4[  622.364000]  [c0176c73] sync_buffer+0x2b/0x2e
4[  622.364000]  [c03083b9] __wait_on_bit+0x2c/0x51
4[  622.364000]  [c0176c48] sync_buffer+0x0/0x2e
4[  622.364000]  [c0308451] out_of_line_wait_on_bit+0x73/0x7b
4[  622.364000]  [c012970e] wake_bit_function+0x0/0x3c
4[  622.364000]  [c012970e] wake_bit_function+0x0/0x3c
4[  622.364000]  [c0176cce] __wait_on_buffer+0x22/0x25
4[  622.364000]  [c0198cf0] ext3_find_entry+0x1aa/0x36f
4[  622.364000]  [c01a2324] journal_dirty_metadata+0x1b6/0x1d3
4[  622.364000]  [c01990e7] ext3_lookup+0x28/0xc6
4[  622.364000]  [c0161611] real_lookup+0x53/0xc2
4[  622.364000]  [c0161881] do_lookup+0x57/0x9d
4[  622.364000]  [c0162075] __link_path_walk+0x7ae/0xb81
4[  622.364000]  [c011cb77] __do_softirq+0x57/0x83
4[  622.364000]  [c0162485] link_path_walk+0x3d/0xa0
4[  622.364000]  [c015a4e7] sys_lchown+0x3c/0x44
4[  622.364000]  [c015a8c7] get_unused_fd+0xa0/0xbc
4[  622.364000]  [c0162840] do_path_lookup+0x1b7/0x200
4[  622.364000]  [c01628e1] __path_lookup_intent_open+0x42/0x72
4[  622.364000]  [c0162931] path_lookup_open+0x20/0x25
4[  622.364000]  [c0163026] open_namei+0x8c/0x532
4[  622.364000]  [c015a328] sys_fchmodat+0xac/0xb9
4[  622.364000]  [c015a71b] do_filp_open+0x25/0x39
4[  622.364000]  [c015a4e7] sys_lchown+0x3c/0x44
4[  622.364000]  [c015a8c7] get_unused_fd+0xa0/0xbc
4[  622.364000]  [c015a9d5] do_sys_open+0x42/0xbe
4[  622.364000]  [c015aa6b] sys_open+0x1a/0x1c
4[  622.364000]  [c0103dbc] syscall_call+0x7/0xb
4[  622.364000]  ===

--
6[  871.52] SysRq : Show Memory
6[  871.52] Mem-info:
4[  871.52] DMA per-cpu:
4[  871.52] CPU0: Hot: hi:0, btch:   1 usd:   0   Cold: hi:0, 
btch:   1 usd:   0
4[  871.52] Normal per-cpu:
4[  871.52] CPU0: Hot: hi:6, btch:   1 usd:   0   Cold: hi:2, 
btch:   1 usd:   0
4[  871.52] Active:5632 inactive:6764 dirty:0 writeback:302 unstable:0
4[  871.52]  free:717 slab:2024 mapped:926 pagetables:135 bounce:0
4[  871.52] DMA free:1104kB min:252kB low:312kB high:376kB active:3600kB 
inactive:6820kB present:16256kB pages_scanned:0 all_unreclaimable? no
4[  871.52] lowmem_reserve[]: 0 47
4[  871.52] Normal free:1764kB min:760kB low:948kB high:1140kB 
active:18928kB inactive:20236kB present:48708kB pages_scanned:0 
all_unreclaimable? no
4[  871.52] lowmem_reserve[]: 0 0
4[  871.52] DMA: 118*4kB 19*8kB 2*16kB 2*32kB 0*64kB 1*128kB 1*256kB 
0*512kB 0*1024kB 0*2048kB 0*4096kB = 1104kB
4[  871.52] Normal: 171*4kB 23*8kB 0*16kB 0*32kB 4*64kB 1*128kB 0*256kB 
1*512kB 0*1024kB 0*2048kB 0*4096kB = 1764kB
4[  871.52] Swap cache: add 0, delete 0, find 0/0, race 0+0
4[  871.52] Free swap  = 0kB
4[  871.52] Total swap = 0kB
6[  871.52] Free swap:0kB
6[  871.52] 16368 pages of RAM
6[  871.52] 0 pages of HIGHMEM
6[  871.52] 1044 reserved pages
6[  871.52] 13456 pages shared
6[  871.52] 0 pages swap cached
6[  871.52] 0 pages dirty
6[  871.52] 302 pages writeback
6[  871.52] 926 pages mapped
6[  871.52] 2024 pages slab
6[  871.52] 135 pages pagetables

--
ozon:/home/wolff# cat /proc/meminfo 
MemTotal:61296 kB
MemFree:  2752 kB
Buffers:  2228 kB
Cached:  29968 kB
SwapCached:  0 kB
Active:  22632 kB
Inactive:27056 kB
SwapTotal:   0 kB
SwapFree:0 kB
Dirty:   0 kB
Writeback:1208 kB
AnonPages:   17512 kB
Mapped:   3704

nbd hangs in 2.6.21

2007-04-28 Thread Rogier Wolff


Hi,

I've been doing some work with nbd-servers, but it seems they are
a bit unreliable right now. It seems to be the kernel side that
is locking up. 

Doing things like 

dd if=/dev/zero of=filesys bs=1k count=1 seek=1024000
nbd-server 1234 `pwd`/filesys 

and then 

nbd-client othersystem 1234 /dev/nd0 
mke2fs /dev/nd0
mount /dev/nd0 /mnt
cp -r /usr/src/linux/ /mnt/test1
cp -r /usr/src/linux/ /mnt/test2
sync

will usually do the trick: The sync will hang in disk-wait and never
come out of it. 

In my case "othersystem" is running 2.6.20. I don't think it is
causing the problems: the nbd-server is simply waiting for the next
request. I also tried a different codebase: nbdsvr. Same thing. 

Anybody else see this?

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

nbd hangs in 2.6.21

2007-04-28 Thread Rogier Wolff


Hi,

I've been doing some work with nbd-servers, but it seems they are
a bit unreliable right now. It seems to be the kernel side that
is locking up. 

Doing things like 

dd if=/dev/zero of=filesys bs=1k count=1 seek=1024000
nbd-server 1234 `pwd`/filesys 

and then 

nbd-client othersystem 1234 /dev/nd0 
mke2fs /dev/nd0
mount /dev/nd0 /mnt
cp -r /usr/src/linux/ /mnt/test1
cp -r /usr/src/linux/ /mnt/test2
sync

will usually do the trick: The sync will hang in disk-wait and never
come out of it. 

In my case othersystem is running 2.6.20. I don't think it is
causing the problems: the nbd-server is simply waiting for the next
request. I also tried a different codebase: nbdsvr. Same thing. 

Anybody else see this?

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: KLive: Linux Kernel Live Usage Monitor

2005-08-30 Thread Rogier Wolff

On Tue, Aug 30, 2005 at 10:53:13AM +0200, Sven Ladegast wrote:
> >A trick to use would be to send an UDP packet at boot (after 1 minute
> >or so), and then randomly say "once a month" (i.e. about 1/30 chance of
> >sending a packet on the first day) The number of these random packets
> >recieved is a measure of the number of CPU-months that the kernel
> >runs.
> 
> This could be a sloution but like you know UDP packets may or may not 
> arrive the destination address. So the packet loss with this method could 
> be very high, expecially if you send only one packet. Using a 
> TCP-connection for this is a lot more stable and the payload can be 
> encrypted too.

The "load" that an UDP packet poses on a system is much lower than
for a TCP connection. The fact that UDP packets sometimes get lost
is not much of an issue: Those packets simply wouldn't get logged.
So what?

In 90%  (my guess, 90% of statistics is made up) of the cases 
where the first packet doesn't reach the destination, any subsequent
packets also wouldn't. So if it is so unimportant as here, why bother
with the more overhead of the TCP connection?

The "in kernel module" that might send this, could put some easily
gathered information into the packet. The goal of logging kernels-
that-get-run would then be met. Installing a userspace program is
something that most testers won't be bothered to do.

A kernel option that is clearly documented what exact info is logged
would IMHO work better. (A userspace program is technically a better
solution, the social aspect of getting a bigger user-base is the main
reason for me to suggest the in-kernel approach).

(the people who go upgrading kernels tend to be different people from
those who go installing programs for fun.)

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: KLive: Linux Kernel Live Usage Monitor

2005-08-30 Thread Rogier Wolff

On Tue, Aug 30, 2005 at 10:01:21AM +0200, Sven Ladegast wrote:
> The idea isn't bad but lots of people could think that this is some kind 
> of home-phoning or spy software. I guess lots of people would turn this 
> feature off...and of course you can't enable it by default. But combined 
> with an automatic oops/panic/bug-report this would be _very_ useful I think.

It IS some "home phoning" and "spy software". However, when the 
goal is to sign you up for more direct marketing, people tend to 
object. When the goal is to keep track of running kernels, I'm
hopeful that people will recognise that this is different. 

A trick to use would be to send an UDP packet at boot (after 1 minute 
or so), and then randomly say "once a month" (i.e. about 1/30 chance of 
sending a packet on the first day) The number of these random packets
recieved is a measure of the number of CPU-months that the kernel
runs. 

Roger. 

--
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: KLive: Linux Kernel Live Usage Monitor

2005-08-30 Thread Rogier Wolff

On Tue, Aug 30, 2005 at 10:01:21AM +0200, Sven Ladegast wrote:
 The idea isn't bad but lots of people could think that this is some kind 
 of home-phoning or spy software. I guess lots of people would turn this 
 feature off...and of course you can't enable it by default. But combined 
 with an automatic oops/panic/bug-report this would be _very_ useful I think.

It IS some home phoning and spy software. However, when the 
goal is to sign you up for more direct marketing, people tend to 
object. When the goal is to keep track of running kernels, I'm
hopeful that people will recognise that this is different. 

A trick to use would be to send an UDP packet at boot (after 1 minute 
or so), and then randomly say once a month (i.e. about 1/30 chance of 
sending a packet on the first day) The number of these random packets
recieved is a measure of the number of CPU-months that the kernel
runs. 

Roger. 

--
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: KLive: Linux Kernel Live Usage Monitor

2005-08-30 Thread Rogier Wolff

On Tue, Aug 30, 2005 at 10:53:13AM +0200, Sven Ladegast wrote:
 A trick to use would be to send an UDP packet at boot (after 1 minute
 or so), and then randomly say once a month (i.e. about 1/30 chance of
 sending a packet on the first day) The number of these random packets
 recieved is a measure of the number of CPU-months that the kernel
 runs.
 
 This could be a sloution but like you know UDP packets may or may not 
 arrive the destination address. So the packet loss with this method could 
 be very high, expecially if you send only one packet. Using a 
 TCP-connection for this is a lot more stable and the payload can be 
 encrypted too.

The load that an UDP packet poses on a system is much lower than
for a TCP connection. The fact that UDP packets sometimes get lost
is not much of an issue: Those packets simply wouldn't get logged.
So what?

In 90%  (my guess, 90% of statistics is made up) of the cases 
where the first packet doesn't reach the destination, any subsequent
packets also wouldn't. So if it is so unimportant as here, why bother
with the more overhead of the TCP connection?

The in kernel module that might send this, could put some easily
gathered information into the packet. The goal of logging kernels-
that-get-run would then be met. Installing a userspace program is
something that most testers won't be bothered to do.

A kernel option that is clearly documented what exact info is logged
would IMHO work better. (A userspace program is technically a better
solution, the social aspect of getting a bigger user-base is the main
reason for me to suggest the in-kernel approach).

(the people who go upgrading kernels tend to be different people from
those who go installing programs for fun.)

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux tty layer hackery: Heads up and RFC

2005-07-22 Thread Rogier Wolff

On Thu, Jul 21, 2005 at 06:46:32PM +0100, Alan Cox wrote:
> int tty_prepare_flip_string(tty, strptr, len)
> 
> Adjust the buffer to allow len characters to be added. Returns a buffer
> pointer in strptr and the length available. This allows for hardware
> that needs to use functions like insl or mencpy_fromio.

Ok, So then I start copying characters into the flipstring, but how do
I say I'm done?

Or is there a race between that I call tty_prepare_flip_string, and
other processes start pulling my not-yet-filled string from the
buffer? (Surely not!)

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux tty layer hackery: Heads up and RFC

2005-07-22 Thread Rogier Wolff

On Thu, Jul 21, 2005 at 06:46:32PM +0100, Alan Cox wrote:
 int tty_prepare_flip_string(tty, strptr, len)
 
 Adjust the buffer to allow len characters to be added. Returns a buffer
 pointer in strptr and the length available. This allows for hardware
 that needs to use functions like insl or mencpy_fromio.

Ok, So then I start copying characters into the flipstring, but how do
I say I'm done?

Or is there a race between that I call tty_prepare_flip_string, and
other processes start pulling my not-yet-filled string from the
buffer? (Surely not!)

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] pci_find_device --> pci_get_device

2005-07-19 Thread Rogier Wolff

On Tue, Jul 19, 2005 at 12:53:38PM +0200, Jiri Slaby wrote:
> Rogier Wolff napsal(a):

> I don't know, if you think it global, or if am I here with other
> fellows (no, I'm not).  I don't know what kind of comment you have
> on your mind. Could you, please, specify it more.  I only changed
> names of called functions and added some pci_dev_put, what should I
> comment?

I meant: that IF I'm right that there needs to be more pci_dev_put, 
that needs to be noted in the source. 

If you know that there needs to be a put, but you have decided you
don't know where to put it exactly, then that needs to be in a comment
so that someone looking at the code after you will be able to put it
in. Otherwise the new person will think: He might have had a smarter
plan and will do the pci_dev_put somewhere else. 

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] pci_find_device -- pci_get_device

2005-07-19 Thread Rogier Wolff

On Tue, Jul 19, 2005 at 12:53:38PM +0200, Jiri Slaby wrote:
 Rogier Wolff napsal(a):

 I don't know, if you think it global, or if am I here with other
 fellows (no, I'm not).  I don't know what kind of comment you have
 on your mind. Could you, please, specify it more.  I only changed
 names of called functions and added some pci_dev_put, what should I
 comment?

I meant: that IF I'm right that there needs to be more pci_dev_put, 
that needs to be noted in the source. 

If you know that there needs to be a put, but you have decided you
don't know where to put it exactly, then that needs to be in a comment
so that someone looking at the code after you will be able to put it
in. Otherwise the new person will think: He might have had a smarter
plan and will do the pci_dev_put somewhere else. 

Roger. 


-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] pci_find_device --> pci_get_device

2005-07-18 Thread Rogier Wolff

On Tue, Jul 19, 2005 at 02:25:23AM +0200, Jiri Slaby wrote:
> The patch is for mixed files from all over the tree.
> 
> Kernel version: 2.6.13-rc3-git4
> 
> * This patch removes from kernel tree pci_find_device and changes
> it with pci_get_device. Next, it adds pci_dev_put, to decrease reference
> count of the variable.
> * Next, there are some (about 10 or so) gcc warning problems (i. e.
> variable may be unitialized) solutions, which were around code with old
> pci_find_device.
> * Some code was unpretty, or ugly, so the patch provides more readable
> code, in some cases.
> * Marks the function as deprecated in pci.h

Hi Jiri, 

The patch grabs reference counts to pdev structures, but almost never
decreases the reference counts. 

If you are working in a team, and want others to be able to continue
where you left off, you should add a comment, even if it is repetitive
to state what needs to be done. 

As far as I can see, you grab a reference to the "pdev" on 
initialization, and never release it. Or you only release it in 
certain error conditions. Would this make the driver unable to 
be unloaded and reloaded? That would not be good. 

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] pci_find_device -- pci_get_device

2005-07-18 Thread Rogier Wolff

On Tue, Jul 19, 2005 at 02:25:23AM +0200, Jiri Slaby wrote:
 The patch is for mixed files from all over the tree.
 
 Kernel version: 2.6.13-rc3-git4
 
 * This patch removes from kernel tree pci_find_device and changes
 it with pci_get_device. Next, it adds pci_dev_put, to decrease reference
 count of the variable.
 * Next, there are some (about 10 or so) gcc warning problems (i. e.
 variable may be unitialized) solutions, which were around code with old
 pci_find_device.
 * Some code was unpretty, or ugly, so the patch provides more readable
 code, in some cases.
 * Marks the function as deprecated in pci.h

Hi Jiri, 

The patch grabs reference counts to pdev structures, but almost never
decreases the reference counts. 

If you are working in a team, and want others to be able to continue
where you left off, you should add a comment, even if it is repetitive
to state what needs to be done. 

As far as I can see, you grab a reference to the pdev on 
initialization, and never release it. Or you only release it in 
certain error conditions. Would this make the driver unable to 
be unloaded and reloaded? That would not be good. 

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/82] changing CONFIG_LOCALVERSION rebuilds too much, for no good reason.

2005-07-17 Thread Rogier Wolff

On Sun, Jul 10, 2005 at 09:18:15PM -0700, Nish Aravamudan wrote:
> one for the SCSI subsystem. If those individual driver maintainers'
> files are being modified, should they be CC'ed, or is the big patch
> just sent to the SCSI maintainer (in this example)? I just want to
> make sure the correct patch-chain is respected.

As a patch maintainer, CCed on ONE of the 82 patches, I wouldn't have
minded getting CCed on a patch that included the change to my driver. 
(as chunk xx/82 in the diff).

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/82] changing CONFIG_LOCALVERSION rebuilds too much, for no good reason.

2005-07-17 Thread Rogier Wolff

On Sun, Jul 10, 2005 at 09:18:15PM -0700, Nish Aravamudan wrote:
 one for the SCSI subsystem. If those individual driver maintainers'
 files are being modified, should they be CC'ed, or is the big patch
 just sent to the SCSI maintainer (in this example)? I just want to
 make sure the correct patch-chain is respected.

As a patch maintainer, CCed on ONE of the 82 patches, I wouldn't have
minded getting CCed on a patch that included the change to my driver. 
(as chunk xx/82 in the diff).

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH][2.6.11] generic_serial.h gcc4 fix

2005-03-15 Thread Rogier Wolff

On Tue, Mar 15, 2005 at 06:39:46PM +0100, Adrian Bunk wrote:
> > @@ -91,6 +91,4 @@ int  gs_setserial(struct gs_port *port, 
> >  int  gs_getserial(struct gs_port *port, struct serial_struct __user *sp);
> >  void gs_got_break(struct gs_port *port);
> >  
> > -extern int gs_debug;
> > -
> >  #endif
> 
> This patch is already in -mm for ages.
> 
> When doing such patches, -mm is usually a better basis than Linus' tree.

Note that the original reason for doing "extern int gs_debug" was that
sx.c used to have an ioctl to fiddle with it "live". Apparently
someone removed that piece of useful, but(t) ugly code, as it is no
longer there.

Roger. 


-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH][2.6.11] generic_serial.h gcc4 fix

2005-03-15 Thread Rogier Wolff

On Tue, Mar 15, 2005 at 06:39:46PM +0100, Adrian Bunk wrote:
  @@ -91,6 +91,4 @@ int  gs_setserial(struct gs_port *port, 
   int  gs_getserial(struct gs_port *port, struct serial_struct __user *sp);
   void gs_got_break(struct gs_port *port);
   
  -extern int gs_debug;
  -
   #endif
 
 This patch is already in -mm for ages.
 
 When doing such patches, -mm is usually a better basis than Linus' tree.

Note that the original reason for doing extern int gs_debug was that
sx.c used to have an ioctl to fiddle with it live. Apparently
someone removed that piece of useful, but(t) ugly code, as it is no
longer there.

Roger. 


-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] partitions/msdos.c

2005-02-28 Thread Rogier Wolff

On Sun, Feb 27, 2005 at 12:40:53AM +0100, Andries Brouwer wrote:
> (Concerning the "size" version: it occurred to me that there is one
> very minor objection: For extended partitions so far the size did
> not normally play a role. Only the starting sector was significant.
> If, at some moment we decide also to check the size, then a weaker
> check, namely only checking for non-extended partitions, might be
> better at first.)

I recently encountered a disk that had clipping enabled. If you go
for the size implementation be careful that people can still run a 
program to unclip the disk after the disk has been detected and the
partition rejected 

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] partitions/msdos.c

2005-02-28 Thread Rogier Wolff

On Sun, Feb 27, 2005 at 12:40:53AM +0100, Andries Brouwer wrote:
 (Concerning the size version: it occurred to me that there is one
 very minor objection: For extended partitions so far the size did
 not normally play a role. Only the starting sector was significant.
 If, at some moment we decide also to check the size, then a weaker
 check, namely only checking for non-extended partitions, might be
 better at first.)

I recently encountered a disk that had clipping enabled. If you go
for the size implementation be careful that people can still run a 
program to unclip the disk after the disk has been detected and the
partition rejected 

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Bug when using custom baud rates....

2005-01-20 Thread Rogier Wolff

On Thu, Jan 20, 2005 at 07:08:58AM -0800, Greg KH wrote:
> On Thu, Jan 20, 2005 at 03:54:22PM +0100, Rogier Wolff wrote:
> > Hi,
> > 
> > When using custom baud rates, the code does: 
> > 
> > 
> >if ((new_serial.baud_base != priv->baud_base) ||
> > (new_serial.baud_base < 9600))
> > return -EINVAL;
> > 
> > Which translates to english as: 
> > 
> > If you changed the baud-base, OR the new one is
> > invalid, return invalid. 
> > 
> > but it should be:
> > 
> > If you changed the baud-base, OR the new one is
> > invalid, return invalid. 
> 
> You mean AND, not OR here, right?  :)

:-) Sorry. Too noisy here. 

> > Patch attached. 
> 
> Have a 2.6 patch?

Patch told me: 
   patching file drivers/usb/serial/ftdi_sio.c
   Hunk #1 succeeded at 1137 (offset 156 lines).

but the resulting patch is attached. 

Roger. 

-- 
+-- Rogier Wolff -- www.harddisk-recovery.nl -- 0800 220 20 20 --
| Files foetsie, bestanden kwijt, alle data weg?!
| Blijf kalm en neem contact op met Harddisk-recovery.nl!
diff -ur linux-2.6.11-r1-clean/drivers/usb/serial/ftdi_sio.c 
linux-2.6.11-r1-ftdio_fix/drivers/usb/serial/ftdi_sio.c
--- linux-2.6.11-r1-panoramix/drivers/usb/serial/ftdi_sio.c Wed Jan 12 
09:19:32 2005
+++ linux-2.6.11-r1-ftdio_fix/drivers/usb/serial/ftdi_sio.c Thu Jan 20 
16:20:24 2005
@@ -1137,7 +1137,7 @@
goto check_and_exit;
}
 
-   if ((new_serial.baud_base != priv->baud_base) ||
+   if ((new_serial.baud_base != priv->baud_base) &&
(new_serial.baud_base < 9600))
return -EINVAL;

PATCH: nbd fix.

2005-01-20 Thread Rogier Wolff


The NBD driver seems to require CAP_SYSADMIN capabilities for 
innocent things like asking what the capacity is. 

Patch attached. 

Roger. 


-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
diff -ur linux-2.4.28.clean/drivers/block/nbd.c 
linux-2.4.28.nbd-fix/drivers/block/nbd.c
--- linux-2.4.28.clean/drivers/block/nbd.c  Wed Jan 19 18:14:01 2005
+++ linux-2.4.28.nbd-fix/drivers/block/nbd.cWed Jan 19 16:36:59 2005
@@ -408,10 +408,7 @@
int dev, error, temp;
struct request sreq ;
 
-   /* Anyone capable of this syscall can do *real bad* things */
 
-   if (!capable(CAP_SYS_ADMIN))
-   return -EPERM;
if (!inode)
return -EINVAL;
dev = MINOR(inode->i_rdev);
@@ -419,6 +416,20 @@
return -ENODEV;
 
lo = _dev[dev];
+
+   /* these are innocent, but */
+   switch (cmd) {
+   case BLKGETSIZE:
+   return put_user(nbd_bytesizes[dev] >> 9, (unsigned long *) arg);
+   case BLKGETSIZE64:
+   return put_user((u64)nbd_bytesizes[dev], (u64 *) arg);
+   }
+
+   /* ... anyone capable of any of the below ioctls can do *real bad* 
+  things */
+   if (!capable(CAP_SYS_ADMIN))
+   return -EPERM;
+
switch (cmd) {
case NBD_DISCONNECT:
printk("NBD_DISCONNECT\n");
@@ -524,10 +535,6 @@
   dev, lo->queue_head.next, lo->queue_head.prev, 
requests_in, requests_out);
return 0;
 #endif
-   case BLKGETSIZE:
-   return put_user(nbd_bytesizes[dev] >> 9, (unsigned long *) arg);
-   case BLKGETSIZE64:
-   return put_user((u64)nbd_bytesizes[dev], (u64 *) arg);
}
return -EINVAL;
 }

Bug when using custom baud rates....

2005-01-20 Thread Rogier Wolff

Hi,

When using custom baud rates, the code does: 


   if ((new_serial.baud_base != priv->baud_base) ||
(new_serial.baud_base < 9600))
return -EINVAL;

Which translates to english as: 

If you changed the baud-base, OR the new one is
invalid, return invalid. 

but it should be:

If you changed the baud-base, OR the new one is
invalid, return invalid. 

Patch attached. 

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
diff -ur linux-2.4.28.clean/drivers/usb/serial/ftdi_sio.c 
linux-2.4.28.ftdi_fix/drivers/usb/serial/ftdi_sio.c
--- linux-2.4.28.clean/drivers/usb/serial/ftdi_sio.cWed Jan 19 16:31:03 2005
+++ linux-2.4.28.ftdi_fix/drivers/usb/serial/ftdi_sio.c Thu Jan 20 15:47:49 2005
@@ -981,7 +981,7 @@
goto check_and_exit;
}
 
-   if ((new_serial.baud_base != priv->baud_base) ||
+   if ((new_serial.baud_base != priv->baud_base) &&
(new_serial.baud_base < 9600))
return -EINVAL;
 
Only in linux-2.4.28.ftdi_fix/drivers/usb/serial: ftdi_sio.c~

PATCH: nbd fix.

2005-01-20 Thread Rogier Wolff


The NBD driver seems to require CAP_SYSADMIN capabilities for 
innocent things like asking what the capacity is. 

Patch attached. 

Roger. 


-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
diff -ur linux-2.4.28.clean/drivers/block/nbd.c 
linux-2.4.28.nbd-fix/drivers/block/nbd.c
--- linux-2.4.28.clean/drivers/block/nbd.c  Wed Jan 19 18:14:01 2005
+++ linux-2.4.28.nbd-fix/drivers/block/nbd.cWed Jan 19 16:36:59 2005
@@ -408,10 +408,7 @@
int dev, error, temp;
struct request sreq ;
 
-   /* Anyone capable of this syscall can do *real bad* things */
 
-   if (!capable(CAP_SYS_ADMIN))
-   return -EPERM;
if (!inode)
return -EINVAL;
dev = MINOR(inode-i_rdev);
@@ -419,6 +416,20 @@
return -ENODEV;
 
lo = nbd_dev[dev];
+
+   /* these are innocent, but */
+   switch (cmd) {
+   case BLKGETSIZE:
+   return put_user(nbd_bytesizes[dev]  9, (unsigned long *) arg);
+   case BLKGETSIZE64:
+   return put_user((u64)nbd_bytesizes[dev], (u64 *) arg);
+   }
+
+   /* ... anyone capable of any of the below ioctls can do *real bad* 
+  things */
+   if (!capable(CAP_SYS_ADMIN))
+   return -EPERM;
+
switch (cmd) {
case NBD_DISCONNECT:
printk(NBD_DISCONNECT\n);
@@ -524,10 +535,6 @@
   dev, lo-queue_head.next, lo-queue_head.prev, 
requests_in, requests_out);
return 0;
 #endif
-   case BLKGETSIZE:
-   return put_user(nbd_bytesizes[dev]  9, (unsigned long *) arg);
-   case BLKGETSIZE64:
-   return put_user((u64)nbd_bytesizes[dev], (u64 *) arg);
}
return -EINVAL;
 }

Bug when using custom baud rates....

2005-01-20 Thread Rogier Wolff

Hi,

When using custom baud rates, the code does: 


   if ((new_serial.baud_base != priv-baud_base) ||
(new_serial.baud_base  9600))
return -EINVAL;

Which translates to english as: 

If you changed the baud-base, OR the new one is
invalid, return invalid. 

but it should be:

If you changed the baud-base, OR the new one is
invalid, return invalid. 

Patch attached. 

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2600998 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. - Adapted from lxrbot FAQ
diff -ur linux-2.4.28.clean/drivers/usb/serial/ftdi_sio.c 
linux-2.4.28.ftdi_fix/drivers/usb/serial/ftdi_sio.c
--- linux-2.4.28.clean/drivers/usb/serial/ftdi_sio.cWed Jan 19 16:31:03 2005
+++ linux-2.4.28.ftdi_fix/drivers/usb/serial/ftdi_sio.c Thu Jan 20 15:47:49 2005
@@ -981,7 +981,7 @@
goto check_and_exit;
}
 
-   if ((new_serial.baud_base != priv-baud_base) ||
+   if ((new_serial.baud_base != priv-baud_base) 
(new_serial.baud_base  9600))
return -EINVAL;
 
Only in linux-2.4.28.ftdi_fix/drivers/usb/serial: ftdi_sio.c~

Re: Bug when using custom baud rates....

2005-01-20 Thread Rogier Wolff

On Thu, Jan 20, 2005 at 07:08:58AM -0800, Greg KH wrote:
 On Thu, Jan 20, 2005 at 03:54:22PM +0100, Rogier Wolff wrote:
  Hi,
  
  When using custom baud rates, the code does: 
  
  
 if ((new_serial.baud_base != priv-baud_base) ||
  (new_serial.baud_base  9600))
  return -EINVAL;
  
  Which translates to english as: 
  
  If you changed the baud-base, OR the new one is
  invalid, return invalid. 
  
  but it should be:
  
  If you changed the baud-base, OR the new one is
  invalid, return invalid. 
 
 You mean AND, not OR here, right?  :)

:-) Sorry. Too noisy here. 

  Patch attached. 
 
 Have a 2.6 patch?

Patch told me: 
   patching file drivers/usb/serial/ftdi_sio.c
   Hunk #1 succeeded at 1137 (offset 156 lines).

but the resulting patch is attached. 

Roger. 

-- 
+-- Rogier Wolff -- www.harddisk-recovery.nl -- 0800 220 20 20 --
| Files foetsie, bestanden kwijt, alle data weg?!
| Blijf kalm en neem contact op met Harddisk-recovery.nl!
diff -ur linux-2.6.11-r1-clean/drivers/usb/serial/ftdi_sio.c 
linux-2.6.11-r1-ftdio_fix/drivers/usb/serial/ftdi_sio.c
--- linux-2.6.11-r1-panoramix/drivers/usb/serial/ftdi_sio.c Wed Jan 12 
09:19:32 2005
+++ linux-2.6.11-r1-ftdio_fix/drivers/usb/serial/ftdi_sio.c Thu Jan 20 
16:20:24 2005
@@ -1137,7 +1137,7 @@
goto check_and_exit;
}
 
-   if ((new_serial.baud_base != priv-baud_base) ||
+   if ((new_serial.baud_base != priv-baud_base) 
(new_serial.baud_base  9600))
return -EINVAL;

Re: loop device broken in 2.4.6-pre5

2001-06-26 Thread Rogier Wolff


[EMAIL PROTECTED] wrote:
> From [EMAIL PROTECTED] Tue Jun 26 10:20:51 2001
> 
> This patch fixes the problem. Please consider applying.
> 
> --- linux-2.4.6-pre5/drivers/block/loop.cSat Jun 23 07:52:39 2001
> +++ linux/drivers/block/loop.cTue Jun 26 09:21:47 2001
> @@ -653,7 +653,7 @@
>  bs = 0;
>  if (blksize_size[MAJOR(lo_device)])
>  bs = blksize_size[MAJOR(lo_device)][MINOR(lo_device)];
> -if (!bs)
> +if (!bs || S_ISREG(inode->i_mode))
>  bs = BLOCK_SIZE;
>  
>  set_blocksize(dev, bs);
> 
> But why 1024? Next week your neighbour comes and has a file-backed
> loop device with an odd number of 512-byte sectors.
> If you want a guarantee, then I suppose one should pick 512.
> (Or make the set blocksize ioctl also work on loop devices.)

I thought the change was a "quick hack" that would make stuff work
(page cache?) near the end of the file. That would mean that this kind
of "quick hack" won't work. 

But if it does anyway, then indeed 512 would be a more appropriate
choice.

I thought I had to convince people, so I chose a sitation that I hoped
people would understand to be likely/possible, to prevent reactions:
"No filesystem will use the last odd numbered 512bytes of a
partition".

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* There are old pilots, and there are bold pilots. 
* There are also old, bald pilots. 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: loop device broken in 2.4.6-pre5

2001-06-26 Thread Rogier Wolff


[EMAIL PROTECTED] wrote:
 From [EMAIL PROTECTED] Tue Jun 26 10:20:51 2001
 
 This patch fixes the problem. Please consider applying.
 
 --- linux-2.4.6-pre5/drivers/block/loop.cSat Jun 23 07:52:39 2001
 +++ linux/drivers/block/loop.cTue Jun 26 09:21:47 2001
 @@ -653,7 +653,7 @@
  bs = 0;
  if (blksize_size[MAJOR(lo_device)])
  bs = blksize_size[MAJOR(lo_device)][MINOR(lo_device)];
 -if (!bs)
 +if (!bs || S_ISREG(inode-i_mode))
  bs = BLOCK_SIZE;
  
  set_blocksize(dev, bs);
 
 But why 1024? Next week your neighbour comes and has a file-backed
 loop device with an odd number of 512-byte sectors.
 If you want a guarantee, then I suppose one should pick 512.
 (Or make the set blocksize ioctl also work on loop devices.)

I thought the change was a quick hack that would make stuff work
(page cache?) near the end of the file. That would mean that this kind
of quick hack won't work. 

But if it does anyway, then indeed 512 would be a more appropriate
choice.

I thought I had to convince people, so I chose a sitation that I hoped
people would understand to be likely/possible, to prevent reactions:
No filesystem will use the last odd numbered 512bytes of a
partition.

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* There are old pilots, and there are bold pilots. 
* There are also old, bald pilots. 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: loop device broken in 2.4.6-pre5

2001-06-25 Thread Rogier Wolff


[EMAIL PROTECTED] wrote:
> From: Jari Ruusu <[EMAIL PROTECTED]>
> 
> File backed loop device on 4k block size ext2 filesystem:
> 
> # dd if=/dev/zero of=file1 bs=1024 count=10
> 10+0 records in
> 10+0 records out
> # losetup /dev/loop0 file1
> # dd if=/dev/zero of=/dev/loop0 bs=1024 count=10 conv=notrunc
> dd: /dev/loop0: No space left on device
> 9+0 records in
> 8+0 records out
> # tune2fs -l /dev/hda1 2>&1| grep "Block size"
> Block size:   4096
> # uname -a
> Linux debian 2.4.6-pre5 #1 Thu Jun 21 14:27:25 EEST 2001 i686 unknown
> 
> Stock 2.4.5 and 2.4.5-ac15 don't have this problem.
> 
> I am not sure there is an error here.

How about:

dd if=/dev/hda1 of=disk.img bs=1k 
mount disk.img /mnt/d1 -o loop


If the filesystem on hda1 happens to use the last 2k of the partition,
and the partition size is 2k mod 4k, then I get a non-working disk.img
if I don't pad the disk.img file with another 2k. And then I might
trip up the "how big is this partition" code in the fs-driver

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* There are old pilots, and there are bold pilots. 
* There are also old, bald pilots. 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: loop device broken in 2.4.6-pre5

2001-06-25 Thread Rogier Wolff


[EMAIL PROTECTED] wrote:
 From: Jari Ruusu [EMAIL PROTECTED]
 
 File backed loop device on 4k block size ext2 filesystem:
 
 # dd if=/dev/zero of=file1 bs=1024 count=10
 10+0 records in
 10+0 records out
 # losetup /dev/loop0 file1
 # dd if=/dev/zero of=/dev/loop0 bs=1024 count=10 conv=notrunc
 dd: /dev/loop0: No space left on device
 9+0 records in
 8+0 records out
 # tune2fs -l /dev/hda1 21| grep Block size
 Block size:   4096
 # uname -a
 Linux debian 2.4.6-pre5 #1 Thu Jun 21 14:27:25 EEST 2001 i686 unknown
 
 Stock 2.4.5 and 2.4.5-ac15 don't have this problem.
 
 I am not sure there is an error here.

How about:

dd if=/dev/hda1 of=disk.img bs=1k 
mount disk.img /mnt/d1 -o loop


If the filesystem on hda1 happens to use the last 2k of the partition,
and the partition size is 2k mod 4k, then I get a non-working disk.img
if I don't pad the disk.img file with another 2k. And then I might
trip up the how big is this partition code in the fs-driver

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* There are old pilots, and there are bold pilots. 
* There are also old, bald pilots. 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: obsolete code must die

2001-06-15 Thread Rogier Wolff


Alan Cox wrote:
> > Would it make sense to create some sort of 'make config' script that
> > determines what you want in your kernel and then downloads only those
> > components? After all, with the constant release of new hardware, isn't a
> > 50MB kernel release not too far away? 100MB?
> 
> This should be a FAQ entry.

It is. 7.7 .

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* There are old pilots, and there are bold pilots. 
* There are also old, bald pilots. 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: obsolete code must die

2001-06-15 Thread Rogier Wolff


Alan Cox wrote:
  Would it make sense to create some sort of 'make config' script that
  determines what you want in your kernel and then downloads only those
  components? After all, with the constant release of new hardware, isn't a
  50MB kernel release not too far away? 100MB?
 
 This should be a FAQ entry.

It is. 7.7 .

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* There are old pilots, and there are bold pilots. 
* There are also old, bald pilots. 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: rtl8139too in 2.4.5

2001-06-04 Thread Rogier Wolff


[EMAIL PROTECTED] wrote:
> My RTL8139 (Identified 8139 chip type 'RTL-8139A')
> was fine in 2.4.3 and doesnt work in 2.4.5.
> Copying the 2.4.3 version of 8139too.c makes things work again.
> 
> Since lots of people complained about this, I have not tried to
> debug - maybe a fixed version already exists?

We upgraded to 2.4.5-ac2 for some test, noted that the ethernet card
was agian in 6-packets-per-second mode (i.e. very slow) and then
continued to 2.4.5-ac4 where the driver was reverted to the one in
2.4.3. That worked.

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* There are old pilots, and there are bold pilots. 
* There are also old, bald pilots. 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: rtl8139too in 2.4.5

2001-06-04 Thread Rogier Wolff


[EMAIL PROTECTED] wrote:
 My RTL8139 (Identified 8139 chip type 'RTL-8139A')
 was fine in 2.4.3 and doesnt work in 2.4.5.
 Copying the 2.4.3 version of 8139too.c makes things work again.
 
 Since lots of people complained about this, I have not tried to
 debug - maybe a fixed version already exists?

We upgraded to 2.4.5-ac2 for some test, noted that the ethernet card
was agian in 6-packets-per-second mode (i.e. very slow) and then
continued to 2.4.5-ac4 where the driver was reverted to the one in
2.4.3. That worked.

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* There are old pilots, and there are bold pilots. 
* There are also old, bald pilots. 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: ECN is on!

2001-05-22 Thread Rogier Wolff


Richard Gooch wrote:
> Dave sent a message out a week or two ago saying he was going to do it
> soon. And back in January he said he'd be doing it in February. The
> kernel list FAQ has stated this right at the top, in big, bright red
> letters. Yesterday, after I saw Dave's announcement, I updated the FAQ
> to reflect that we're now running ECN.
> 
> People have had plenty of warning. Think of it as a bonus that it
> didn't happen back in February. They've had an extra 3 months to sort
> something out.

The "we'll turn it on in February" warning is worth NOTHING in this
situation: February comes and goes. March comes and goes. Everybody
who read the warning will think: Ok, so I must be fine.

A warning of the form: "ECN will go on as soon as this message clears
the queues" would've been useful, as thousands (hundreds?) suddenly get
nothing anymore.

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* There are old pilots, and there are bold pilots. 
* There are also old, bald pilots. 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: ECN is on!

2001-05-22 Thread Rogier Wolff


Richard Gooch wrote:
 Dave sent a message out a week or two ago saying he was going to do it
 soon. And back in January he said he'd be doing it in February. The
 kernel list FAQ has stated this right at the top, in big, bright red
 letters. Yesterday, after I saw Dave's announcement, I updated the FAQ
 to reflect that we're now running ECN.
 
 People have had plenty of warning. Think of it as a bonus that it
 didn't happen back in February. They've had an extra 3 months to sort
 something out.

The we'll turn it on in February warning is worth NOTHING in this
situation: February comes and goes. March comes and goes. Everybody
who read the warning will think: Ok, so I must be fine.

A warning of the form: ECN will go on as soon as this message clears
the queues would've been useful, as thousands (hundreds?) suddenly get
nothing anymore.

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* There are old pilots, and there are bold pilots. 
* There are also old, bald pilots. 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Getting FS access events

2001-05-18 Thread Rogier Wolff

Linus Torvalds wrote:
> I'm really serious about doing "resume from disk". If you want a fast
> boot, I will bet you a dollar that you cannot do it faster than by loading
> a contiguous image of several megabytes contiguously into memory. There is
> NO overhead, you're pretty much guaranteed platter speeds, and there are
> no issues about trying to order accesses etc. There are also no issues
> about messing up any run-time data structures.

Linus, 

The "boot quickly" was an example. "Load netscape quickly" on some
systems is done by dd-ing the binary to /dev/null. 

Now, you're going to say again that this won't work because of
buffer-cache/page-cache incoherency.  That is NOT the point. The point
is that the fun about a cache is that it's just a cache. It speeds
things up transparently. 

If I need a new "prime-the-cache" program to mmap the files, and
trigger a page-in in the right order, then that's fine with me.

The fun about doing these tricks is that it works, and keeps on
working (functionally) even if it stops working (fast).

Yes, there is a way to boot even faster: preloading memory. Fine. But
this doesn't allow me to load netscape quicker.

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* There are old pilots, and there are bold pilots. 
* There are also old, bald pilots. 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Getting FS access events

2001-05-18 Thread Rogier Wolff


Linus Torvalds wrote:
 I'm really serious about doing resume from disk. If you want a fast
 boot, I will bet you a dollar that you cannot do it faster than by loading
 a contiguous image of several megabytes contiguously into memory. There is
 NO overhead, you're pretty much guaranteed platter speeds, and there are
 no issues about trying to order accesses etc. There are also no issues
 about messing up any run-time data structures.

Linus, 

The boot quickly was an example. Load netscape quickly on some
systems is done by dd-ing the binary to /dev/null. 

Now, you're going to say again that this won't work because of
buffer-cache/page-cache incoherency.  That is NOT the point. The point
is that the fun about a cache is that it's just a cache. It speeds
things up transparently. 

If I need a new prime-the-cache program to mmap the files, and
trigger a page-in in the right order, then that's fine with me.

The fun about doing these tricks is that it works, and keeps on
working (functionally) even if it stops working (fast).

Yes, there is a way to boot even faster: preloading memory. Fine. But
this doesn't allow me to load netscape quicker.

Roger. 

-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* There are old pilots, and there are bold pilots. 
* There are also old, bald pilots. 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] RIO, SX, driver update.

2001-05-14 Thread Rogier Wolff



Retry. This time with patch 

Roger. 



Rogier Wolff wrote:
> 
> Hi Linus, Alan, 
> 
> The patch below implements breaks (correctly) for the RIO and SX
> cards.
> 
> We started out trying to fix one thing, but found that the 2.4.4 rio
> driver was behind on several patches.
> 
>   Roger Wolff. 
>   Patrick van de Lageweg. 
> 
> ---


diff -u -r linux-2.4.4.clean/drivers/char/generic_serial.c 
linux-2.4.4.rio_close/drivers/char/generic_serial.c
--- linux-2.4.4.clean/drivers/char/generic_serial.c Fri Dec 29 23:35:47 2000
+++ linux-2.4.4.rio_close/drivers/char/generic_serial.c Mon May 14 16:36:39 2001
@@ -344,7 +344,7 @@
struct gs_port *port = ptr;
long end_jiffies;
int jiffies_to_transmit, charsleft = 0, rv = 0;
-   int to, rcib;
+   int rcib;
 
func_enter();
 
@@ -368,6 +368,7 @@
return rv;
}
/* stop trying: now + twice the time it would normally take +  seconds */
+   if (timeout == 0) timeout = MAX_SCHEDULE_TIMEOUT;
end_jiffies  = jiffies; 
if (timeout !=  MAX_SCHEDULE_TIMEOUT)
end_jiffies += port->baud?(2 * rcib * 10 * HZ / port->baud):0;
@@ -376,11 +377,9 @@
gs_dprintk (GS_DEBUG_FLUSH, "now=%lx, end=%lx (%ld).\n", 
jiffies, end_jiffies, end_jiffies-jiffies); 
 
-   to = 100;
/* the expression is actually jiffies < end_jiffies, but that won't
   work around the wraparound. Tricky eh? */
-   while (to-- &&
-  (charsleft = gs_real_chars_in_buffer (port->tty)) &&
+   while ((charsleft = gs_real_chars_in_buffer (port->tty)) &&
time_after (end_jiffies, jiffies)) {
/* Units check: 
   chars * (bits/char) * (jiffies /sec) / (bits/sec) = jiffies!
@@ -1059,6 +1058,19 @@
copy_to_user(sp, , sizeof(struct serial_struct));
 }
 
+
+void gs_got_break(struct gs_port *port)
+{
+   if (port->flags & ASYNC_SAK) {
+   do_SAK (port->tty);
+   }
+   *(port->tty->flip.flag_buf_ptr) = TTY_BREAK;
+   port->tty->flip.flag_buf_ptr++;
+   port->tty->flip.char_buf_ptr++;
+   port->tty->flip.count++;
+}
+
+
 EXPORT_SYMBOL(gs_put_char);
 EXPORT_SYMBOL(gs_write);
 EXPORT_SYMBOL(gs_write_room);
@@ -1075,4 +1087,4 @@
 EXPORT_SYMBOL(gs_init_port);
 EXPORT_SYMBOL(gs_setserial);
 EXPORT_SYMBOL(gs_getserial);
-
+EXPORT_SYMBOL(gs_got_break);
diff -u -r linux-2.4.4.clean/drivers/char/rio/linux_compat.h 
linux-2.4.4.rio_close/drivers/char/rio/linux_compat.h
--- linux-2.4.4.clean/drivers/char/rio/linux_compat.h   Fri Aug 11 23:51:33 2000
+++ linux-2.4.4.rio_close/drivers/char/rio/linux_compat.h   Mon May 14 15:11:09 
+2001
@@ -16,11 +16,13 @@
  *  Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
  */
 
+#include 
+
 
 #define disable(oldspl) save_flags (oldspl)
 #define restore(oldspl) restore_flags (oldspl)
 
-#define sysbrk(x) kmalloc ((x), GFP_KERNEL)
+#define sysbrk(x) kmalloc ((x),in_interrupt()? GFP_ATOMIC : GFP_KERNEL)
 #define sysfree(p,size) kfree ((p))
 
 #define WBYTE(p,v) writeb(v, )
diff -u -r linux-2.4.4.clean/drivers/char/rio/rio_linux.c 
linux-2.4.4.rio_close/drivers/char/rio/rio_linux.c
--- linux-2.4.4.clean/drivers/char/rio/rio_linux.c  Wed May  2 18:29:56 2001
+++ linux-2.4.4.rio_close/drivers/char/rio/rio_linux.c  Mon May 14 16:03:39 2001
@@ -58,6 +58,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -165,7 +166,7 @@
   /* startuptime */ HZ*2,   /* how long to wait for card to run */
   /* slowcook */0,  /* TRUE -> always use line disc. */
   /* intrpolltime */1,  /* The frequency of OUR polls */
-  /* breakinterval */   25, /* x10 mS */
+  /* breakinterval */   25, /* x10 mS XXX: units seem to be 1ms not 10! 
+-- REW*/
   /* timer */   10, /* mS */
   /* RtaLoadBase */ 0x7000,
   /* HostLoadBase */0x7C00,
@@ -203,11 +204,8 @@
 unsigned int cmd, unsigned long arg);
 static int rio_init_drivers(void);
 
-
 void my_hd (void *addr, int len);
 
-
-
 static struct tty_driver rio_driver, rio_callout_driver;
 static struct tty_driver rio_driver2, rio_callout_driver2;
 
@@ -248,14 +246,12 @@
 long rio_irqmask = -1;
 
 #ifndef TWO_ZERO
-#ifdef MODULE
 MODULE_AUTHOR("Rogier Wolff <[EMAIL PROTECTED]>, Patrick van de Lageweg 
<[EMAIL PROTECTED]>");
 MODULE_DESCRIPTION("RIO driver");
 MODULE_PARM(rio_poll, "i");
 MODULE_PARM(rio_debug, "i");
 MODULE_PARM(rio_irqmask, "i");
 #endif
-#endif
 
 static struct real_driver rio_real_driver = {
   rio_disable_tx_interrupts,
@@ -383,8 +379,8 @@
 
 int rio_ismodem (kdev_t device)
 {
-  return (MAJOR (device) != RIO_NOR

[PATCH] RIO, SX, driver update.

2001-05-14 Thread Rogier Wolff



Hi Linus, Alan, 

The patch below implements breaks (correctly) for the RIO and SX
cards.

We started out trying to fix one thing, but found that the 2.4.4 rio
driver was behind on several patches.

Roger Wolff. 
Patrick van de Lageweg. 

---


-- 
** [EMAIL PROTECTED] ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* There are old pilots, and there are bold pilots. 
* There are also old, bald pilots. 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 >

1 - 100 of 375 matches

Mail list logo