Re: 2.6.16.32 stuck in generic_file_aio_write()

2007-02-19 Thread Igmar Palsenberg

Hi,

> I can not make sure it is hardware problem, but I have interest in this 
> case's reproducing.
> If you tell me your platform's construction, I will try it and give you good 
> solution.
> Does your RAID adapter's firmware version work on 1.42?
> Areca firmware had fix some hardware bugs and rare sg length handle in this 
> version.

I've hacked up the sysrq code so that it gives me another command : j , 
which dumps the current IRQ status on the console :

SysRq : Show IRQ status
..
Showing info for IRQ 14
status :
depth  : 0
wake_depth : 0
irq_count  : 38717
irqs_unhandled : 0

Showing info for IRQ 15
status : DISABLED
depth  : 1
wake_depth : 0
irq_count  : 22
irqs_unhandled : 0

which is a the (incomplete) result on my machine after loading a module 
that does disable_irq(15) on module load.

I've put the patch at http://www.jdi-ict.nl/areca/sysrq-j.patch
I'll do a follow-up when anything usefull comes out.


Regards,


Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2007-02-19 Thread Igmar Palsenberg

Hi,

 I can not make sure it is hardware problem, but I have interest in this 
 case's reproducing.
 If you tell me your platform's construction, I will try it and give you good 
 solution.
 Does your RAID adapter's firmware version work on 1.42?
 Areca firmware had fix some hardware bugs and rare sg length handle in this 
 version.

I've hacked up the sysrq code so that it gives me another command : j , 
which dumps the current IRQ status on the console :

SysRq : Show IRQ status
..
Showing info for IRQ 14
status :
depth  : 0
wake_depth : 0
irq_count  : 38717
irqs_unhandled : 0

Showing info for IRQ 15
status : DISABLED
depth  : 1
wake_depth : 0
irq_count  : 22
irqs_unhandled : 0

which is a the (incomplete) result on my machine after loading a module 
that does disable_irq(15) on module load.

I've put the patch at http://www.jdi-ict.nl/areca/sysrq-j.patch
I'll do a follow-up when anything usefull comes out.


Regards,


Igmar

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2007-02-12 Thread Igmar Palsenberg

Hi,

> I can not make sure it is hardware problem, but I have interest in this 
> case's reproducing.
> If you tell me your platform's construction, I will try it and give you good 
> solution.

The machines giving problems are almost identical when it comes to 
hardware specs :

Intel SE7520BD2 mainbord (SE7520 chipset)
Dual Intel Xeon 2.8 Ghz (other machine : Dual Xeon 3.2 Ghz)
4 GB PC3200 ECC (400 Mhz) Corsair (other machine : 2GB PC3200 ECC)

> Does your RAID adapter's firmware version work on 1.42?
> Areca firmware had fix some hardware bugs and rare sg length handle in this 
> version.

It's currently at 1.41. I'll see if I can upgrade it to 1.42. For now, 
I've put all available stacktraces when it hung on 
http://www.jdi-ict.nl/areca, together with a lspci -v -v and a copy of the 
kernel's .config

Please let me know if you need anything else.



Regards,


    Igmar




-- 
Igmar Palsenberg
JDI ICT

Zutphensestraatweg 85
6953 CJ Dieren
Tel: +31 (0)313 - 496741
Fax: +31 (0)313 - 420996
The Netherlands

mailto: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2007-02-12 Thread Igmar Palsenberg

Hi,

 I can not make sure it is hardware problem, but I have interest in this 
 case's reproducing.
 If you tell me your platform's construction, I will try it and give you good 
 solution.

The machines giving problems are almost identical when it comes to 
hardware specs :

Intel SE7520BD2 mainbord (SE7520 chipset)
Dual Intel Xeon 2.8 Ghz (other machine : Dual Xeon 3.2 Ghz)
4 GB PC3200 ECC (400 Mhz) Corsair (other machine : 2GB PC3200 ECC)

 Does your RAID adapter's firmware version work on 1.42?
 Areca firmware had fix some hardware bugs and rare sg length handle in this 
 version.

It's currently at 1.41. I'll see if I can upgrade it to 1.42. For now, 
I've put all available stacktraces when it hung on 
http://www.jdi-ict.nl/areca, together with a lspci -v -v and a copy of the 
kernel's .config

Please let me know if you need anything else.



Regards,


Igmar




-- 
Igmar Palsenberg
JDI ICT

Zutphensestraatweg 85
6953 CJ Dieren
Tel: +31 (0)313 - 496741
Fax: +31 (0)313 - 420996
The Netherlands

mailto: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2007-02-05 Thread Igmar Palsenberg

> Does the other machine have the same problems?

It does. It seems to depend on the interrupt frequency : Setting KERNEL_HZ=250
makes it ony appear once a month or so, with KERNEL_HZ=1000, it will 
occur within a week. It does happen a lot less with the other machine, 
which isn't under disk activity load as much as the other machine.
 
> Are you able to rule out a hardware failure?

Well.. It's too much coincidence that 2 (almost identical) machines show 
the same weard behaviour. What strikes me that only *disk* interrupts 
after a while don't get handled. The machine itself is alive, just all 
disk IO is blocked, which makes it pretty much useless. 

Erich, could this be some sort of hardware problem ? I know it's a PITA to 
reproduce, but setting CONFIG_HZ to 1000 and bashing the machine with 
diskactivity seems to help :)


Regards,


Igmar

-- 
Igmar Palsenberg
JDI ICT

Zutphensestraatweg 85
6953 CJ Dieren
Tel: +31 (0)313 - 496741
Fax: +31 (0)313 - 420996
The Netherlands

mailto: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2007-02-05 Thread Igmar Palsenberg

 Does the other machine have the same problems?

It does. It seems to depend on the interrupt frequency : Setting KERNEL_HZ=250
makes it ony appear once a month or so, with KERNEL_HZ=1000, it will 
occur within a week. It does happen a lot less with the other machine, 
which isn't under disk activity load as much as the other machine.
 
 Are you able to rule out a hardware failure?

Well.. It's too much coincidence that 2 (almost identical) machines show 
the same weard behaviour. What strikes me that only *disk* interrupts 
after a while don't get handled. The machine itself is alive, just all 
disk IO is blocked, which makes it pretty much useless. 

Erich, could this be some sort of hardware problem ? I know it's a PITA to 
reproduce, but setting CONFIG_HZ to 1000 and bashing the machine with 
diskactivity seems to help :)


Regards,


Igmar

-- 
Igmar Palsenberg
JDI ICT

Zutphensestraatweg 85
6953 CJ Dieren
Tel: +31 (0)313 - 496741
Fax: +31 (0)313 - 420996
The Netherlands

mailto: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2006-12-14 Thread Igmar Palsenberg

> > See below. The other machine is mostly identifical, except for i8042 
> > missing (probably due to running an older kernel, or small differences in 
> > the kernel config).
> > 
> 
> Does the other machine have the same problems?

No, but that machine has a lot less disk and networkactivity.
 
> Are you able to rule out a hardware failure?

100% ? No, but the hardware is relatively new (about a year old), and of 
good quality. It's hard to reprodure, so looking at it when it starts to 
fault isn't possible either :(

> The disk interrupt is unshared, which rules out a few software problems, I
> guess.

Indeed. Bah, I hate these kind of things :(



Regards,


Igmar
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2006-12-14 Thread Igmar Palsenberg

> > Hmm.. Switching CONFIG_HZ from 1000 to 250 seems to 'fix' the problem. 
> > I haven't seen the issue in nearly a week now. This makes Andrew's theory 
> > about missing interrupts very likely.
> > 
> > Andrew / others : Is there a way to find out if it *is* missing 
> > interrupts ?
> > 
> 
> umm, nasty.  What's in /proc/interrupts?

See below. The other machine is mostly identifical, except for i8042 
missing (probably due to running an older kernel, or small differences in 
the kernel config).

Regards,


Igmar

[EMAIL PROTECTED] ~]$ cat /proc/interrupts
   CPU0   CPU1
  0:   73702693   74509271   IO-APIC-edge  timer
  1:  1  1   IO-APIC-edge  i8042
  4:   2289   8389   IO-APIC-edge  serial
  8:  0  1   IO-APIC-edge  rtc
  9:  0  0   IO-APIC-fasteoi   acpi
 12:  3  1   IO-APIC-edge  i8042
 16:  203127788  0   IO-APIC-fasteoi   uhci_hcd:usb2, eth0
 17:525492   IO-APIC-fasteoi   uhci_hcd:usb4
 18:   1370   67584889   IO-APIC-fasteoi   arcmsr
 19:  0  0   IO-APIC-fasteoi   ehci_hcd:usb1
 20:  0  0   IO-APIC-fasteoi   uhci_hcd:usb3
NMI:  0  0
LOC:  148127756  148133476
ERR:  0
MIS:  0
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2006-12-14 Thread Igmar Palsenberg

> > I'll put a .config and a dmesg of the machine booting at 
> > http://www.jdi-ict.nl/plain/ for those who want to look at it.
> 
> dmesg : http://www.jdi-ict.nl/plain/lnx01.dmesg
> Kernel config : http://www.jdi-ict.nl/plain/lnx01.config

Hmm.. Switching CONFIG_HZ from 1000 to 250 seems to 'fix' the problem. 
I haven't seen the issue in nearly a week now. This makes Andrew's theory 
about missing interrupts very likely.

Andrew / others : Is there a way to find out if it *is* missing 
interrupts ?


Regards,


Igmar
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2006-12-14 Thread Igmar Palsenberg

  I'll put a .config and a dmesg of the machine booting at 
  http://www.jdi-ict.nl/plain/ for those who want to look at it.
 
 dmesg : http://www.jdi-ict.nl/plain/lnx01.dmesg
 Kernel config : http://www.jdi-ict.nl/plain/lnx01.config

Hmm.. Switching CONFIG_HZ from 1000 to 250 seems to 'fix' the problem. 
I haven't seen the issue in nearly a week now. This makes Andrew's theory 
about missing interrupts very likely.

Andrew / others : Is there a way to find out if it *is* missing 
interrupts ?


Regards,


Igmar
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2006-12-14 Thread Igmar Palsenberg

  Hmm.. Switching CONFIG_HZ from 1000 to 250 seems to 'fix' the problem. 
  I haven't seen the issue in nearly a week now. This makes Andrew's theory 
  about missing interrupts very likely.
  
  Andrew / others : Is there a way to find out if it *is* missing 
  interrupts ?
  
 
 umm, nasty.  What's in /proc/interrupts?

See below. The other machine is mostly identifical, except for i8042 
missing (probably due to running an older kernel, or small differences in 
the kernel config).

Regards,


Igmar

[EMAIL PROTECTED] ~]$ cat /proc/interrupts
   CPU0   CPU1
  0:   73702693   74509271   IO-APIC-edge  timer
  1:  1  1   IO-APIC-edge  i8042
  4:   2289   8389   IO-APIC-edge  serial
  8:  0  1   IO-APIC-edge  rtc
  9:  0  0   IO-APIC-fasteoi   acpi
 12:  3  1   IO-APIC-edge  i8042
 16:  203127788  0   IO-APIC-fasteoi   uhci_hcd:usb2, eth0
 17:525492   IO-APIC-fasteoi   uhci_hcd:usb4
 18:   1370   67584889   IO-APIC-fasteoi   arcmsr
 19:  0  0   IO-APIC-fasteoi   ehci_hcd:usb1
 20:  0  0   IO-APIC-fasteoi   uhci_hcd:usb3
NMI:  0  0
LOC:  148127756  148133476
ERR:  0
MIS:  0
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2006-12-14 Thread Igmar Palsenberg

  See below. The other machine is mostly identifical, except for i8042 
  missing (probably due to running an older kernel, or small differences in 
  the kernel config).
  
 
 Does the other machine have the same problems?

No, but that machine has a lot less disk and networkactivity.
 
 Are you able to rule out a hardware failure?

100% ? No, but the hardware is relatively new (about a year old), and of 
good quality. It's hard to reprodure, so looking at it when it starts to 
fault isn't possible either :(

 The disk interrupt is unshared, which rules out a few software problems, I
 guess.

Indeed. Bah, I hate these kind of things :(



Regards,


Igmar
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2006-12-07 Thread Igmar Palsenberg

> I've enabled most debugging now, I'll see of i can run both a disk and VM 
> stresstest.

Running stress now :

stress -c 2 -i 2 -m 8 -d 8 --vm-bytes 20M --vm-hang 5 --hdd-bytes 20M

I'll see what this results in.
 
> I'll put a .config and a dmesg of the machine booting at 
> http://www.jdi-ict.nl/plain/ for those who want to look at it.

dmesg : http://www.jdi-ict.nl/plain/lnx01.dmesg
Kernel config : http://www.jdi-ict.nl/plain/lnx01.config



regards,


Igmar
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2006-12-07 Thread Igmar Palsenberg

> I thought it was, but from my look through yout 8-billion-task backtrace,
> no task was stuck in D-state with the appropriate call trace.

I was afraid of that... Where is the lock on the i_mutex suppose 
to be released ? I can't grasp the codepath from within an interrupt back 
to the fs layer.
 
> So I don't know what's causing this.  In the first trace you have at least
> four D-state kjournalds and a lot of processes stuck on an i_mutex.  I
> guess it's consistent with an IO system which is losing completion
> interrupts.  AFAICT in the second trace all you have is a lot of processes
> stuck on i_mutex for no obvious reason - I don't know why that would
> happen.

Is there any way to see if it is missing interrupts ? Enabling the 
debugging in the areca driver isn't a good idea on this machine, it's a
heavely IO loaded machine, and the problem seems to take some time to occur.

I *does* happen less often with a 2.6.19 kernel however. 

The task dump takes > 10 seconds, which causes the softlock detector to 
trigger. Is there any objection to a patch which disables the lockup 
detector during the dump ? It isn't a big issue, since al it does is dump 
a stacktrace.

I've enabled most debugging now, I'll see of i can run both a disk and VM 
stresstest.

I'll put a .config and a dmesg of the machine booting at 
http://www.jdi-ict.nl/plain/ for those who want to look at it.


Regards,


Igmar
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2006-12-07 Thread Igmar Palsenberg

 I thought it was, but from my look through yout 8-billion-task backtrace,
 no task was stuck in D-state with the appropriate call trace.

I was afraid of that... Where is the lock on the i_mutex suppose 
to be released ? I can't grasp the codepath from within an interrupt back 
to the fs layer.
 
 So I don't know what's causing this.  In the first trace you have at least
 four D-state kjournalds and a lot of processes stuck on an i_mutex.  I
 guess it's consistent with an IO system which is losing completion
 interrupts.  AFAICT in the second trace all you have is a lot of processes
 stuck on i_mutex for no obvious reason - I don't know why that would
 happen.

Is there any way to see if it is missing interrupts ? Enabling the 
debugging in the areca driver isn't a good idea on this machine, it's a
heavely IO loaded machine, and the problem seems to take some time to occur.

I *does* happen less often with a 2.6.19 kernel however. 

The task dump takes  10 seconds, which causes the softlock detector to 
trigger. Is there any objection to a patch which disables the lockup 
detector during the dump ? It isn't a big issue, since al it does is dump 
a stacktrace.

I've enabled most debugging now, I'll see of i can run both a disk and VM 
stresstest.

I'll put a .config and a dmesg of the machine booting at 
http://www.jdi-ict.nl/plain/ for those who want to look at it.


Regards,


Igmar
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2006-12-07 Thread Igmar Palsenberg

 I've enabled most debugging now, I'll see of i can run both a disk and VM 
 stresstest.

Running stress now :

stress -c 2 -i 2 -m 8 -d 8 --vm-bytes 20M --vm-hang 5 --hdd-bytes 20M

I'll see what this results in.
 
 I'll put a .config and a dmesg of the machine booting at 
 http://www.jdi-ict.nl/plain/ for those who want to look at it.

dmesg : http://www.jdi-ict.nl/plain/lnx01.dmesg
Kernel config : http://www.jdi-ict.nl/plain/lnx01.config



regards,


Igmar
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2006-12-06 Thread Igmar Palsenberg

> > Done some more digging : isn't http://lkml.org/lkml/2006/10/13/139 somehow 
> > related ? I do see pagefaults, and inode locks and mmap_locks. 
> > 
> 
> I thought it was, but from my look through yout 8-billion-task backtrace,
> no task was stuck in D-state with the appropriate call trace.
> 
> So I don't know what's causing this.  In the first trace you have at least
> four D-state kjournalds and a lot of processes stuck on an i_mutex.  I
> guess it's consistent with an IO system which is losing completion
> interrupts. 

Hmm.. Is there any way to make sure ? I've got a second machine (almost 
identical), which doesn't show this.

The main difference is the running kernel. I've had them at the same 
kernel, at which bad machine still crashes.

/proc/interrupts

Bad machine  : 18:   11160637   11235698   IO-APIC-fasteoi   arcmsr
Good machine : 18:   61658630   79352227   IO-APIC-level  arcmsr

Bad machine is running 2.6.19, good is running 2.6.14.7-grsec, which 
probably accounts for these changes.

> AFAICT in the second trace all you have is a lot of processes
> stuck on i_mutex for no obvious reason - I don't know why that would
> happen.

It's consequent, also the traces.
 
> How long does it take for this to happen?

Days to a week tops. It does happen less frequent with the 2.6.19, 
2.6.16.32 triggered it almost daily.

> Yes, lockdep might find something.

I've enabled most debug options. I'll boot the other kernel tomorrow.



Regards,


Igmar
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2006-12-06 Thread Igmar Palsenberg

> > It's rather large, but for those who want to look at it : 
> > http://www.jdi-ict.nl/plain/serial-28112006.txt
> 
> The same problem, this time with 2.6.19. I've done a show tasks, a show 
> locks, a show regs, and after that, a sync + reboot :)
> 
> Log is at http://www.jdi-ict.nl/plain/serial-04122006.txt .
> 
> If anyone needs more info : please tell me.

Done some more digging : isn't http://lkml.org/lkml/2006/10/13/139 somehow 
related ? I do see pagefaults, and inode locks and mmap_locks. 

Regards,


Igmar
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2006-12-06 Thread Igmar Palsenberg

  It's rather large, but for those who want to look at it : 
  http://www.jdi-ict.nl/plain/serial-28112006.txt
 
 The same problem, this time with 2.6.19. I've done a show tasks, a show 
 locks, a show regs, and after that, a sync + reboot :)
 
 Log is at http://www.jdi-ict.nl/plain/serial-04122006.txt .
 
 If anyone needs more info : please tell me.

Done some more digging : isn't http://lkml.org/lkml/2006/10/13/139 somehow 
related ? I do see pagefaults, and inode locks and mmap_locks. 

Regards,


Igmar
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2006-12-06 Thread Igmar Palsenberg

  Done some more digging : isn't http://lkml.org/lkml/2006/10/13/139 somehow 
  related ? I do see pagefaults, and inode locks and mmap_locks. 
  
 
 I thought it was, but from my look through yout 8-billion-task backtrace,
 no task was stuck in D-state with the appropriate call trace.
 
 So I don't know what's causing this.  In the first trace you have at least
 four D-state kjournalds and a lot of processes stuck on an i_mutex.  I
 guess it's consistent with an IO system which is losing completion
 interrupts. 

Hmm.. Is there any way to make sure ? I've got a second machine (almost 
identical), which doesn't show this.

The main difference is the running kernel. I've had them at the same 
kernel, at which bad machine still crashes.

/proc/interrupts

Bad machine  : 18:   11160637   11235698   IO-APIC-fasteoi   arcmsr
Good machine : 18:   61658630   79352227   IO-APIC-level  arcmsr

Bad machine is running 2.6.19, good is running 2.6.14.7-grsec, which 
probably accounts for these changes.

 AFAICT in the second trace all you have is a lot of processes
 stuck on i_mutex for no obvious reason - I don't know why that would
 happen.

It's consequent, also the traces.
 
 How long does it take for this to happen?

Days to a week tops. It does happen less frequent with the 2.6.19, 
2.6.16.32 triggered it almost daily.

 Yes, lockdep might find something.

I've enabled most debug options. I'll boot the other kernel tomorrow.



Regards,


Igmar
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2006-12-04 Thread Igmar Palsenberg

> It's rather large, but for those who want to look at it : 
> http://www.jdi-ict.nl/plain/serial-28112006.txt

The same problem, this time with 2.6.19. I've done a show tasks, a show 
locks, a show regs, and after that, a sync + reboot :)

Log is at http://www.jdi-ict.nl/plain/serial-04122006.txt .

If anyone needs more info : please tell me.

Regards,


Igmar
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2006-12-04 Thread Igmar Palsenberg

 It's rather large, but for those who want to look at it : 
 http://www.jdi-ict.nl/plain/serial-28112006.txt

The same problem, this time with 2.6.19. I've done a show tasks, a show 
locks, a show regs, and after that, a sync + reboot :)

Log is at http://www.jdi-ict.nl/plain/serial-04122006.txt .

If anyone needs more info : please tell me.

Regards,


Igmar
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2006-12-01 Thread Igmar Palsenberg


Hi,

> > I've got a machine which occasionally locks up. I can still sysrq it from 
> > a serial console, so it's not entirely dead.
> > 
> > A sysrq-t learns me that it's got a large number of httpd processes stuck 
> > in D state :
> 
> There are known deadlocks in generic_file_write() in kernels up to and
> including 2.6.17.  Pagefaults are involved and I'd need to see the entire
> sysrq-T output to determine if you're hitting that bug.

It's rather large, but for those who want to look at it : 
http://www.jdi-ict.nl/plain/serial-28112006.txt

There is also a dump from a day later, but halfway the Areca controller 
decided to kick out the array, on which a lot of unwritten data needed to 
be written :)

That dump is at http://www.jdi-ict.nl/plain/serial-29112006.txt


Regards,


Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2006-12-01 Thread Igmar Palsenberg


Hi,

  I've got a machine which occasionally locks up. I can still sysrq it from 
  a serial console, so it's not entirely dead.
  
  A sysrq-t learns me that it's got a large number of httpd processes stuck 
  in D state :
 
 There are known deadlocks in generic_file_write() in kernels up to and
 including 2.6.17.  Pagefaults are involved and I'd need to see the entire
 sysrq-T output to determine if you're hitting that bug.

It's rather large, but for those who want to look at it : 
http://www.jdi-ict.nl/plain/serial-28112006.txt

There is also a dump from a day later, but halfway the Areca controller 
decided to kick out the array, on which a lot of unwritten data needed to 
be written :)

That dump is at http://www.jdi-ict.nl/plain/serial-29112006.txt


Regards,


Igmar

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2006-11-30 Thread Igmar Palsenberg

Hi,

> If you are working on arcmsr 1.20.00.13 for official kernel version.
> This is the last version.

I'm already on that version. I'll see if I can upgrade to 2.6.19 today.

> Could you check your RAID controller event and tell someting to me?
> You can check "MBIOS"=>"Physical Drive Information"=>"View Drive 
> Information"=>"Select The Drive"=>"Timeout Count"..
> It could tell you which disk had bad behavior cause your RAID volume 
> offline.

I need to be in the BIOS right ? I couldn't find anything usefull with the 
cli32 tool.

> About the message dump from arcmsr, it said that your RAID volume had 
> something wrong and kicked out from the system.
> How about your RAID config?

CLI> disk info
Ch   ModelNameSerial#  FirmRev Capacity  State
===
 1   HDT722516DLA380  VDK71BTCDB90KE   V43OA91A 164.7GB  RaidSet 
Member(1)
 2   HDT722516DLA380  VDN71BTCDEPH7G   V43OA91A 164.7GB  RaidSet 
Member(1)
 3   HDT722516DLA380  VDN71BTCDES96G   V43OA91A 164.7GB  RaidSet 
Member(1)
 4   HDT722516DLA380  VDN71BTCDE15KG   V43OA91A 164.7GB  RaidSet 
Member(1)
===

CLI> rsf info
Num Name Disks TotalCap  FreeCap DiskChannels   State
===
 1  Raid Set # 004  640.0GB0.0GB 1234   Normal
===

CLI> vsf info
 # Name Raid# Level   Capacity Ch/Id/Lun  State
===
 1 ARC-1110-VOL#001   Raid5480.0GB 00/00/00   Normal
===

A plain RAID 5 config with 4 disks. 


> Areca had new firmware released (1.42).
> If you are working on "sg" device with scsi passthrough ioctl method to feed 
> data into Areca's RAID volume.
> You need to limit your data under 512 blocks (256K) each transfer.
> The new firmware will enlarge it into 4096 blocks (2M) each transfer.
> The firmware version 1.42 is on releasing procedure but not yet put it on 
> Areca ftp site.

I don't use the sg driver at all. Is the upgrade worth it ? I usually 
don't mess with firmware unless being told to do so.

> If you need it, please tell me again.

Can you send it to me ? Installing it won't hurt I guess :)


Regards,


Igmar


-- 
Igmar Palsenberg
JDI ICT

Zutphensestraatweg 85
6953 CJ Dieren
Tel: +31 (0)313 - 496741
Fax: +31 (0)313 - 420996
The Netherlands

mailto: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2006-11-30 Thread Igmar Palsenberg

Hi,

 If you are working on arcmsr 1.20.00.13 for official kernel version.
 This is the last version.

I'm already on that version. I'll see if I can upgrade to 2.6.19 today.

 Could you check your RAID controller event and tell someting to me?
 You can check MBIOS=Physical Drive Information=View Drive 
 Information=Select The Drive=Timeout Count..
 It could tell you which disk had bad behavior cause your RAID volume 
 offline.

I need to be in the BIOS right ? I couldn't find anything usefull with the 
cli32 tool.

 About the message dump from arcmsr, it said that your RAID volume had 
 something wrong and kicked out from the system.
 How about your RAID config?

CLI disk info
Ch   ModelNameSerial#  FirmRev Capacity  State
===
 1   HDT722516DLA380  VDK71BTCDB90KE   V43OA91A 164.7GB  RaidSet 
Member(1)
 2   HDT722516DLA380  VDN71BTCDEPH7G   V43OA91A 164.7GB  RaidSet 
Member(1)
 3   HDT722516DLA380  VDN71BTCDES96G   V43OA91A 164.7GB  RaidSet 
Member(1)
 4   HDT722516DLA380  VDN71BTCDE15KG   V43OA91A 164.7GB  RaidSet 
Member(1)
===

CLI rsf info
Num Name Disks TotalCap  FreeCap DiskChannels   State
===
 1  Raid Set # 004  640.0GB0.0GB 1234   Normal
===

CLI vsf info
 # Name Raid# Level   Capacity Ch/Id/Lun  State
===
 1 ARC-1110-VOL#001   Raid5480.0GB 00/00/00   Normal
===

A plain RAID 5 config with 4 disks. 


 Areca had new firmware released (1.42).
 If you are working on sg device with scsi passthrough ioctl method to feed 
 data into Areca's RAID volume.
 You need to limit your data under 512 blocks (256K) each transfer.
 The new firmware will enlarge it into 4096 blocks (2M) each transfer.
 The firmware version 1.42 is on releasing procedure but not yet put it on 
 Areca ftp site.

I don't use the sg driver at all. Is the upgrade worth it ? I usually 
don't mess with firmware unless being told to do so.

 If you need it, please tell me again.

Can you send it to me ? Installing it won't hurt I guess :)


Regards,


Igmar


-- 
Igmar Palsenberg
JDI ICT

Zutphensestraatweg 85
6953 CJ Dieren
Tel: +31 (0)313 - 496741
Fax: +31 (0)313 - 420996
The Netherlands

mailto: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2006-11-29 Thread Igmar Palsenberg

Hi,

A followup. It crashed again, giving me :

arcmsr0: scsi id=0 lun=0 ccb='0xf7c984e0' poll command abort successfully
end_request: I/O error, dev sda, sector 3724719

and

sd 0:0:0:0: rejecting I/O to offline device
about 15k times.

I'll see if I can upgrade the RAID driver.



Igmar


-- 
Igmar Palsenberg
JDI ICT

Zutphensestraatweg 85
6953 CJ Dieren
Tel: +31 (0)313 - 496741
Fax: +31 (0)313 - 420996
The Netherlands

mailto: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.16.32 stuck in generic_file_aio_write()

2006-11-29 Thread Igmar Palsenberg

Hi,

I've got a machine which occasionally locks up. I can still sysrq it from 
a serial console, so it's not entirely dead.

A sysrq-t learns me that it's got a large number of httpd processes stuck 
in D state :

httpd D F7619440  2160 11635   2057 11636   (NOTLB)
dbb7ae14 cc9b0550 c33224a0 f7619440 de187604  00b3 0001
   00b3   d374a550 c33224a0 0005b8d8 f04af800 
000f75e7
   d374a550 cc9b0550 cc9b0678 ef7d33ec ef7d33e8 cc9b0550 ef7d33fc 
c041bf70
Call Trace:
 [] __mutex_lock_slowpath+0x92/0x43e
 [] generic_file_aio_write+0x5c/0xfa
 [] generic_file_aio_write+0x5c/0xfa
 [] generic_file_aio_write+0x5c/0xfa
 [] permission+0xad/0xcb
 [] ext3_file_write+0x3b/0xb0
 [] do_sync_write+0xd5/0x130
 [] _spin_unlock+0xb/0xf
 [] autoremove_wake_function+0x0/0x4b
 [] vfs_write+0x1a3/0x1a8
 [] sys_write+0x4b/0x74
 [] sysenter_past_esp+0x54/0x75

After this, the machine is rendered useless (probably due to the fact that 
disk IO isn't working anymore).

The lock debugging gives me this :

D   httpd:11635 [cc9b0550, 116] blocked on mutex: [ef7d33e8] 
{inode_init_once}
.. held by: httpd:  506 [d67e1000, 121]
... acquired at:   generic_file_aio_write+0x5c/0xfa 


I see similiar things as mentioned in http://lkml.org/lkml/2006/1/10/64, 
with the difference that I'm not running software RAID or SATA (it's an 
Areca ARC-1110).

I can't reproduce it until now, it 'just' happens. Can someone give me a 
pointer where to start looking ?

Erich, I've CC-ed you since the machine is running an Areca RAID config. 
It's also the only used disk subsystem in this machine.


Regards,


Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.16.32 stuck in generic_file_aio_write()

2006-11-29 Thread Igmar Palsenberg

Hi,

I've got a machine which occasionally locks up. I can still sysrq it from 
a serial console, so it's not entirely dead.

A sysrq-t learns me that it's got a large number of httpd processes stuck 
in D state :

httpd D F7619440  2160 11635   2057 11636   (NOTLB)
dbb7ae14 cc9b0550 c33224a0 f7619440 de187604  00b3 0001
   00b3   d374a550 c33224a0 0005b8d8 f04af800 
000f75e7
   d374a550 cc9b0550 cc9b0678 ef7d33ec ef7d33e8 cc9b0550 ef7d33fc 
c041bf70
Call Trace:
 [c041bf70] __mutex_lock_slowpath+0x92/0x43e
 [c0148f29] generic_file_aio_write+0x5c/0xfa
 [c0148f29] generic_file_aio_write+0x5c/0xfa
 [c0148f29] generic_file_aio_write+0x5c/0xfa
 [c01746c9] permission+0xad/0xcb
 [c01d9c4a] ext3_file_write+0x3b/0xb0
 [c0166777] do_sync_write+0xd5/0x130
 [c041d1bf] _spin_unlock+0xb/0xf
 [c0135c13] autoremove_wake_function+0x0/0x4b
 [c0166975] vfs_write+0x1a3/0x1a8
 [c0166a39] sys_write+0x4b/0x74
 [c0102c03] sysenter_past_esp+0x54/0x75

After this, the machine is rendered useless (probably due to the fact that 
disk IO isn't working anymore).

The lock debugging gives me this :

D   httpd:11635 [cc9b0550, 116] blocked on mutex: [ef7d33e8] 
{inode_init_once}
.. held by: httpd:  506 [d67e1000, 121]
... acquired at:   generic_file_aio_write+0x5c/0xfa 


I see similiar things as mentioned in http://lkml.org/lkml/2006/1/10/64, 
with the difference that I'm not running software RAID or SATA (it's an 
Areca ARC-1110).

I can't reproduce it until now, it 'just' happens. Can someone give me a 
pointer where to start looking ?

Erich, I've CC-ed you since the machine is running an Areca RAID config. 
It's also the only used disk subsystem in this machine.


Regards,


Igmar

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.16.32 stuck in generic_file_aio_write()

2006-11-29 Thread Igmar Palsenberg

Hi,

A followup. It crashed again, giving me :

arcmsr0: scsi id=0 lun=0 ccb='0xf7c984e0' poll command abort successfully
end_request: I/O error, dev sda, sector 3724719

and

sd 0:0:0:0: rejecting I/O to offline device
about 15k times.

I'll see if I can upgrade the RAID driver.



Igmar


-- 
Igmar Palsenberg
JDI ICT

Zutphensestraatweg 85
6953 CJ Dieren
Tel: +31 (0)313 - 496741
Fax: +31 (0)313 - 420996
The Netherlands

mailto: [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.4.x deadlocking

2001-06-21 Thread Igmar Palsenberg


Hi,

My collegue has a Lifetec 9888 laptop (PII) that suffers from IRQ
deadlocking :

His TI PCMCIA bridge locates itself on IRQ 12, and so does his mouse.
System boots fine, but as soon as he types anything or moves his mouse,
the keyboard locks.
2.2.x doesn't seem to allocate a IRQ to the bridge, so no problems there.

Remote access is still possible, and the system works fine excepts that
his keyboard / mouse are dead :)
Not starting GPM prevents the lock.

Is there any way to tell the PCI subsystem to leave IRQ 12 alone ? The
pci_setup() routine seems to be a pretty noop.



Regards,

Igmar Palsenberg

-- 

Igmar Palsenberg
JDI Media Solutions

Boulevard Heuvelink 102
6828 KT Arnhem
The Netherlands

mailto: [EMAIL PROTECTED]
PGP/GPG key : http://www.jdimedia.nl/formulier/pgp/igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



2.4.x deadlocking

2001-06-21 Thread Igmar Palsenberg


Hi,

My collegue has a Lifetec 9888 laptop (PII) that suffers from IRQ
deadlocking :

His TI PCMCIA bridge locates itself on IRQ 12, and so does his mouse.
System boots fine, but as soon as he types anything or moves his mouse,
the keyboard locks.
2.2.x doesn't seem to allocate a IRQ to the bridge, so no problems there.

Remote access is still possible, and the system works fine excepts that
his keyboard / mouse are dead :)
Not starting GPM prevents the lock.

Is there any way to tell the PCI subsystem to leave IRQ 12 alone ? The
pci_setup() routine seems to be a pretty noop.



Regards,

Igmar Palsenberg

-- 

Igmar Palsenberg
JDI Media Solutions

Boulevard Heuvelink 102
6828 KT Arnhem
The Netherlands

mailto: [EMAIL PROTECTED]
PGP/GPG key : http://www.jdimedia.nl/formulier/pgp/igmar

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



2.4.5 strange behaviour

2001-06-20 Thread Igmar Palsenberg


Hi,

2.4.5 keeps thinking I can change a CDROM 20 times a second or so.

System :

Compaq Armada 7360 DMT

Relevant stuff from dmesg :

Uniform Multi-Platform E-IDE driver Revision: 6.31
ide: Assuming 33MHz system bus speed for PIO modes; override with
idebus=xx
PCI_IDE: unknown IDE controller on PCI bus 00 device 71, VID=0e11,
DID=ae33
PCI: Device 00:0e.1 not available because of resource collisions
PCI_IDE: chipset revision 3
PCI_IDE: not 100% native mode: will probe irqs later
hda: IBM-DTCA-23240, ATA DISK drive
hdb: COMPAQ CRD-S88P, ATAPI CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hda: 6354432 sectors (3253 MB) w/468KiB Cache, CHS=788/128/63
hdb: ATAPI 8X CD-ROM drive, 256kB Cache
Uniform CD-ROM driver Revision: 3.12

If starts spitting messages in the form of

VFS: Disk change detected on device ide0(3,64)

The behaviour seems to be triggered by some event, and if that happens the
only way to resolve == reboot


Second think is that the IrDA subsystem seems to have stopped working. I'm
looking into that (irattach goes fine, detection also, but no contact with
the device itself)

I'm going back to 2.4.4 to see if thing go better.


Igmar


-- 

Igmar Palsenberg
JDI Media Solutions

Boulevard Heuvelink 102
6828 KT Arnhem
The Netherlands

mailto: [EMAIL PROTECTED]
PGP/GPG key : http://www.jdimedia.nl/formulier/pgp/igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



2.4.5 strange behaviour

2001-06-20 Thread Igmar Palsenberg


Hi,

2.4.5 keeps thinking I can change a CDROM 20 times a second or so.

System :

Compaq Armada 7360 DMT

Relevant stuff from dmesg :

Uniform Multi-Platform E-IDE driver Revision: 6.31
ide: Assuming 33MHz system bus speed for PIO modes; override with
idebus=xx
PCI_IDE: unknown IDE controller on PCI bus 00 device 71, VID=0e11,
DID=ae33
PCI: Device 00:0e.1 not available because of resource collisions
PCI_IDE: chipset revision 3
PCI_IDE: not 100% native mode: will probe irqs later
hda: IBM-DTCA-23240, ATA DISK drive
hdb: COMPAQ CRD-S88P, ATAPI CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hda: 6354432 sectors (3253 MB) w/468KiB Cache, CHS=788/128/63
hdb: ATAPI 8X CD-ROM drive, 256kB Cache
Uniform CD-ROM driver Revision: 3.12

If starts spitting messages in the form of

VFS: Disk change detected on device ide0(3,64)

The behaviour seems to be triggered by some event, and if that happens the
only way to resolve == reboot


Second think is that the IrDA subsystem seems to have stopped working. I'm
looking into that (irattach goes fine, detection also, but no contact with
the device itself)

I'm going back to 2.4.4 to see if thing go better.


Igmar


-- 

Igmar Palsenberg
JDI Media Solutions

Boulevard Heuvelink 102
6828 KT Arnhem
The Netherlands

mailto: [EMAIL PROTECTED]
PGP/GPG key : http://www.jdimedia.nl/formulier/pgp/igmar

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Netmos patch

2001-02-15 Thread Igmar Palsenberg


Hi,

Wrong patch. Attached is the (hopefully) correct one. Or replace the
PCI_VENDOR_ID_NETMOS_9705 with PCI_DEVICE_ID_NETMOS_9705


Regards,


Igmar

-- 

--
Igmar Palsenberg
JDI Media Solutions

Jansplaats 11
6811 GB Arnhem
The Netherlands

mailto: [EMAIL PROTECTED]
PGP/GPG key : http://www.jdimedia.nl/formulier/pgp/igmar


--- linux/include/linux/pci.h.orig  Thu Feb 15 11:18:43 2001
+++ linux/include/linux/pci.h   Thu Feb 15 11:52:27 2001
@@ -1268,6 +1268,9 @@
 #define PCI_DEVICE_ID_INTERPHASE_5526  0x0004
 #define PCI_DEVICE_ID_INTERPHASE_55x6  0x0005
 
+#define PCI_VENDOR_ID_NETMOS   0x9710
+#define PCI_DEVICE_ID_NETMOS_9705  0x9705
+
 /*
  * The PCI interface treats multi-function devices as independent
  * devices.  The slot/function address of each device is encoded
--- linux/drivers/misc/parport_pc.c.origThu Feb 15 11:49:00 2001
+++ linux/drivers/misc/parport_pc.c Thu Feb 15 11:53:21 2001
@@ -910,6 +910,8 @@
  { { 0, -1 }, } },
{ PCI_VENDOR_ID_OXSEMI, PCI_DEVICE_ID_OXSEMI_12PCI840, 1,
  { { 0, 1 }, } },
+   { PCI_VENDOR_ID_NETMOS, PCI_DEVICE_ID_NETMOS_9705, 1, 
+ { { 0, -1 }, } },
{ 0, }
};
 
--- linux/drivers/pci/oldproc.c.origThu Feb 15 11:30:36 2001
+++ linux/drivers/pci/oldproc.c Thu Feb 15 11:30:06 2001
@@ -947,6 +947,7 @@
  case PCI_VENDOR_ID_TIGERJET:  return "TigerJet";
  case PCI_VENDOR_ID_ARK:   return "ARK Logic";
  case PCI_VENDOR_ID_SYSKONNECT:return "SysKonnect";
+ case PCI_VENDOR_ID_NETMOS:return "Netmos";
  default:  return "Unknown vendor";
}
 }



Netmos PCI parallel card

2001-02-15 Thread Igmar Palsenberg


Hi,

Attached is a patch to make a Netmos PCI parallal port card working.

Card is a PCI card with a Netmos 9705 controller and an Atmel serial
eeprom.



Regards,


Igmar


-- 

--
Igmar Palsenberg
JDI Media Solutions

Jansplaats 11
6811 GB Arnhem
The Netherlands

mailto: [EMAIL PROTECTED]
PGP/GPG key : http://www.jdimedia.nl/formulier/pgp/igmar


00:00.0 Class 0600: 8086:7122 (rev 03)
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- 
SERR- FastB2B-
Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- TAbort- SERR- TAbort- SERR- Reset- FastB2B-

00:1f.0 Class 0601: 8086:2410 (rev 01)
Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV- VGASnoop- ParErr- Stepping- 
SERR+ FastB2B-
Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- 

--- linux/include/linux/pci.h.orig  Thu Feb 15 11:18:43 2001
+++ linux/include/linux/pci.h   Thu Feb 15 11:52:27 2001
@@ -1268,6 +1268,9 @@
 #define PCI_DEVICE_ID_INTERPHASE_5526  0x0004
 #define PCI_DEVICE_ID_INTERPHASE_55x6  0x0005
 
+#define PCI_VENDOR_ID_NETMOS   0x9710
+#define PCI_VENDOR_ID_NETMOS_9705  0x9705
+
 /*
  * The PCI interface treats multi-function devices as independent
  * devices.  The slot/function address of each device is encoded
--- linux/drivers/misc/parport_pc.c.origThu Feb 15 11:49:00 2001
+++ linux/drivers/misc/parport_pc.c Thu Feb 15 11:53:21 2001
@@ -910,6 +910,8 @@
  { { 0, -1 }, } },
{ PCI_VENDOR_ID_OXSEMI, PCI_DEVICE_ID_OXSEMI_12PCI840, 1,
  { { 0, 1 }, } },
+   { PCI_VENDOR_ID_NETMOS, PCI_DEVICE_ID_NETMOS_9705, 1, 
+ { { 0, -1 }, } },
{ 0, }
};
 
--- linux/drivers/pci/oldproc.c.origThu Feb 15 11:30:36 2001
+++ linux/drivers/pci/oldproc.c Thu Feb 15 11:30:06 2001
@@ -947,6 +947,7 @@
  case PCI_VENDOR_ID_TIGERJET:  return "TigerJet";
  case PCI_VENDOR_ID_ARK:   return "ARK Logic";
  case PCI_VENDOR_ID_SYSKONNECT:return "SysKonnect";
+ case PCI_VENDOR_ID_NETMOS:return "Netmos";
  default:  return "Unknown vendor";
}
 }



Netmos PCI parallel card

2001-02-15 Thread Igmar Palsenberg


Hi,

Attached is a patch to make a Netmos PCI parallal port card working.

Card is a PCI card with a Netmos 9705 controller and an Atmel serial
eeprom.



Regards,


Igmar


-- 

--
Igmar Palsenberg
JDI Media Solutions

Jansplaats 11
6811 GB Arnhem
The Netherlands

mailto: [EMAIL PROTECTED]
PGP/GPG key : http://www.jdimedia.nl/formulier/pgp/igmar


00:00.0 Class 0600: 8086:7122 (rev 03)
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- 
SERR- FastB2B-
Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=fast TAbort- TAbort- 
MAbort+ SERR- PERR-
Latency: 0 set

00:01.0 Class 0300: 8086:7123 (rev 03)
Subsystem: 8086:0200
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- 
SERR- FastB2B-
Status: Cap+ 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=medium TAbort- TAbort- 
MAbort- SERR- PERR-
Latency: 0 set
Interrupt: pin A routed to IRQ 11
Region 0: Memory at ec00 (32-bit, prefetchable)
Region 1: Memory at ffe8 (32-bit, non-prefetchable)
Capabilities: [dc] Power Management version 1
Flags: PMEClk- AuxPwr- DSI+ D1- D2- PME-
Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:1e.0 Class 0604: 8086:2418 (rev 01)
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- 
SERR+ FastB2B-
Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=fast TAbort- TAbort- 
MAbort- SERR- PERR-
Latency: 0 set
Bus: primary=00, secondary=01, subordinate=01, sec-latency=64
I/O behind bridge: d000-dfff
Memory behind bridge: ffc0-ffcf
Prefetchable memory behind bridge: e7e0-e7ef
BridgeCtl: Parity- SERR+ NoISA- VGA- MAbort- Reset- FastB2B-

00:1f.0 Class 0601: 8086:2410 (rev 01)
Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV- VGASnoop- ParErr- Stepping- 
SERR+ FastB2B-
Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium TAbort- TAbort- 
MAbort- SERR- PERR-
Latency: 0 set

00:1f.1 Class 0101: 8086:2411 (rev 01) (prog-if 80 [Master])
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- 
SERR- FastB2B-
Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium TAbort- TAbort- 
MAbort- SERR- PERR-
Latency: 0 set
Region 4: I/O ports at ffa0

00:1f.2 Class 0c03: 8086:2412 (rev 01)
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- 
SERR- FastB2B-
Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium TAbort- TAbort- 
MAbort- SERR- PERR-
Latency: 0 set
Interrupt: pin D routed to IRQ 5
Region 4: I/O ports at ef80

00:1f.3 Class 0c05: 8086:2413 (rev 01)
Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- 
SERR- FastB2B-
Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium TAbort- TAbort- 
MAbort- SERR- PERR-
Interrupt: pin B routed to IRQ 10
Region 4: I/O ports at efa0

01:05.0 Class 0200: 10ec:8029
Subsystem: 10ec:8029
Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- 
SERR- FastB2B-
Status: Cap- 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium TAbort- TAbort- 
MAbort- SERR- PERR-
Interrupt: pin A routed to IRQ 10
Region 0: I/O ports at df80

01:0b.0 Class 0701: 9710:9705 (rev 01) (prog-if 02)
Subsystem: 1000:0010
Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- 
SERR+ FastB2B-
Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium TAbort- TAbort- 
MAbort- SERR- PERR-
Interrupt: pin A routed to IRQ 5
Region 0: I/O ports at df00



--- linux/include/linux/pci.h.orig  Thu Feb 15 11:18:43 2001
+++ linux/include/linux/pci.h   Thu Feb 15 11:52:27 2001
@@ -1268,6 +1268,9 @@
 #define PCI_DEVICE_ID_INTERPHASE_5526  0x0004
 #define PCI_DEVICE_ID_INTERPHASE_55x6  0x0005
 
+#define PCI_VENDOR_ID_NETMOS   0x9710
+#define PCI_VENDOR_ID_NETMOS_9705  0x9705
+
 /*
  * The PCI interface treats multi-function devices as independent
  * devices.  The slot/function address of each device is encoded
--- linux/drivers/misc/parport_pc.c.origThu Feb 15 11:49:00 2001
+++ linux/drivers/misc/parport_pc.c Thu Feb 15 11:53:21 2001
@@ -910,6 +910,8 @@
  { { 0, -1 }, } },
{ PCI_VENDOR_ID_OXSEMI, PCI_DEVICE_ID_OXSEMI_12PCI840, 1,
  { { 0, 1 }, } },
+   { PCI_VENDOR_ID_NETMOS, PCI_DEVICE_ID_NETMOS_9705, 1, 
+ { { 0, -1 }, } },
{ 0, }
};
 
--- linux/drivers/pci/oldproc.c.origThu Feb 15 11:30:36 2001
+++ linux/drivers/pci/oldproc.c Thu Feb 15 11:30:06 2001
@@ -947,6 +947,7 @@
  case PCI_VENDOR_ID_TIGERJET:  return

Netmos patch

2001-02-15 Thread Igmar Palsenberg


Hi,

Wrong patch. Attached is the (hopefully) correct one. Or replace the
PCI_VENDOR_ID_NETMOS_9705 with PCI_DEVICE_ID_NETMOS_9705


Regards,


Igmar

-- 

--
Igmar Palsenberg
JDI Media Solutions

Jansplaats 11
6811 GB Arnhem
The Netherlands

mailto: [EMAIL PROTECTED]
PGP/GPG key : http://www.jdimedia.nl/formulier/pgp/igmar


--- linux/include/linux/pci.h.orig  Thu Feb 15 11:18:43 2001
+++ linux/include/linux/pci.h   Thu Feb 15 11:52:27 2001
@@ -1268,6 +1268,9 @@
 #define PCI_DEVICE_ID_INTERPHASE_5526  0x0004
 #define PCI_DEVICE_ID_INTERPHASE_55x6  0x0005
 
+#define PCI_VENDOR_ID_NETMOS   0x9710
+#define PCI_DEVICE_ID_NETMOS_9705  0x9705
+
 /*
  * The PCI interface treats multi-function devices as independent
  * devices.  The slot/function address of each device is encoded
--- linux/drivers/misc/parport_pc.c.origThu Feb 15 11:49:00 2001
+++ linux/drivers/misc/parport_pc.c Thu Feb 15 11:53:21 2001
@@ -910,6 +910,8 @@
  { { 0, -1 }, } },
{ PCI_VENDOR_ID_OXSEMI, PCI_DEVICE_ID_OXSEMI_12PCI840, 1,
  { { 0, 1 }, } },
+   { PCI_VENDOR_ID_NETMOS, PCI_DEVICE_ID_NETMOS_9705, 1, 
+ { { 0, -1 }, } },
{ 0, }
};
 
--- linux/drivers/pci/oldproc.c.origThu Feb 15 11:30:36 2001
+++ linux/drivers/pci/oldproc.c Thu Feb 15 11:30:06 2001
@@ -947,6 +947,7 @@
  case PCI_VENDOR_ID_TIGERJET:  return "TigerJet";
  case PCI_VENDOR_ID_ARK:   return "ARK Logic";
  case PCI_VENDOR_ID_SYSKONNECT:return "SysKonnect";
+ case PCI_VENDOR_ID_NETMOS:return "Netmos";
  default:  return "Unknown vendor";
}
 }



Re: Vanilla 2.4.0 ext2fs error

2001-02-01 Thread Igmar Palsenberg

On Wed, 31 Jan 2001, bert hubert wrote:

> On Wed, Jan 31, 2001 at 06:21:04PM +0100, Igmar Palsenberg wrote:
> 
> > Jan 31 18:01:57 base kernel: EXT2-fs error (device ide0(3,71)):
> > ext2_new_inode:
> > reserved inode or inode > inodes count - block_group = 0,inode=1
> 
> does fsck run on this fs find any errors?

Haven't tried, but highly unlikely.. The FS was formatted about 20 seconds
before the error :)

I'll give it a try.

> Huge .sig!

I like them :)


Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Vanilla 2.4.0 ext2fs error

2001-02-01 Thread Igmar Palsenberg

On Wed, 31 Jan 2001, bert hubert wrote:

 On Wed, Jan 31, 2001 at 06:21:04PM +0100, Igmar Palsenberg wrote:
 
  Jan 31 18:01:57 base kernel: EXT2-fs error (device ide0(3,71)):
  ext2_new_inode:
  reserved inode or inode  inodes count - block_group = 0,inode=1
 
 does fsck run on this fs find any errors?

Haven't tried, but highly unlikely.. The FS was formatted about 20 seconds
before the error :)

I'll give it a try.

 Huge .sig!

I like them :)


Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Why isn't init PID 1?

2001-01-31 Thread Igmar Palsenberg

On Wed, 31 Jan 2001, Paul Powell wrote:

> Hello,
> 
> I have a bootable linux CD that runs a custom init. 
> Under most versions of linux init runs as process ID
> one.  Under my bootable CD, it runs as process ID 15. 
> I need it to run as PID 1 so that I can execute a
> kill(-1,15) without killing init.
> 
> The boot CD uses and initrd image to load drivers. 
> The linuxrc file looks like:
> 
> #!/bin/sash
> 
> aliasall
> 
> echo "Loading aic7xxx module"
> insmod /lib/aic7xxx.o
> echo "Loading ips module"
> insmod /lib/ips.o ips=ioctlsize:512000
> echo "Loading sg module"
> insmod /lib/sg.o
> echo "Loading FAT modules"
> insmod /lib/fat.o
> insmod /lib/vfat.o
> 
> echo "Mounting /proc"
> mount -t proc /proc /proc
> init
> umount /proc
> 
> Does it run as PID 15 because I execute insmod and
> mount before running init?

Yes. First program to run get PID 1. 

Solution : fork() in init and load the modules in the child.



Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Vanilla 2.4.0 et2fs errors

2001-01-31 Thread Igmar Palsenberg


Hi,

Can someone 'translate' this for me ?


Jan 31 18:01:57 base kernel: EXT2-fs error (device ide0(3,71)):
ext2_new_inode:
reserved inode or inode > inodes count - block_group = 0,inode=1

It's reproducable, but doesn't seem to give any problems. It happens when
on a (almost) empty FS an links is attempted to a directory that doesn't
exist at that point.

I'll try 2.4.1 when I get back and see what that does..


Regards,


Igmar



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Vanilla 2.4.0 ext2fs error

2001-01-31 Thread Igmar Palsenberg


Hi,

Can someone 'translate' this for me ?


Jan 31 18:01:57 base kernel: EXT2-fs error (device ide0(3,71)):
ext2_new_inode:
reserved inode or inode > inodes count - block_group = 0,inode=1

It's reproducable, but doesn't seem to give any problems.
It happens when on a (almost) emty FS an links is attempted to a directory
that doesn't exist at that point.

I'll try 2.4.1 when I get back and see what that does..


Regards,


Igmar


-- 

--
Igmar Palsenberg
JDI Media Solutions

Jansplaats 11
6811 GB Arnhem
The Netherlands

mailto: [EMAIL PROTECTED]
PGP/GPG key : http://www.jdimedia.nl/formulier/pgp/igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Vanilla 2.4.0 ext2fs error

2001-01-31 Thread Igmar Palsenberg


Hi,

Can someone 'translate' this for me ?


Jan 31 18:01:57 base kernel: EXT2-fs error (device ide0(3,71)):
ext2_new_inode:
reserved inode or inode  inodes count - block_group = 0,inode=1

It's reproducable, but doesn't seem to give any problems.
It happens when on a (almost) emty FS an links is attempted to a directory
that doesn't exist at that point.

I'll try 2.4.1 when I get back and see what that does..


Regards,


Igmar


-- 

--
Igmar Palsenberg
JDI Media Solutions

Jansplaats 11
6811 GB Arnhem
The Netherlands

mailto: [EMAIL PROTECTED]
PGP/GPG key : http://www.jdimedia.nl/formulier/pgp/igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Why isn't init PID 1?

2001-01-31 Thread Igmar Palsenberg

On Wed, 31 Jan 2001, Paul Powell wrote:

 Hello,
 
 I have a bootable linux CD that runs a custom init. 
 Under most versions of linux init runs as process ID
 one.  Under my bootable CD, it runs as process ID 15. 
 I need it to run as PID 1 so that I can execute a
 kill(-1,15) without killing init.
 
 The boot CD uses and initrd image to load drivers. 
 The linuxrc file looks like:
 
 #!/bin/sash
 
 aliasall
 
 echo "Loading aic7xxx module"
 insmod /lib/aic7xxx.o
 echo "Loading ips module"
 insmod /lib/ips.o ips=ioctlsize:512000
 echo "Loading sg module"
 insmod /lib/sg.o
 echo "Loading FAT modules"
 insmod /lib/fat.o
 insmod /lib/vfat.o
 
 echo "Mounting /proc"
 mount -t proc /proc /proc
 init
 umount /proc
 
 Does it run as PID 15 because I execute insmod and
 mount before running init?

Yes. First program to run get PID 1. 

Solution : fork() in init and load the modules in the child.



Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: winsock 2

2001-01-29 Thread Igmar Palsenberg

On Tue, 23 Jan 2001, Rajiv Majumdar wrote:

> 
> 
> does winsock support raw sockets?if not, how do we implement an "ip spoof"
> in winsock?

This is a LINUX mailing list, now a windblows one. Please post to a
windows list, not this one.

> rajiv


Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: winsock 2

2001-01-29 Thread Igmar Palsenberg

On Tue, 23 Jan 2001, Rajiv Majumdar wrote:

 
 
 does winsock support raw sockets?if not, how do we implement an "ip spoof"
 in winsock?

This is a LINUX mailing list, now a windblows one. Please post to a
windows list, not this one.

 rajiv


Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Documenting stat(2)

2001-01-20 Thread Igmar Palsenberg

On Thu, 18 Jan 2001, Mike Castle wrote:

> On Thu, Jan 18, 2001 at 09:52:02PM +0100, Igmar Palsenberg wrote:
> > I use lstat to check if a config file is a symlink, and if it is, it
> > refuses to open it. 
> 
> Nice race condition.

Agree, but still better then opening things that are actually a
symlink. Now would someone probably say : use the O_NOWFOLLOW option, but
since I do other checks that wouldn't be an option.

> mrc


Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Documenting stat(2)

2001-01-20 Thread Igmar Palsenberg

On Thu, 18 Jan 2001, Mike Castle wrote:

 On Thu, Jan 18, 2001 at 09:52:02PM +0100, Igmar Palsenberg wrote:
  I use lstat to check if a config file is a symlink, and if it is, it
  refuses to open it. 
 
 Nice race condition.

Agree, but still better then opening things that are actually a
symlink. Now would someone probably say : use the O_NOWFOLLOW option, but
since I do other checks that wouldn't be an option.

 mrc


Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Documenting stat(2)

2001-01-18 Thread Igmar Palsenberg


> Nope stat should return the details of the symlink
> whereas lstat should return the details of the symlink target.

It's the other way around according to the manpage, and my code also says
it's the other way around.

It's logical the way it is.. 

I use lstat to check if a config file is a symlink, and if it is, it
refuses to open it. 



Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Documenting stat(2)

2001-01-18 Thread Igmar Palsenberg


 Nope stat should return the details of the symlink
 whereas lstat should return the details of the symlink target.

It's the other way around according to the manpage, and my code also says
it's the other way around.

It's logical the way it is.. 

I use lstat to check if a config file is a symlink, and if it is, it
refuses to open it. 



Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4.0 + iproute2

2001-01-17 Thread Igmar Palsenberg


> The underlying problem is of course that all those sanity checks should
> be done in user space, not in the kernel.
> 
> (See also ftp://icaftp.epfl.ch/pub/people/almesber/slides/tmp-tc.ps.gz
> The bitching starts on slide 11, some ideas for fixing the problem on
> slide 16, but heed the warning on slide 15.)
> 
> Besides that, I agree that we have far too many EINVALs in the kernel.
> Maybe we should just record file name and line number of the EINVAL
> in *current and add an eh?(2) system call ;-)

I don't care about an error, but EINVAL is giving very confusing
errors.. Like finding your glasses when you're already have them on.

I like the h_errno solution, but that's another glibc change.

> - Werner


Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: detecting bounced mails

2001-01-17 Thread Igmar Palsenberg

On Wed, 17 Jan 2001, Rajiv Majumdar wrote:

> 
> 
> Sorry..the topic does not fit here. But wanted to know, how can we check
> validity of an email id "in advance"

You can't. Only think you can check is a valid domain that will in theory
accept mail, no way to check if it really dilivers.

> so that we can skip "bounce".
> 
> Thanks!
> Rajiv


Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: detecting bounced mails

2001-01-17 Thread Igmar Palsenberg

On Wed, 17 Jan 2001, Rajiv Majumdar wrote:

 
 
 Sorry..the topic does not fit here. But wanted to know, how can we check
 validity of an email id "in advance"

You can't. Only think you can check is a valid domain that will in theory
accept mail, no way to check if it really dilivers.

 so that we can skip "bounce".
 
 Thanks!
 Rajiv


Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4.0 + iproute2

2001-01-17 Thread Igmar Palsenberg


 The underlying problem is of course that all those sanity checks should
 be done in user space, not in the kernel.
 
 (See also ftp://icaftp.epfl.ch/pub/people/almesber/slides/tmp-tc.ps.gz
 The bitching starts on slide 11, some ideas for fixing the problem on
 slide 16, but heed the warning on slide 15.)
 
 Besides that, I agree that we have far too many EINVALs in the kernel.
 Maybe we should just record file name and line number of the EINVAL
 in *current and add an eh?(2) system call ;-)

I don't care about an error, but EINVAL is giving very confusing
errors.. Like finding your glasses when you're already have them on.

I like the h_errno solution, but that's another glibc change.

 - Werner


Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



2.4.0ac9

2001-01-15 Thread Igmar Palsenberg


Hi,

2.4.0ac9 still kills the mouse on this machine. dmesg is attached.
Something I find interesting is that the PCMCIA bridge is on IRQ12.

We can't change the mouse or the PCMCIA bridges' interrupt.

I'll be happy to provide additional info.


Regards,

Igmar


Jan 16 08:33:06 mars kernel: Linux version 2.4.0-ac9 ([EMAIL PROTECTED]) 
(gcc version egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)) #6 Mon Jan 15 19:03:29 
CET 2001 
Jan 16 08:33:06 mars kernel: BIOS-provided physical RAM map: 
Jan 16 08:33:06 mars kernel:  BIOS-e820: 0009f800 @  (usable) 
Jan 16 08:33:06 mars kernel:  BIOS-e820: 0800 @ 0009f800 
(reserved) 
Jan 16 08:33:06 mars kernel:  BIOS-e820: 00015800 @ 000ea800 
(reserved) 
Jan 16 08:33:06 mars kernel:  BIOS-e820: 05f0 @ 0010 (usable) 
Jan 16 08:33:06 mars kernel:  BIOS-e820: 00015800 @ fffea800 
(reserved) 
Jan 16 08:33:06 mars kernel: On node 0 totalpages: 24576 
Jan 16 08:33:06 mars kernel: zone(0): 4096 pages. 
Jan 16 08:33:06 mars kernel: zone(1): 20480 pages. 
Jan 16 08:33:06 mars atd: atd startup succeeded
Jan 16 08:33:06 mars kernel: zone(2): 0 pages. 
Jan 16 08:33:07 mars kernel: Kernel command line: BOOT_IMAGE=test1 ro root=307 
sb=0x220,5,1 
Jan 16 08:33:07 mars kernel: Initializing CPU#0 
Jan 16 08:33:07 mars kernel: Detected 232.110 MHz processor. 
Jan 16 08:33:07 mars kernel: Console: colour VGA+ 80x25 
Jan 16 08:33:07 mars kernel: Calibrating delay loop... 462.02 BogoMIPS 
Jan 16 08:33:07 mars kernel: Memory: 94084k/98304k available (1252k kernel code, 3832k 
reserved, 508k data, 196k init, 0k highmem) 
Jan 16 08:33:07 mars kernel: Dentry-cache hash table entries: 16384 (order: 5, 131072 
bytes) 
Jan 16 08:33:07 mars kernel: Buffer-cache hash table entries: 4096 (order: 2, 16384 
bytes) 
Jan 16 08:33:07 mars kernel: Page-cache hash table entries: 32768 (order: 5, 131072 
bytes) 
Jan 16 08:33:07 mars kernel: Inode-cache hash table entries: 8192 (order: 4, 65536 
bytes) 
Jan 16 08:33:07 mars kernel: VFS: Diskquotas version dquot_6.5.0 initialized 
Jan 16 08:33:00 mars rc.sysinit: Mounting proc filesystem succeeded 
Jan 16 08:33:07 mars crond: crond startup succeeded
Jan 16 08:33:07 mars kernel: CPU: Before vendor init, caps: 0183f9ff  
, vendor = 0 
Jan 16 08:33:00 mars sysctl: net.ipv4.ip_forward = 0 
Jan 16 08:33:07 mars kernel: CPU: L1 I cache: 16K, L1 D cache: 16K 
Jan 16 08:33:00 mars sysctl: net.ipv4.conf.all.rp_filter = 1 
Jan 16 08:33:07 mars kernel: CPU: L2 cache: 512K 
Jan 16 08:33:00 mars sysctl: kernel.sysrq = 0 
Jan 16 08:33:08 mars kernel: Intel machine check architecture supported. 
Jan 16 08:33:08 mars inet: inetd startup succeeded
Jan 16 08:33:00 mars sysctl: error: 'net.ipv4.ip_always_defrag' is an unknown key 
Jan 16 08:33:08 mars kernel: Intel machine check reporting enabled on CPU#0. 
Jan 16 08:33:00 mars rc.sysinit: Configuring kernel parameters succeeded 
Jan 16 08:33:08 mars kernel: CPU: After vendor init, caps: 0183f9ff   
 
Jan 16 08:33:08 mars sshd: Starting sshd: 
Jan 16 08:33:00 mars date: Tue Jan 16 08:32:59 CET 2001 
Jan 16 08:33:08 mars kernel: CPU: After generic, caps: 0183f9ff   
 
Jan 16 08:33:00 mars rc.sysinit: Setting clock : Tue Jan 16 08:32:59 CET 2001 
succeeded 
Jan 16 08:33:08 mars kernel: CPU: Common caps: 0183f9ff    
Jan 16 08:33:00 mars rc.sysinit: Loading default keymap succeeded 
Jan 16 08:33:09 mars kernel: CPU: Intel Pentium II (Deschutes) stepping 02 
Jan 16 08:33:00 mars rc.sysinit: Activating swap partitions succeeded 
Jan 16 08:33:09 mars kernel: Enabling fast FPU save and restore... done. 
Jan 16 08:33:00 mars rc.sysinit: Setting hostname mars.office.jdimedia.nl succeeded 
Jan 16 08:33:09 mars kernel: Checking 'hlt' instruction... OK. 
Jan 16 08:33:00 mars fsck: /dev/hda7: clean, 50558/114016 files, 196128/227800 blocks 
Jan 16 08:33:10 mars sshd: sshd startup succeeded
Jan 16 08:33:10 mars sshd[557]: Server listening on 0.0.0.0 port 22.
Jan 16 08:33:10 mars kernel: POSIX conformance testing by UNIFIX 
Jan 16 08:33:00 mars rc.sysinit: Checking root filesystem succeeded 
Jan 16 08:33:10 mars sshd: 
Jan 16 08:33:10 mars sshd[557]: Generating 768 bit RSA key.
Jan 16 08:33:10 mars kernel: PCI: PCI BIOS revision 2.10 entry at 0xfd9e3, last bus=0 
Jan 16 08:33:00 mars rc.sysinit: Remounting root filesystem in read-write mode 
succeeded 
Jan 16 08:33:10 mars rc: Starting sshd succeeded
Jan 16 08:33:10 mars kernel: PCI: Using configuration type 1 
Jan 16 08:33:00 mars rc.sysinit: Finding module dependencies failed 
Jan 16 08:33:10 mars kernel: PCI: Probing PCI hardware 
Jan 16 08:33:00 mars depmod: depmod: Can't open /lib/modules/2.4.0-ac9/modules.dep for 
writing 
Jan 16 08:33:11 mars kernel: PCI: Using IRQ router PIIX [8086/7110] at 00:03.0 
Jan 16 

2.4.0ac9

2001-01-15 Thread Igmar Palsenberg


Hi,

2.4.0ac9 still kills the mouse on this machine. dmesg is attached.
Something I find interesting is that the PCMCIA bridge is on IRQ12.

We can't change the mouse or the PCMCIA bridges' interrupt.

I'll be happy to provide additional info.


Regards,

Igmar


Jan 16 08:33:06 mars kernel: Linux version 2.4.0-ac9 ([EMAIL PROTECTED]) 
(gcc version egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)) #6 Mon Jan 15 19:03:29 
CET 2001 
Jan 16 08:33:06 mars kernel: BIOS-provided physical RAM map: 
Jan 16 08:33:06 mars kernel:  BIOS-e820: 0009f800 @  (usable) 
Jan 16 08:33:06 mars kernel:  BIOS-e820: 0800 @ 0009f800 
(reserved) 
Jan 16 08:33:06 mars kernel:  BIOS-e820: 00015800 @ 000ea800 
(reserved) 
Jan 16 08:33:06 mars kernel:  BIOS-e820: 05f0 @ 0010 (usable) 
Jan 16 08:33:06 mars kernel:  BIOS-e820: 00015800 @ fffea800 
(reserved) 
Jan 16 08:33:06 mars kernel: On node 0 totalpages: 24576 
Jan 16 08:33:06 mars kernel: zone(0): 4096 pages. 
Jan 16 08:33:06 mars kernel: zone(1): 20480 pages. 
Jan 16 08:33:06 mars atd: atd startup succeeded
Jan 16 08:33:06 mars kernel: zone(2): 0 pages. 
Jan 16 08:33:07 mars kernel: Kernel command line: BOOT_IMAGE=test1 ro root=307 
sb=0x220,5,1 
Jan 16 08:33:07 mars kernel: Initializing CPU#0 
Jan 16 08:33:07 mars kernel: Detected 232.110 MHz processor. 
Jan 16 08:33:07 mars kernel: Console: colour VGA+ 80x25 
Jan 16 08:33:07 mars kernel: Calibrating delay loop... 462.02 BogoMIPS 
Jan 16 08:33:07 mars kernel: Memory: 94084k/98304k available (1252k kernel code, 3832k 
reserved, 508k data, 196k init, 0k highmem) 
Jan 16 08:33:07 mars kernel: Dentry-cache hash table entries: 16384 (order: 5, 131072 
bytes) 
Jan 16 08:33:07 mars kernel: Buffer-cache hash table entries: 4096 (order: 2, 16384 
bytes) 
Jan 16 08:33:07 mars kernel: Page-cache hash table entries: 32768 (order: 5, 131072 
bytes) 
Jan 16 08:33:07 mars kernel: Inode-cache hash table entries: 8192 (order: 4, 65536 
bytes) 
Jan 16 08:33:07 mars kernel: VFS: Diskquotas version dquot_6.5.0 initialized 
Jan 16 08:33:00 mars rc.sysinit: Mounting proc filesystem succeeded 
Jan 16 08:33:07 mars crond: crond startup succeeded
Jan 16 08:33:07 mars kernel: CPU: Before vendor init, caps: 0183f9ff  
, vendor = 0 
Jan 16 08:33:00 mars sysctl: net.ipv4.ip_forward = 0 
Jan 16 08:33:07 mars kernel: CPU: L1 I cache: 16K, L1 D cache: 16K 
Jan 16 08:33:00 mars sysctl: net.ipv4.conf.all.rp_filter = 1 
Jan 16 08:33:07 mars kernel: CPU: L2 cache: 512K 
Jan 16 08:33:00 mars sysctl: kernel.sysrq = 0 
Jan 16 08:33:08 mars kernel: Intel machine check architecture supported. 
Jan 16 08:33:08 mars inet: inetd startup succeeded
Jan 16 08:33:00 mars sysctl: error: 'net.ipv4.ip_always_defrag' is an unknown key 
Jan 16 08:33:08 mars kernel: Intel machine check reporting enabled on CPU#0. 
Jan 16 08:33:00 mars rc.sysinit: Configuring kernel parameters succeeded 
Jan 16 08:33:08 mars kernel: CPU: After vendor init, caps: 0183f9ff   
 
Jan 16 08:33:08 mars sshd: Starting sshd: 
Jan 16 08:33:00 mars date: Tue Jan 16 08:32:59 CET 2001 
Jan 16 08:33:08 mars kernel: CPU: After generic, caps: 0183f9ff   
 
Jan 16 08:33:00 mars rc.sysinit: Setting clock : Tue Jan 16 08:32:59 CET 2001 
succeeded 
Jan 16 08:33:08 mars kernel: CPU: Common caps: 0183f9ff    
Jan 16 08:33:00 mars rc.sysinit: Loading default keymap succeeded 
Jan 16 08:33:09 mars kernel: CPU: Intel Pentium II (Deschutes) stepping 02 
Jan 16 08:33:00 mars rc.sysinit: Activating swap partitions succeeded 
Jan 16 08:33:09 mars kernel: Enabling fast FPU save and restore... done. 
Jan 16 08:33:00 mars rc.sysinit: Setting hostname mars.office.jdimedia.nl succeeded 
Jan 16 08:33:09 mars kernel: Checking 'hlt' instruction... OK. 
Jan 16 08:33:00 mars fsck: /dev/hda7: clean, 50558/114016 files, 196128/227800 blocks 
Jan 16 08:33:10 mars sshd: sshd startup succeeded
Jan 16 08:33:10 mars sshd[557]: Server listening on 0.0.0.0 port 22.
Jan 16 08:33:10 mars kernel: POSIX conformance testing by UNIFIX 
Jan 16 08:33:00 mars rc.sysinit: Checking root filesystem succeeded 
Jan 16 08:33:10 mars sshd: 
Jan 16 08:33:10 mars sshd[557]: Generating 768 bit RSA key.
Jan 16 08:33:10 mars kernel: PCI: PCI BIOS revision 2.10 entry at 0xfd9e3, last bus=0 
Jan 16 08:33:00 mars rc.sysinit: Remounting root filesystem in read-write mode 
succeeded 
Jan 16 08:33:10 mars rc: Starting sshd succeeded
Jan 16 08:33:10 mars kernel: PCI: Using configuration type 1 
Jan 16 08:33:00 mars rc.sysinit: Finding module dependencies failed 
Jan 16 08:33:10 mars kernel: PCI: Probing PCI hardware 
Jan 16 08:33:00 mars depmod: depmod: Can't open /lib/modules/2.4.0-ac9/modules.dep for 
writing 
Jan 16 08:33:11 mars kernel: PCI: Using IRQ router PIIX [8086/7110] at 00:03.0 
Jan 16 

Re: 2.4.0 + iproute2

2001-01-14 Thread Igmar Palsenberg


> > Using textual strings means you can't use standard functions. An option
> > would be to extend the call so that if the userspace app wants to know
> > what really went wrong he can ask the kernel.
> 
> That will not work. Consider an application that has multiple rtnetlink
> sockets open, which each have own errors.

errno is only valid until a new syscall is done. So I don't see the
problem with multiple sockets, you can only perform one at a time.


> rtnetlink is such a radical interface for unix, adding a few more changes
> for a different error reporting system probably does not make much difference.
> 
> my problem with keeping the textual error messages out of kernel is that
> it means that three entities (kernel module, number table in kernel and 
> external string table) need to be kept in sync. In practice that's usually
> not the case.

I wonder if the glibc keeps it's own copy of the sys_errlist[]. If it has,
that means that we indeed have a problem..
Maybe the kernel could provide errno -> textual mapping, but that sounds
like bloat to me..

An other way is to have some kind of extended error.

> David's /proc/errno_strings would only require keeping kernel table and
> module in sync. 
> Text errors for rtnetlink would localize it to the module itself. 
> I could probably live with David's solution, although I would prefer the full
> way. 

Disadvantage of textual stuff is that you can't do more then print it. 

> -Andi


Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4.0 + iproute2

2001-01-14 Thread Igmar Palsenberg


> People must be really suffering right now, and we ought to get
> /proc/errno_strings implemented as soon as possible... :-)

First the help describing large tables should be changed. It's wrong.
String errors don't belong in kernel space IMHO.

Igmar


-- 

--
Igmar Palsenberg
JDI Media Solutions

Jansplaats 11
6811 GB Arnhem
The Netherlands

mailto: [EMAIL PROTECTED]
PGP/GPG key : http://www.jdimedia.nl/formulier/pgp/igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4.0 + iproute2

2001-01-14 Thread Igmar Palsenberg

On Sun, 14 Jan 2001, Andi Kleen wrote:

> On Sun, Jan 14, 2001 at 03:36:55AM -0800, David S. Miller wrote:
> >
> > Andi Kleen writes:
> >  > How would you pass the extended errors? As strings or as to be
> >  > defined new numbers? I would prefer strings, because the number
> >  > namespace could turn out to be as nasty to maintain as the current
> >  > sysctl one.
> >
> > Textual error messages for system calls never belong in the kernel.
> > Put it in glibc or wherever.
>
> This just means that a table needs to be kept in sync between glibc and
> netlink, and if someone e.g. gets a new CBQ module he would need to update
> glibc. It's also bad for maintainers, because patches for tables of number
> tend to always reject ;)

Agree, but textual strings are bad. I want to say :

if (error) {
perror("RTNETLINK");
return -1;
}

Using textual strings means you can't use standard functions. An option
would be to extend the call so that if the userspace app wants to know
what really went wrong he can ask the kernel.

In that case you can keep the -EINVAL, the namespace won't be polluted,
and you can see what goes wrong. Agains this is that you need another
interface, which isn't portable.

>
> Textual error messages are e.g. used by plan9 and would be somewhat similar
> to /proc. It would probably waste a few bytes in the kernel, but that's not
> too bad, given the work it saves. e.g. rusty's code usually has a debug option
> that you can set and where each EINVAL outputs a error message; i always found
> that very useful and sometimes hacked that into other subsystems in my
> private tree.

Still means that all standard functions won't work.

> -Andi


Igmar

-- 

--
Igmar Palsenberg
JDI Media Solutions

Jansplaats 11
6811 GB Arnhem
The Netherlands

mailto: [EMAIL PROTECTED]
PGP/GPG key : http://www.jdimedia.nl/formulier/pgp/igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4.0 + iproute2

2001-01-14 Thread Igmar Palsenberg

> Igmar Palsenberg writes:
>
>  > we might want to consider changing the error the call gives in case
>  > MULTIPLE_TABLES isn't set. -EINVAL is ugly, -ENOSYS should make the error
>  > more clear..
>
> How do I tell the difference between using the wrong system call
> number to invoke an ioctl or socket option change, and making a
> call for a feature I haven't configured into my kernel?

The large tables option is rather strange : Looking at the name I start
thinking that the option is actually already there, but this option
enlarges this table.

When the kernel return -EINVAL I start thinking that the call is actually
supported, but the userspace stuff sends garbage. In this case, it sends
valid data, bit the call isn't there.

I haven't had a real good look at the code, but we might change the
behaviour so that the call fails (same case if NETLINK isn't compiled in,
you get an error when creating the socket).

If this isn't possible (if we don't know what userspace wants when
creating the socket, it's a good idea to print an aditional hint saying
'you might want to compile LARGE TABLES option'.

> I think ENOSYS is just a bad a choice.

Maybe time for a ENOTSUPPORTED or so ?

The config option says :

'If you have routing zones that grow to more than about 64 entries, you
may want to say Y here to speed up the routing process'

Which I assume that it just enlarges the table.

-ENOSYS is bad in this case indeed, but -EINVAL is also bad IMHO.



Regards,

    Igmar
-- 

--
Igmar Palsenberg
JDI Media Solutions

Jansplaats 11
6811 GB Arnhem
The Netherlands

mailto: [EMAIL PROTECTED]
PGP/GPG key : http://www.jdimedia.nl/formulier/pgp/igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4.0 + iproute2

2001-01-14 Thread Igmar Palsenberg


>  > > You forgot to set CONFIG_IP_ADVANCED_ROUTER
>  >
>  > Nope. Still the same error after that one is set :
>  >
>  > CONFIG_IP_ADVANCED_ROUTER=y
>
> Try CONFIG_IP_MULTIPLE_TABLES.

Yep, that was the one..

we might want to consider changing the error the call gives in case
MULTIPLE_TABLES isn't set. -EINVAL is ugly, -ENOSYS should make the error
more clear..

> Later,
> David S. Miller
> [EMAIL PROTECTED]


Thanx,

Igmar


-- 

--
Igmar Palsenberg
JDI Media Solutions

Jansplaats 11
6811 GB Arnhem
The Netherlands

mailto: [EMAIL PROTECTED]
PGP/GPG key : http://www.jdimedia.nl/formulier/pgp/igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4.0 + iproute2

2001-01-14 Thread Igmar Palsenberg

> On Sat, Jan 13, 2001 at 05:37:01PM +0100, Igmar Palsenberg wrote:
> > Hi,
> >
> > kernel : 2.4.0 vanilla
> > iproute2 version : ss001007
> >
> > After building I've got a few problems :
> >
> > ./ip rule list
> > RTNETLINK answers: Invalid argument
> > Dump terminated
>
> You forgot to set CONFIG_IP_ADVANCED_ROUTER

Nope. Still the same error after that one is set :

CONFIG_IP_ADVANCED_ROUTER=y

[root@base root]# ip rule list
RTNETLINK answers: Invalid argument
Dump terminated

According to net/ipv4/Config.in :

if [ "$CONFIG_IP_ADVANCED_ROUTER" = "y" ]; then
   define_bool CONFIG_RTNETLINK y
   define_bool CONFIG_NETLINK y

CONFIG_IP_ADVANCED_ROUTER just sets those two values, and adapts the
questions. To make sure I just recompiled with Advanced Router turned on,
and still the same error.

I tested the other command of the ip command, and this one is the only one
that gives problems, the others are fine.




Regards,


Igmar



-- 

--
Igmar Palsenberg
JDI Media Solutions

Jansplaats 11
6811 GB Arnhem
The Netherlands

mailto: [EMAIL PROTECTED]
PGP/GPG key : http://www.jdimedia.nl/formulier/pgp/igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4.0 + iproute2

2001-01-14 Thread Igmar Palsenberg

 On Sat, Jan 13, 2001 at 05:37:01PM +0100, Igmar Palsenberg wrote:
  Hi,
 
  kernel : 2.4.0 vanilla
  iproute2 version : ss001007
 
  After building I've got a few problems :
 
  ./ip rule list
  RTNETLINK answers: Invalid argument
  Dump terminated

 You forgot to set CONFIG_IP_ADVANCED_ROUTER

Nope. Still the same error after that one is set :

CONFIG_IP_ADVANCED_ROUTER=y

[root@base root]# ip rule list
RTNETLINK answers: Invalid argument
Dump terminated

According to net/ipv4/Config.in :

if [ "$CONFIG_IP_ADVANCED_ROUTER" = "y" ]; then
   define_bool CONFIG_RTNETLINK y
   define_bool CONFIG_NETLINK y

CONFIG_IP_ADVANCED_ROUTER just sets those two values, and adapts the
questions. To make sure I just recompiled with Advanced Router turned on,
and still the same error.

I tested the other command of the ip command, and this one is the only one
that gives problems, the others are fine.




Regards,


    Igmar



-- 

--
Igmar Palsenberg
JDI Media Solutions

Jansplaats 11
6811 GB Arnhem
The Netherlands

mailto: [EMAIL PROTECTED]
PGP/GPG key : http://www.jdimedia.nl/formulier/pgp/igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4.0 + iproute2

2001-01-14 Thread Igmar Palsenberg


You forgot to set CONFIG_IP_ADVANCED_ROUTER
  
   Nope. Still the same error after that one is set :
  
   CONFIG_IP_ADVANCED_ROUTER=y

 Try CONFIG_IP_MULTIPLE_TABLES.

Yep, that was the one..

we might want to consider changing the error the call gives in case
MULTIPLE_TABLES isn't set. -EINVAL is ugly, -ENOSYS should make the error
more clear..

 Later,
 David S. Miller
 [EMAIL PROTECTED]


Thanx,

Igmar


-- 

--
Igmar Palsenberg
JDI Media Solutions

Jansplaats 11
6811 GB Arnhem
The Netherlands

mailto: [EMAIL PROTECTED]
PGP/GPG key : http://www.jdimedia.nl/formulier/pgp/igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4.0 + iproute2

2001-01-14 Thread Igmar Palsenberg

 Igmar Palsenberg writes:

   we might want to consider changing the error the call gives in case
   MULTIPLE_TABLES isn't set. -EINVAL is ugly, -ENOSYS should make the error
   more clear..

 How do I tell the difference between using the wrong system call
 number to invoke an ioctl or socket option change, and making a
 call for a feature I haven't configured into my kernel?

The large tables option is rather strange : Looking at the name I start
thinking that the option is actually already there, but this option
enlarges this table.

When the kernel return -EINVAL I start thinking that the call is actually
supported, but the userspace stuff sends garbage. In this case, it sends
valid data, bit the call isn't there.

I haven't had a real good look at the code, but we might change the
behaviour so that the call fails (same case if NETLINK isn't compiled in,
you get an error when creating the socket).

If this isn't possible (if we don't know what userspace wants when
creating the socket, it's a good idea to print an aditional hint saying
'you might want to compile LARGE TABLES option'.

 I think ENOSYS is just a bad a choice.

Maybe time for a ENOTSUPPORTED or so ?

The config option says :

'If you have routing zones that grow to more than about 64 entries, you
may want to say Y here to speed up the routing process'

Which I assume that it just enlarges the table.

-ENOSYS is bad in this case indeed, but -EINVAL is also bad IMHO.



Regards,

Igmar
-- 

--
Igmar Palsenberg
JDI Media Solutions

Jansplaats 11
6811 GB Arnhem
The Netherlands

mailto: [EMAIL PROTECTED]
PGP/GPG key : http://www.jdimedia.nl/formulier/pgp/igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4.0 + iproute2

2001-01-14 Thread Igmar Palsenberg

On Sun, 14 Jan 2001, Andi Kleen wrote:

 On Sun, Jan 14, 2001 at 03:36:55AM -0800, David S. Miller wrote:
 
  Andi Kleen writes:
How would you pass the extended errors? As strings or as to be
defined new numbers? I would prefer strings, because the number
namespace could turn out to be as nasty to maintain as the current
sysctl one.
 
  Textual error messages for system calls never belong in the kernel.
  Put it in glibc or wherever.

 This just means that a table needs to be kept in sync between glibc and
 netlink, and if someone e.g. gets a new CBQ module he would need to update
 glibc. It's also bad for maintainers, because patches for tables of number
 tend to always reject ;)

Agree, but textual strings are bad. I want to say :

if (error) {
perror("RTNETLINK");
return -1;
}

Using textual strings means you can't use standard functions. An option
would be to extend the call so that if the userspace app wants to know
what really went wrong he can ask the kernel.

In that case you can keep the -EINVAL, the namespace won't be polluted,
and you can see what goes wrong. Agains this is that you need another
interface, which isn't portable.


 Textual error messages are e.g. used by plan9 and would be somewhat similar
 to /proc. It would probably waste a few bytes in the kernel, but that's not
 too bad, given the work it saves. e.g. rusty's code usually has a debug option
 that you can set and where each EINVAL outputs a error message; i always found
 that very useful and sometimes hacked that into other subsystems in my
 private tree.

Still means that all standard functions won't work.

 -Andi


Igmar

-- 

--
Igmar Palsenberg
JDI Media Solutions

Jansplaats 11
6811 GB Arnhem
The Netherlands

mailto: [EMAIL PROTECTED]
PGP/GPG key : http://www.jdimedia.nl/formulier/pgp/igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4.0 + iproute2

2001-01-14 Thread Igmar Palsenberg


 People must be really suffering right now, and we ought to get
 /proc/errno_strings implemented as soon as possible... :-)

First the help describing large tables should be changed. It's wrong.
String errors don't belong in kernel space IMHO.

Igmar


-- 

--
Igmar Palsenberg
JDI Media Solutions

Jansplaats 11
6811 GB Arnhem
The Netherlands

mailto: [EMAIL PROTECTED]
PGP/GPG key : http://www.jdimedia.nl/formulier/pgp/igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4.0 + iproute2

2001-01-14 Thread Igmar Palsenberg


  Using textual strings means you can't use standard functions. An option
  would be to extend the call so that if the userspace app wants to know
  what really went wrong he can ask the kernel.
 
 That will not work. Consider an application that has multiple rtnetlink
 sockets open, which each have own errors.

errno is only valid until a new syscall is done. So I don't see the
problem with multiple sockets, you can only perform one at a time.


 rtnetlink is such a radical interface for unix, adding a few more changes
 for a different error reporting system probably does not make much difference.
 
 my problem with keeping the textual error messages out of kernel is that
 it means that three entities (kernel module, number table in kernel and 
 external string table) need to be kept in sync. In practice that's usually
 not the case.

I wonder if the glibc keeps it's own copy of the sys_errlist[]. If it has,
that means that we indeed have a problem..
Maybe the kernel could provide errno - textual mapping, but that sounds
like bloat to me..

An other way is to have some kind of extended error.

 David's /proc/errno_strings would only require keeping kernel table and
 module in sync. 
 Text errors for rtnetlink would localize it to the module itself. 
 I could probably live with David's solution, although I would prefer the full
 way. 

Disadvantage of textual stuff is that you can't do more then print it. 

 -Andi


Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



2.4.0 + iproute2

2001-01-13 Thread Igmar Palsenberg

Hi,

kernel : 2.4.0 vanilla
iproute2 version : ss001007

After building I've got a few problems :

./ip rule list
RTNETLINK answers: Invalid argument
Dump terminated

Version should be OK according to the Changes file.

config is attached


Regards,


Igmar

-- 

--
Igmar Palsenberg
JDI Media Solutions

Jansplaats 11
6811 GB Arnhem
The Netherlands

mailto: [EMAIL PROTECTED]
PGP/GPG key : http://www.jdimedia.nl/formulier/pgp/igmar


#
# Automatically generated by make menuconfig: don't edit
#
CONFIG_X86=y
CONFIG_ISA=y
CONFIG_UID16=y

#
# Processor type and features
#
CONFIG_M586TSC=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_L1_CACHE_SHIFT=5
CONFIG_X86_USE_STRING_486=y
CONFIG_X86_ALIGNMENT_16=y
CONFIG_X86_TSC=y
CONFIG_NOHIGHMEM=y

#
# General setup
#
CONFIG_NET=y
CONFIG_PCI=y
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_NAMES=y
CONFIG_SYSVIPC=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_SYSCTL=y
CONFIG_KCORE_ELF=y
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_MISC=y

#
# Plug and Play configuration
#
CONFIG_PNP=y
CONFIG_ISAPNP=y

#
# Block devices
#
CONFIG_BLK_DEV_FD=y
CONFIG_BLK_DEV_LOOP=y
CONFIG_BLK_DEV_NBD=y

#
# Networking options
#
CONFIG_PACKET=y
CONFIG_PACKET_MMAP=y
CONFIG_NETLINK=y
CONFIG_RTNETLINK=y
CONFIG_NETFILTER=y
CONFIG_UNIX=y
CONFIG_INET=y
CONFIG_SYN_COOKIES=y

#
#   IP: Netfilter Configuration
#
CONFIG_IP_NF_CONNTRACK=y
CONFIG_IP_NF_FTP=y
CONFIG_IP_NF_IPTABLES=y
CONFIG_IP_NF_MATCH_LIMIT=y
CONFIG_IP_NF_MATCH_MAC=y
CONFIG_IP_NF_MATCH_MARK=y
CONFIG_IP_NF_MATCH_MULTIPORT=y
CONFIG_IP_NF_MATCH_TOS=y
CONFIG_IP_NF_MATCH_STATE=y
CONFIG_IP_NF_FILTER=y
CONFIG_IP_NF_TARGET_REJECT=y
CONFIG_IP_NF_NAT=y
CONFIG_IP_NF_NAT_NEEDED=y
CONFIG_IP_NF_TARGET_MASQUERADE=y
CONFIG_IP_NF_TARGET_REDIRECT=y
CONFIG_IP_NF_MANGLE=y
CONFIG_IP_NF_TARGET_TOS=y
CONFIG_IP_NF_TARGET_MARK=y
CONFIG_IP_NF_TARGET_LOG=y

#
# ATA/IDE/MFM/RLL support
#
CONFIG_IDE=y

#
# IDE, ATA and ATAPI Block devices
#
CONFIG_BLK_DEV_IDE=y
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_BLK_DEV_IDECD=y
CONFIG_BLK_DEV_CMD640=y
CONFIG_BLK_DEV_RZ1000=y
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_IDEPCI_SHARE_IRQ=y
CONFIG_BLK_DEV_IDE_MODES=y

#
# Network device support
#
CONFIG_NETDEVICES=y

#
# Ethernet (10 or 100Mbit)
#
CONFIG_NET_ETHERNET=y
CONFIG_NET_VENDOR_3COM=y
CONFIG_VORTEX=y
CONFIG_NET_PCI=y
CONFIG_8139TOO=y

#
# Ethernet (1000 Mbit)
#
CONFIG_PPP=y
CONFIG_PPP_ASYNC=y
CONFIG_PPP_DEFLATE=y

#
# ISDN subsystem
#
CONFIG_ISDN=y

#
# Passive ISDN cards
#
CONFIG_ISDN_DRV_HISAX=y
CONFIG_HISAX_EURO=y
CONFIG_HISAX_16_3=y

#
# Character devices
#
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_SERIAL=y
CONFIG_UNIX98_PTYS=y
CONFIG_UNIX98_PTY_COUNT=256

#
# Mice
#
CONFIG_MOUSE=y
CONFIG_PSMOUSE=y

#
# File systems
#
CONFIG_QUOTA=y
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_PROC_FS=y
CONFIG_DEVPTS_FS=y
CONFIG_EXT2_FS=y

#
# Network File Systems
#
# CONFIG_CODA_FS is not set
CONFIG_NFS_FS=y
CONFIG_NFS_V3=y
# CONFIG_ROOT_NFS is not set
CONFIG_NFSD=y
CONFIG_NFSD_V3=y
CONFIG_SUNRPC=y
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y

#
# Partition Types
#
CONFIG_MSDOS_PARTITION=y
CONFIG_NLS=y

#
# Native Language Support
#
CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_CODEPAGE_850=y
CONFIG_NLS_ISO8859_1=y

#
# Console drivers
#
CONFIG_VGA_CONSOLE=y



PS/2 mouse access kills keyboard

2001-01-13 Thread Igmar Palsenberg


Hi,

on plain 2.4.0 vanilla any mouse access kills the keyboard. Only way to
restore functionality is to kill gpm.

gpm writes 'protocol error' to syslog. I have access to this machine on
monday, so I can post details then.

Changing the IRQ is totally unrelated, machine works in 2.2.x with the
same config.


regards,


Igmar


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



PS/2 mouse access kills keyboard

2001-01-13 Thread Igmar Palsenberg


Hi,

on plain 2.4.0 vanilla any mouse access kills the keyboard. Only way to
restore functionality is to kill gpm.

gpm writes 'protocol error' to syslog. I have access to this machine on
monday, so I can post details then.

Changing the IRQ is totally unrelated, machine works in 2.2.x with the
same config.


regards,


Igmar


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



2.4.0 + iproute2

2001-01-13 Thread Igmar Palsenberg

Hi,

kernel : 2.4.0 vanilla
iproute2 version : ss001007

After building I've got a few problems :

./ip rule list
RTNETLINK answers: Invalid argument
Dump terminated

Version should be OK according to the Changes file.

config is attached


Regards,


Igmar

-- 

--
Igmar Palsenberg
JDI Media Solutions

Jansplaats 11
6811 GB Arnhem
The Netherlands

mailto: [EMAIL PROTECTED]
PGP/GPG key : http://www.jdimedia.nl/formulier/pgp/igmar


#
# Automatically generated by make menuconfig: don't edit
#
CONFIG_X86=y
CONFIG_ISA=y
CONFIG_UID16=y

#
# Processor type and features
#
CONFIG_M586TSC=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_L1_CACHE_SHIFT=5
CONFIG_X86_USE_STRING_486=y
CONFIG_X86_ALIGNMENT_16=y
CONFIG_X86_TSC=y
CONFIG_NOHIGHMEM=y

#
# General setup
#
CONFIG_NET=y
CONFIG_PCI=y
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_NAMES=y
CONFIG_SYSVIPC=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_SYSCTL=y
CONFIG_KCORE_ELF=y
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_MISC=y

#
# Plug and Play configuration
#
CONFIG_PNP=y
CONFIG_ISAPNP=y

#
# Block devices
#
CONFIG_BLK_DEV_FD=y
CONFIG_BLK_DEV_LOOP=y
CONFIG_BLK_DEV_NBD=y

#
# Networking options
#
CONFIG_PACKET=y
CONFIG_PACKET_MMAP=y
CONFIG_NETLINK=y
CONFIG_RTNETLINK=y
CONFIG_NETFILTER=y
CONFIG_UNIX=y
CONFIG_INET=y
CONFIG_SYN_COOKIES=y

#
#   IP: Netfilter Configuration
#
CONFIG_IP_NF_CONNTRACK=y
CONFIG_IP_NF_FTP=y
CONFIG_IP_NF_IPTABLES=y
CONFIG_IP_NF_MATCH_LIMIT=y
CONFIG_IP_NF_MATCH_MAC=y
CONFIG_IP_NF_MATCH_MARK=y
CONFIG_IP_NF_MATCH_MULTIPORT=y
CONFIG_IP_NF_MATCH_TOS=y
CONFIG_IP_NF_MATCH_STATE=y
CONFIG_IP_NF_FILTER=y
CONFIG_IP_NF_TARGET_REJECT=y
CONFIG_IP_NF_NAT=y
CONFIG_IP_NF_NAT_NEEDED=y
CONFIG_IP_NF_TARGET_MASQUERADE=y
CONFIG_IP_NF_TARGET_REDIRECT=y
CONFIG_IP_NF_MANGLE=y
CONFIG_IP_NF_TARGET_TOS=y
CONFIG_IP_NF_TARGET_MARK=y
CONFIG_IP_NF_TARGET_LOG=y

#
# ATA/IDE/MFM/RLL support
#
CONFIG_IDE=y

#
# IDE, ATA and ATAPI Block devices
#
CONFIG_BLK_DEV_IDE=y
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_BLK_DEV_IDECD=y
CONFIG_BLK_DEV_CMD640=y
CONFIG_BLK_DEV_RZ1000=y
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_IDEPCI_SHARE_IRQ=y
CONFIG_BLK_DEV_IDE_MODES=y

#
# Network device support
#
CONFIG_NETDEVICES=y

#
# Ethernet (10 or 100Mbit)
#
CONFIG_NET_ETHERNET=y
CONFIG_NET_VENDOR_3COM=y
CONFIG_VORTEX=y
CONFIG_NET_PCI=y
CONFIG_8139TOO=y

#
# Ethernet (1000 Mbit)
#
CONFIG_PPP=y
CONFIG_PPP_ASYNC=y
CONFIG_PPP_DEFLATE=y

#
# ISDN subsystem
#
CONFIG_ISDN=y

#
# Passive ISDN cards
#
CONFIG_ISDN_DRV_HISAX=y
CONFIG_HISAX_EURO=y
CONFIG_HISAX_16_3=y

#
# Character devices
#
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_SERIAL=y
CONFIG_UNIX98_PTYS=y
CONFIG_UNIX98_PTY_COUNT=256

#
# Mice
#
CONFIG_MOUSE=y
CONFIG_PSMOUSE=y

#
# File systems
#
CONFIG_QUOTA=y
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_PROC_FS=y
CONFIG_DEVPTS_FS=y
CONFIG_EXT2_FS=y

#
# Network File Systems
#
# CONFIG_CODA_FS is not set
CONFIG_NFS_FS=y
CONFIG_NFS_V3=y
# CONFIG_ROOT_NFS is not set
CONFIG_NFSD=y
CONFIG_NFSD_V3=y
CONFIG_SUNRPC=y
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y

#
# Partition Types
#
CONFIG_MSDOS_PARTITION=y
CONFIG_NLS=y

#
# Native Language Support
#
CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_CODEPAGE_850=y
CONFIG_NLS_ISO8859_1=y

#
# Console drivers
#
CONFIG_VGA_CONSOLE=y



Re: Message from new kernel

2001-01-11 Thread Igmar Palsenberg

On Thu, 11 Jan 2001, Nguyen Truong Sinh wrote:

> I am using Redhat 7.0 for my system. After install new kernel (2.4.0). My system 
>always inform 
> NET: 3 messages suppressed
> 
> What does it mean ? and how to fix it, I don't want it appears on the console at all.

man syslog

messages supressed means it didn't write all three messages, but a line
saying that the three messages where the same as the previous one in the
logs.

> 
> Thanks.


Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Message from new kernel

2001-01-11 Thread Igmar Palsenberg

On Thu, 11 Jan 2001, Nguyen Truong Sinh wrote:

 I am using Redhat 7.0 for my system. After install new kernel (2.4.0). My system 
always inform 
 NET: 3 messages suppressed
 
 What does it mean ? and how to fix it, I don't want it appears on the console at all.

man syslog

messages supressed means it didn't write all three messages, but a line
saying that the three messages where the same as the previous one in the
logs.

 
 Thanks.


Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.2.18 and Maxtor 96147H6 (61 GB)

2001-01-10 Thread Igmar Palsenberg


> Probably you confused the proper way to use ibmsetmax with
> the proper way to use setmax. For setmax, and a Maxtor disk,
> you do not use a different machine, put the jumper to clip,
> now the boot succeeds, and you let Linux unclip.
> Either with a patched kernel that knows about these things
> or with a utility run from a boot script.
> (It is most convenient to have a partition boundary where
> the jumper clips, so that with old kernels and without running
> the utility you also have a valid filesystem.)

I found out yesterday after searching the mailinglist... My bad.
Thanx for the info.

2.2.18 + ide is patched, 2.4.0 isn't. 


Regards,

Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.2.18 and Maxtor 96147H6 (61 GB)

2001-01-10 Thread Igmar Palsenberg


 Probably you confused the proper way to use ibmsetmax with
 the proper way to use setmax. For setmax, and a Maxtor disk,
 you do not use a different machine, put the jumper to clip,
 now the boot succeeds, and you let Linux unclip.
 Either with a patched kernel that knows about these things
 or with a utility run from a boot script.
 (It is most convenient to have a partition boundary where
 the jumper clips, so that with old kernels and without running
 the utility you also have a valid filesystem.)

I found out yesterday after searching the mailinglist... My bad.
Thanx for the info.

2.2.18 + ide is patched, 2.4.0 isn't. 


Regards,

Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Delay in authentication.

2001-01-09 Thread Igmar Palsenberg

On Mon, 8 Jan 2001, Scott Laird wrote:

> 
> Is syslog running correctly?  When syslog screws up, it very frequently
> results in this sort of problem.

Indeed, or no DNS when talking remote logins.


Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Shared memory not enabled in 2.4.0?

2001-01-09 Thread Igmar Palsenberg


>  # cat /proc/meminfo
>  total:used:free:  shared: buffers:  cached:
>  Mem:  130293760 123133952  71598080 30371840 15179776
 ^^

It means shared process memory, not shm. 

One thing to watch : PowerTweak. Seems to set the max shm segments to 0



Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.2.18 and Maxtor 96147H6 (61 GB)

2001-01-09 Thread Igmar Palsenberg


> > 2.2.18 sometimes sees 61 GB, sometimes 32 GB.
> > I don't call that hard to understand.
> 
> The same kernel has varying behaviour?
> Maybe not hard to understand, but rather surprising.
> You are the first to report nondeterministic behaviour.

You're not the only one that is suprised :

1) Put disk in my machine (target machine hangs itself with the disk)
2) use setmax to set the limit to 32 GB
3) Put the disk in the target machine
4) System boots, linux sees 64GB
5) rebooted system, system hangs due to the soft clippig 'vanished'

I even had occurences of the kernel setmax message showing up, and after a
plain reboot it didn't.

It beats me.. I can't explain, and the machine is rock solid now.



Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.2.18 and Maxtor 96147H6 (61 GB)

2001-01-09 Thread Igmar Palsenberg


  2.2.18 sometimes sees 61 GB, sometimes 32 GB.
  I don't call that hard to understand.
 
 The same kernel has varying behaviour?
 Maybe not hard to understand, but rather surprising.
 You are the first to report nondeterministic behaviour.

You're not the only one that is suprised :

1) Put disk in my machine (target machine hangs itself with the disk)
2) use setmax to set the limit to 32 GB
3) Put the disk in the target machine
4) System boots, linux sees 64GB
5) rebooted system, system hangs due to the soft clippig 'vanished'

I even had occurences of the kernel setmax message showing up, and after a
plain reboot it didn't.

It beats me.. I can't explain, and the machine is rock solid now.



Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Shared memory not enabled in 2.4.0?

2001-01-09 Thread Igmar Palsenberg


  # cat /proc/meminfo
  total:used:free:  shared: buffers:  cached:
  Mem:  130293760 123133952  71598080 30371840 15179776
 ^^

It means shared process memory, not shm. 

One thing to watch : PowerTweak. Seems to set the max shm segments to 0



Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Delay in authentication.

2001-01-08 Thread Igmar Palsenberg

On Mon, 8 Jan 2001, Chris Meadors wrote:

> On Mon, 8 Jan 2001, David S. Miller wrote:
> 
> > This definitely seems like the classic "/etc/nsswitch.conf is told to
> > look for YP servers and you are not using YP", so have a look and fix
> > nsswitch.conf if this is in fact the problem.
> 
> What I have never gotten, is why on my machines (no specific distro, just
> everything built from source and installed by me) login takes a long time,
> unless I have portmap running.
> 
> My /etc/nsswitch.conf would seem to be right:
> 
> What else could effect that?

check /etc/pam.d/login

Could be kerberos that is biting you, althrough that doesn't explain the
portmap story.



Igmar




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.2.18 and Maxtor 96147H6 (61 GB)

2001-01-08 Thread Igmar Palsenberg

On Sat, 6 Jan 2001 [EMAIL PROTECTED] wrote:

> > It's not that simple.. The maxtor comes clipped,. but Linux can't kill the
> > clip. So it sticks with 32 MB
> 
> > ibmsetmax.c does a software clip, but that bugs a bit. Sometimes even
> > Linux doesn't see 61 GB, but only 32, sometimes the full capacity.
> 
> Please don't talk vague useless garbage.
> There is no entity called "Linux". If you mean "the 2.4.0 kernel
> boot messages report 61 GB, fdisk 2.9s sees 32 GB, fdisk 2.10r sees 61 GB"
> then say so. If you mean something else, say what you mean.
> Precisely, with versions and everything.

2.2.18 sometimes sees 61 GB, sometimes 32 GB. I don't call that hard to
understand. And I don't use 2.4 on that machine, see previous posting. I
also mentined that I use 2.2.18 with Andre's IDE patches. 

> Since you have a Maxtor, my old setmax should suffice for you, it can kill
> the clip, and there is no reason to use ibmsetmax.c, that is a version for
> IBM disks. There should not be any need to use other machines.
> 
> If something changed for recent Maxtor disks, we would like to know,
> but only reliable, detailed reports are of any use.

It was probably t he BIOS of this newer machine that somehow killed the
software clip. I can't explain otherwise.

The setmax program initially gave errors, so that's why I switched to
ibmsetmax.

If the vague behaviour starts appearing again I'll debug the thing. For
now I blaim the award bios :)
 
> Andries


Igmar 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.2.18 and Maxtor 96147H6 (61 GB)

2001-01-08 Thread Igmar Palsenberg

On Sat, 6 Jan 2001 [EMAIL PROTECTED] wrote:

  It's not that simple.. The maxtor comes clipped,. but Linux can't kill the
  clip. So it sticks with 32 MB
 
  ibmsetmax.c does a software clip, but that bugs a bit. Sometimes even
  Linux doesn't see 61 GB, but only 32, sometimes the full capacity.
 
 Please don't talk vague useless garbage.
 There is no entity called "Linux". If you mean "the 2.4.0 kernel
 boot messages report 61 GB, fdisk 2.9s sees 32 GB, fdisk 2.10r sees 61 GB"
 then say so. If you mean something else, say what you mean.
 Precisely, with versions and everything.

2.2.18 sometimes sees 61 GB, sometimes 32 GB. I don't call that hard to
understand. And I don't use 2.4 on that machine, see previous posting. I
also mentined that I use 2.2.18 with Andre's IDE patches. 

 Since you have a Maxtor, my old setmax should suffice for you, it can kill
 the clip, and there is no reason to use ibmsetmax.c, that is a version for
 IBM disks. There should not be any need to use other machines.
 
 If something changed for recent Maxtor disks, we would like to know,
 but only reliable, detailed reports are of any use.

It was probably t he BIOS of this newer machine that somehow killed the
software clip. I can't explain otherwise.

The setmax program initially gave errors, so that's why I switched to
ibmsetmax.

If the vague behaviour starts appearing again I'll debug the thing. For
now I blaim the award bios :)
 
 Andries


Igmar 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Delay in authentication.

2001-01-08 Thread Igmar Palsenberg

On Mon, 8 Jan 2001, Chris Meadors wrote:

 On Mon, 8 Jan 2001, David S. Miller wrote:
 
  This definitely seems like the classic "/etc/nsswitch.conf is told to
  look for YP servers and you are not using YP", so have a look and fix
  nsswitch.conf if this is in fact the problem.
 
 What I have never gotten, is why on my machines (no specific distro, just
 everything built from source and installed by me) login takes a long time,
 unless I have portmap running.
 
 My /etc/nsswitch.conf would seem to be right:
 
 What else could effect that?

check /etc/pam.d/login

Could be kerberos that is biting you, althrough that doesn't explain the
portmap story.



Igmar




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Console logging

2001-01-06 Thread Igmar Palsenberg

On Sat, 6 Jan 2001, Mike wrote:

> Hi!
> 
> I am getting getting "/var/log/messages" on my console. It doesn't save
> it in /var/log.
> I have checked entries in /etc/syslog.conf file. Its correct.
> Can someone help me.

Syslog isn't running

> 
> Regards,
> Mike



Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.2.18 and Maxtor 96147H6 (61 GB)

2001-01-06 Thread Igmar Palsenberg


> > Sven, how did you kill the clipping ??
> > Or in generic, how do I kill the clipping ?
> 
> Go set the jumpers right. (anyhow, IBM drives are delivered unclipped,
> not sure why Maxtors seem to be)

It's not that simple.. The maxtor comes clipped,. but Linux can't kill the
clip. So it sticks with 32 MB

ibmsetmax.c does a software clip, but that bugs a bit. Sometimes even
Linux doesn't see 61 GB, but only 32, sometimes the full capacity.
(i'm talking without the hardware jumper).

the machine I used to set the limit (target machine doesn't but without
the hardware clip), seems to reset the software clip. Probably the BIOS
who does that.

It seems stable now, machine boots OK, and Linux sees 61GB. Let's hope it
will stay that way.


Regards,

Igmar



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.2.18 and Maxtor 96147H6 (61 GB)

2001-01-06 Thread Igmar Palsenberg


  Sven, how did you kill the clipping ??
  Or in generic, how do I kill the clipping ?
 
 Go set the jumpers right. (anyhow, IBM drives are delivered unclipped,
 not sure why Maxtors seem to be)

It's not that simple.. The maxtor comes clipped,. but Linux can't kill the
clip. So it sticks with 32 MB

ibmsetmax.c does a software clip, but that bugs a bit. Sometimes even
Linux doesn't see 61 GB, but only 32, sometimes the full capacity.
(i'm talking without the hardware jumper).

the machine I used to set the limit (target machine doesn't but without
the hardware clip), seems to reset the software clip. Probably the BIOS
who does that.

It seems stable now, machine boots OK, and Linux sees 61GB. Let's hope it
will stay that way.


Regards,

Igmar



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Console logging

2001-01-06 Thread Igmar Palsenberg

On Sat, 6 Jan 2001, Mike wrote:

 Hi!
 
 I am getting getting "/var/log/messages" on my console. It doesn't save
 it in /var/log.
 I have checked entries in /etc/syslog.conf file. Its correct.
 Can someone help me.

Syslog isn't running

 
 Regards,
 Mike



Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.2.18 and Maxtor 96147H6 (61 GB)

2001-01-05 Thread Igmar Palsenberg


> I had a similar situation except I was more interested in the performance
> difference. Went from ~4MB/s with the 430HX controller to ~12.5MB/s with
> the promise. This on an old Pentium system.

The network is 10 mbit, so 4 MB/sec is no good in this case.
I've got the thing running, with (ibm)setmax. Don't hang the disk in a
machine that does handle > 32 GB, because it will screw the limit the
setmax just set.

> > > I solved the problem by getting a Promise Ultra 100 controller
> > > and putting the drive on that. Works perfectly under Linux 
> > > Mandrake 2.2.17-mdk-21 - it shows up as /dev/hde.  They are
> > > cheap controllers if you don't get the RAID version.
> > 
> > Thanx.. Will try that. New machine costs more.
> >  
> 
> Vanilla 2.2 kernels don't have this support (at least not as on 2.2.18).
> If you're not running Mandrake, grab Andre Hedrick's excellent ide patch.

Already installed :)

> One thing you may like to know. If you want the drives attached to the new
> controller to be /dev/hda..., then edit lilo.conf and add
>   append="pci=reverse"
> to your patched kernel entry. Oh, and if you ever need to bootstrap one of
> these puppies with a kernel that doesn't have the drivers, you can use
>   append="ide0=0xe000,0xd802 ide1=0xd400,0xd002"
> to be able to access the drive attached to the Promise controller using the
> standard ide driver.
> 
> Hope this helps.

Thanx. If I get anymore problems I'll switch to a Promise controller. 2
days to setup a plain Linux box is a bit much.. 

Main problem I've had is that the software clipping bugs, or that my BIOS
in teh newer machine screws things up.

> Tim

Regards,


Igmar


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.2.18 and Maxtor 96147H6 (61 GB)

2001-01-05 Thread Igmar Palsenberg


> No. 2.2.* handles large drives since 2.2.14.
> This looks more like you used the jumper to clip the drive to 32GB.
> Don't use it and get full capacity.
> If your BIOS hangs when it sees such a large drive so that you
> cannot avoid using the jumper, use setmax in your boot scripts,
> or use a kernel patch that does the same at kernel boot time.
> 
> >> Looks like some short int (2 bytes) overflowing. I'll try the ide patches.
> 
> The overflow is in certain BIOSes, not in Linux.
> (You see in the above: 65531 is not an overflow value.)

The number after clipping was actual - 2^16. Was the reason I was thinking
the kernel was playing games. After applying IDE patches the idesetmax
message showed up :) 

> > I had to recompile fdisk as my old suse 6.4 version got the same
> > 2byte-wraparound problem.
> 
> In the good old days the HDIO_GETGEOM ioctl would give you the disk
> geometry. It has a short for cylinders and hence overflows when C
> gets above 65535. Since geometry is on its way out - indeed, there has
> not been any such thing for many, many years - it would have been
> nonsense to introduce new ioctls that report meaningless 32-bit numbers
> instead of the present meaningless 16-bit number.
> So, instead, the "cylinder" field in the output of this ioctl has been
> declared obsolete, and is not used anymore. Programs that want to print
> some value, just because they always did and users expect something,
> now use BLKGETSIZE to get total size and divide by heads*sectors
> to get a cylinder value.
> (But again: this cylinder value is not used anywhere, the computed value
> is just for the user's eyes.)

all is block adressed indeed.. I need to look at fdisk, because it is
doing things wrong.

The other machine's BIOS can handle 64 GB wihout problems, so I can run
without clipping in that machine.

Linux sees the correct size, but fdisk still sees 32 GB. Probably a
recompile / upgrade.
 
> Andries


Regards,

Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.2.18 and Maxtor 96147H6 (61 GB)

2001-01-05 Thread Igmar Palsenberg


 No. 2.2.* handles large drives since 2.2.14.
 This looks more like you used the jumper to clip the drive to 32GB.
 Don't use it and get full capacity.
 If your BIOS hangs when it sees such a large drive so that you
 cannot avoid using the jumper, use setmax in your boot scripts,
 or use a kernel patch that does the same at kernel boot time.
 
  Looks like some short int (2 bytes) overflowing. I'll try the ide patches.
 
 The overflow is in certain BIOSes, not in Linux.
 (You see in the above: 65531 is not an overflow value.)

The number after clipping was actual - 2^16. Was the reason I was thinking
the kernel was playing games. After applying IDE patches the idesetmax
message showed up :) 

  I had to recompile fdisk as my old suse 6.4 version got the same
  2byte-wraparound problem.
 
 In the good old days the HDIO_GETGEOM ioctl would give you the disk
 geometry. It has a short for cylinders and hence overflows when C
 gets above 65535. Since geometry is on its way out - indeed, there has
 not been any such thing for many, many years - it would have been
 nonsense to introduce new ioctls that report meaningless 32-bit numbers
 instead of the present meaningless 16-bit number.
 So, instead, the "cylinder" field in the output of this ioctl has been
 declared obsolete, and is not used anymore. Programs that want to print
 some value, just because they always did and users expect something,
 now use BLKGETSIZE to get total size and divide by heads*sectors
 to get a cylinder value.
 (But again: this cylinder value is not used anywhere, the computed value
 is just for the user's eyes.)

all is block adressed indeed.. I need to look at fdisk, because it is
doing things wrong.

The other machine's BIOS can handle 64 GB wihout problems, so I can run
without clipping in that machine.

Linux sees the correct size, but fdisk still sees 32 GB. Probably a
recompile / upgrade.
 
 Andries


Regards,

Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



RE: 2.2.18 and Maxtor 96147H6 (61 GB)

2001-01-04 Thread Igmar Palsenberg

On Thu, 4 Jan 2001, Torrey Hoffman wrote:

> I had exactly this problem with the Maxtor 61 GB drive on my 
> Pentium based server.  Theoretically a BIOS upgrade could fix it,
> but ASUS quit making BIOS upgrades for my motherboard two years
> ago.

Ah well, join the club in my case :)

> I solved the problem by getting a Promise Ultra 100 controller
> and putting the drive on that. Works perfectly under Linux 
> Mandrake 2.2.17-mdk-21 - it shows up as /dev/hde.  They are
> cheap controllers if you don't get the RAID version.

Thanx.. Will try that. New machine costs more.
 
> Best wishes.
> 
> Torrey Hoffman


Regards,

Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.2.18 and Maxtor 96147H6 (61 GB)

2001-01-04 Thread Igmar Palsenberg


> I did'nt know something like that even existed :)
> 
> Just plugged the drive into the ide controller (single drive on a
> promise ata100 in a dec alpha) and it worked.

Ah.. This is a i386 machine, UDMA33 capable, and the bloody thing won't
boot with the clipping removed, and with clipping I can use only 32 GB :((

> But I'm booting from SCSI as the machine does not support IDE-drives in
> the "bios".

This machine the other way around :)
 

Regards,

Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.2.18 and Maxtor 96147H6 (61 GB)

2001-01-04 Thread Igmar Palsenberg

On Thu, 4 Jan 2001, Andre Hedrick wrote:

> 
> You have a hard destroke clipping on the drive.
> Go look at you logs.

Yeah.. I removed the clipping, and the machine won't boot. It halts after
PnP init. Any way to use full capacity with the clipping enabled ?


Regards,

Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.2.18 and Maxtor 96147H6 (61 GB)

2001-01-04 Thread Igmar Palsenberg



Hi,

Forget the question on killing the drive clipping. I forgot to RTFM :)


Regards,

Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.2.18 and Maxtor 96147H6 (61 GB)

2001-01-04 Thread Igmar Palsenberg


> You have a hard destroke clipping on the drive.
> Go look at you logs.

Yep, logs indicate that.. 

Sven, how did you kill the clipping ??
Or in generic, how do I kill the clipping ?



Regards,

Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



2.2.18 and Maxtor 96147H6 (61 GB) part #2

2001-01-04 Thread Igmar Palsenberg


Hi,

Just tried the ide patches for 2.2.18, and same result :((

Any patches / suggestions I can try ??


Regards,


Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



2.2.18 and Maxtor 96147H6 (61 GB)

2001-01-04 Thread Igmar Palsenberg


Hi,

kernel 2.2.18 hates my Maxtor drive :

ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hda: Maxtor 96147H6, 32253MB w/2048kB Cache, CHS=65531/16/63, (U)DMA

Actual (correct) parameters : CHS=119112/16/63

Looks like some short int (2 bytes) overflowing. I'll try the ide patches.



Regards,


Igmar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



  1   2   3   4   >