Re: khubd taking 100% CPU after unproperly removing USB webcam

2007-01-16 Thread Jerome Lacoste

On 1/16/07, Oliver Neukum <[EMAIL PROTECTED]> wrote:

Am Dienstag, 16. Januar 2007 10:10 schrieb Jerome Lacoste:
> Hi,
>
> I unplugged my (second) webcam, forgotting to stop ekiga, and khubd is
> now taking 100% CPU.
>
> - lsusb doesn't return
> - /etc/init.d/udev restart didn't resolve the problem.
>
> Is that a problem one may want to investigate or should I just forget
> about it (problem being cause by a user error)?

If your are using this driver
http://mxhaard.free.fr/download.html

then it appears that it most likely hanging here:

for (n = 0; n < SPCA50X_NUMFRAMES; n++)
if (waitqueue_active(&spca50x->frame[n].wq))
wake_up_interruptible(&spca50x->frame[n].wq);
if (waitqueue_active(&spca50x->wq))
wake_up_interruptible(&spca50x->wq);
gspca_kill_transfert(spca50x);
PDEBUG(3, "Disconnect Kill isoc done");
up(&spca50x->lock);
while (spca50x->user)
schedule();

This driver's disconnect handling is buggy. As this is an out of tree
driver, please contact the original author.


OK thanks for your answer.

I also found out that ekiga was still running. I killed it and that
stopped the hang.

Jerome
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


khubd taking 100% CPU after unproperly removing USB webcam

2007-01-16 Thread Jerome Lacoste

Hi,

I unplugged my (second) webcam, forgotting to stop ekiga, and khubd is
now taking 100% CPU.

- lsusb doesn't return
- /etc/init.d/udev restart didn't resolve the problem.

Is that a problem one may want to investigate or should I just forget
about it (problem being cause by a user error)?

Is there a way for me to:
- get more information about the problem ? I cannot strace not gdb
attach to the kernel process.
- solve the issue without restarting the computer
- should I do something particular before unplugging a USB device (on
windows there's this tray program to stop the USB device), should I
just run some sort of lsof on the device to check if it is safe to
remove it ?

Cheers,

Jerome

[Pleased keep me CC'ed]


DETAILS


uname -a

Linux dolcevita 2.6.17-10-generic #2 SMP Tue Dec 5 22:28:26 UTC 2006
i686 GNU/Linux

lspci

00:00.0 Host bridge: Intel Corporation 82865G/PE/P DRAM
Controller/Host-Hub Interface (rev 02)
00:02.0 VGA compatible controller: Intel Corporation 82865G Integrated
Graphics Controller (rev 02)
00:1d.0 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB
UHCI Controller #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB
UHCI Controller #2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB
UHCI Controller #3 (rev 02)
00:1d.3 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB
UHCI Controller #4 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB2
EHCI Controller (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev c2)
00:1f.0 ISA bridge: Intel Corporation 82801EB/ER (ICH5/ICH5R) LPC
Interface Bridge (rev 02)
00:1f.1 IDE interface: Intel Corporation 82801EB/ER (ICH5/ICH5R) IDE
Controller (rev 02)
00:1f.2 IDE interface: Intel Corporation 82801EB (ICH5) SATA Controller (rev 02)
00:1f.3 SMBus: Intel Corporation 82801EB/ER (ICH5/ICH5R) SMBus
Controller (rev 02)
00:1f.5 Multimedia audio controller: Intel Corporation 82801EB/ER
(ICH5/ICH5R) AC'97 Audio Controller (rev 02)
01:07.0 Serial controller: Rockwell International HCF 56k
Data/Fax/Voice/Spkp (w/Handset) Modem (rev 01)
01:0c.0 Ethernet controller: Intel Corporation 82540EM Gigabit
Ethernet Controller (rev 02)


dmesg

...
[17238199.732000] usb 3-1: new full speed USB device using uhci_hcd
and address 3
[17238199.924000] usb 3-1: configuration #1 chosen from 1 choice
[17238200.568000] drivers/media/video/spca5xx
/spca5xx-main.c : USB SPCA5XX camera found. Type Flexcam 100 (SPCA561A)
[17238200.576000] usbcore: registered new driver spca5xx
[17238200.576000] drivers/media/video/spca5xx/spca5xx-main.c: spca5xx
driver 00.57.08 registered
[17238225.18 ] pwc Dumping frame 269.
[17238225.28] pwc Dumping frame 270.
[17238225.38] pwc Dumping frame 271.
[17238225.48] pwc Dumping frame 272.
[17238225.58] pwc Dumping frame 273.
[17238225.68] pwc Dumping frame 274.
[17238225.78] pwc Dumping frame 275.
[17238225.88] pwc Dumping frame 276.
[17238225.98] pwc Dumping frame 277.
[17238226.08] pwc Dumping frame 278.
[17238226.18] pwc Dumping frame 279.
[17238226.28] pwc Dumping frame 280.
[17238230.68] pwc Closing video device: 325 frames received,
dumped 12 frames, 0 frames with errors.
[17238231.996000] pwc type = 720
[17238232.00] pwc type = 720
[17238232.00] pwc set_video_mode(160x120 @ 10, palette 15).
[17238232.00] pwc decode_size = 1.
[17238232.00] pwc Using alternate setting 1.
[17238232.888000] pwc type = 720
[17238232.892000] pwc type = 720
[17238232.892000] pwc set_video_mode(160x120 @ 10, palette 15).
[17238232.892000] pwc decode_size = 1.
[17238232.892000] pwc Using alternate setting 1.
[17238233.468000] pwc type = 720
[17238233.472000] pwc type = 720
[17238233.472000] pwc set_video_mode(160x120 @ 10, palette 15).
[17238233.472000] pwc decode_size = 1.
[17238233.472000] pwc Using alternate setting 1.
[17238233.632000] pwc type = 720
[17238233.632000] pwc type = 720
[17238233.632000] pwc set_video_mode(160x120 @ 10, palette 15).
[17238233.632000] pwc decode_size = 1.
[17238233.632000] pwc Using alternate setting 1.
[17238233.992000] pwc type = 720
[17238233.996000] pwc type = 720
[17238233.996000] pwc set_video_mode(160x120 @ 10, palette 15).
[17238233.996000] pwc decode_size = 1.
[17238233.996000] pwc Using alternate setting 1.
[17238234.736000] pwc type = 720
[17238234.736000] pwc type = 720
[17238234.736000] pwc set_video_mode(160x120 @ 10, palette 15).
[17238234.736000] pwc decode_size = 1.
[17238234.736000] pwc Using alternate setting 1.
[17238446.704000] usb 3-1: USB disconnect, address 3
[17238446.704000] drivers/media/video/spca5xx/spca5xx-main.c:
usb_submit_urb() ret -19
[17238446.704000] drivers/media/video/spca5xx/spca5xx-main.c:
usb_submit_urb() ret -19

(note that the pwc camera is still plugged)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  ht

Re: cache regresions with 2.6.1x ?

2005-08-24 Thread jerome lacoste
On 8/23/05, Andrew Morton <[EMAIL PROTECTED]> wrote:
> jerome lacoste <[EMAIL PROTECTED]> wrote:
> >
> > I am on a Dell Inspiron 8100 laptop with 512 M and 1G disk cache. I
> >  usually have at least 4 big applications running simultaneously: a
> >  Java IDE, firefox, firefox and X. All that under the Gnome desktop.
> >
> >  I've sometimes seen periods where my laptop goes kind of nuts. While
> >  the cpu is still at 0%, the workload goes to 100% (as shown in the
> >  gnome process monitor) (I haven't checked in other means, e.g. top or
> >  /proc info as my machine is unusable).
> >
> >  But with my latest upgrade to 2.6.12 from 2.6.10, the hanging happens
> >  much more often. It lasts for over 30 seconds.
> >
> >  Could this hanging be related to swapping?
> >  Are there any VM regression lately that would make a kernel less
> >  appropriate for desktop use?
> >  How can I investigate that further?
> 
> 10-20 lines of `vmstat 1' output while it's happening would help.

Here it goes. Maybe just some bad swapping?

[EMAIL PROTECTED]> vmstat 1
procs ---memory-- ---swap-- -io --system-- cpu
 r  b   swpd   free   buff  cache   si   sobibo   incs us sy id wa
 1  7 588164   7424  18612 106908   1373444   1012  8  2 85  5
 2  4 587996   6152  18624 108092  404  664   540  2892 1201  2631 70  9  0 21
 0 12 588276   5160  18620 109188  664 1244   860  1244 1195   615 46  5  0 50
 0 13 588140   4912  18628 109188  2160   216 8 1156   245  0  0  0 100
 0 17 588536   4892  18628 109972  132  576   132   576 1172   353 32  4  0 64
 0 16 589096   5016  18628 1101920  608 4   628 1169   247  7  2  0 91
 0 16 589780   5636  18632 1101360  716 0   808 1181   261  1  0  0 99
 0 11 590272   5388  18532 111548  168  820   176   820 1192   457 52  5  0 43
 0 11 590260   5140  18540 1115840  32036   332 1159   743 12  2  0 86
 4 10 590232   9756  17456 109304  100  240  1064   420 1333  2297 47 16  0 38
 1  6 590240  17440  16908 105680   72   80   460  2108 1266  2052 65 24  0 11
 1  5 590004  13596  16988 109060  5800  3380 0 1356  1743 13  8  0 79
 0  6 589776  10372  17032 110936  9680  1800  2924 1172  1057 19  4  0 77
 0  7 589368   6544  17104 112468 16520  2496   100 1202  1109  8  4  0 88
 1  4 589212   5676  17132 112232  4880  1204 0 1160  1092  6  3  0 91
 0  6 589032  12724  16772 107000  5880   844 0 1339  1444 36 12  0 52
 0  6 588664   8012  16792 108440 14480  2068 0 1637  1222 12  5  0 83
 0  7 588252   6464  16460 108900 18404  2036   236 1629  1156  8 13  0 79
 6  4 588040  13180  14696 107352  460  124   476  4608 1554  1644 70  9  0 21
 1  3 587792  11812  14412 108348  8480  109632 1404  2733 27  8  0 65
 0  5 587464   9332  1 109596 13800  1572 0 1159  1030 21  3  0 76
 0  6 586976   8836  14244 110488 1556   24  1960   684 1210  1562 16  7  0 77
 0  4 586684   6728  14288 111536  7480  1068   344 1175  1216 12  3  0 85
 0  9 586676   6232  14308 111908   960   33648 1185  1544  7  2  0 91
 0  6 586500  10860  13384 112364  7920  1108  4516 1163  1588 24  8  0 68
 0  5 586200   9024  13440 113272 13920  152812 1176  1019  8  4  0 88
 1  6 585848   5596  13456 114888 19680  204452 1171  1118 11  5  0 84
 0  6 585384   5968  12000 115972 14520  1484 0 1156   952 13  5  0 82
 0  6 584984   5224  11880 115800 19160  2276 0 1167   780  4  3  0 93
 0  6 584744   5148   9396 118836 11040  3352 0 1159   988 12  8  0 80
 0  6 584560   5996   8664 119776  9604  1492 4 1148   893 17  7  0 76
 0  7 584204   5396   8536 120912 10480  1716 0 1186  1118 12  3  0 85
 0  5 583964   5752   8036 121468  772   40   96440 1154  5811 19 12  0 69
 0  5 583608   5272   7532 121268 15000  1552   300 1156   784  3  2  0 95
 0  5 583496   5840   7344 120712  3960   948  8892 1175  1137 16  6  0 78
 0  8 583448   5004   6016 124748  1720  2616 4 1154  1027  9  6  0 85
 0  9 583396   4880   4156 130880   96   20  4604   812 1176  1077  4  5  0 91
 0  9 582780   4896   3836 130808 20120  2076 0 1184  1103  1  0  0 99
 0 10 582520   5020   3792 130328  9920  1120 4 1156   783  3  1  0 96
 0 10 581924   4896   3308 130360 15280  1712  1756 1183   762  6  2  0 92
 0  9 581916   5196   3256 130632   96   44   364   112 1159  1119 11  3  0 86
 0 13 581704   4896   3264 130796  8200  1672  3128 1171  1143  4  2  0 94
 0 11 581704   6384   3088 12952000 456 1159   706  4  6  0 90
 0 12 581628   5640   3128 130148  3280   716   612 1140   786  1  1  0 98
 0  7 581492   4880   2336 135580  5640  368492 1176  1788 16  7  0 77
 
> If lots of system time is being consumed then the next step i

Re: mass "tulip_stop_rxtx() failed", network stops

2005-08-23 Thread jerome lacoste
On 8/23/05, Tomasz Chmielewski <[EMAIL PROTECTED]> wrote:
> We are running almost 20 Fujitsu-Siemens Scenic machines, 2.6.8.1
> kernel, equipped with a onboard card that uses a tulip module:
> 
> 02:01.0 Ethernet controller: Linksys NC100 Network Everywhere Fast
> Ethernet 10/100 (rev 11)
> 
> No problem with those.
> 
> 
> We are running four more machines like that, the only difference is the
> kernel they are running (2.6.11.4).
> 
> On some of them, there are serious problems with a network, and they
> usually happen when the traffic is bigger than usual (i.e., some big
> software deployment to several workstations, remote backup, etc.).
> 
> The syslog is then full of entries like that:
> 
> Aug 21 04:04:30 SERVER-B-HS kernel: NETDEV WATCHDOG: eth0: transmit
> timed out
> Aug 21 04:04:30 SERVER-B-HS kernel: :00:06.0: tulip_stop_rxtx() failed

I am seeing thousands of tulip_stop_rxtx() failed messages as well
with 2.6.11. No regular network failure though.

See http://kerneltrap.org/mailarchive/1/message/110291/flat

Cheers,

Jerome
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


cache regresions with 2.6.1x ?

2005-08-22 Thread jerome lacoste
Hi,

I am on a Dell Inspiron 8100 laptop with 512 M and 1G disk cache. I
usually have at least 4 big applications running simultaneously: a
Java IDE, firefox, firefox and X. All that under the Gnome desktop.

I've sometimes seen periods where my laptop goes kind of nuts. While
the cpu is still at 0%, the workload goes to 100% (as shown in the
gnome process monitor) (I haven't checked in other means, e.g. top or
/proc info as my machine is unusable).

But with my latest upgrade to 2.6.12 from 2.6.10, the hanging happens
much more often. It lasts for over 30 seconds.

Could this hanging be related to swapping?
Are there any VM regression lately that would make a kernel less
appropriate for desktop use?
How can I investigate that further?

Thanks

> cat /proc/meminfo 
MemTotal:   516220 kB
MemFree: 17720 kB
Buffers:  9412 kB
Cached:  67404 kB
SwapCached: 149584 kB
Active: 423072 kB
Inactive:37860 kB
HighTotal:   0 kB
HighFree:0 kB
LowTotal:   516220 kB
LowFree: 17720 kB
SwapTotal:  976712 kB
SwapFree:   487432 kB
Dirty: 520 kB
Writeback:   0 kB
Mapped: 405256 kB
Slab:22600 kB
CommitLimit:   1234820 kB
Committed_AS:  1793068 kB
PageTables:   3564 kB
VmallocTotal:   507896 kB
VmallocUsed: 26472 kB
VmallocChunk:   481268 kB


> fdisk -l /dev/hda

Disk /dev/hda: 60.0 GB, 60011642880 bytes
16 heads, 63 sectors/track, 116280 cylinders
Units = cylinders of 1008 * 512 = 516096 bytes

   Device Boot  Start End  Blocks   Id  System
/dev/hda1   *   1   19376 9765472+   7  HPFS/NTFS
Partition 1 does not end on cylinder boundary.
/dev/hda2   19377  116280488396165  Extended
Partition 2 does not end on cylinder boundary.
/dev/hda5   19377   21314  976720+  82  Linux swap / Solaris
/dev/hda6   21315   29064 3905968+  83  Linux
/dev/hda7   29065   36814 3905968+  83  Linux
/dev/hda8   36815  11628040050832+  83  Linux
expresso:/home/jerome/Dev/CruiseControl/cruisecontrol#
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


thousands of "tulip_stop_rxtx() failed" errors

2005-08-22 Thread jerome lacoste
Using kernel 2.6.11 on Mandriva LE 2005, I am seeing a lot of tulip
errors in my logs:
   :02:09.0: tulip_stop_rxtx() failed

It doesn't seem to impact the performance, although it fills up my logs. 

The card is connected to the ADSL modem. Any idea as to what could be
causing this?
Bad cable?

The box is remote so I cannot do much with hardware right now.

Cheers,

Jerome


> ifconfig eth0
eth0  Lien encap:Ethernet  HWaddr 00:80:AD:20:1D:66  
  adr inet6: fe80::280:adff:fe20:1d66/64 Scope:Lien
  UP BROADCAST MULTICAST  MTU:1500  Metric:1
  RX packets:242073 errors:4010 dropped:0 overruns:0 frame:0
  TX packets:231702 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 lg file transmission:1000 
  RX bytes:71151194 (67.8 Mb)  TX bytes:48721547 (46.4 Mb)
  Interruption:3 Adresse de base:0xd800 

> uname -a
Linux localhost 2.6.11-12mdk-i586-up-1GB #1 Mon Jun 27 21:49:58 MDT
2005 i686 Pentium III (Coppermine) unknown GNU/Linux

> cat /proc/interrupts 
   CPU0   
  0:9415381  XT-PIC  timer
  1:   6557  XT-PIC  i8042
  2:  0  XT-PIC  cascade
  3: 602974  XT-PIC  eth0
  8:  1  XT-PIC  rtc
  9:3727677  XT-PIC  Ensoniq AudioPCI
 10: 66  XT-PIC  uhci_hcd
 11: 790910  XT-PIC  [EMAIL PROTECTED]::01:00.0
 12: 482453  XT-PIC  i8042
 14: 133760  XT-PIC  ide0
 15:  11898  XT-PIC  ide1
NMI:  0 
LOC:  0 
ERR:  0
MIS:  0

> lspci  
00:00.0 Host bridge: Intel Corporation 82815 815 Chipset Host Bridge
and Memory Controller Hub (rev 02)
00:01.0 PCI bridge: Intel Corporation 82815 815 Chipset AGP Bridge (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 02)
00:1f.0 ISA bridge: Intel Corporation 82801BA ISA Bridge (LPC) (rev 02)
00:1f.1 IDE interface: Intel Corporation 82801BA IDE U100 (rev 02)
00:1f.2 USB Controller: Intel Corporation 82801BA/BAM USB (Hub #1) (rev 02)
00:1f.3 SMBus: Intel Corporation 82801BA/BAM SMBus (rev 02)
01:00.0 VGA compatible controller: ATI Technologies Inc Rage 128
PF/PRO AGP 4x TMDS
02:09.0 Ethernet controller: Davicom Semiconductor, Inc. 21x4x
DEC-Tulip compatible 10/100 Ethernet (rev 31)
02:0a.0 FireWire (IEEE 1394): VIA Technologies, Inc. IEEE 1394 Host
Controller (rev 46)
02:0c.0 Multimedia audio controller: Ensoniq ES1371 [AudioPCI-97] (rev 09)



02:09.0 Ethernet controller: Davicom Semiconductor, Inc. 21x4x
DEC-Tulip compatible 10/100 Ethernet (rev 31)
 Subsystem: Unknown device 4554:434e
 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR+ FastB2B-
 Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
SERR-  mii-tool -v
eth0: negotiated 100baseTx-FD, link ok
  product info: vendor 00:60:6e, model 4 rev 0
  basic mode:   autonegotiation enabled
  basic status: autonegotiation complete, link ok
  capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
  advertising:  100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
  link partner: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Environment variables inside the kernel?

2005-08-18 Thread jerome lacoste
[I doubt this is the right list to ask this question.]

On 8/18/05, Guillermo López Alejos <[EMAIL PROTECTED]> wrote:
> Hi,
> 
> I have a piece of code which uses environment variables. I have been
> told that it is not going to work in kernel space because the concept
> of environment is not applicable inside the kernel.
>
> I belive that, but I need to demonstrate it. 

Is it me or does that sound like a school assignment? :)

> I do not know how to
> proof this, perhaps referring to a solid reference about Linux design
> that points to the idea that it has no sense to use environment
> variables in kernel space.
> 
> Do anyone knows about the existence of such document?

No.

But you should be able to answer your question by wondering:
- where environment variables come from? see "man sh" or "man bash"
(in particular ENVIRONMENT section)
- how processes are handled. "man init" (in particular BOOTING section)
- where your kernel space is...

Cheers,

Jerome
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.12.3 clock drifting twice too fast (amd64)

2005-08-16 Thread jerome lacoste
On 8/16/05, john stultz <[EMAIL PROTECTED]> wrote:
> On Tue, 2005-08-16 at 12:10 +0200, jerome lacoste wrote:
> > Installed stock 2.6.12.3 on a brand new amd64 box with an Asus extreme
> > AX 300 SE/t mainboard.
> >
> > I remember seeing a message in the boot saying something along:
> >
> >   "cannot connect to hardware clock."
> >
> > And now I see that the time is changing too fast (about 2 seconds each 
> > second).
> [snip]
> > :00:00.0 Host bridge: ATI Technologies Inc: Unknown device 5951
> 
> Looks like the AMD/ATI bug.
> 
> http://bugzilla.kernel.org/show_bug.cgi?id=3927

Sounds like it. I will have to try the patch.

Good catch John!

Jerome
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.12.3 clock drifting twice too fast (amd64)

2005-08-16 Thread jerome lacoste
On 8/16/05, Christoph Lameter <[EMAIL PROTECTED]> wrote:
> On Tue, 16 Aug 2005, jerome lacoste wrote:
> 
> > Installed stock 2.6.12.3 on a brand new amd64 box with an Asus extreme
> > AX 300 SE/t mainboard.

Ooops the main board is a Sapphire Axion XP200PA-A58SL. The
aforementionned name is the video card's one...

> > I remember seeing a message in the boot saying something along:
> >
> >   "cannot connect to hardware clock."
> >
> > And now I see that the time is changing too fast (about 2 seconds each 
> > second).
> 
> The timer interrupt is probably called twice for some reason and therefore
> time runs twice as fast. Try using HPET for interrupt timing.

Sorry to sound stupid but how do you use HPET?

My latest kernel config has:

> grep HPET /usr/src/linux-2.6.12.3/.config
CONFIG_HPET_TIMER=y
CONFIG_HPET=y
# CONFIG_HPET_RTC_IRQ is not set
CONFIG_HPET_MMAP=y

So it should be enabled, right?

The kernel config doc talks about a 'miscdevice' named /dev/hpec/

I have this
[EMAIL PROTECTED]:/usr/src/linux-2.6.12.3 # ls -la /dev/hpet 
crw-rw  1 root root 10, 228 Aug 16 15:17 /dev/hpet

I didn't find /usr/src/linux-2.6.12.3/Documentation/hpet.txt very
explicit on what to do. I managed to compile the example code after
tweeking it

--- hpet.c.orig 2005-08-16 23:30:58.0 +0200
+++ hpet.c  2005-08-17 00:01:43.0 +0200
@@ -1,3 +1,6 @@
+/**
+ * Compile with  gcc -s -I/usr/src/linux-`uname -r`/include -Wall
-Wstrict-prototypes hpet.c -o hpet
+ */
 #include 
 #include 
 #include 
@@ -13,6 +16,8 @@
 #include 
 #include 
 #include 
+typedef u_int32_t u32;
+typedef u_int64_t u64;
 #include 
 
 
But as I don't know which device_name to specify (I tried /dev/hpet
and /dev/rtc), I am kinda stuck.


I've also tried rtctest.c (from the rtc. documentation). When run, it
ends by displaying:
Typing "cat /proc/interrupts" will show 131 more events on IRQ 8.
But it only add 117 interrupts for me.
And when ran, this test program correctly counts the seconds. I.e.
time is not too fast.

I also see that the reported RTC time really differs from the time
returned by the date command. I am a little bit confused.

E.g. 

with rtctestnew a modified version of rtctest.c that: displays current
rtc time, register an alarm for 5 secs, waits for the event, and
redisplays for the new rtc time.

> date; ./rtctestnew ; date
Tue Aug 16 22:49:18 CEST 2005
Current RTC date/time is 17-8-2005, 03:03:20.
Alarm time now set to 03:03:25.
Waiting 5 seconds for alarm... okay. Alarm rang.
Current RTC date/time is 17-8-2005, 03:03:25.
 *** Test complete ***
Tue Aug 16 22:49:28 CEST 2005

rtc time is now 7 hours more than current time. rtc time updates
correctly (+5 seconds) while date is increased 10 seconds. timezone is
correct.

Any idea?

> > I don't have visual on the boot sequence anymore (only remote access).
> 
> Use serial console or netconsole. The boot information is logged. Try
> dmesg.

serial will be hard. The machine is 2500 km away with a non geek in front of it.

Can I use netconsole over ppp ?

Jerome
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.12.3 clock drifting twice too fast (amd64)

2005-08-16 Thread jerome lacoste
Installed stock 2.6.12.3 on a brand new amd64 box with an Asus extreme
AX 300 SE/t mainboard.

I remember seeing a message in the boot saying something along:

  "cannot connect to hardware clock."

And now I see that the time is changing too fast (about 2 seconds each second).

I don't have visual on the boot sequence anymore (only remote access).
Kernel earlier than 2.6.11 won't boot on the box (SATA chipset unsupported).

Some info below (probably irrelevant).

What should I try?

Linux manies 2.6.12.3 #1 Thu Jul 28 12:49:15 CEST 2005 x86_64 GNU/Linux

[EMAIL PROTECTED]:~$ lspci
:00:00.0 Host bridge: ATI Technologies Inc: Unknown device 5951
:00:02.0 PCI bridge: ATI Technologies Inc: Unknown device 5a34
:00:11.0 RAID bus controller: ATI Technologies Inc: Unknown device 437a
:00:12.0 RAID bus controller: ATI Technologies Inc: Unknown device 4379
:00:13.0 USB Controller: ATI Technologies Inc: Unknown device 4374
:00:13.1 USB Controller: ATI Technologies Inc: Unknown device 4375
:00:13.2 USB Controller: ATI Technologies Inc: Unknown device 4373
:00:14.0 SMBus: ATI Technologies Inc: Unknown device 4372 (rev 04)
:00:14.1 IDE interface: ATI Technologies Inc: Unknown device 4376
:00:14.3 ISA bridge: ATI Technologies Inc: Unknown device 4377
:00:14.4 PCI bridge: ATI Technologies Inc: Unknown device 4371
:00:14.5 Multimedia audio controller: ATI Technologies Inc:
Unknown device 4370
:00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 NorthBridge
:00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 NorthBridge
:00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 NorthBridge
:00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 NorthBridge
:01:00.0 VGA compatible controller: ATI Technologies Inc: Unknown
device 5b60
:01:00.1 Display controller: ATI Technologies Inc: Unknown device 5b70
:02:05.0 Multimedia video controller: Brooktree Corporation Bt878
Video Capture (rev 11)
:02:05.1 Multimedia controller: Brooktree Corporation Bt878 Audio
Capture (rev 11)
:02:0b.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL-8139/8139C/8139C+ (rev 10)

[EMAIL PROTECTED]:~$ cat /proc/interrupts 
   CPU0   
  0:   22456064IO-APIC-edge  timer
  1:   6175IO-APIC-edge  i8042
  7:  0IO-APIC-edge  parport0
  8:  0IO-APIC-edge  rtc
 12: 241004IO-APIC-edge  i8042
 14: 201574IO-APIC-edge  ide0
 16: 23   IO-APIC-level  bttv0
 17: 370224   IO-APIC-level  ATI IXP
 19: 33   IO-APIC-level  ohci_hcd:usb1, ohci_hcd:usb2, ehci_hcd:usb3
 20: 221385   IO-APIC-level  eth0
 21:  0   IO-APIC-level  acpi
 22:  65322   IO-APIC-level  libata
 23:  0   IO-APIC-level  libata
NMI:   9788 
LOC:   11226651 
ERR: 18
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][1/2] SquashFS

2005-03-17 Thread jerome lacoste
On Tue, 15 Mar 2005 17:50:02 -0800, Junio C Hamano <[EMAIL PROTECTED]> wrote:
> > "PJ" == Paul Jackson <[EMAIL PROTECTED]> writes:
> 
> PJ> There is not a concensus (nor a King Penguin dictate) between the
> PJ> "while(1)" and "for(;;)" style to document.
> 
> FWIW, linux-0.01 has four uses of "while (1)" and two uses of
> "for (;;)" ;-).
> 
> ./fs/inode.c:   while (1) {
> ./fs/namei.c:   while (1) {
> ./fs/namei.c:   while (1) {
> ./kernel/sched.c:   while (1) {
> 
> ./init/main.c:  for(;;) pause();
> ./kernel/panic.c:   for(;;);
> 
> What is interesting here is that the King Penguin used these two
> constructs with consistency.  The "while (1)" form was used with
> normal exit routes with "if (...) break" inside; while the
> "for(;;)" form was used only in unusual "the thread of control
> should get stuck here forever" cases.
> 
> So, Phillip's decision to go back to his original while(1) style
> seems to be in line with the style used in the original Linux
> kernel ;-).

After the Pinguin janitors, now comes the Pinguin archeologists.

This starts to be lemmingesque :)

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: enabling IOAPIC on C3 processor?

2005-03-17 Thread jerome lacoste
On Wed, 16 Mar 2005 15:48:21 -0500, Lee Revell <[EMAIL PROTECTED]> wrote:
> On Wed, 2005-03-16 at 16:11 +0100, jerome lacoste wrote:
> > On Tue, 15 Mar 2005 15:22:36 -0500, Lee Revell <[EMAIL PROTECTED]> wrote:
> > > On Tue, 2005-03-15 at 13:09 +0100, jerome lacoste wrote:
> > > > I have a VIA Epia M1 board that crashes very badly (and pretty
> > > > often, especially when using DMA). I want to fix that.
> > > >
> > >
> > > Are the crashes associated with any particular workload or device?  My
> > > M6000 works perfectly.
> > >
> > > The one big problem I had with is is the VIA Unichrome XAA driver had a
> > > FIFO related bug that caused it to stall the PCI bus, delaying
> > > interrupts for tens of ms unless "Option NoAccel" was used.
> > >
> > > This bug was fixed over 6 months ago though.
> >
> > It crashes my box within minutes if not seconds when using mythtv
> > (tuner using ivtv driver) while using my network card. If I disable
> > DMA on the disk and don't use my card, it's much more stable (several
> > hours without problem).
> >
> 
> Well, you might have better luck capturing the Oops with kdb.  At the
> very least it might drop you into the debugger instead of locking up the
> machine.

It doesn't.
I patched and recompiled my kernel and made sure the
/proc/sys/kernel/kdb is set to 1.
Machine dies with no kdb started.

I guess I just need for VIA to wake up now, right? No more bullets in my gun?

Jerome
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: enabling IOAPIC on C3 processor?

2005-03-16 Thread jerome lacoste
On Tue, 15 Mar 2005 15:22:36 -0500, Lee Revell <[EMAIL PROTECTED]> wrote:
> On Tue, 2005-03-15 at 13:09 +0100, jerome lacoste wrote:
> > I have a VIA Epia M1 board that crashes very badly (and pretty
> > often, especially when using DMA). I want to fix that.
> >
> 
> Are the crashes associated with any particular workload or device?  My
> M6000 works perfectly.
> 
> The one big problem I had with is is the VIA Unichrome XAA driver had a
> FIFO related bug that caused it to stall the PCI bus, delaying
> interrupts for tens of ms unless "Option NoAccel" was used.
> 
> This bug was fixed over 6 months ago though.

It crashes my box within minutes if not seconds when using mythtv
(tuner using ivtv driver) while using my network card. If I disable
DMA on the disk and don't use my card, it's much more stable (several
hours without problem).

See this for more details:
http://forums.viaarena.com/messageview.aspx?catid=28&threadid=60131&enterthread=y

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: enabling IOAPIC on C3 processor? (how to investigate hangs without nmi watchdog)

2005-03-15 Thread jerome lacoste
On Tue, 15 Mar 2005 13:34:55 +0100, Mikael Pettersson <[EMAIL PROTECTED]> wrote:
> jerome lacoste writes:
>  > I have a VIA Epia M1 board that crashes very badly (and pretty
>  > often, especially when using DMA). I want to fix that.
>  >
>  > Serial console + magic SysRQ didn't help so I am going the nmi
>  > watchdog way. But in order to have nmi watchdog I need APIC, right?
>  >
>  > The C3 processor seems to support IOAPIC.
>  > (http://www.via.com.tw/en/products/processors/c3/specs.jsp)
>  >
>  > But:
>  > - I don't see anything in the BIOS related to APIC.
>  > - grep APIC /lib/modules/`uname -r`/build/.config shows me that all
>  > APIC options are 'y'.
>  > - dmesg | grep APIC tells me "no local APIC present or hardware disabled".
>  > - adding lapic kernel parameter doesn't change that.
>  > - and of course, nmi_watchdog=1 or 2 gives me NMI count 0 in 
> /proc/interrupts.
>  >
>  > Did I miss something when it comes to enabling IOAPIC support on C3 
> processor?
> 
> Unless you have a pre-release engineering part for a future product,
> then your C3 has no local APIC, and hence no I/O APIC functionality.
> 
> I know some C3 specs pages list I/O APIC support, but if you look in
> the datasheets for current products you find zero APIC support.

My board is 2 years old (May 2003).

I've checked the specs [2] and they say (page 17 out of 83)
"APIC will be available in future steppings."  Yeah right...

Mine is stepping 1 according to /proc/cpuinfo.

So if I don't have APIC, that means I cannot use nmi_watchdog to
investigate the problem, right?

Do I have any alternative to investigate this hang or should I just
give up and smash my board?

Cheers,

Jerome

[2] http://www.via.com.tw/en/downloads/datasheets/processors/c3_nehemiah.zip
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


enabling IOAPIC on C3 processor?

2005-03-15 Thread jerome lacoste
I have a VIA Epia M1 board that crashes very badly (and pretty
often, especially when using DMA). I want to fix that.

Serial console + magic SysRQ didn't help so I am going the nmi
watchdog way. But in order to have nmi watchdog I need APIC, right?

The C3 processor seems to support IOAPIC.
(http://www.via.com.tw/en/products/processors/c3/specs.jsp)

But:
- I don't see anything in the BIOS related to APIC. 
- grep APIC /lib/modules/`uname -r`/build/.config shows me that all
APIC options are 'y'.
- dmesg | grep APIC tells me "no local APIC present or hardware disabled".
- adding lapic kernel parameter doesn't change that. 
- and of course, nmi_watchdog=1 or 2 gives me NMI count 0 in /proc/interrupts.

Did I miss something when it comes to enabling IOAPIC support on C3 processor?

Note: I also see a lot an increasing ERR count in /proc/interrupts,
especially when I put my box in conditions that make it also more
unstable (i.e. sending files on the network while using the PVR-350
using mythtv). Not sure if it is related.

Jerome

[EMAIL PROTECTED]:~$ dmesg | grep APIC
No local APIC present or hardware disabled
mapped APIC to d000 (013c1000)

[EMAIL PROTECTED]:~$ grep APIC /lib/modules/`uname -r`/build/.config 
CONFIG_X86_UP_APIC=y
CONFIG_X86_UP_IOAPIC=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y

Grub options
[...]
kernel  /vmlinuz-2.6.11-medios1 root=/dev/hda1 ro 
console=ttyS0,57600n8 console=tty0 nmi_watchdog=1 lapic
[...]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: oops / 2.6.11 / run_timer_softirq (mountvirtfs)

2005-03-11 Thread jerome lacoste
On Thu, 10 Mar 2005 20:59:43 -0800, Andrew Morton <[EMAIL PROTECTED]> wrote:
> jerome lacoste <[EMAIL PROTECTED]> wrote:
> >
> > On an VIA EPIA board, I got this single oops at boot. Wasn't stored on
> > file so I had to take a screenshot with a digital camera. Basicallly
> > goes along those lines:
> >
> > Process: S36mountvirtfs
> >
> > Call trace:
> >  run_timer_softirq+0x16f/0x200
> >  __do_softirq
> >  do_softirq
> >  irq_exit
> >  do_IRQ
> >  common_interrupt
> >
> > Process is found here on my system:
> >
> > lrwxr-xr-x  1 root root 21 Mar  1 00:29 /etc/rcS.d/S36mountvirtfs ->
> > ../init.d/mountvirtfs
> >
> > The exact screenshot (500k) can be found here:
> >
> > http://coffeebreaks.dyndns.org/~jerome/static/images/linux/oops_2.6.11_run_timer_softirq_boot.jpg
> >
> 
> An oops in cascade() is tricky.  Normally it means that some piece of code
> has done something bad with a kernel timer.  Later, a clock tick happens
> and the kernel falls over.  We're left with no hints as to which part of
> the kernel misbehaved.
> 
> Please try enabling CONFIG_DEBUG_SLAB and CONFIG_DEBUG_PAGEALLOC and see if
> that reveals any additional info.

Question; the thing happened once at boot time (out of hundreds) so it
will probably be hard to reproduce.

I you may have seen on the pictures, the screen was completely filled
up with the oops information. How will the new CONFIG_ options help if
I don't have more information on the screen when it oopses?

> Apart from that, you have a lot of modules configured there.  Please try
> disabling them all, see if the oops goes away.  If it does then try
> re-enabling them, see if you can narrow it down to the one which is causing
> the timer list corruption.

If the problem reappears I will see what I can do.

Jerome

> Thanks.

Pareillement

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


oops / 2.6.11 / run_timer_softirq (mountvirtfs)

2005-03-10 Thread jerome lacoste
On an VIA EPIA board, I got this single oops at boot. Wasn't stored on
file so I had to take a screenshot with a digital camera. Basicallly
goes along those lines:

Process: S36mountvirtfs

Call trace:
 run_timer_softirq+0x16f/0x200
 __do_softirq
 do_softirq
 irq_exit
 do_IRQ
 common_interrupt

Process is found here on my system:

lrwxr-xr-x  1 root root 21 Mar  1 00:29 /etc/rcS.d/S36mountvirtfs ->
../init.d/mountvirtfs

The exact screenshot (500k) can be found here:

http://coffeebreaks.dyndns.org/~jerome/static/images/linux/oops_2.6.11_run_timer_softirq_boot.jpg

I can spend time copying the input into a file and doing the ksymoops
stuff if someone wants to get it.
Otherwise I will go back to try fixing the other problems that happens
much more often...

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: VIA Rhine ethernet driver bug

2005-03-09 Thread jerome lacoste
On Sat, 15 Jan 2005 12:43:33 +0100, Udo van den Heuvel <[EMAIL PROTECTED]> 
wrote:
> Hello,
> 
> On my firewall (VIA EPIA CL-6000 with VIA Rhine network chips running FC3
> and custom kernels) I see messages like:
> 
> Jan 13 19:35:46 epia kernel: eth1: Oversized Ethernet frame spanned multiple
> buffers, entry 0x4 length 0 status 0600!

That might be interesting to someone:

My VIA EPIA based machine  was working well until some minutes ago. I
accidently removed the power supply and the machine rebooted. From
then on I didn't have network anymore. The ethernet card (VIA Rhine
II) (static, not dhcp) was not working and "Oversized "  messages
were printed on the console.

I've rebooted twice and the network didn't still come up. Pinging a
machine on my LAN and pinging the box back resulted in > 98% of the
packets lost.

So I stopped the machine, let it rest for a minute or so and booted
again. That solved the problem.

So if you see the same message as Udo, try to let your box rest.

Jerome
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Please open sysfs symbols to proprietary modules

2005-02-07 Thread jerome lacoste
On Mon, 7 Feb 2005 11:55:31 -0500 (EST), linux-os <[EMAIL PROTECTED]> wrote:
> On Mon, 7 Feb 2005, Chris Friesen wrote:
> 
> > Lee Revell wrote:
> >> On Wed, 2005-02-02 at 21:50 -0500, Kyle Moffett wrote:
> >>
> >>> It's not like somebody will have
> >>> some innate commercial advantage over you because they have your
> >>> driver source code.
> >>
> >>
> >> For a hardware vendor that's not a very compelling argument.  Especially
> >> compared to what their IP lawyers are telling them.
> >>
> >> Got anything to back it up?
> >
> > I have a friend who works for a company that does reverse-engineering of 
> > ICs.
> > Companies hire them to figure out how their competitor's chips work.  This 
> > is
> > the real threat to hardware manufacturers, not publishing the chip specs.
> >
> > Having driver code gives you the interface to the device.  That can be
> > reverse-engineered from watching bus traces or disassembling binary drivers
> > (which is how many linux drivers were originally written). Companies have
> > these kinds of resources.
> >
> > If you look at the big chip manufacturers (TI, Maxim, Analog Devices, etc.)
> > they publish specs on everything.  It would be nice if others did the same.
> >
> > Chris
> 
> I also have first-hand knowledge. Once there was a company called
> Data Precision. Just point your favorite search-engine to that
> name. They were a wholly owned subsidiary of Analogic. They
> no longer exist. Data Precision would take a year or more
> to develop a product. Six weeks after it was available on
> the market, it would have been cloned by Pacific-rim companies
> and dumped into the US at below US manufacturing cost.
> [..]

Shouldn't you be able to use legal action against companies that
provide such clones (at least in your country)? You could maybe even
sue the local resellers for participating to the fraud.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Huge unreliability - does Linux have something to do with it?

2005-02-05 Thread jerome lacoste
Took

On Fri, 4 Feb 2005 07:18:17 -0500, Wakko Warner <[EMAIL PROTECTED]> wrote:
> Please keep me CCd
> 
> jerome lacoste wrote:
> > particular hardware (Dell Inspiron 8100)? I run Linux on 3 other
> 
> I have this exact same laptop.  It works perfectly for me with linux.
> Originally started with a 2.4 kernel and recently went to 2.6.10.  The modem
> works well, the video card works well even with 3D accel.  I replaced the
> original 30gb hdd with a 40gb (for space reasons).  The only complaint about
> this thing I have is the fact they used an nvidia video chip.  I have seen
> more than 4 months uptime on it (I used to use it as a desktop)

I sometimes use it as a desktop. Thing is as I never took the time to
try to make work sw suspend, I'd rather have it running all the time
than to restart it every now and then.

While looking for a replacement disk, I've seen that some new disks
were "designed for continuous, 24/7 operation".

E.g. http://www6.tomshardware.com/storage/20030813/mini-harddisks-01.html

Not sure how good that is, but I will sure look into it...

Thanks for all who answered. If you want to further the talk, it maybe
better to take this off lkml now.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Huge unreliability - does Linux have something to do with it?

2005-02-04 Thread jerome lacoste
>> Could a hardware failure look like bad sectors to fsck?
> 
> A failure of the bus or a former sporadic error can cause defective fs, but
> normally you have a read error in fsck no structure error.
> 
> Are you using hdparm? is the system perhaps overheating or overclocked?

no overclock
hdparm is used but I cannot tell you exactly what the config is (now
machine has been running memtest for 1.5 hour). I don't think I use
special option: probably the defaults in my config file (mult_sect 16,
dma on, write_cache off).

overheating: perhaps. The machine is hot and running many hours per
day (usually 12-16). It s running the fans very often, but it's always
been like that. I've tried to control the fan, but then the
temperature goes high very quickly. So I let the fans run.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Huge unreliability - does Linux have something to do with it?

2005-02-04 Thread jerome lacoste
[Sorry for the sensational title]

I have had this laptop for three years. It ran Linux (Debian unstable)
from the start and its hardware has been very unreliable: I changed
hard disks twice and the motherboard thrice. My DVD drive started
failing some days ago (this one is 'original', 3 years old). But I
don't mind as I am not under warranty anymore... This morning the
machine booted with fsck errors on my hard disk. I am not sure if I
did the right thing, but I said clear the inodes, and I ended up
loosing some programs(*) (du, dircolors, etc..). The day starts well
isn't it? Sounds like I will have to switch disks again...

I halted the machine correctly yesterday night. I never dropped the
box in 3 years. Am I just being unlucky? Or could the fact that I am
using Linux on the box affect the reliability in some ways on that
particular hardware (Dell Inspiron 8100)? I run Linux on 3 other
computers and never had single problems with them.

How can the file system (ext3) be messed up the way it was this
morning after I stopped the machine correctly yesterday?
Could a hardware failure look like bad sectors to fsck?

Attached the output of smartctl -a /dev/hda, whatever that helps.

Jerome

(*) I accept tips on discovering and maybe recovering which files have
been taken out of my system...
smartctl version 5.32 Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model: HITACHI_DK23FB-60
Serial Number:1ZX822
Firmware Version: 00M0A0C1
Device is:In smartctl database [for details use: -P show]
ATA Version is:   5
ATA Standard is:  ATA/ATAPI-5 T13 1321D revision 3
Local Time is:Fri Feb  4 09:53:50 2005 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x80)	Offline data collection activity
	was never started.
	Auto Offline Data Collection: Enabled.
Self-test execution status:  (   0)	The previous self-test routine completed
	without error or no self-test has ever 
	been run.
Total time to complete Offline 
data collection: 		 (2150) seconds.
Offline data collection
capabilities: 			 (0x5b) SMART execute Offline immediate.
	Auto Offline data collection on/off support.
	Suspend Offline collection upon new
	command.
	Offline surface scan supported.
	Self-test supported.
	No Conveyance Self-test supported.
	Selective Self-test supported.
SMART capabilities:(0x0003)	Saves SMART data before entering
	power-saving mode.
	Supports SMART auto save timer.
Error logging capability:(0x01)	Error logging supported.
	No General Purpose Logging support.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 (  37) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG VALUE WORST THRESH TYPE  UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate 0x000d   091   090   050Pre-fail  Offline  -   412316862542
  2 Throughput_Performance  0x0005   100   092   050Pre-fail  Offline  -   3140
  3 Spin_Up_Time0x0007   100   100   050Pre-fail  Always   -   0
  4 Start_Stop_Count0x0032   100   100   000Old_age   Always   -   388
  5 Reallocated_Sector_Ct   0x0033   095   095   010Pre-fail  Always   -   142
  7 Seek_Error_Rate 0x000f   100   100   050Pre-fail  Always   -   651
  8 Seek_Time_Performance   0x0005   100   100   050Pre-fail  Offline  -   1125
  9 Power_On_Minutes0x0032   095   095   000Old_age   Always   -   2512h+02m
 10 Spin_Retry_Count0x0013   100   100   050Pre-fail  Always   -   0
 12 Power_Cycle_Count   0x0032   100   100   000Old_age   Always   -   361
191 G-Sense_Error_Rate  0x000a   100   068   000Old_age   Always   -   17536255
192 Power-Off_Retract_Count 0x0032   100   100   000Old_age   Always   -   25
193 Load_Cycle_Count0x0032   092   092   000Old_age   Always   -   53398/53372
194 Temperature_Celsius 0x0022   072   036   000Old_age   Always   -   54 (Lifetime Min/Max 72/13)
195 Hardware_ECC_Recovered  0x001a   100   001   000Old_age   Always   -   4189
196 Reallocated_Event_Count 0x0032   086   086   000Old_age   Always   -   142
197 Current_Pending_Sector  0x0032   097   094   000Old_age   Always   -   3
198 Offline_Uncorrectable   0x0010   100   100   000Old_age   Offline  -   0
199 UDMA_CRC_Error_Count0x003e   200   200   000Old_age   Always   -   0
200 Multi_Zone_Error_Rate   0x0013  

Re: [patch 1/1] fix syscallN() macro errno value checking for i386

2005-01-30 Thread jerome lacoste
On Sun, 30 Jan 2005 18:00:22 +0100, Arnd Bergmann <[EMAIL PROTECTED]> wrote:
> On Sünnavend 29 Januar 2005 02:01, [EMAIL PROTECTED] wrote:
> >
> > From: Paolo 'Blaisorblade' Giarrusso <[EMAIL PROTECTED]>
> > Cc: David Howells <[EMAIL PROTECTED]>
> >
> > The errno values which are visible for userspace are actually in the range
> > -1 - -129, not until -128 (): this value was added:
> >
> > #define   EKEYREJECTED129 /* Key was rejected by service */
> >
> > And this would break ucLibc (for what I heard).
> >
> > This is just a quick-fix, because putting a macro inside errno.h instead of
> > having it copied in two places would be probably nicer.
> 
> Yes. Note that your patch only fixes the bug on i386. The code has been
> copied to many other architectures, and some of them have been updated
> less recently and are checking for values lower than 128. There should
> really be a way to keep them all in sync.


what about something along?

#define EKEYNEXT130 /* key counter */

and 

 if ((unsigned long)(res) >= (unsigned long)(-EKEYNEXT)) {

JL
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/