Bug#595927: closed by Guillem Jover guil...@debian.org (Bug#600075: fixed in dpkg 1.15.8.6)

2010-11-26 Thread stephen mulcahy
FYI - I just manually installed 1.15.8.6 on one of our systems that had 
been exhibiting this problem and then ran a significant aptitude update 
 upgrade and didn't see any occurences of the problem - so this looks 
positive.


It would be good to see this update migrate into squeeze asap.

Thanks for your efforts on this - much appreciated.

-stephen

On 25/11/10 07:03, Debian Bug Tracking System wrote:

This is an automatic notification regarding your Bug report
which was filed against the dpkg package:

#600075: linux-image-2.6.32-5-amd64: task dpkg blocked for more than 120 seconds

It has been closed by Guillem Joverguil...@debian.org.

Their explanation is attached below along with your original report.
If this explanation is unsatisfactory and you have not received a
better one in a separate message then please contact Guillem 
Joverguil...@debian.org  by
replying to this email.





--
Stephen Mulcahy, DI2, Digital Enterprise Research Institute,
NUI Galway, IDA Business Park, Lower Dangan, Galway, Ireland
http://di2.deri.iehttp://webstar.deri.iehttp://sindice.com



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#595927: Related kernel bug?

2010-09-08 Thread stephen mulcahy


Is this possibly related to http://lkml.org/lkml/2010/2/12/41 ?
If so, there seems to be a patch available.

--
Stephen Mulcahy, DI2, Digital Enterprise Research Institute,
NUI Galway, IDA Business Park, Lower Dangan, Galway, Ireland
http://di2.deri.iehttp://webstar.deri.iehttp://sindice.com



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#595927: linux-image-2.6.32-5-amd64: task dpkg blocked for more than 120 seconds

2010-09-07 Thread stephen mulcahy
Package: linux-2.6
Version: 2.6.32-21
Severity: important


I've seen this problem on a number of machines with different processor arch 
(intel and amd). I guess it can be triggered by more than just dpkg but this is 
the case I've seen most.

dmesg contains the following

[60600.580047] INFO: task dpkg:21101 blocked for more than 120 seconds.
[60600.580078] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables this 
message.
[60600.580123] dpkg  D 0002 0 21101  20880 0x
[60600.580127]  88022f0754c0 0082 000e2c86ddb8 
88002c86de98
[60600.580132]    f9e0 
88002c86dfd8
[60600.580135]  00015780 00015780 88022ca99c40 
88022ca99f38
[60600.580138] Call Trace:
[60600.580148]  [81106667] ? bdi_sched_wait+0x0/0xe
[60600.580151]  [81106670] ? bdi_sched_wait+0x9/0xe
[60600.580155]  [812f81d6] ? __wait_on_bit+0x41/0x70
[60600.580158]  [81106667] ? bdi_sched_wait+0x0/0xe
[60600.580161]  [812f8270] ? out_of_line_wait_on_bit+0x6b/0x77
[60600.580165]  [81063910] ? wake_bit_function+0x0/0x23
[60600.580168]  [811066e8] ? sync_inodes_sb+0x73/0x12a
[60600.580171]  [8110a255] ? __sync_filesystem+0x4b/0x70
[60600.580174]  [8110a314] ? sync_filesystems+0x9a/0xe3
[60600.580176]  [8110a3a2] ? sys_sync+0x1c/0x2e
[60600.580181]  [81010b42] ? system_call_fastpath+0x16/0x1b

seems to eventually work, but definitely stalls for a long time. This problem 
occurs on both lightly loaded and heavily loaded systems.

Let me know what other diagnostics I can run.

-- Package-specific info:
** Version:
Linux version 2.6.32-5-amd64 (Debian 2.6.32-21) (b...@decadent.org.uk) (gcc 
version 4.3.5 (Debian 4.3.5-2) ) #1 SMP Wed Aug 25 13:59:41 UTC 2010

** Command line:
BOOT_IMAGE=/boot/vmlinuz-2.6.32-5-amd64 
root=UUID=316c4383-192d-4a78-af15-844a0c92f1cf ro quiet

** Not tainted

** Kernel log:
[1.392558] usbhid: v2.6:USB HID core driver
[1.427468] EXT4-fs (sda1): mounted filesystem with ordered data mode
[2.062190] udev: starting version 160
[2.199893] processor LNXCPU:00: registered as cooling_device0
[2.200512] processor LNXCPU:01: registered as cooling_device1
[2.201043] processor LNXCPU:02: registered as cooling_device2
[2.201545] processor LNXCPU:03: registered as cooling_device3
[2.252882] parport_pc 00:0c: reported by Plug and Play ACPI
[2.252984] parport0: PC-style at 0x378 (0x778), irq 7 [PCSPP,TRISTATE,EPP]
[2.337007] pci_hotplug: PCI Hot Plug PCI Core version: 0.5
[2.343687] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
[2.357725] input: PC Speaker as /devices/platform/pcspkr/input/input3
[2.389002] input: Power Button as 
/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/PNP0C0C:00/input/input4
[2.389006] ACPI: Power Button [PWRB]
[2.389054] input: Power Button as 
/devices/LNXSYSTM:00/LNXPWRBN:00/input/input5
[2.389058] ACPI: Power Button [PWRF]
[2.468557] [drm] Initialized drm 1.1.0 20060810
[2.469964] i801_smbus :00:1f.3: PCI INT B - GSI 19 (level, low) - IRQ 
19
[2.487561] EDAC MC: Ver: 2.1.0 Aug 25 2010
[2.500914] Error: Driver 'pcspkr' is already registered, aborting...
[2.512615] dca service started, version 1.12.1
[2.530391] EDAC MC0: Giving out device to 'i5000_edac.c' 'I5000': DEV 
:00:10.0
[2.530411] EDAC PCI0: Giving out device to module 'i5000_edac' controller 
'EDAC PCI controller': DEV ':00:10.0' (POLLED)
[2.531953] intel_rng: FWH not detected
[2.549459] ioatdma: Intel(R) QuickData Technology Driver 4.00
[2.549518] ioatdma :00:08.0: PCI INT A - GSI 16 (level, low) - IRQ 16
[2.549541] ioatdma :00:08.0: setting latency timer to 64
[2.549569]   alloc irq_desc for 54 on node -1
[2.549571]   alloc kstat_irqs on node -1
[2.549580] ioatdma :00:08.0: irq 54 for MSI/MSI-X
[2.587480] [drm] radeon defaulting to userspace modesetting.
[2.587671] pci :07:01.0: PCI INT A - GSI 18 (level, low) - IRQ 18
[2.589626] [drm] Initialized radeon 1.32.0 20080528 for :07:01.0 on 
minor 0
[3.173205] Adding 15624184k swap on /dev/sda5.  Priority:-1 extents:1 
across:15624184k 
[3.439599] loop: module loaded
[3.908196] kjournald starting.  Commit interval 5 seconds
[3.908493] EXT3 FS on sda6, internal journal
[3.908497] EXT3-fs: mounted filesystem with ordered data mode.
[3.948201] EXT4-fs (sdb1): mounted filesystem with ordered data mode
[4.724683] e1000e :04:00.0: irq 52 for MSI/MSI-X
[4.780055] e1000e :04:00.0: irq 52 for MSI/MSI-X
[4.781225] ADDRCONF(NETDEV_UP): eth0: link is not ready
[7.424986] e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: 
None
[7.428289] ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[   17.508003] eth0: no IPv6 routers present
[   63.392030] usb 2-1: USB 

Bug#572201: [PATCH] forcedeth: fix tx limit2 flag check

2010-04-14 Thread stephen mulcahy

Ayaz Abdulla wrote:
This patch fixes the TX_LIMIT feature flag. The previous logic check for 
TX_LIMIT2 also took into account a device that only had TX_LIMIT set.


Signed-off-by: Ayaz Abdulla aabdu...@nvidia.com

This is a fix for bug 572201 @ bugs.debian.org


Hi,

Thanks! I'll rebuild my Debian kernel with this and run a test today.

-stephen



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#572201: forcedeth driver hangs under heavy load

2010-04-14 Thread stephen mulcahy

Ayaz Abdulla wrote:

Attached fix has been submitted to netdev.


I've run my reproducer with this patch applied to be Debian 2.6.32 
kernel and so far the problem with nodes becoming unresponsive hasn't 
occurred.


NIC settings were left the default so this looks positive

r...@node23:~# ethtool -k eth0
Offload parameters for eth0:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp-segmentation-offload: on
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: off
large-receive-offload: off

Thanks!

-stephen



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#572201: forcedeth driver hangs under heavy load

2010-04-13 Thread stephen mulcahy

Eric Dumazet wrote:

OK it seems forcedeth has problem with checksums ?

Try to change ethtool -k eth0 settings ?

ethtool -K eth0 tso off tx off


Yes, that makes an unresponsive system responsive again immediately, nice!

Should the driver default to disabling this until we problem is corrected?

-stephen



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#572201: forcedeth driver hangs under heavy load

2010-04-13 Thread stephen mulcahy

Eric Dumazet wrote:

Le mardi 13 avril 2010 à 11:03 +0100, stephen mulcahy a écrit :

Eric Dumazet wrote:

OK it seems forcedeth has problem with checksums ?

Try to change ethtool -k eth0 settings ?

ethtool -K eth0 tso off tx off

Yes, that makes an unresponsive system responsive again immediately, nice!

Should the driver default to disabling this until we problem is corrected?

-stephen


Both flags need to be disabled, or only one is OK ?


ethtool -K eth0 tx off

fixes the problem (without tso)

but running

ethtool -k eth0
Offload parameters for eth0:
rx-checksumming: on
tx-checksumming: off
scatter-gather: off
tcp-segmentation-offload: off
udp-fragmentation-offload: off
generic-segmentation-offload: on
generic-receive-offload: off
large-receive-offload: off

seems to indicate that tso is also disabled by this - does that sound 
correct?


-stephen



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#572201: forcedeth driver hangs under heavy load

2010-04-13 Thread stephen mulcahy

Ok, I've tried both of the following with my reproducer

1. ethtool -K eth0 tso off

RESULT: reproducer causes multiple hosts to be come unresponsive on 
first run.


2. ethtool -K eth0 tx off

RESULT: reproducer runs three times without any hosts becoming unresponsive.

-stephen



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#572201: forcedeth driver hangs under heavy load

2010-04-13 Thread stephen mulcahy

Eric Dumazet wrote:

Le mardi 13 avril 2010 à 15:27 +0100, stephen mulcahy a écrit :

Ok, I've tried both of the following with my reproducer

1. ethtool -K eth0 tso off

RESULT: reproducer causes multiple hosts to be come unresponsive on 
first run.


2. ethtool -K eth0 tx off

RESULT: reproducer runs three times without any hosts becoming unresponsive.

-stephen


Thanks Stephen !

Now some brave fouls to check the 6410 lines of this driver ? ;)

Question of the day : Why TSO is broken in forcedeth ?
Is it generically broken or is it broken for specific NICS ?



Actually, it is only when tx-checksumming is turned off that the problem 
 doesn't occur (so I'm not sure TSO is the problem).


Additionally, a google also turns up this existing Debian bug 
http://bugs.debian.org/506419 which seems to be related.


-stephen




--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#572201: forcedeth driver hangs under heavy load

2010-04-13 Thread stephen mulcahy

stephen mulcahy wrote:

Now some brave fouls to check the 6410 lines of this driver ? ;)

Question of the day : Why TSO is broken in forcedeth ?
Is it generically broken or is it broken for specific NICS ?



Actually, it is only when tx-checksumming is turned off that the problem 
 doesn't occur (so I'm not sure TSO is the problem).


Additionally, a google also turns up this existing Debian bug 
http://bugs.debian.org/506419 which seems to be related.


As mentioned in the original Debian bug - I can reproduce this by 
running Hadoop[1] TeraSort[2] but I haven't identified a simpler 
reproducer. I tried to recreate this with iperf and ping -f but neither 
helped - it may be that the problem only occurs when systems are passing 
large amounts of traffic and have very high cpu utilisation (when 
running the Hadoop TeraSort all 8 cores run at 70-100% utilisation as 
measure with htop - I plan to instrument the nodes with something like 
Zabbix or Ganglia but it hasn't happened yet).


-stephen

[1] http://hadoop.apache.org/
[2] 
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/examples/terasort/package-summary.html




--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#572201: forcedeth driver hangs under heavy load

2010-04-13 Thread stephen mulcahy

Eric Dumazet wrote:


I am scratching my head, but I thought you told me that

ethtool -K eth0 tso off
ethtool -K eth0 tx on 


was working ?


No, sorry for the confusion.

ethtool -K eth0 tx off

fixes the problem.


Setting only

ethtool -K eth0 tso off
ethtool -K eth0 tx on

still results in failures.

-stephen




--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#506419: [Fwd: Re: forcedeth driver hangs under heavy load]

2010-04-13 Thread stephen mulcahy

Hi Martin,

Just came across a similar bug you logged a while back - thought you 
might be interested.


-stephen

--
Stephen Mulcahy Atlantic Linux http://www.atlanticlinux.ie
Registered in Ireland, no. 376591 (144 Ros Caoin, Roscam, Galway)
---BeginMessage---

Eric Dumazet wrote:

Le mardi 13 avril 2010 à 15:27 +0100, stephen mulcahy a écrit :

Ok, I've tried both of the following with my reproducer

1. ethtool -K eth0 tso off

RESULT: reproducer causes multiple hosts to be come unresponsive on 
first run.


2. ethtool -K eth0 tx off

RESULT: reproducer runs three times without any hosts becoming unresponsive.

-stephen


Thanks Stephen !

Now some brave fouls to check the 6410 lines of this driver ? ;)

Question of the day : Why TSO is broken in forcedeth ?
Is it generically broken or is it broken for specific NICS ?



Actually, it is only when tx-checksumming is turned off that the problem 
 doesn't occur (so I'm not sure TSO is the problem).


Additionally, a google also turns up this existing Debian bug 
http://bugs.debian.org/506419 which seems to be related.


-stephen


---End Message---


Bug#572201: forcedeth driver hangs under heavy load

2010-04-13 Thread stephen mulcahy

Eric Dumazet wrote:

OK, thanks for clarification.

Last question, did you tried a vanilla kernel, aka 2.6.33.2 for
example ?


I built a Debian package from the vanilla 2.6.33.2 and installed that on 
all nodes and tried my reproducer with the same results - nodes becoming 
unresponsive.


I didn't try changing the tso and tx settings with the 2.6.33.2 kernel 
though. Let me know if that would be useful (and/or if there is another 
kernel that you would like me to test with) and I'll try to fit it in.


Thanks again for your help.

-stephen



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#572201: forcedeth driver hangs under heavy load

2010-04-12 Thread stephen mulcahy

Ben Hutchings wrote:

Stephen Mulcahy reported a regression in forcedeth at
http://bugs.debian.org/572201.  The system information and some
diagnostic information can be found there.  Anyone able to help?


Incidentally, I also tried the 2.6.33.2 kernel with 
CONFIG_FORCEDETH_NAPI set to y to see if that made a difference.


It doesn't - further testing over the weekend saw 6 of 45 machines drop 
off the network with this problem. Nothing in dmesg or system logs. 
Happy to run more tests if someone can advise on what should be run.


-stephen

--
Stephen Mulcahy Atlantic Linux http://www.atlanticlinux.ie
Registered in Ireland, no. 376591 (144 Ros Caoin, Roscam, Galway)



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#572201: forcedeth driver hangs under heavy load

2010-04-12 Thread stephen mulcahy

stephen mulcahy wrote:
It doesn't - further testing over the weekend saw 6 of 45 machines drop 
off the network with this problem. Nothing in dmesg or system logs. 
Happy to run more tests if someone can advise on what should be run.


I also just tried using the 2.6.30-2-amd64 (Debian) forcedeth kernel 
module while running the 2.6.32-3-amd64 (Debian) kernel and experienced 
the same symptoms.


Not sure if thats any help.

-stephen



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#572201: forcedeth driver hangs under heavy load

2010-04-12 Thread stephen mulcahy

Eric Dumazet wrote:

Le lundi 12 avril 2010 à 13:39 +0100, stephen mulcahy a écrit :
I am not sure I understand. Are you saying that using 2.6.30-2-amd64
kernel also makes your forcedeth adapter being not functional ?


Hi Eric,

If I run my tests with the 2.6.30-2-amd64 kernel the network doesn't 
malfunction.


If I run my tests with the 2.6.32-3-amd64 kernel the network does 
malfunction.


If I take the forcedeth.ko module from the 2.6.30-2-amd64 kernel and 
drop that into /lib/modules/2.6.32-3-amd64/kernel/drivers/net/ and then 
reboot to 2.6.32-3-amd64 and rerun my tests - the network does malfunction.



Are both way non functional (RX and TX), or only one side ?


Whats the best way of testing this? (tcpdump listening on both hosts and 
then running pings between the systems?)


-stephen



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#572201: forcedeth driver hangs under heavy load

2010-04-12 Thread stephen mulcahy

stephen mulcahy wrote:

Are both way non functional (RX and TX), or only one side ?


Whats the best way of testing this? (tcpdump listening on both hosts and 
then running pings between the systems?)



stephen mulcahy wrote:
 Are both way non functional (RX and TX), or only one side ?

 Whats the best way of testing this? (tcpdump listening on both hosts and
 then running pings between the systems?)

On one of the nodes that is in the malfunctioning state (node05), I ran

ssh node20

and grabbed the following output from running tcpdump on node20

r...@node20:~# tcpdump host node20 and node05
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes
14:12:59.612626 IP node05.webstar.cnet.36295  node20.ssh: Flags [S], 
seq 3677858646, win 5840, options [mss 1460,sackOK,TS val 1599534 ecr 
0,nop,wscale 7], length 0
14:12:59.612656 IP node20.ssh  node05.webstar.cnet.36295: Flags [S.], 
seq 3610575850, ack 3677858647, win 5792, options [mss 1460,sackOK,TS 
val 1598775 ecr 1599534,nop,wscale 7], length 0
14:12:59.612718 IP node05.webstar.cnet.36295  node20.ssh: Flags [.], 
ack 1, win 46, options [nop,nop,TS val 1599534 ecr 1598775], length 0
14:12:59.617434 IP node20.ssh  node05.webstar.cnet.36295: Flags [P.], 
seq 1:33, ack 1, win 46, options [nop,nop,TS val 1598776 ecr 1599534], 
length 32
14:12:59.617522 IP node05.webstar.cnet.36295  node20.ssh: Flags [.], 
ack 33, win 46, options [nop,nop,TS val 1599535 ecr 1598776], length 0
14:12:59.617609 IP node05.webstar.cnet.36295  node20.ssh: Flags [P.], 
seq 1:33, ack 33, win 46, options [nop,nop,TS val 1599535 ecr 1598776], 
length 32
14:12:59.820434 IP node05.webstar.cnet.36295  node20.ssh: Flags [P.], 
seq 4294936586:4294936618, ack 2620194849, win 46, options [nop,nop,TS 
val 1599586 ecr 1598776], length 32
14:13:00.229069 IP node05.webstar.cnet.36295  node20.ssh: Flags [P.], 
seq 4294961734:4294961766, ack 3928358945, win 46, options [nop,nop,TS 
val 1599688 ecr 1598776], length 32
14:13:01.044396 IP node05.webstar.cnet.36295  node20.ssh: Flags [P.], 
seq 4294964167:4294964199, ack 410320929, win 46, options [nop,nop,TS 
val 1599892 ecr 1598776], length 32
14:13:02.676308 IP node05.webstar.cnet.36295  node20.ssh: Flags [P.], 
seq 1:33, ack 33, win 46, options [nop,nop,TS val 1600300 ecr 1598776], 
length 32
14:13:05.940804 IP node05.webstar.cnet.36295  node20.ssh: Flags [P.], 
seq 17294:17326, ack 3045851169, win 46, options [nop,nop,TS val 1601116 
ecr 1598776], length 32
14:13:12.468484 IP node05.webstar.cnet.36295  node20.ssh: Flags [P.], 
seq 17294:17326, ack 3045851169, win 46, options [nop,nop,TS val 1602748 
ecr 1598776], length 32
14:13:23.846891 IP node20.ssh  node05.webstar.cnet.36084: Flags [F.], 
seq 2093054475, ack 2175389538, win 46, options [nop,nop,TS val 1604834 
ecr 1575591], length 0
14:13:23.847278 IP node05.webstar.cnet.36084  node20.ssh: Flags [R], 
seq 2175389538, win 0, length 0
14:13:25.523850 IP node05.webstar.cnet.36295  node20.ssh: Flags [P.], 
seq 1:33, ack 33, win 46, options [nop,nop,TS val 1606012 ecr 1598776], 
length 32
14:13:50.127509 IP node20.ssh  node05.webstar.cnet.36143: Flags [F.], 
seq 2526196657, ack 2590340885, win 46, options [nop,nop,TS val 1611404 
ecr 1582161], length 0
14:13:50.127879 IP node05.webstar.cnet.36143  node20.ssh: Flags [R], 
seq 2590340885, win 0, length 0
14:13:51.633934 IP node05.webstar.cnet.36295  node20.ssh: Flags [P.], 
seq 4294963190:4294963222, ack 9830433, win 46, options [nop,nop,TS val 
1612540 ecr 1598776], length 32
14:13:55.125525 ARP, Request who-has node05.webstar.cnet tell node20, 
length 28
14:13:55.125886 ARP, Reply node05.webstar.cnet is-at 00:30:48:ce:dc:02 
(oui Unknown), length 46
14:14:43.855380 IP node05.webstar.cnet.36295  node20.ssh: Flags [P.], 
seq 1:33, ack 33, win 46, options [nop,nop,TS val 1625596 ecr 1598776], 
length 32
14:14:48.855143 ARP, Request who-has node20 tell node05.webstar.cnet, 
length 46
14:14:48.855469 ARP, Reply node20 is-at 00:30:48:ce:de:34 (oui Unknown), 
length 28
14:14:59.617675 IP node20.ssh  node05.webstar.cnet.36295: Flags [F.], 
seq 33, ack 1, win 46, options [nop,nop,TS val 1628777 ecr 1599535], 
length 0
14:14:59.618202 IP node05.webstar.cnet.36295  node20.ssh: Flags [FP.], 
seq 4294959654:4294960446, ack 3930456098, win 46, options [nop,nop,TS 
val 1629536 ecr 1628777], length 792
14:14:59.821527 IP node20.ssh  node05.webstar.cnet.36295: Flags [F.], 
seq 33, ack 1, win 46, options [nop,nop,TS val 1628828 ecr 1599535], 
length 0
14:14:59.821598 IP node05.webstar.cnet.36295  node20.ssh: Flags [.], 
ack 34, win 46, options [nop,nop,TS val 1629587 ecr 1628828,nop,nop,sack 
1 {33:34}], length 0

^C^
27 packets captured
31 packets received by filter
0 packets dropped by kernel


I then did ifdown and ifup on node05 and again ran

ssh node20

and grabbed the following output from running tcpdump on node20

r...@node20:~# tcpdump host node20 and node05

Bug#572201: forcedeth driver hangs under heavy load

2010-04-12 Thread stephen mulcahy

Eric Dumazet wrote:

Le lundi 12 avril 2010 à 14:19 +0100, stephen mulcahy a écrit :

Do you have some netfilters rules ?



Hi Eric,

I don't have any netfilters rules:

r...@node34:~# for table in filter nat mangle raw; do iptables -t $table 
-L; done

Chain INPUT (policy ACCEPT)
target prot opt source   destination

Chain FORWARD (policy ACCEPT)
target prot opt source   destination

Chain OUTPUT (policy ACCEPT)
target prot opt source   destination
Chain PREROUTING (policy ACCEPT)
target prot opt source   destination

Chain POSTROUTING (policy ACCEPT)
target prot opt source   destination

Chain OUTPUT (policy ACCEPT)
target prot opt source   destination
Chain PREROUTING (policy ACCEPT)
target prot opt source   destination

Chain INPUT (policy ACCEPT)
target prot opt source   destination

Chain FORWARD (policy ACCEPT)
target prot opt source   destination

Chain OUTPUT (policy ACCEPT)
target prot opt source   destination

Chain POSTROUTING (policy ACCEPT)
target prot opt source   destination
Chain PREROUTING (policy ACCEPT)
target prot opt source   destination

Chain OUTPUT (policy ACCEPT)
target prot opt source   destination


I re-ran this on the 2.6.32 kernel (with the 2.6.32 forcedeth module) 
just in case that was screwing something up.


node33 is in the unresponsive state this time. I'm running tcpdump on 
node34. on node33 I try to ssh to node34 (using ip address of node34). I 
note that I can ping between node33 and node34.


r...@node34:~# tcpdump -v host node34 and node33
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 96 
bytes
17:05:19.622384 IP (tos 0x0, ttl 64, id 21435, offset 0, flags [DF], 
proto TCP (6), length 60)
node33.webstar.cnet.43653  node34.ssh: Flags [S], cksum 0xb994 
(correct), seq 1675314077, win 5840, options [mss 1460,sackOK,TS val 
331814 ecr 0,nop,wscale 7], length 0
17:05:19.622754 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto 
TCP (6), length 60)
node34.ssh  node33.webstar.cnet.43653: Flags [S.], cksum 0x9d81 
(correct), seq 1669769379, ack 1675314078, win 5792, options [mss 
1460,sackOK,TS val 331779 ecr 331814,nop,wscale 7], length 0
17:05:19.622813 IP (tos 0x0, ttl 64, id 21436, offset 0, flags [DF], 
proto TCP (6), length 52)
node33.webstar.cnet.43653  node34.ssh: Flags [.], cksum 0xe2bf 
(correct), ack 1, win 46, options [nop,nop,TS val 331814 ecr 331779], 
length 0
17:05:19.627666 IP (tos 0x0, ttl 64, id 47271, offset 0, flags [DF], 
proto TCP (6), length 84)
node34.ssh  node33.webstar.cnet.43653: Flags [P.], seq 1:33, ack 
1, win 46, options [nop,nop,TS val 331780 ecr 331814], length 32
17:05:19.627748 IP (tos 0x0, ttl 64, id 21437, offset 0, flags [DF], 
proto TCP (6), length 52)
node33.webstar.cnet.43653  node34.ssh: Flags [.], cksum 0xe29c 
(correct), ack 33, win 46, options [nop,nop,TS val 331816 ecr 331780], 
length 0
17:05:19.627833 IP (tos 0x0, ttl 64, id 21438, offset 0, flags [DF], 
proto TCP (6), length 84, bad cksum 1f8a (-d189)!)
node33.webstar.cnet.43653  node34.ssh: Flags [P.], seq 
23413:23445, ack 2749038625, win 46, options [nop,nop,TS val 331816 ecr 
331780], length 32
17:05:19.831634 IP (tos 0x0, ttl 64, id 21439, offset 0, flags [DF], 
proto TCP (6), length 84, bad cksum d189 (-d188)!)
node33.webstar.cnet.43653  node34.ssh: Flags [P.], seq 1:33, ack 
33, win 46, options [nop,nop,TS val 331867 ecr 331780], length 32
17:05:20.239603 IP (tos 0x0, ttl 64, id 21440, offset 0, flags [DF], 
proto TCP (6), length 84, bad cksum 15c6 (-d187)!)
node33.webstar.cnet.43653  node34.ssh: Flags [P.], seq 
30492:30524, ack 809893921, win 46, options [nop,nop,TS val 331969 ecr 
331780], length 32
17:05:21.055534 IP (tos 0x0, ttl 64, id 21441, offset 0, flags [DF], 
proto TCP (6), length 84, bad cksum d187 (-d186)!)
node33.webstar.cnet.43653  node34.ssh: Flags [P.], seq 1:33, ack 
33, win 46, options [nop,nop,TS val 332173 ecr 331780], length 32
17:05:22.687386 IP (tos 0x0, ttl 64, id 21442, offset 0, flags [DF], 
proto TCP (6), length 84, bad cksum d186 (-d185)!)
node33.webstar.cnet.43653  node34.ssh: Flags [P.], seq 1:33, ack 
33, win 46, options [nop,nop,TS val 332581 ecr 331780], length 32
17:05:25.950935 IP (tos 0x0, ttl 64, id 21443, offset 0, flags [DF], 
proto TCP (6), length 84, bad cksum 15c4 (-d184)!)
node33.webstar.cnet.43653  node34.ssh: Flags [P.], seq 
30492:30524, ack 809893921, win 46, options [nop,nop,TS val 97 ecr 
331780], length 32
17:05:32.478527 IP (tos 0x0, ttl 64, id 21444, offset 0, flags [DF], 
proto TCP (6), length 84, bad cksum c01 (-d183)!)
node33.webstar.cnet.43653  node34.ssh: Flags [P.], seq 
43997:44029, ack 1311047713, win 46, options [nop,nop,TS val 335029 ecr 
331780], length 32
17:05:45.533370 IP (tos 0x0, ttl 64, id 21445, offset 0

Bug#570499: same problem - 640k detected on system with 16GB

2010-04-09 Thread stephen mulcahy

Hi,

I ran into the same problem - when I booted with memtest86+ 4.0 on my 
Debian Squeeze box it only detected 640k (which made for a very quick 
and successful test :)


I suspect this is related to #540572 and the use of multiboot.

I downloaded the memtest86+ bin directly from their site and changed 
/etc/grub.d/20_memtest86+ to use that bin with linux16 instead of 
multiboot and it runs and correctly detects the full amount of memory (I 
tried the Debian memtest86+ binary image aswell but it failed to load, 
presumably because it has been modified to work with multiboot).


-stephen

--
Stephen Mulcahy, DI2, Digital Enterprise Research Institute,
NUI Galway, IDA Business Park, Lower Dangan, Galway, Ireland
http://di2.deri.iehttp://webstar.deri.iehttp://sindice.com



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#556030: BUG: soft lockup - CPU#1 stuck for 61s! [java:7582]

2010-04-08 Thread stephen mulcahy

Just ran into the same problem on a 2.6.30-2-amd64 system

Lots of these

Apr  7 23:49:50 node28 kernel: [24133.424830] BUG: soft lockup - CPU#5 
stuck for 61s! [java:1255]
Apr  7 23:49:50 node28 kernel: [24133.424834] Modules linked in: loop 
snd_pcsp snd_pcm snd_timer snd soundcore i2c_nforce2 joydev button 
snd_page_alloc i2c_core evdev processor serio_raw ext4 mbcache jbd2 
crc16 sg sr_mod cdrom sd_mod crc_t10dif ide_pci_generic ata_generic 
usbhid hid sata_nv libata ohci_hcd forcedeth scsi_mod amd74xx ide_core 
ehci_hcd thermal fan thermal_sys [last unloaded: scsi_wait_scan]

Apr  7 23:49:50 node28 kernel: [24133.424834] CPU 5:
Apr  7 23:49:50 node28 kernel: [24133.424834] Modules linked in: loop 
snd_pcsp snd_pcm snd_timer snd soundcore i2c_nforce2 joydev button 
snd_page_alloc i2c_core evdev processor serio_raw ext4 mbcache jbd2 
crc16 sg sr_mod cdrom sd_mod crc_t10dif ide_pci_generic ata_generic 
usbhid hid sata_nv libata ohci_hcd forcedeth scsi_mod amd74xx ide_core 
ehci_hcd thermal fan thermal_sys [last unloaded: scsi_wait_scan]
Apr  7 23:49:50 node28 kernel: [24133.424834] Pid: 1255, comm: java Not 
tainted 2.6.30-2-amd64 #1 H8DMU
Apr  7 23:49:50 node28 kernel: [24133.424834] RIP: 
0010:[804b5ae1]  [804b5ae1] _spin_lock+0x15/0x1b
Apr  7 23:49:50 node28 kernel: [24133.424834] RSP: 0018:880393cbdda0 
 EFLAGS: 0202
Apr  7 23:49:50 node28 kernel: [24133.424834] RAX: db2d RBX: 
807782c0 RCX: f470fb29
Apr  7 23:49:50 node28 kernel: [24133.424834] RDX: db2e RSI: 
104a544b RDI: 807782c0
Apr  7 23:49:50 node28 kernel: [24133.424834] RBP: 802105ce R08: 
601e6830 R09: 2403e8c3
Apr  7 23:49:50 node28 kernel: [24133.424834] R10:  R11: 
0206 R12: c207b0c0
Apr  7 23:49:50 node28 kernel: [24133.424834] R13: 880308dec8b0 R14: 
0005 R15: 880308dec8b0
Apr  7 23:49:50 node28 kernel: [24133.424834] FS: 
7fb223590910() GS:c2069000() knlGS:
Apr  7 23:49:50 node28 kernel: [24133.424834] CS:  0010 DS:  ES: 
 CR0: 8005003b
Apr  7 23:49:50 node28 kernel: [24133.424834] CR2: 7fb232369000 CR3: 
0002e9e53000 CR4: 06e0
Apr  7 23:49:50 node28 kernel: [24133.424834] DR0:  DR1: 
 DR2: 
Apr  7 23:49:50 node28 kernel: [24133.424834] DR3:  DR6: 
0ff0 DR7: 0400

Apr  7 23:49:50 node28 kernel: [24133.424834] Call Trace:
Apr  7 23:49:50 node28 kernel: [24133.424834]  [80260272] ? 
futex_wake+0x58/0xda
Apr  7 23:49:50 node28 kernel: [24133.424834]  [802613fc] ? 
do_futex+0xa9/0x8c8
Apr  7 23:49:50 node28 kernel: [24133.424834]  [8020e5fa] ? 
__switch_to+0xff/0x263
Apr  7 23:49:50 node28 kernel: [24133.424834]  [803503a2] ? 
rb_erase+0x1be/0x285
Apr  7 23:49:50 node28 kernel: [24133.424834]  [804b465c] ? 
thread_return+0x3e/0xb1
Apr  7 23:49:50 node28 kernel: [24133.424834]  [80261d19] ? 
sys_futex+0xfe/0x11c
Apr  7 23:49:50 node28 kernel: [24133.424834]  [80215d6a] ? 
native_sched_clock+0x2e/0x5b
Apr  7 23:49:50 node28 kernel: [24133.424834]  [80215d9c] ? 
sched_clock+0x5/0x8
Apr  7 23:49:50 node28 kernel: [24133.424834]  [8020fa42] ? 
system_call_fastpath+0x16/0x1b



Can't currently move to 2.6.32 because of #572201 - is it worth trying 
that patch? Will there be any further updates to 2.6.30?


-stephen
--
Stephen Mulcahy Atlantic Linux http://www.atlanticlinux.ie
Registered in Ireland, no. 376591 (144 Ros Caoin, Roscam, Galway)



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#572201: Further queries

2010-04-07 Thread stephen mulcahy

Ben Hutchings wrote:

On Tue, 2010-03-16 at 10:33 +, stephen mulcahy wrote:
[...]

We will shortly update the official kernel packages to incorporate this
release, so you could just wait a day or two and update.  However I'm not
aware of any changes in 2.6.32.10 that would fix this sort of bug.
Again, I scanned the changelogs and nothing jumped out at me. I'll try 
the updated package when you release it to see if it makes a difference.

[...]

Have you done this and did it help?


Hi,

Just tried this now and it doesn't help. Still getting nodes dropping 
out after running the Hadoop Terasort (behaviour which doesn't happen 
with the 2.6.30 kernel).


Still no messages in the logs - but as usual, ifdown followed by ifup 
makes things right.


Anything else I can run for diagnostics?

-stephen

--
Stephen Mulcahy Atlantic Linux http://www.atlanticlinux.ie
Registered in Ireland, no. 376591 (144 Ros Caoin, Roscam, Galway)



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#576003: Acknowledgement (INFO: task bacula-sd blocked for more than 120 seconds.)

2010-03-31 Thread stephen mulcahy

Forgot to add - this seems related to the closed #516374




--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#572201: Further queries

2010-03-16 Thread stephen mulcahy

Ben Hutchings wrote:

On Mon, Mar 15, 2010 at 05:20:32PM +, stephen mulcahy wrote:
All pause frames should be dropped, either by the hardware or the driver.
So it's not unexpected that these are equal.


Ok, thanks for the clarification.


It might be interesting to see what happens if you disable pause frame
handling with this command:

ethtool -A eth0 autoneg off rx off tx off


I tried this and re-ran my hadoop test and I'm seeing the same drop-outs 
 from systems as with this enabled. Running ethtool -S eth0 on a 
dropped out system gives the following output.


NIC statistics:
 tx_bytes: 45900034824
 tx_zero_rexmt: 40968086
 tx_one_rexmt: 0
 tx_many_rexmt: 0
 tx_late_collision: 0
 tx_fifo_errors: 0
 tx_carrier_errors: 0
 tx_excess_deferral: 0
 tx_retry_error: 0
 rx_frame_error: 0
 rx_extra_byte: 0
 rx_late_collision: 0
 rx_runt: 0
 rx_frame_too_long: 0
 rx_over_errors: 0
 rx_crc_errors: 0
 rx_frame_align_error: 0
 rx_length_error: 0
 rx_unicast: 42104294
 rx_multicast: 897
 rx_broadcast: 564
 rx_packets: 42105755
 rx_errors_total: 0
 tx_errors_total: 0
 tx_deferral: 0
 tx_packets: 40968086
 rx_bytes: 48159336484
 tx_pause: 0
 rx_pause: 0
 rx_drop_frame: 0
 tx_unicast: 3322
 tx_multicast: 4392
 tx_broadcast: 23998478524

and no messages in the system logs.

These systems are running with DHCP (and have Avahi installed) - is it 
possible these are related to the problem (but again, why is it only 
showing up when running the 2.6.32 kernel).



I can't see any major changes in the forcedeth driver since 2.6.30.


I scanned what changelogs I could find also and nothing jumped out at me 
that could be the cause of this.



We will shortly update the official kernel packages to incorporate this
release, so you could just wait a day or two and update.  However I'm not
aware of any changes in 2.6.32.10 that would fix this sort of bug.


Again, I scanned the changelogs and nothing jumped out at me. I'll try 
the updated package when you release it to see if it makes a difference.


Let me know if there's any further testing I can do before I roll the 
systems back to 2.6.30 and put them back into production.


Thanks,

-stephen

--
Stephen Mulcahy Atlantic Linux http://www.atlanticlinux.ie
Registered in Ireland, no. 376591 (144 Ros Caoin, Roscam, Galway)



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#572201: Further queries

2010-03-15 Thread stephen mulcahy

Hi,

Any further thoughts on this?

In the ethtool output, I notice the following

rx_pause: 46798
rx_drop_frame: 46798

I've checked some other machines and I don't see any of either stat - 
possibly because these are specific to some nic drivers? Anyway, is it 
normal for those numbers to be the same?


As I said, I'm not seeing the behaviour with the 2.6.30 kernel - so 
wondering what has changed.


I see Linux 2.6.32.10 was just released, is it worth my while building 
that and seeing if I can reproduce the problem?


-stephen

--
Stephen Mulcahy Atlantic Linux http://www.atlanticlinux.ie
Registered in Ireland, no. 376591 (144 Ros Caoin, Roscam, Galway)



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#572201: linux-image-2.6.32-trunk-amd64: forcedeth driver hangs under heavy load

2010-03-09 Thread stephen mulcahy
: [   16.834895] forcedeth :00:08.0: 
irq 30 for MSI/MSI-X

Mar  9 08:31:21 node20 kernel: [   27.456017] eth0: no IPv6 routers present




The device statistics (output from ethtool -S eth0) might also be
informative.


NIC statistics:
 tx_bytes: 63756006188
 tx_zero_rexmt: 56365619
 tx_one_rexmt: 0
 tx_many_rexmt: 0
 tx_late_collision: 0
 tx_fifo_errors: 0
 tx_carrier_errors: 0
 tx_excess_deferral: 0
 tx_retry_error: 0
 rx_frame_error: 0
 rx_extra_byte: 0
 rx_late_collision: 0
 rx_runt: 0
 rx_frame_too_long: 0
 rx_over_errors: 0
 rx_crc_errors: 0
 rx_frame_align_error: 0
 rx_length_error: 0
 rx_unicast: 58975439
 rx_multicast: 933
 rx_broadcast: 1618
 rx_packets: 58977990
 rx_errors_total: 0
 tx_errors_total: 0
 tx_deferral: 0
 tx_packets: 56365619
 rx_bytes: 69269122814
 tx_pause: 0
 rx_pause: 46798
 rx_drop_frame: 46798
 tx_unicast: 2284
 tx_multicast: 3008
 tx_broadcast: 16510200339

If I ifdown eth0 and then ifup eth0, I can again connect to the system 
without problems.


Thanks,

-stephen

--
Stephen Mulcahy, DI2, Digital Enterprise Research Institute,
NUI Galway, IDA Business Park, Lower Dangan, Galway, Ireland
http://di2.deri.iehttp://webstar.deri.iehttp://sindice.com




--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#572201: linux-image-2.6.32-trunk-amd64: forcedeth driver hangs under heavy load

2010-03-08 Thread stephen mulcahy

Ben Hutchings wrote:

What protocol(s) are you using when this occurs?


I was running the Hadoop application (http://hadoop.apache.org/) which 
uses TCP as far as I know.


I just tried to reproduce the problem using iperf but have sent GBs 
between two machines running 2.6.32 without seeing any problems. Will 
try running Hadoop across all my systems, with only 2 running 2.6.32 and 
see if I can replicate the problem. Otherwise will roll them all back to 
2.6.32 and work from there.


Happy to run further diagnostics to tie this down if you let me know 
what to run.


We'll want to see the kernel log (output from dmesg) after this happens,
even if you can't spot anything in it.

The device statistics (output from ethtool -S eth0) might also be
informative.


Ok, will post both of those when I manage to reproduce the problem again.

-stephen

--
Stephen Mulcahy, DI2, Digital Enterprise Research Institute,
NUI Galway, IDA Business Park, Lower Dangan, Galway, Ireland
http://di2.deri.iehttp://webstar.deri.iehttp://sindice.com



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#572201: linux-image-2.6.32-trunk-amd64: forcedeth driver hangs under heavy load

2010-03-02 Thread stephen mulcahy
- 
Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- 
MAbort- SERR- PERR- INTx-

Capabilities: [f0] Secure device ?

00:19.4 Host bridge [0600]: Advanced Micro Devices [AMD] K10 [Opteron, 
Athlon64, Sempron] Link Control [1022:1204]
	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
	Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- 
MAbort- SERR- PERR- INTx-


01:05.0 VGA compatible controller [0300]: ATI Technologies Inc ES1000 
[1002:515e] (rev 02) (prog-if 00 [VGA controller])

Subsystem: Super Micro Computer Inc Device [15d9:1911]
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping+ SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium TAbort- 
TAbort- MAbort- SERR- PERR- INTx-

Latency: 64 (2000ns min), Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 10
Region 0: Memory at f000 (32-bit, prefetchable) [size=128M]
Region 1: I/O ports at e000 [size=256]
Region 2: Memory at febf (32-bit, non-prefetchable) [size=64K]
Expansion ROM at feb0 [disabled] [size=128K]
Capabilities: [50] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-


** USB devices:
Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 001 Device 002: ID 14dd:0002 Raritan Computer, Inc.
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub


-- System Information:
Debian Release: squeeze/sid
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.30-2-amd64 (SMP w/8 CPU cores)
Locale: LANG=en_IE.UTF-8, LC_CTYPE=en_IE.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages linux-image-2.6.32-trunk-amd64 depends on:
ii  debconf [debconf-2.0]1.5.28  Debian configuration 
management sy
ii  initramfs-tools [linux-initr 0.93.4  tools for generating an 
initramfs
ii  module-init-tools3.12~pre1-1 tools for managing Linux 
kernel mo


Versions of packages linux-image-2.6.32-trunk-amd64 recommends:
ii  firmware-linux-free   2.6.32-5   Binary firmware for various 
driver


Versions of packages linux-image-2.6.32-trunk-amd64 suggests:
pn  grub | lilo   none (no description available)
pn  linux-doc-2.6.32  none (no description available)

Versions of packages linux-image-2.6.32-trunk-amd64 is related to:
pn  firmware-bnx2 none (no description available)
pn  firmware-bnx2xnone (no description available)
pn  firmware-ipw2x00  none (no description available)
pn  firmware-ivtv none (no description available)
pn  firmware-iwlwifi  none (no description available)
pn  firmware-linuxnone (no description available)
pn  firmware-linux-nonfreenone (no description available)
pn  firmware-qlogic   none (no description available)
pn  firmware-ralink   none (no description available)

-- debconf information:

linux-image-2.6.32-trunk-amd64/postinst/bootloader-error-2.6.32-trunk-amd64:
  shared/kernel-image/really-run-bootloader: true

linux-image-2.6.32-trunk-amd64/postinst/depmod-error-initrd-2.6.32-trunk-amd64: 
false


linux-image-2.6.32-trunk-amd64/prerm/removing-running-kernel-2.6.32-trunk-amd64: 
true


linux-image-2.6.32-trunk-amd64/postinst/bootloader-test-error-2.6.32-trunk-amd64:

linux-image-2.6.32-trunk-amd64/postinst/missing-firmware-2.6.32-trunk-amd64:

linux-image-2.6.32-trunk-amd64/prerm/would-invalidate-boot-loader-2.6.32-trunk-amd64: 
true


--
Stephen Mulcahy, DI2, Digital Enterprise Research Institute,
NUI Galway, IDA Business Park, Lower Dangan, Galway, Ireland
http://di2.deri.iehttp://webstar.deri.iehttp://sindice.com



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#224530: (no subject)

2009-12-08 Thread stephen mulcahy

heirloom-mailx supports the -r option for doing this



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#515737: xserver-xorg-input-all: After upgrading from Etch to Lenny, mouse pointer is very slow and keys don't repeat

2009-02-17 Thread stephen mulcahy
Package: xserver-xorg-input-all
Version: 1:7.3+18
Severity: important

Hi, flagging this as important because it's an RSI issue for anyone
using X for any duration.

After upgrading from Etch to Lenny, I found 2 input problems.

1. My mouse pointer is very, very slow. I use GNOME and tried modifying 
the preferences in System / Preferences / Mouse but they seemed to have 
no effect.

After some digging, I found running 

xset m 2 2

restores the behaviour to what I think it was (fluid mouse movement 
rather than the treacly behaviour I had). Can this be the default 
behaviour? Is there a reason the GNOME settings don't apply? 

2. On a related note, I noticed that the keyboard wasn't repeating, 
particularly the arrow keys (I didn't realise how much I rely on this
behaviour until it was gone). Again, I tried modifying the GNOME 
settings but this seemed to give excessive repeating behaviour. After 
some digging, I found the following did just what I needed

xset r

Again, can this be the default?

-- System Information:
Debian Release: 5.0
  APT prefers stable
  APT policy: (500, 'stable')
Architecture: i386 (i686)

Kernel: Linux 2.6.26-1-686 (SMP w/2 CPU cores)
Locale: LANG=en_IE.UTF-8, LC_CTYPE=en_IE.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash

Versions of packages xserver-xorg-input-all depends on:
ii  xserver-xorg-input- 1:2.0.8-1X.Org X server -- evdev input driv
ii  xserver-xorg-input- 1:1.3.1-1X.Org X server -- keyboard input d
ii  xserver-xorg-input- 1:1.3.0-1X.Org X server -- mouse input driv
ii  xserver-xorg-input- 0.14.7~git20070706-3 Synaptics TouchPad driver for X.Or
ii  xserver-xorg-input- 0.7.9.3-2X.Org X server -- Wacom input driv

xserver-xorg-input-all recommends no packages.

xserver-xorg-input-all suggests no packages.

-- no debconf information



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#412981: libapache2-mod-auth-pam: Workaround works for me

2007-07-04 Thread stephen mulcahy

Package: libapache2-mod-auth-pam
Version: 1.1.1-6.1
Followup-For: Bug #412981

I ran into this problem also but the workaround of disabling Basic 
Authentication works for me. So I'm using the following,


AuthBasicAuthoritative off
AuthPAM_Enabled on
AuthType Basic
AuthName PAM
require valid-user

I can confirm that the apache error log still logs noise around this though

[Wed Jul 04 14:43:51 2007] [error] Internal error: pcfg_openfile() 
called with NULL filename
[Wed Jul 04 14:43:51 2007] [error] [client 10.0.5.55] (9)Bad file 
descriptor: Could not open password file: (null)
[Wed Jul 04 14:43:51 2007] [error] [client 10.0.5.55] Could not fetch 
resource information.  [301, #0]
[Wed Jul 04 14:43:51 2007] [error] [client 10.0.5.55] (84)Invalid or 
incomplete multibyte or wide character: Requests for a collection must 
have a trailing slash on the URI.  [301, #0]
[Wed Jul 04 14:43:51 2007] [error] Internal error: pcfg_openfile() 
called with NULL filename
[Wed Jul 04 14:43:51 2007] [error] [client 10.0.5.55] (9)Bad file 
descriptor: Could not open password file: (null)


-stephen

--
Stephen Mulcahy, Applepie Solutions Ltd., Innovation in Business Center,
GMIT, Dublin Rd, Galway, Ireland.  +353.91.751262  http://www.aplpi.com
Registered in Ireland, no. 289353 (5 Woodlands Avenue, Renmore, Galway)


--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#386959: same problem on diskless nodes

2006-11-24 Thread stephen mulcahy
Hi,

I have the same problem reported previously - the nfs mount script times
out without anything ever getting mounted.

If I run

/etc/network/if-up.d/mountnfs

manually after the node has booted it works fine.

The nodes in question are diskless nodes which use netboot to start up so
I'm wondering if the check for a working network interface which you
mentioned might be failing on me.

Any suggestions on how to get /etc/network/if-up.d/mountnfs logging its
output somewhere, a quick look didn't show me what script this is getting
started in. I guess my situation is pretty unusual so feel free to
disregard this comment, for now I've added a quick hack to
/etc/init.d/mountnfs.sh which calls /etc/network/if-up.d/mountnfs directly
and this seems to do the trick (although I'd prefer to not have to
maintain such a hack going forward).

Can you point me at where the check for a working interface is performed
and I'll try to see if thats whats failing here (again, I didn't see it
from a quick look around .. perhaps the solution here is for me to take a
longer look around the startup scripts)?

Thanks,

-stephen



-- 
   Stephen MulcahyApplepie Solutions Ltd   http://www.aplpi.com
email:[EMAIL PROTECTED] mobile:+353.87.2930252 office:+353.91.751262
  Unit 30, Industry Support Centre, GMIT, Dublin Rd, Galway, Ireland.




Bug#386959: same problem on diskless nodes

2006-11-24 Thread stephen mulcahy
Sorry, I see this is address in bug 388761 .. I'll follow up there if I
have anything useful to add.

-stephen

 Hi,

 I have the same problem reported previously - the nfs mount script times
 out without anything ever getting mounted.

 If I run

 /etc/network/if-up.d/mountnfs

 manually after the node has booted it works fine.

 The nodes in question are diskless nodes which use netboot to start up so
 I'm wondering if the check for a working network interface which you
 mentioned might be failing on me.

 Any suggestions on how to get /etc/network/if-up.d/mountnfs logging its
 output somewhere, a quick look didn't show me what script this is getting
 started in. I guess my situation is pretty unusual so feel free to
 disregard this comment, for now I've added a quick hack to
 /etc/init.d/mountnfs.sh which calls /etc/network/if-up.d/mountnfs directly
 and this seems to do the trick (although I'd prefer to not have to
 maintain such a hack going forward).

 Can you point me at where the check for a working interface is performed
 and I'll try to see if thats whats failing here (again, I didn't see it
 from a quick look around .. perhaps the solution here is for me to take a
 longer look around the startup scripts)?

 Thanks,

 -stephen





Bug#388114: installation-report: Installed on HP NC6400 went ok but had problem with grub install

2006-09-22 Thread stephen mulcahy
Hi Geert,

Thanks for your response - I hope the installation report is useful to
someone.

I'm replying off-bug because I'm not sure any of the following is
relevant/useful - feel free to add in if you wish.

See some comments below,

Geert Stappers wrote:
 On Mon, Sep 18, 2006 at 06:31:00PM +0100, stephen mulcahy wrote:
 Package: installation-report
 Severity: normal

 See blog at 
 http://blog.aplpi.com/index.php/2006/09/18/debian-gnulinux-on-a-hp-nc6400/ 
 for details. Seems to be related to bug 380351 but worked around it.
 
 Okay, fine. Thank you for reporting the succesfull install.

You may want to mention the above bug in the errata.

 
 Also, didn't get option to install SMP kernel - should I?  
 
 Recent kernels have SMP support (iow: there are no SMP-kernels any more)

This is news to me - is it documented anywhere? There was only one core
available in /proc/cpuinfo until I installed the SMP flavoured 2.6.16
kernel (is this a dummy image or are you talking about changes in
unstable/post-2.6.16?).

 
 
 Something else:
 
 Debian-installer can resize NTFS partitions ( no need for an extra
 piece of software )  Just select the partition, select the size 
 and enter the new desired size.

Thanks, I was aware of this but I've yet to have any success using it. I
suspect running chkdsk/f on the NTFS partition fixes the parted in
Debian aswell .. but since it was significantly older than the parted in
the gparted livecd the last time I tried it and failed I got in the
habit of using gparted. I'll give it another shot the next time.

 
 
 Have fun with you Debian GNU/Linux computer system.

Thanks! And well done on the Debian etch installer so far - overall it
is working very well. Your efforts are appreciated.

-stephen

-- 
Stephen Mulcahy, Applepie Solutions Ltd, Innovation in Business Center,
   GMIT, Dublin Rd, Galway, Ireland.  mailto:[EMAIL PROTECTED]
  mobile:+353.87.2930252  office:+353.91.751262  http://www.aplpi.com


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#388114: installation-report: Installed on HP NC6400 went ok but had problem with grub install

2006-09-18 Thread stephen mulcahy
Package: installation-report
Severity: normal


See blog at 
http://blog.aplpi.com/index.php/2006/09/18/debian-gnulinux-on-a-hp-nc6400/ 
for details. Seems to be related to bug 380351 but worked around it.

Also, didn't get option to install SMP kernel - should I?  

-- System Information:
Debian Release: testing/unstable
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: i386 (i686)
Shell:  /bin/sh linked to /bin/bash
Kernel: Linux 2.6.16-2-686-smp
Locale: LANG=en_IE, LC_CTYPE=en_IE (charmap=ISO-8859-1)


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#363668: flashplugin-nonfree download is broken

2006-04-20 Thread stephen mulcahy
Package: flashplugin-nonfree
Version: 7.0.63.2
Severity: grave
Justification: renders package unusable

The download is currently broken because the link at 
http://macromedia.mplug.org/ is incorrect.

The macromedia download url seems to have chaned from 
http://fpdownload.macromedia.com/get/flashplayer/current/install_flash_player_7_linux.tar.gz
 
to 
http://download.macromedia.com/get/flashplayer/current/install_flash_player_7_linux.tar.gz
 
but the USA and Europe mirrors for http://macromedia.mplug.org don't 
seem to have had their links updated.

Easily fixed by manually downloading from 
http://download.macromedia.com/get/flashplayer/current/install_flash_player_7_linux.tar.gz
 
and running /usr/sbin/update-flashplugin -l location downloaded to but 
maybe want to fix script or notify upstream of broken link.

-- System Information:
Debian Release: testing/unstable
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: i386 (i686)
Shell:  /bin/sh linked to /bin/bash
Kernel: Linux 2.6.15-1-686
Locale: LANG=en_IE, LC_CTYPE=en_IE (charmap=ISO-8859-1)

Versions of packages flashplugin-nonfree depends on:
ii  debconf [debconf-2.0] 1.4.72 Debian configuration management sy
ii  gsfonts-x11   0.18   Make Ghostscript fonts available t

Versions of packages flashplugin-nonfree recommends:
pn  libstdc++2.10-glibc2.2none (no description available)

-- debconf information:
  flashplugin-nonfree/httpget: true
  flashplugin-nonfree/not_exist:
* flashplugin-nonfree/http_proxy:
  flashplugin-nonfree/failed:
  flashplugin-nonfree/local:
* flashplugin-nonfree/delete: false


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#261824: Successful Debian sarge install on SunFire 280R

2005-09-14 Thread stephen mulcahy

Hi,

I completed a successful install on a SunFire 280R using the latest 
debian sarge installer from a cdrom without experiencing the Fast Data 
Access MMU Miss error on boot. The same system was giving this error 
for the latest version of the Gentoo installer. I'm not sure what the 
SILO versions are between those and whether that is the source of the 
problem but thought it might be of help to someone.


Great work,

-stephen

--
[EMAIL PROTECTED] http://www.skynet.ie/~stephen/


--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]