Re: [PATCH] scsi: target/sbp: remove firewire SBP target driver

2020-06-16 Thread Chris Boot
On 16/06/2020 16:34, James Bottomley wrote:
> On Tue, 2020-06-16 at 14:13 +, Johannes Thumshirn wrote:
>> On 16/06/2020 16:09, Bart Van Assche wrote:
>>> On 2020-06-16 02:42, Finn Thain wrote:
>>>> Martin said, "I'd appreciate a patch to remove it"
>>>>
>>>> And Bart said, "do you want to keep this driver in the kernel
>>>> tree?"
>>>>
>>>> AFAICT both comments are quite ambiguous. I don't see an
>>>> actionable request, just an expression of interest from people
>>>> doing their jobs.
>>>>
>>>> Note well: there is no pay check associated with having a
>>>> MAINTAINERS file 
>>>> entry.
>>>
>>> Hi Finn,
>>>
>>> As far as I know the sbp driver only has had one user ever and that
>>> user is no longer user the sbp driver. So why to keep it in the
>>> kernel tree? Restoring a kernel driver can be easy - the first step
>>> is a "git revert".
>>
>> Why not move the driver to drivers/staging for 2 or 3 kernel releases
>> and if noone steps up, delete it?
> 
> Because that's pretty much the worst of all worlds: If the driver is
> simply going orphaned it can stay where it is to avoid confusion.  If
> it's being removed, it's better to remove it from where it is because
> that makes the patch to restore it easy to find.
> 
> Chris, the thing is this: if this driver has just one user on a stable
> distro who complains about its removal six months to two years from
> now, Linus will descend on us from a great height (which won't matter
> to you, since you'll be long gone).  This makes everyone very wary of
> outright removal.  If you're really, really sure it has no users, it
> can be deleted, but if there's the slightest chance it has just one, it
> should get orphaned.

My patch to delete the driver was based on Martin's original request:
https://lore.kernel.org/lkml/yq1img99d4k@ca-mkp.ca.oracle.com/

I don't especially want it to be gone, nor can I be sure there are no
users of what is as far as I can tell a working piece of code. I can
tell you that I never hear about it (other than the odd patch), whereas
I do get emails out of the blue for some of my other (much smaller)
stuff which clearly has users. I'd be just as happy for this to be
orphaned or for nothing to happen to it.

Honestly, I am totally ambivalent as to what happens to this code.
Martin, however, clearly cares enough to have asked me to supply a patch
to remove it.

Cheers,
Chris

-- 
Chris Boot
bo...@boo.tc


Re: [PATCH] scsi: target/sbp: remove firewire SBP target driver

2020-06-15 Thread Chris Boot
On 15/06/2020 00:28, Finn Thain wrote:
> On Sun, 14 Jun 2020, Chris Boot wrote:
> 
>> I expect that if someone finds this useful it can stick around (but 
>> that's not my call).
> 
> Who's call is that? If the patch had said "From: Martin K. Petersen" and 
> "This driver is being removed because it has the following defects..." 
> that would be some indication of a good-faith willingness to accept users 
> as developers in the spirit of the GPL, which is what you seem to be 
> alluding to (?).

If you're asking me, I'd say it was martin's call:

> SCSI TARGET SUBSYSTEM 
>  
> M:  "Martin K. Petersen"  
>  
[...]
> F:  drivers/target/   
>  
> F:  include/target/   
>  

>> I just don't have the time or inclination or hardware to be able to 
>> maintain it anymore, so someone else would have to pick it up.
>>
> 
> Which is why most drivers get orphaned, right?

Sure, but that's not what Martin asked me to do, hence this patch.

-- 
Chris Boot
bo...@boo.tc


Re: [PATCH] scsi: target/sbp: remove firewire SBP target driver

2020-06-14 Thread Chris Boot
On 14/06/2020 01:03, Finn Thain wrote:
> On Sat, 13 Jun 2020, Chris Boot wrote:
> 
>> I no longer have the time to maintain this subsystem nor the hardware to
>> test patches with. 
> 
> Then why not patch MAINTAINERS, and orphan it, as per usual practice?
> 
> $ git log --oneline MAINTAINERS | grep -i orphan

My patch to remove it was in response to:

https://lore.kernel.org/lkml/yq1img99d4k@ca-mkp.ca.oracle.com/

>> It also doesn't appear to have any active users so I doubt anyone will 
>> miss it.
>>
> 
> It's not unusual that any Linux driver written more than 5 years ago 
> "doesn't appear to have any active users".
> 
> If a driver has been orphaned and broken in the past, and no-one stepped 
> up to fix it within a reasonable period, removal would make sense. But 
> that's not the case here.
> 
> I haven't used this driver for a long time, but I still own PowerMacs with 
> firewire, and I know I'm not the only one.

I expect that if someone finds this useful it can stick around (but
that's not my call). I just don't have the time or inclination or
hardware to be able to maintain it anymore, so someone else would have
to pick it up.

Cheers,
Chris

-- 
Chris Boot
bo...@boo.tc


[PATCH] scsi: target/sbp: remove SBP target driver

2020-06-13 Thread Chris Boot
I no longer have the time to maintain this subsystem nor the hardware to
test patches with. It also doesn't appear to have any active users so I
doubt anyone will miss it.

Signed-off-by: Chris Boot 
---
 MAINTAINERS |9 -
 drivers/target/Kconfig  |1 -
 drivers/target/Makefile |1 -
 drivers/target/sbp/Kconfig  |   12 -
 drivers/target/sbp/Makefile |2 -
 drivers/target/sbp/sbp_target.c | 2350 ---
 drivers/target/sbp/sbp_target.h |  243 
 7 files changed, 2618 deletions(-)
 delete mode 100644 drivers/target/sbp/Kconfig
 delete mode 100644 drivers/target/sbp/Makefile
 delete mode 100644 drivers/target/sbp/sbp_target.c
 delete mode 100644 drivers/target/sbp/sbp_target.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 56d7d27fc114..81b7db7d68a8 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6669,15 +6669,6 @@ S:   Maintained
 T: git 
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media.git
 F: drivers/media/firewire/
 
-FIREWIRE SBP-2 TARGET
-M: Chris Boot 
-L: linux-s...@vger.kernel.org
-L: target-de...@vger.kernel.org
-L: linux1394-de...@lists.sourceforge.net
-S: Maintained
-T: git git://git.kernel.org/pub/scm/linux/kernel/git/nab/lio-core-2.6.git 
master
-F: drivers/target/sbp/
-
 FIREWIRE SUBSYSTEM
 M: Stefan Richter 
 L: linux1394-de...@lists.sourceforge.net
diff --git a/drivers/target/Kconfig b/drivers/target/Kconfig
index c163b14774d7..4a5682745ada 100644
--- a/drivers/target/Kconfig
+++ b/drivers/target/Kconfig
@@ -46,6 +46,5 @@ config TCM_USER2
 source "drivers/target/loopback/Kconfig"
 source "drivers/target/tcm_fc/Kconfig"
 source "drivers/target/iscsi/Kconfig"
-source "drivers/target/sbp/Kconfig"
 
 endif
diff --git a/drivers/target/Makefile b/drivers/target/Makefile
index 45634747377e..c13da05af2e2 100644
--- a/drivers/target/Makefile
+++ b/drivers/target/Makefile
@@ -29,4 +29,3 @@ obj-$(CONFIG_TCM_USER2)   += target_core_user.o
 obj-$(CONFIG_LOOPBACK_TARGET)  += loopback/
 obj-$(CONFIG_TCM_FC)   += tcm_fc/
 obj-$(CONFIG_ISCSI_TARGET) += iscsi/
-obj-$(CONFIG_SBP_TARGET)   += sbp/
diff --git a/drivers/target/sbp/Kconfig b/drivers/target/sbp/Kconfig
deleted file mode 100644
index 53a1c75f5660..
--- a/drivers/target/sbp/Kconfig
+++ /dev/null
@@ -1,12 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0-only
-config SBP_TARGET
-   tristate "FireWire SBP-2 fabric module"
-   depends on FIREWIRE
-   help
- Say Y or M here to enable SCSI target functionality over FireWire.
- This enables you to expose SCSI devices to other nodes on the FireWire
- bus, for example hard disks. Similar to FireWire Target Disk mode on
- many Apple computers.
-
- To compile this driver as a module, say M here: The module will be
- called sbp-target.
diff --git a/drivers/target/sbp/Makefile b/drivers/target/sbp/Makefile
deleted file mode 100644
index 766f23690013..
--- a/drivers/target/sbp/Makefile
+++ /dev/null
@@ -1,2 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0-only
-obj-$(CONFIG_SBP_TARGET) += sbp_target.o
diff --git a/drivers/target/sbp/sbp_target.c b/drivers/target/sbp/sbp_target.c
deleted file mode 100644
index e4a9b9fe3dfb..
--- a/drivers/target/sbp/sbp_target.c
+++ /dev/null
@@ -1,2350 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-or-later
-/*
- * SBP2 target driver (SCSI over IEEE1394 in target mode)
- *
- * Copyright (C) 2011  Chris Boot 
- */
-
-#define KMSG_COMPONENT "sbp_target"
-#define pr_fmt(fmt) KMSG_COMPONENT ": " fmt
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include "sbp_target.h"
-
-/* FireWire address region for management and command block address handlers */
-static const struct fw_address_region sbp_register_region = {
-   .start  = CSR_REGISTER_BASE + 0x1,
-   .end= 0x1ULL,
-};
-
-static const u32 sbp_unit_directory_template[] = {
-   0x1200609e, /* unit_specifier_id: NCITS/T10 */
-   0x13010483, /* unit_sw_version: 1155D Rev 4 */
-   0x3800609e, /* command_set_specifier_id: NCITS/T10 */
-   0x390104d8, /* command_set: SPC-2 */
-   0x3b00, /* command_set_revision: 0 */
-   0x3c01, /* firmware_revision: 1 */
-};
-
-#define SESSION_MAINTENANCE_INTERVAL HZ
-
-static atomic_t login_id = ATOMIC_INIT(0);
-
-static void session_maintenance_work(struct work_struct *);
-static int sbp_run_transaction(struct fw_card *, int, int, int, int,
-   unsigned long long, void *, size_t);
-
-static int read_peer_guid(u64 *guid, const struct sbp_management_request *req)
-{
-   int ret;
-   __be32 high, low;
-
-   ret = sbp_run_transaction(req->card, TCODE_READ_Q

Re: [PATCH] sbp-target: add the missed kfree() in an error path

2020-05-28 Thread Chris Boot
On 28/05/2020 15:53, Bart Van Assche wrote:
> On 2020-05-28 03:20, Chuhong Yuan wrote:
>> sbp_fetch_command() forgets to call kfree() in an error path.
>> Add the missed call to fix it.
> 
> Hi Chris,
> 
> The changelog of the code under drivers/target/sbp makes we wonder
> whether this driver has ever had any other users than its original
> author. Do you agree with this? If so, do you want to keep this driver
> in the kernel tree?

Hi Bart,

I think you might be right. I also don't have much time to maintain it
these days and the hardware I had is long dead. It probably should be
removed for everyone's sanity.

Best regards,
Chris

-- 
Chris Boot
bo...@bootc.net


Re: [PATCH] sbp-target: Delete an error message for a failed memory allocation in three functions

2017-12-10 Thread Chris Boot
On 10/12/2017 19:10, SF Markus Elfring wrote:
> From: Markus Elfring <elfr...@users.sourceforge.net>
> Date: Sun, 10 Dec 2017 19:54:11 +0100
> 
> Omit an extra message for a memory allocation failure in these functions.
> 
> This issue was detected by using the Coccinelle software.
> 
> Signed-off-by: Markus Elfring <elfr...@users.sourceforge.net>
[snip]

Looks good to me.

Acked-by: Chris Boot <bo...@boo.tc>

Thanks,
Chris

-- 
Chris Boot
bo...@boo.tc


Re: [PATCH] sbp-target: Delete an error message for a failed memory allocation in three functions

2017-12-10 Thread Chris Boot
On 10/12/2017 19:10, SF Markus Elfring wrote:
> From: Markus Elfring 
> Date: Sun, 10 Dec 2017 19:54:11 +0100
> 
> Omit an extra message for a memory allocation failure in these functions.
> 
> This issue was detected by using the Coccinelle software.
> 
> Signed-off-by: Markus Elfring 
[snip]

Looks good to me.

Acked-by: Chris Boot 

Thanks,
Chris

-- 
Chris Boot
bo...@boo.tc


BUG/panic in ctnetlink_conntrack_event in 4.8.11

2016-12-21 Thread Chris Boot
0
[147966.128285]  [] ? apic_timer_interrupt+0x82/0x90
[147966.134557][] ? cpuidle_enter_state+0x126/0x2d0
[147966.141555]  [] ? cpuidle_enter_state+0x113/0x2d0
[147966.147916]  [] ? cpu_startup_entry+0x2a2/0x350
[147966.154103]  [] ? start_secondary+0x14d/0x190
[147966.160117] ---[ end trace d5725bb00a2f3d6c ]---

Regards,
Chris

-- 
Chris Boot
bo...@bootc.net


BUG/panic in ctnetlink_conntrack_event in 4.8.11

2016-12-21 Thread Chris Boot
0
[147966.128285]  [] ? apic_timer_interrupt+0x82/0x90
[147966.134557][] ? cpuidle_enter_state+0x126/0x2d0
[147966.141555]  [] ? cpuidle_enter_state+0x113/0x2d0
[147966.147916]  [] ? cpu_startup_entry+0x2a2/0x350
[147966.154103]  [] ? start_secondary+0x14d/0x190
[147966.160117] ---[ end trace d5725bb00a2f3d6c ]---

Regards,
Chris

-- 
Chris Boot
bo...@bootc.net


Re: [patch] sbp-target: checking for NULL instead of IS_ERR

2016-03-10 Thread Chris Boot
On 10/03/16 21:52, Chris Boot wrote:
> On 10/03/16 20:56, Chris Boot wrote:
>> On 05/03/16 09:33, Nicholas A. Bellinger wrote:
>>> On Sat, 2016-03-05 at 08:45 +, Chris Boot wrote:
>>>> Are these in linux-next or another branch somewhere I can easily clone
>>>> them from?
>>>
>>> The patch series is in target-pending/for-next.
>>
>> Hi Nic,
>>
>> I've just managed to resurrect a test rig for this (the hardware I had
>> for it has stopped being usable, yay!), and my initial testing shows the
>> updated code panics on the first submitted IO.
> 
> So this isn't the first IO, it's exactly the 2nd IO. I'm hitting
> BUG_ON(se_cmd->se_tfo || se_cmd->se_sess) in target_submit_cmd_map_sgls().
> 
> I'm assuming the se_cmd is being reused due to percpu ida allocator, and
> the code must be missing something to clean up the se_cmd sufficiently
> once we're done with it.
> 
> At this point I'm out of my depth going through the target core, so I'd
> appreciate some pointers to get any further!

Replying to myself again... Worked it out after reading the thread about the 
usb gadget target. Here's the patch you want to squash into your existing 
series:

diff --git a/drivers/target/sbp/sbp_target.c b/drivers/target/sbp/sbp_target.c
index a04b0605f8d0..d021997cc837 100644
--- a/drivers/target/sbp/sbp_target.c
+++ b/drivers/target/sbp/sbp_target.c
@@ -933,6 +933,7 @@ static struct sbp_target_request *sbp_mgt_get_req(struct 
sbp_session *sess,
return ERR_PTR(-ENOMEM);
 
req = &((struct sbp_target_request *)se_sess->sess_cmd_map)[tag];
+   memset(req, 0, sizeof(*req));
req->se_cmd.map_tag = tag;
req->se_cmd.tag = next_orb;
 
@@ -1619,12 +1620,8 @@ static void sbp_mgt_agent_rw(struct fw_card *card,
rcode = RCODE_CONFLICT_ERROR;
goto out;
}
-   // XXX:
-#if 0
-   req = sbp_mgt_get_req(agent->login->sess, card);
-#else
+
req = kzalloc(sizeof(*req), GFP_ATOMIC);
-#endif
if (!req) {
rcode = RCODE_CONFLICT_ERROR;
    goto out;

I hope Thunderbird hasn't mangled this too badly.

With this applied, please add this to the patch for sbp_target:

Acked-by: Chris Boot <bo...@bootc.net>

Thanks,
Chris

-- 
Chris Boot
bo...@bootc.net


Re: [patch] sbp-target: checking for NULL instead of IS_ERR

2016-03-10 Thread Chris Boot
On 10/03/16 21:52, Chris Boot wrote:
> On 10/03/16 20:56, Chris Boot wrote:
>> On 05/03/16 09:33, Nicholas A. Bellinger wrote:
>>> On Sat, 2016-03-05 at 08:45 +, Chris Boot wrote:
>>>> Are these in linux-next or another branch somewhere I can easily clone
>>>> them from?
>>>
>>> The patch series is in target-pending/for-next.
>>
>> Hi Nic,
>>
>> I've just managed to resurrect a test rig for this (the hardware I had
>> for it has stopped being usable, yay!), and my initial testing shows the
>> updated code panics on the first submitted IO.
> 
> So this isn't the first IO, it's exactly the 2nd IO. I'm hitting
> BUG_ON(se_cmd->se_tfo || se_cmd->se_sess) in target_submit_cmd_map_sgls().
> 
> I'm assuming the se_cmd is being reused due to percpu ida allocator, and
> the code must be missing something to clean up the se_cmd sufficiently
> once we're done with it.
> 
> At this point I'm out of my depth going through the target core, so I'd
> appreciate some pointers to get any further!

Replying to myself again... Worked it out after reading the thread about the 
usb gadget target. Here's the patch you want to squash into your existing 
series:

diff --git a/drivers/target/sbp/sbp_target.c b/drivers/target/sbp/sbp_target.c
index a04b0605f8d0..d021997cc837 100644
--- a/drivers/target/sbp/sbp_target.c
+++ b/drivers/target/sbp/sbp_target.c
@@ -933,6 +933,7 @@ static struct sbp_target_request *sbp_mgt_get_req(struct 
sbp_session *sess,
return ERR_PTR(-ENOMEM);
 
req = &((struct sbp_target_request *)se_sess->sess_cmd_map)[tag];
+   memset(req, 0, sizeof(*req));
req->se_cmd.map_tag = tag;
req->se_cmd.tag = next_orb;
 
@@ -1619,12 +1620,8 @@ static void sbp_mgt_agent_rw(struct fw_card *card,
rcode = RCODE_CONFLICT_ERROR;
goto out;
}
-   // XXX:
-#if 0
-   req = sbp_mgt_get_req(agent->login->sess, card);
-#else
+
req = kzalloc(sizeof(*req), GFP_ATOMIC);
-#endif
if (!req) {
rcode = RCODE_CONFLICT_ERROR;
    goto out;

I hope Thunderbird hasn't mangled this too badly.

With this applied, please add this to the patch for sbp_target:

Acked-by: Chris Boot 

Thanks,
Chris

-- 
Chris Boot
bo...@bootc.net


Re: [patch] sbp-target: checking for NULL instead of IS_ERR

2016-03-10 Thread Chris Boot
On 10/03/16 20:56, Chris Boot wrote:
> On 05/03/16 09:33, Nicholas A. Bellinger wrote:
>> On Sat, 2016-03-05 at 08:45 +0000, Chris Boot wrote:
>>> Are these in linux-next or another branch somewhere I can easily clone
>>> them from?
>>
>> The patch series is in target-pending/for-next.
> 
> Hi Nic,
> 
> I've just managed to resurrect a test rig for this (the hardware I had
> for it has stopped being usable, yay!), and my initial testing shows the
> updated code panics on the first submitted IO.

So this isn't the first IO, it's exactly the 2nd IO. I'm hitting
BUG_ON(se_cmd->se_tfo || se_cmd->se_sess) in target_submit_cmd_map_sgls().

I'm assuming the se_cmd is being reused due to percpu ida allocator, and
the code must be missing something to clean up the se_cmd sufficiently
once we're done with it.

At this point I'm out of my depth going through the target core, so I'd
appreciate some pointers to get any further!

Thanks,
Chris

-- 
Chris Boot
bo...@bootc.net


Re: [patch] sbp-target: checking for NULL instead of IS_ERR

2016-03-10 Thread Chris Boot
On 10/03/16 20:56, Chris Boot wrote:
> On 05/03/16 09:33, Nicholas A. Bellinger wrote:
>> On Sat, 2016-03-05 at 08:45 +0000, Chris Boot wrote:
>>> Are these in linux-next or another branch somewhere I can easily clone
>>> them from?
>>
>> The patch series is in target-pending/for-next.
> 
> Hi Nic,
> 
> I've just managed to resurrect a test rig for this (the hardware I had
> for it has stopped being usable, yay!), and my initial testing shows the
> updated code panics on the first submitted IO.

So this isn't the first IO, it's exactly the 2nd IO. I'm hitting
BUG_ON(se_cmd->se_tfo || se_cmd->se_sess) in target_submit_cmd_map_sgls().

I'm assuming the se_cmd is being reused due to percpu ida allocator, and
the code must be missing something to clean up the se_cmd sufficiently
once we're done with it.

At this point I'm out of my depth going through the target core, so I'd
appreciate some pointers to get any further!

Thanks,
Chris

-- 
Chris Boot
bo...@bootc.net


Re: [patch] sbp-target: checking for NULL instead of IS_ERR

2016-03-10 Thread Chris Boot
On 05/03/16 09:33, Nicholas A. Bellinger wrote:
> On Sat, 2016-03-05 at 08:45 +0000, Chris Boot wrote:
>> Are these in linux-next or another branch somewhere I can easily clone
>> them from?
> 
> The patch series is in target-pending/for-next.

Hi Nic,

I've just managed to resurrect a test rig for this (the hardware I had
for it has stopped being usable, yay!), and my initial testing shows the
updated code panics on the first submitted IO.

I'll go and debug it now and see what I can get from it, but I thought
I'd let you know ASAP.

Cheers,
Chris

-- 
Chris Boot
bo...@bootc.net


Re: [patch] sbp-target: checking for NULL instead of IS_ERR

2016-03-10 Thread Chris Boot
On 05/03/16 09:33, Nicholas A. Bellinger wrote:
> On Sat, 2016-03-05 at 08:45 +0000, Chris Boot wrote:
>> Are these in linux-next or another branch somewhere I can easily clone
>> them from?
> 
> The patch series is in target-pending/for-next.

Hi Nic,

I've just managed to resurrect a test rig for this (the hardware I had
for it has stopped being usable, yay!), and my initial testing shows the
updated code panics on the first submitted IO.

I'll go and debug it now and see what I can get from it, but I thought
I'd let you know ASAP.

Cheers,
Chris

-- 
Chris Boot
bo...@bootc.net


Re: [patch] sbp-target: checking for NULL instead of IS_ERR

2016-03-05 Thread Chris Boot
On 5 Mar 2016, at 07:33, Nicholas A. Bellinger <n...@linux-iscsi.org> wrote:
> 
> Hi Dan + BootC,
> 
> On Wed, 2016-03-02 at 13:09 +0300, Dan Carpenter wrote:
>> We changed this from kzalloc to sbp_mgt_get_req() so we need to change
>> from checking for NULL to check for error pointers.
>> 
>> Fixes: c064b2a78989 ('sbp-target: Conversion to percpu_ida tag 
>> pre-allocation')
>> Signed-off-by: Dan Carpenter <dan.carpen...@oracle.com>
>> 
>> diff --git a/drivers/target/sbp/sbp_target.c 
>> b/drivers/target/sbp/sbp_target.c
>> index 251d532..a04b0605f 100644
>> --- a/drivers/target/sbp/sbp_target.c
>> +++ b/drivers/target/sbp/sbp_target.c
>> @@ -951,7 +951,7 @@ static void tgt_agent_fetch_work(struct work_struct 
>> *work)
>> 
>>  while (next_orb && tgt_agent_check_active(agent)) {
>>  req = sbp_mgt_get_req(sess, sess->card, next_orb);
>> -if (!req) {
>> +if (IS_ERR(req)) {
>>  spin_lock_bh(>lock);
>>  agent->state = AGENT_STATE_DEAD;
>>  spin_unlock_bh(>lock);
> 
> Fixed + folded into the original patch.
> 
> Thanks Dan.
> 
> Chris, would you be so kind to review the original changes here:
> 
> sbp-target: Conversion to percpu_ida tag pre-allocation
> http://www.spinics.net/lists/target-devel/msg11778.html
> 
> sbp-target: Convert to TARGET_SCF_ACK_KREF I/O krefs
> http://www.spinics.net/lists/target-devel/msg11780.html
> 
> and verify on your local IEEE1394 target setup..?

Hi Nic, Dan,

I’m away this weekend so I can’t test these for a few days at least, 
unfortunately. I must admit I only vaguely follow the changes here as I haven’t 
been keeping up with the pace of change in target-devel lately, but it 
generally looks OK I think.

Are these in linux-next or another branch somewhere I can easily clone them 
from?

How soon do you need my ACK/NAK on these?

Cheers,
Chris

-- 
Chris Boot
bo...@bootc.net


Re: [patch] sbp-target: checking for NULL instead of IS_ERR

2016-03-05 Thread Chris Boot
On 5 Mar 2016, at 07:33, Nicholas A. Bellinger  wrote:
> 
> Hi Dan + BootC,
> 
> On Wed, 2016-03-02 at 13:09 +0300, Dan Carpenter wrote:
>> We changed this from kzalloc to sbp_mgt_get_req() so we need to change
>> from checking for NULL to check for error pointers.
>> 
>> Fixes: c064b2a78989 ('sbp-target: Conversion to percpu_ida tag 
>> pre-allocation')
>> Signed-off-by: Dan Carpenter 
>> 
>> diff --git a/drivers/target/sbp/sbp_target.c 
>> b/drivers/target/sbp/sbp_target.c
>> index 251d532..a04b0605f 100644
>> --- a/drivers/target/sbp/sbp_target.c
>> +++ b/drivers/target/sbp/sbp_target.c
>> @@ -951,7 +951,7 @@ static void tgt_agent_fetch_work(struct work_struct 
>> *work)
>> 
>>  while (next_orb && tgt_agent_check_active(agent)) {
>>  req = sbp_mgt_get_req(sess, sess->card, next_orb);
>> -if (!req) {
>> +if (IS_ERR(req)) {
>>  spin_lock_bh(>lock);
>>  agent->state = AGENT_STATE_DEAD;
>>  spin_unlock_bh(>lock);
> 
> Fixed + folded into the original patch.
> 
> Thanks Dan.
> 
> Chris, would you be so kind to review the original changes here:
> 
> sbp-target: Conversion to percpu_ida tag pre-allocation
> http://www.spinics.net/lists/target-devel/msg11778.html
> 
> sbp-target: Convert to TARGET_SCF_ACK_KREF I/O krefs
> http://www.spinics.net/lists/target-devel/msg11780.html
> 
> and verify on your local IEEE1394 target setup..?

Hi Nic, Dan,

I’m away this weekend so I can’t test these for a few days at least, 
unfortunately. I must admit I only vaguely follow the changes here as I haven’t 
been keeping up with the pace of change in target-devel lately, but it 
generally looks OK I think.

Are these in linux-next or another branch somewhere I can easily clone them 
from?

How soon do you need my ACK/NAK on these?

Cheers,
Chris

-- 
Chris Boot
bo...@bootc.net


qla2xxx firmware crashes in target mode

2015-10-19 Thread Chris Boot
4 Gbps).
[484976.448002] qla2xxx [:05:00.0]-0121:9: Failed to enable
receiving of RSCN requests: 0x2.

HTH,
Chris

-- 
Chris Boot
bo...@bootc.net
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


qla2xxx firmware crashes in target mode

2015-10-19 Thread Chris Boot
4 Gbps).
[484976.448002] qla2xxx [:05:00.0]-0121:9: Failed to enable
receiving of RSCN requests: 0x2.

HTH,
Chris

-- 
Chris Boot
bo...@bootc.net
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Panic on 3.10.18 in nf_conntrack_sip with IPv6

2013-11-10 Thread Chris Boot
 ? nf_hook_thresh.constprop.36+0x2e/0x33
> [  799.941898]  [] ? nf_hook_thresh.constprop.36+0x2e/0x33
> [  799.963582]  [] ? ip6_output+0x7a/0x83
> [  799.983090]  [] ? ip6_forward+0x5fd/0x69e
> [  800.001437]  [] ? pskb_may_pull+0x2d/0x2d
> [  800.019612]  [] ? pskb_may_pull+0x2d/0x2d
> [  800.036639]  [] ? __ipv6_conntrack_in+0xc4/0x13f 
> [nf_conntrack_ipv6]
> [  800.057257]  [] ? nf_iterate+0x42/0x80
> [  800.075044]  [] ? nf_hook_slow+0x69/0x100
> [  800.092089]  [] ? pskb_may_pull+0x2d/0x2d
> [  800.108860]  [] ? pskb_may_pull+0x2d/0x2d
> [  800.353154]  [] ? nf_ct_frag6_output+0x9f/0xe8 
> [nf_defrag_ipv6]
> [  800.371387]  [] ? pskb_may_pull+0x2d/0x2d
> [  800.387677]  [] ? ipv6_defrag+0xbb/0xcf [nf_defrag_ipv6]
> [  800.406280]  [] ? pskb_may_pull+0x2d/0x2d
> [  800.424998]  [] ? nf_iterate+0x42/0x80
> [  800.441318]  [] ? nf_hook_slow+0x69/0x100
> [  800.457694]  [] ? pskb_may_pull+0x2d/0x2d
> [  800.474649]  [] ? nf_hook_thresh.constprop.13+0x34/0x39
> [  800.495046]  [] ? ipv6_rcv+0x2bb/0x30b
> [  800.511896]  [] ? __netif_receive_skb_core+0x437/0x4af
> [  800.532539]  [] ? netif_receive_skb+0x42/0x73
> [  800.551414]  [] ? napi_gro_receive+0x35/0x76
> [  800.568152]  [] ? e1000_clean_rx_irq+0x249/0x2cb [e1000e]
> [  800.589151]  [] ? e1000e_poll+0x65/0x203 [e1000e]
> [  800.606255]  [] ? ktime_get+0x5f/0x6b
> [  800.622019]  [] ? net_rx_action+0xa7/0x1d9
> [  800.640555]  [] ? _raw_spin_unlock_irqrestore+0xc/0xd
> [  800.658116]  [] ? add_interrupt_randomness+0x39/0x16f
> [  800.677242]  [] ? __do_softirq+0xe4/0x1f9
> [  800.696819]  [] ? call_softirq+0x1c/0x30
> [  800.713258]  [] ? do_softirq+0x3a/0x78
> [  800.731145]  [] ? irq_exit+0x3f/0x83
> [  800.747466]  [] ? do_IRQ+0x81/0x97
> [  800.763133]  [] ? common_interrupt+0x6d/0x6d
> [  800.780351]   
> [  800.782499]  [] ? clockevents_program_event+0x9a/0xb6
> [  800.813469]  [] ? arch_local_irq_enable+0x4/0x8
> [  800.831484]  [] ? cpuidle_enter_state+0x46/0xb1
> [  800.849729]  [] ? cpuidle_idle_call+0xcf/0x126
> [  800.869185]  [] ? arch_cpu_idle+0x6/0x1a
> [  800.885493]  [] ? cpu_startup_entry+0x106/0x169
> [  800.902532]  [] ? start_kernel+0x3d7/0x3e2
> [  800.922455]  [] ? repair_env_string+0x57/0x57
> [  800.939302]  [] ? x86_64_start_kernel+0xf2/0xfd
> [  800.956528] Code: c3 41 57 41 56 41 55 41 54 55 53 48 89 fb 55 8b 87 dc 00 
> 00 00 89 f5 01 f0 01 c2 85 f6 79 02 0f 0b 8b 87 f4 00 00 00 ff c8 74 02 <0f> 
> 0b 83 c2 3f 89 c8 41 89 cd 80 cc 20 83 e2 c0 f6 87 b2 00 00 
> [  801.015687] RIP  [] pskb_expand_head+0x2a/0x1e1
> [  801.034404]  RSP 
> [  801.049813] ---[ end trace a0ea98f51afb8cc0 ]---
> [  801.454124] Kernel panic - not syncing: Fatal exception in interrupt
> [  801.474385] Rebooting in 120 seconds..

The O taint is due to loading LinBIT's drbd module. The crash occurs
even without this, and also in a 3.7.10 kernel that I was using before.

Cheers,
Chris

-- 
Chris Boot
bo...@bootc.net
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Panic on 3.10.18 in nf_conntrack_sip with IPv6

2013-11-10 Thread Chris Boot
]  [813419ac] ? nf_hook_thresh.constprop.36+0x2e/0x33
 [  799.963582]  [81344437] ? ip6_output+0x7a/0x83
 [  799.983090]  [81343a10] ? ip6_forward+0x5fd/0x69e
 [  800.001437]  [8134446d] ? pskb_may_pull+0x2d/0x2d
 [  800.019612]  [8134446d] ? pskb_may_pull+0x2d/0x2d
 [  800.036639]  [a04746ac] ? __ipv6_conntrack_in+0xc4/0x13f 
 [nf_conntrack_ipv6]
 [  800.057257]  [812f201a] ? nf_iterate+0x42/0x80
 [  800.075044]  [812f20c1] ? nf_hook_slow+0x69/0x100
 [  800.092089]  [8134446d] ? pskb_may_pull+0x2d/0x2d
 [  800.108860]  [8134446d] ? pskb_may_pull+0x2d/0x2d
 [  800.353154]  [a046bc5a] ? nf_ct_frag6_output+0x9f/0xe8 
 [nf_defrag_ipv6]
 [  800.371387]  [8134446d] ? pskb_may_pull+0x2d/0x2d
 [  800.387677]  [a046b0bc] ? ipv6_defrag+0xbb/0xcf [nf_defrag_ipv6]
 [  800.406280]  [8134446d] ? pskb_may_pull+0x2d/0x2d
 [  800.424998]  [812f201a] ? nf_iterate+0x42/0x80
 [  800.441318]  [812f20c1] ? nf_hook_slow+0x69/0x100
 [  800.457694]  [8134446d] ? pskb_may_pull+0x2d/0x2d
 [  800.474649]  [813445b9] ? nf_hook_thresh.constprop.13+0x34/0x39
 [  800.495046]  [81344b43] ? ipv6_rcv+0x2bb/0x30b
 [  800.511896]  [812cea5d] ? __netif_receive_skb_core+0x437/0x4af
 [  800.532539]  [812ceca1] ? netif_receive_skb+0x42/0x73
 [  800.551414]  [812cf419] ? napi_gro_receive+0x35/0x76
 [  800.568152]  [a012e20b] ? e1000_clean_rx_irq+0x249/0x2cb [e1000e]
 [  800.589151]  [a0131698] ? e1000e_poll+0x65/0x203 [e1000e]
 [  800.606255]  [810742f4] ? ktime_get+0x5f/0x6b
 [  800.622019]  [812cf1b8] ? net_rx_action+0xa7/0x1d9
 [  800.640555]  [8139238c] ? _raw_spin_unlock_irqrestore+0xc/0xd
 [  800.658116]  [812730de] ? add_interrupt_randomness+0x39/0x16f
 [  800.677242]  [8104244a] ? __do_softirq+0xe4/0x1f9
 [  800.696819]  [81398bdc] ? call_softirq+0x1c/0x30
 [  800.713258]  [8100e9ee] ? do_softirq+0x3a/0x78
 [  800.731145]  [8104262a] ? irq_exit+0x3f/0x83
 [  800.747466]  [8100e6ff] ? do_IRQ+0x81/0x97
 [  800.763133]  [8139262d] ? common_interrupt+0x6d/0x6d
 [  800.780351]  EOI 
 [  800.782499]  [81078ffb] ? clockevents_program_event+0x9a/0xb6
 [  800.813469]  [812a8110] ? arch_local_irq_enable+0x4/0x8
 [  800.831484]  [812a84db] ? cpuidle_enter_state+0x46/0xb1
 [  800.849729]  [812a8615] ? cpuidle_idle_call+0xcf/0x126
 [  800.869185]  [81013b3b] ? arch_cpu_idle+0x6/0x1a
 [  800.885493]  [81073255] ? cpu_startup_entry+0x106/0x169
 [  800.902532]  [816b5d40] ? start_kernel+0x3d7/0x3e2
 [  800.922455]  [816b577f] ? repair_env_string+0x57/0x57
 [  800.939302]  [816b559a] ? x86_64_start_kernel+0xf2/0xfd
 [  800.956528] Code: c3 41 57 41 56 41 55 41 54 55 53 48 89 fb 55 8b 87 dc 00 
 00 00 89 f5 01 f0 01 c2 85 f6 79 02 0f 0b 8b 87 f4 00 00 00 ff c8 74 02 0f 
 0b 83 c2 3f 89 c8 41 89 cd 80 cc 20 83 e2 c0 f6 87 b2 00 00 
 [  801.015687] RIP  [812c5b22] pskb_expand_head+0x2a/0x1e1
 [  801.034404]  RSP 88043fc037c0
 [  801.049813] ---[ end trace a0ea98f51afb8cc0 ]---
 [  801.454124] Kernel panic - not syncing: Fatal exception in interrupt
 [  801.474385] Rebooting in 120 seconds..

The O taint is due to loading LinBIT's drbd module. The crash occurs
even without this, and also in a 3.7.10 kernel that I was using before.

Cheers,
Chris

-- 
Chris Boot
bo...@bootc.net
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PANIC at net/xfrm/xfrm_output.c:125 (3.9.4)

2013-06-27 Thread Chris Boot
On 26/06/2013 23:17, David Miller wrote:
> From: Chris Boot 
> Date: Thu, 20 Jun 2013 21:36:44 +0100
> 
>> On 06/06/2013 09:38, Timo Teras wrote:
>>> On Thu, 06 Jun 2013 08:47:56 +0100
>>> Chris Boot  wrote:
>>> 
>>>> On 06/06/13 02:24, Fan Du wrote:
>>>>> Hello Chris/Jean
>>>>>
>>>>> This issue might have already been fixed by this:
>>>>> https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/net/xfrm/xfrm_output.c?id=497574c72c9922cf20c12aed15313c389f722fa0
>>>>>
>>>>>
>>>>> Hope it helps.
>>>>
>>>> Hi Fan, Jean,
>>>>
>>>> Thanks, that looks like it's the patch for exactly my problem.
>>>> Unfortunately I can't test it until next week now. :-/
>>>>
>>>> Timo/Dave: are there any plans to push this into 3.10-rc and/or
>>>> stable? I seem to be able to hit the issue pretty reliably.
>>> 
>>> It is already present in 3.10-rc3 [1], and Dave has it queued for
>>> 3.9-stable [2].
>>> 
>>> - Timo
>>> 
>>> [1] http://lwn.net/Articles/551922/
>>> [2] http://patchwork.ozlabs.org/patch/245594/
>> 
>> I'm just wondering if this patch has got lost in the cracks; I reported
>> the issue in 3.9.4 and 3.9.7 is just out without any sign of it. Have I
>> missed something?
> 
> It got submitted to -stable last week.

Dave,

Thank you, I see it's in 3.9.8 that has been just released.

Cheers,
Chris

-- 
Chris Boot
bo...@bootc.net
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PANIC at net/xfrm/xfrm_output.c:125 (3.9.4)

2013-06-27 Thread Chris Boot
On 26/06/2013 23:17, David Miller wrote:
 From: Chris Boot bo...@bootc.net
 Date: Thu, 20 Jun 2013 21:36:44 +0100
 
 On 06/06/2013 09:38, Timo Teras wrote:
 On Thu, 06 Jun 2013 08:47:56 +0100
 Chris Boot bo...@bootc.net wrote:
 
 On 06/06/13 02:24, Fan Du wrote:
 Hello Chris/Jean

 This issue might have already been fixed by this:
 https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/net/xfrm/xfrm_output.c?id=497574c72c9922cf20c12aed15313c389f722fa0


 Hope it helps.

 Hi Fan, Jean,

 Thanks, that looks like it's the patch for exactly my problem.
 Unfortunately I can't test it until next week now. :-/

 Timo/Dave: are there any plans to push this into 3.10-rc and/or
 stable? I seem to be able to hit the issue pretty reliably.
 
 It is already present in 3.10-rc3 [1], and Dave has it queued for
 3.9-stable [2].
 
 - Timo
 
 [1] http://lwn.net/Articles/551922/
 [2] http://patchwork.ozlabs.org/patch/245594/
 
 I'm just wondering if this patch has got lost in the cracks; I reported
 the issue in 3.9.4 and 3.9.7 is just out without any sign of it. Have I
 missed something?
 
 It got submitted to -stable last week.

Dave,

Thank you, I see it's in 3.9.8 that has been just released.

Cheers,
Chris

-- 
Chris Boot
bo...@bootc.net
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PANIC at net/xfrm/xfrm_output.c:125 (3.9.4)

2013-06-20 Thread Chris Boot
On 06/06/2013 09:38, Timo Teras wrote:
> On Thu, 06 Jun 2013 08:47:56 +0100
> Chris Boot  wrote:
> 
>> On 06/06/13 02:24, Fan Du wrote:
>>> Hello Chris/Jean
>>>
>>> This issue might have already been fixed by this:
>>> https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/net/xfrm/xfrm_output.c?id=497574c72c9922cf20c12aed15313c389f722fa0
>>>
>>>
>>> Hope it helps.
>>
>> Hi Fan, Jean,
>>
>> Thanks, that looks like it's the patch for exactly my problem.
>> Unfortunately I can't test it until next week now. :-/
>>
>> Timo/Dave: are there any plans to push this into 3.10-rc and/or
>> stable? I seem to be able to hit the issue pretty reliably.
> 
> It is already present in 3.10-rc3 [1], and Dave has it queued for
> 3.9-stable [2].
> 
> - Timo
> 
> [1] http://lwn.net/Articles/551922/
> [2] http://patchwork.ozlabs.org/patch/245594/

Hi folks,

I'm just wondering if this patch has got lost in the cracks; I reported
the issue in 3.9.4 and 3.9.7 is just out without any sign of it. Have I
missed something?

Thanks,
Chris

-- 
Chris Boot
bo...@bootc.net
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PANIC at net/xfrm/xfrm_output.c:125 (3.9.4)

2013-06-20 Thread Chris Boot
On 06/06/2013 09:38, Timo Teras wrote:
 On Thu, 06 Jun 2013 08:47:56 +0100
 Chris Boot bo...@bootc.net wrote:
 
 On 06/06/13 02:24, Fan Du wrote:
 Hello Chris/Jean

 This issue might have already been fixed by this:
 https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/net/xfrm/xfrm_output.c?id=497574c72c9922cf20c12aed15313c389f722fa0


 Hope it helps.

 Hi Fan, Jean,

 Thanks, that looks like it's the patch for exactly my problem.
 Unfortunately I can't test it until next week now. :-/

 Timo/Dave: are there any plans to push this into 3.10-rc and/or
 stable? I seem to be able to hit the issue pretty reliably.
 
 It is already present in 3.10-rc3 [1], and Dave has it queued for
 3.9-stable [2].
 
 - Timo
 
 [1] http://lwn.net/Articles/551922/
 [2] http://patchwork.ozlabs.org/patch/245594/

Hi folks,

I'm just wondering if this patch has got lost in the cracks; I reported
the issue in 3.9.4 and 3.9.7 is just out without any sign of it. Have I
missed something?

Thanks,
Chris

-- 
Chris Boot
bo...@bootc.net
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PANIC at net/xfrm/xfrm_output.c:125 (3.9.4)

2013-06-06 Thread Chris Boot
On 06/06/13 09:38, Timo Teras wrote:
> On Thu, 06 Jun 2013 08:47:56 +0100
> Chris Boot  wrote:
> 
>> On 06/06/13 02:24, Fan Du wrote:
>>> Hello Chris/Jean
>>>
>>> This issue might have already been fixed by this:
>>> https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/net/xfrm/xfrm_output.c?id=497574c72c9922cf20c12aed15313c389f722fa0
>>>
>>>
>>> Hope it helps.
>>
>> Hi Fan, Jean,
>>
>> Thanks, that looks like it's the patch for exactly my problem.
>> Unfortunately I can't test it until next week now. :-/
>>
>> Timo/Dave: are there any plans to push this into 3.10-rc and/or
>> stable? I seem to be able to hit the issue pretty reliably.
> 
> It is already present in 3.10-rc3 [1], and Dave has it queued for
> 3.9-stable [2].
> 
> - Timo
> 
> [1] http://lwn.net/Articles/551922/
> [2] http://patchwork.ozlabs.org/patch/245594/

Thank you!

Cheers,
Cheers

-- 
Chris Boot
bo...@bootc.net
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PANIC at net/xfrm/xfrm_output.c:125 (3.9.4)

2013-06-06 Thread Chris Boot
On 06/06/13 02:24, Fan Du wrote:
> Hello Chris/Jean
> 
> This issue might have already been fixed by this:
> https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/net/xfrm/xfrm_output.c?id=497574c72c9922cf20c12aed15313c389f722fa0
> 
> 
> Hope it helps.

Hi Fan, Jean,

Thanks, that looks like it's the patch for exactly my problem.
Unfortunately I can't test it until next week now. :-/

Timo/Dave: are there any plans to push this into 3.10-rc and/or stable?
I seem to be able to hit the issue pretty reliably.

Thanks,
Chris
> On 2013年06月06日 09:04, Jean Sacren wrote:
>> From: Chris Boot
>> Date: Wed, 05 Jun 2013 22:47:48 +0100
>>>
>>> Hi folks,
>>>
>>> I have a re-purposed Watchguard Firebox running Debian GNU/Linux with a
>>> self-built vanilla 3.9.4 kernel. I have an IPsec tunnel up to a remote
>>> router through which I was passing a fair bit of traffic when I hit the
>>> following panic:
>>>
>>> [486832.949560] BUG: unable to handle kernel NULL pointer dereference at
>>> 0010
>>> [486832.953431] IP: [] xfrm_output_resume+0x61/0x29f
>>> [486832.953431] *pde = 
>>> [486832.953431] Oops:  [#1]
>>> [486832.953431] Modules linked in: xt_realm xt_nat authenc esp4
>>> xfrm4_mode_tunnel tun ip6table_nat nf_nat_ipv6 sch_fq_codel xt_statistic
>>> xt_CT xt_LOG xt_connlimit xt_recent xt_time xt_TCPMSS xt_sctp
>>> ip6t_REJECT pppoe deflate zlib_deflate pppox ctr twofish_generic
>>> twofish_i586 twofish_common camellia_generic serpent_sse2_i586 xts
>>> serpent_generic lrw gf128mul glue_helper ablk_helper cryptd
>>> blowfish_generic blowfish_common cast5_generic cast_common des_generic
>>> cbc xcbc rmd160 sha512_generic sha256_generic sha1_generic hmac
>>> crypto_null af_key xfrm_algo xt_comment xt_addrtype xt_policy
>>> ip_set_hash_ip ipt_ULOG ipt_REJECT ipt_MASQUERADE ipt_ECN ipt_CLUSTERIP
>>> ipt_ah act_police cls_basic cls_flow cls_fw cls_u32 sch_tbf sch_prio
>>> sch_htb sch_hfsc sch_ingress sch_sfq xt_set ip_set nf_nat_tftp
>>> nf_nat_snmp_basic nf_conntrack_snmp nf_nat_sip nf_nat_pptp
>>> nf_nat_proto_gre nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda ts_kmp
>>> nf_conntrack_amanda nf_conntrack_sane nf_conntrack_tftp nf_conntrack_sip
>>> nf_conntrack_proto_udplite nf_conntrack_proto_sctp nf_conntrack_pptp
>>> nf_conntrack_proto_gre nf_conntrack_netlink nf_conntrack_netbios_ns
>>> nf_conntrack_broadcast nf_conntrack_irc nf_conntrack_h323
>>> nf_conntrack_ftp xt_TPROXY nf_tproxy_core xt_tcpmss xt_pkttype
>>> xt_physdev xt_owner xt_NFQUEUE xt_NFLOG nfnetlink_log xt_multiport
>>> xt_mark xt_mac xt_limit xt_length xt_iprange xt_helper xt_hashlimit
>>> xt_DSCP xt_dscp xt_dccp xt_connmark xt_CLASSIFY xt_AUDIT xt_state
>>> nfnetlink bridge 8021q garp stp mrp llc ppp_generic slhc
>>> nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_mangle ip6table_raw
>>> ip6table_filter ip6_tables xt_tcpudp xt_conntrack iptable_mangle
>>> iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat
>>> nf_conntrack iptable_raw iptable_filter ip_tables x_tables w83627hf
>>> hwmon_vid loop iTCO_wdt iTCO_vendor_support evdev snd_pcm snd_page_alloc
>>> snd_timer snd soundcore acpi_cpufreq mperf processor pcspkr serio_raw
>>> drm_kms_helper lpc_ich i2c_i801 of_i2c drm rng_core thermal_sys
>>> i2c_algo_bit ehci_pci i2c_core ext4 crc16 jbd2 mbcache dm_mod sg sd_mod
>>> crc_t10dif ata_generic ata_piix uhci_hcd ehci_hcd libata microcode
>>> scsi_mod skge sky2 usbcore usb_common
>>> [486832.953431] Pid: 0, comm: swapper Not tainted 3.9.4-1-bootc #1
>>> [486832.953431] EIP: 0060:[] EFLAGS: 00210246 CPU: 0
>>> [486832.953431] EIP is at xfrm_output_resume+0x61/0x29f
>>> [486832.953431] EAX:  EBX: f3fbc100 ECX: f77f1288 EDX: f6130200
>>> [486832.953431] ESI: 0016 EDI:  EBP: f70b3c00 ESP: c1407c44
>>> [486832.953431]  DS: 007b ES: 007b FS:  GS: 00e0 SS: 0068
>>> [486832.953431] CR0: 8005003b CR2: 0010 CR3: 37247000 CR4: 07d0
>>> [486832.953431] DR0:  DR1:  DR2:  DR3: 
>>> [486832.953431] DR6: 0ff0 DR7: 0400
>>> [486832.953431] Process swapper (pid: 0, ti=c1406000 task=c1413490
>>> task.ti=c1406000)
>>> [486832.953431] Stack:
>>> [486832.953431]  c129d44f 8000 0002 c1457254 f3fbc100 c129d44f
>>>  0008
>>> [486832.953431]  c129d49e  f4524000 c129d44f 8000 
>>> f3fbc100 c1268b49
>>> [486832.953431]  f3fbc100 f127604

Re: PANIC at net/xfrm/xfrm_output.c:125 (3.9.4)

2013-06-06 Thread Chris Boot
]
 [486832.953431]  [f8499310] ? NF_HOOK_THRESH+0x1d/0x4c [bridge]
 [486832.953431]  [f8495353] ? br_handle_local_finish+0x4d/0x4d
 [bridge]
 [486832.953431]  [f849983e] ? br_nf_pre_routing_finish+0x1c8/0x1d2
 [bridge]
 [486832.953431]  [f8495353] ? br_handle_local_finish+0x4d/0x4d
 [bridge]
 [486832.953431]  [c12635d1] ? nf_hook_slow+0x52/0xed
 [486832.953431]  [f8499676] ? nf_bridge_alloc.isra.18+0x32/0x32
 [bridge]
 [486832.953431]  [f8499676] ? nf_bridge_alloc.isra.18+0x32/0x32
 [bridge]
 [486832.953431]  [f8499310] ? NF_HOOK_THRESH+0x1d/0x4c [bridge]
 [486832.953431]  [f8499676] ? nf_bridge_alloc.isra.18+0x32/0x32
 [bridge]
 [486832.953431]  [f849a1c0] ? br_nf_pre_routing+0x32c/0x33f [bridge]
 [486832.953431]  [f8499676] ? nf_bridge_alloc.isra.18+0x32/0x32
 [bridge]
 [486832.953431]  [c1263552] ? nf_iterate+0x3c/0x69
 [486832.953431]  [f8495353] ? br_handle_local_finish+0x4d/0x4d
 [bridge]
 [486832.953431]  [c12635d1] ? nf_hook_slow+0x52/0xed
 [486832.953431]  [f8495353] ? br_handle_local_finish+0x4d/0x4d
 [bridge]
 [486832.953431]  [f84952fa] ? nf_hook_thresh.constprop.10+0x36/0x42
 [bridge]
 [486832.953431]  [f8495353] ? br_handle_local_finish+0x4d/0x4d
 [bridge]
 [486832.953431]  [f8495746] ? br_handle_frame+0x18f/0x1b5 [bridge]
 [486832.953431]  [f8495353] ? br_handle_local_finish+0x4d/0x4d
 [bridge]
 [486832.953431]  [f84955b7] ? br_handle_frame_finish+0x264/0x264
 [bridge]
 [486832.953431]  [c12467a8] ? __netif_receive_skb_core+0x2b5/0x406
 [486832.953431]  [c1051a58] ? __getnstimeofday+0x17/0x52
 [486832.953431]  [c1051a00] ? get_monotonic_boottime+0x73/0x92
 [486832.953431]  [c124704f] ? napi_gro_receive+0x2e/0x69
 [486832.953431]  [c10053d8] ?
 __stop_machine.isra.0.constprop.1+0x27/0x27
 [486832.953431]  [f80792d7] ? sky2_poll+0x6d8/0x8f3 [sky2]
 [486832.953431]  [c1006058] ? native_sched_clock+0x40/0x98
 [486832.953431]  [c1006058] ? native_sched_clock+0x40/0x98
 [486832.953431]  [c1005962] ? paravirt_sched_clock+0x8/0xb
 [486832.953431]  [c1006058] ? native_sched_clock+0x40/0x98
 [486832.953431]  [c1246bbf] ? net_rx_action+0x6e/0x180
 [486832.953431]  [c1005962] ? paravirt_sched_clock+0x8/0xb
 [486832.953431]  [c102ca5a] ? __do_softirq+0xa5/0x19e
 [486832.953431]  [c102cbfa] ? irq_exit+0x36/0x69
 [486832.953431]  [c100326b] ? do_IRQ+0x6e/0x81
 [486832.953431]  [c12e4cf3] ? common_interrupt+0x33/0x38
 [486832.953431]  [c101df1b] ? native_safe_halt+0x2/0x3
 [486832.953431]  [c1006b2f] ? default_idle+0x23/0x3e
 [486832.953431]  [c10070cd] ? cpu_idle+0x75/0x8f
 [486832.953431]  [c145996b] ? start_kernel+0x34e/0x353
 [486832.953431]  [c1459465] ? repair_env_string+0x4d/0x4d
 [486832.953431] Code: f9 ff 8b 43 74 c7 43 70 00 00 00 00 85 c0 74 0e ff
 08 0f 94 c2 84 d2 74 05 e8 c7 6b e1 ff 8b 43 48 c7 43 74 00 00 00 00 83
 e0 fe8b  50 10 89 d8 ff 52 34 83 f8 01 89 c7 0f 85 21 02 00 00 8b 53
 [486832.953431] EIP: [c12a4dd0] xfrm_output_resume+0x61/0x29f SS:ESP
 0068:c1407c44
 [486832.953431] CR2: 0010
 [486833.573872] ---[ end trace ed321ebdc197b3d7 ]---
 [486833.578576] Kernel panic - not syncing: Fatal exception in interrupt
 [486833.582572] Rebooting in 60 seconds..

 (gdb) list *xfrm_output_resume+0x61
 0xc12a4dd0 is in xfrm_output_resume (net/xfrm/xfrm_output.c:125).
 120 int xfrm_output_resume(struct sk_buff *skb, int err)
 121 {
 122 while (likely((err = xfrm_output_one(skb, err)) == 0)) {
 123 nf_reset(skb);
 124
 125 err = skb_dst(skb)-ops-local_out(skb);
 126 if (unlikely(err != 1))
 127 goto out;
 128
 129 if (!skb_dst(skb)-xfrm)

 Try this:

 diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c
 index bcfda89..0cf003d 100644
 --- a/net/xfrm/xfrm_output.c
 +++ b/net/xfrm/xfrm_output.c
 @@ -64,6 +64,7 @@ static int xfrm_output_one(struct sk_buff *skb, int
 err)

   if (unlikely(x-km.state != XFRM_STATE_VALID)) {
   XFRM_INC_STATS(net, LINUX_MIB_XFRMOUTSTATEINVALID);
 +err = -EINVAL;
   goto error;
   }


 


-- 
Chris Boot
bo...@bootc.net
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PANIC at net/xfrm/xfrm_output.c:125 (3.9.4)

2013-06-06 Thread Chris Boot
On 06/06/13 09:38, Timo Teras wrote:
 On Thu, 06 Jun 2013 08:47:56 +0100
 Chris Boot bo...@bootc.net wrote:
 
 On 06/06/13 02:24, Fan Du wrote:
 Hello Chris/Jean

 This issue might have already been fixed by this:
 https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/net/xfrm/xfrm_output.c?id=497574c72c9922cf20c12aed15313c389f722fa0


 Hope it helps.

 Hi Fan, Jean,

 Thanks, that looks like it's the patch for exactly my problem.
 Unfortunately I can't test it until next week now. :-/

 Timo/Dave: are there any plans to push this into 3.10-rc and/or
 stable? I seem to be able to hit the issue pretty reliably.
 
 It is already present in 3.10-rc3 [1], and Dave has it queued for
 3.9-stable [2].
 
 - Timo
 
 [1] http://lwn.net/Articles/551922/
 [2] http://patchwork.ozlabs.org/patch/245594/

Thank you!

Cheers,
Cheers

-- 
Chris Boot
bo...@bootc.net
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


PANIC at net/xfrm/xfrm_output.c:125 (3.9.4)

2013-06-05 Thread Chris Boot
]  [] ? br_handle_local_finish+0x4d/0x4d [bridge]
[486832.953431]  [] ? nf_hook_thresh.constprop.10+0x36/0x42
[bridge]
[486832.953431]  [] ? br_handle_local_finish+0x4d/0x4d [bridge]
[486832.953431]  [] ? br_handle_frame+0x18f/0x1b5 [bridge]
[486832.953431]  [] ? br_handle_local_finish+0x4d/0x4d [bridge]
[486832.953431]  [] ? br_handle_frame_finish+0x264/0x264 [bridge]
[486832.953431]  [] ? __netif_receive_skb_core+0x2b5/0x406
[486832.953431]  [] ? __getnstimeofday+0x17/0x52
[486832.953431]  [] ? get_monotonic_boottime+0x73/0x92
[486832.953431]  [] ? napi_gro_receive+0x2e/0x69
[486832.953431]  [] ? __stop_machine.isra.0.constprop.1+0x27/0x27
[486832.953431]  [] ? sky2_poll+0x6d8/0x8f3 [sky2]
[486832.953431]  [] ? native_sched_clock+0x40/0x98
[486832.953431]  [] ? native_sched_clock+0x40/0x98
[486832.953431]  [] ? paravirt_sched_clock+0x8/0xb
[486832.953431]  [] ? native_sched_clock+0x40/0x98
[486832.953431]  [] ? net_rx_action+0x6e/0x180
[486832.953431]  [] ? paravirt_sched_clock+0x8/0xb
[486832.953431]  [] ? __do_softirq+0xa5/0x19e
[486832.953431]  [] ? irq_exit+0x36/0x69
[486832.953431]  [] ? do_IRQ+0x6e/0x81
[486832.953431]  [] ? common_interrupt+0x33/0x38
[486832.953431]  [] ? native_safe_halt+0x2/0x3
[486832.953431]  [] ? default_idle+0x23/0x3e
[486832.953431]  [] ? cpu_idle+0x75/0x8f
[486832.953431]  [] ? start_kernel+0x34e/0x353
[486832.953431]  [] ? repair_env_string+0x4d/0x4d
[486832.953431] Code: f9 ff 8b 43 74 c7 43 70 00 00 00 00 85 c0 74 0e ff
08 0f 94 c2 84 d2 74 05 e8 c7 6b e1 ff 8b 43 48 c7 43 74 00 00 00 00 83
e0 fe <8b> 50 10 89 d8 ff 52 34 83 f8 01 89 c7 0f 85 21 02 00 00 8b 53
[486832.953431] EIP: [] xfrm_output_resume+0x61/0x29f SS:ESP
0068:c1407c44
[486832.953431] CR2: 0010
[486833.573872] ---[ end trace ed321ebdc197b3d7 ]---
[486833.578576] Kernel panic - not syncing: Fatal exception in interrupt
[486833.582572] Rebooting in 60 seconds..

(gdb) list *xfrm_output_resume+0x61
0xc12a4dd0 is in xfrm_output_resume (net/xfrm/xfrm_output.c:125).
120 int xfrm_output_resume(struct sk_buff *skb, int err)
121 {
122 while (likely((err = xfrm_output_one(skb, err)) == 0)) {
123 nf_reset(skb);
124
125 err = skb_dst(skb)->ops->local_out(skb);
126 if (unlikely(err != 1))
127 goto out;
128
129 if (!skb_dst(skb)->xfrm)

Not knowing anything much about networking in the kernel I can't go any
further, but I'm happy to try out patches and poke around with a little
guidance.

I should add that the box doesn't reboot after 60 seconds and the
watchdog doesn't seem to kick in either, but that's clearly not a
networking issue. It reboots fine with the 'reboot' command.

Cheers,
Chris

-- 
Chris Boot
bo...@bootc.net
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


PANIC at net/xfrm/xfrm_output.c:125 (3.9.4)

2013-06-05 Thread Chris Boot
] ? nf_bridge_alloc.isra.18+0x32/0x32 [bridge]
[486832.953431]  [c1263552] ? nf_iterate+0x3c/0x69
[486832.953431]  [f8495353] ? br_handle_local_finish+0x4d/0x4d [bridge]
[486832.953431]  [c12635d1] ? nf_hook_slow+0x52/0xed
[486832.953431]  [f8495353] ? br_handle_local_finish+0x4d/0x4d [bridge]
[486832.953431]  [f84952fa] ? nf_hook_thresh.constprop.10+0x36/0x42
[bridge]
[486832.953431]  [f8495353] ? br_handle_local_finish+0x4d/0x4d [bridge]
[486832.953431]  [f8495746] ? br_handle_frame+0x18f/0x1b5 [bridge]
[486832.953431]  [f8495353] ? br_handle_local_finish+0x4d/0x4d [bridge]
[486832.953431]  [f84955b7] ? br_handle_frame_finish+0x264/0x264 [bridge]
[486832.953431]  [c12467a8] ? __netif_receive_skb_core+0x2b5/0x406
[486832.953431]  [c1051a58] ? __getnstimeofday+0x17/0x52
[486832.953431]  [c1051a00] ? get_monotonic_boottime+0x73/0x92
[486832.953431]  [c124704f] ? napi_gro_receive+0x2e/0x69
[486832.953431]  [c10053d8] ? __stop_machine.isra.0.constprop.1+0x27/0x27
[486832.953431]  [f80792d7] ? sky2_poll+0x6d8/0x8f3 [sky2]
[486832.953431]  [c1006058] ? native_sched_clock+0x40/0x98
[486832.953431]  [c1006058] ? native_sched_clock+0x40/0x98
[486832.953431]  [c1005962] ? paravirt_sched_clock+0x8/0xb
[486832.953431]  [c1006058] ? native_sched_clock+0x40/0x98
[486832.953431]  [c1246bbf] ? net_rx_action+0x6e/0x180
[486832.953431]  [c1005962] ? paravirt_sched_clock+0x8/0xb
[486832.953431]  [c102ca5a] ? __do_softirq+0xa5/0x19e
[486832.953431]  [c102cbfa] ? irq_exit+0x36/0x69
[486832.953431]  [c100326b] ? do_IRQ+0x6e/0x81
[486832.953431]  [c12e4cf3] ? common_interrupt+0x33/0x38
[486832.953431]  [c101df1b] ? native_safe_halt+0x2/0x3
[486832.953431]  [c1006b2f] ? default_idle+0x23/0x3e
[486832.953431]  [c10070cd] ? cpu_idle+0x75/0x8f
[486832.953431]  [c145996b] ? start_kernel+0x34e/0x353
[486832.953431]  [c1459465] ? repair_env_string+0x4d/0x4d
[486832.953431] Code: f9 ff 8b 43 74 c7 43 70 00 00 00 00 85 c0 74 0e ff
08 0f 94 c2 84 d2 74 05 e8 c7 6b e1 ff 8b 43 48 c7 43 74 00 00 00 00 83
e0 fe 8b 50 10 89 d8 ff 52 34 83 f8 01 89 c7 0f 85 21 02 00 00 8b 53
[486832.953431] EIP: [c12a4dd0] xfrm_output_resume+0x61/0x29f SS:ESP
0068:c1407c44
[486832.953431] CR2: 0010
[486833.573872] ---[ end trace ed321ebdc197b3d7 ]---
[486833.578576] Kernel panic - not syncing: Fatal exception in interrupt
[486833.582572] Rebooting in 60 seconds..

(gdb) list *xfrm_output_resume+0x61
0xc12a4dd0 is in xfrm_output_resume (net/xfrm/xfrm_output.c:125).
120 int xfrm_output_resume(struct sk_buff *skb, int err)
121 {
122 while (likely((err = xfrm_output_one(skb, err)) == 0)) {
123 nf_reset(skb);
124
125 err = skb_dst(skb)-ops-local_out(skb);
126 if (unlikely(err != 1))
127 goto out;
128
129 if (!skb_dst(skb)-xfrm)

Not knowing anything much about networking in the kernel I can't go any
further, but I'm happy to try out patches and poke around with a little
guidance.

I should add that the box doesn't reboot after 60 seconds and the
watchdog doesn't seem to kick in either, but that's clearly not a
networking issue. It reboots fine with the 'reboot' command.

Cheers,
Chris

-- 
Chris Boot
bo...@bootc.net
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


drbd: kernels 3.7 => 3.8 broken userspace compatibility

2013-05-06 Thread Chris Boot
Hi all,

I upgraded from a 3.7.x kernel to a 3.8.x kernel on a test machine
running DRBD, and found myself unable to bring up my DRBD devices. I'm
using the 8.3.13 userspace tools as shipped in Debian Wheezy, which work
fine on the 3.7 kernel, but they appear to hang when using the 3.8
kernel and cannot set up the device.

The 3.8 kernel appears to introduce drbd 8.4.2 rather than the 8.3.13
available in 3.7.

The hang seems to be caused by lots of the following:

[pid  7631] socket(PF_NETLINK, SOCK_DGRAM, 11) = 8
[pid  7631] getpid()= 7631
[pid  7631] bind(8, {sa_family=AF_NETLINK, pid=7631, groups=},
12) = 0
[pid  7631] sendto(8,
"4\0\0\0\3\0\0\0\1\0\0\0\317\35\0\0\4\0\0\0\1\0\0\0\1\0\0\0\317\35\0\0"...,
52, 0, NULL, 0) = 52
[pid  7631] poll([{fd=8, events=POLLIN}], 1, 12 
[pid  7630] <... read resumed> 0x7fff6d011a30, 1024) = ? ERESTARTSYS (To
be restarted)
[pid  7630] --- SIGALRM (Alarm clock) @ 0 (0) ---
[pid  7630] rt_sigreturn(0xe)   = -1 EINTR (Interrupted system call)
[pid  7630] close(8)= 0
[pid  7630] wait4(7631, Process 7630 suspended

I asked for help on the #drbd channel on FreeNode, and the only remark I
got there was that I should upgrade the userspace tools. Somehow, that
doesn't feel right to me - can a newer kernel require new userspace
tools to still be able to use a certain kernel functionality at all?
Doesn't this fall under not breaking userspace with new kernel versions?

Even if the kernel did require new userspace tools, should there not be
some better mechanism to notify the user they must upgrade them before
things will work? At the moment all I see without strace is:

# drbdadm attach r0
DRBD module version: 8.4.2
   userland version: 8.3.13
you should upgrade your drbd tools!
[hang]

There is nothing in dmesg during this time, either.

Cheers,
Chris

PS: Please ensure you CC me as I'm no longer an LKML subscriber.

-- 
Chris Boot
bo...@bootc.net

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


drbd: kernels 3.7 = 3.8 broken userspace compatibility

2013-05-06 Thread Chris Boot
Hi all,

I upgraded from a 3.7.x kernel to a 3.8.x kernel on a test machine
running DRBD, and found myself unable to bring up my DRBD devices. I'm
using the 8.3.13 userspace tools as shipped in Debian Wheezy, which work
fine on the 3.7 kernel, but they appear to hang when using the 3.8
kernel and cannot set up the device.

The 3.8 kernel appears to introduce drbd 8.4.2 rather than the 8.3.13
available in 3.7.

The hang seems to be caused by lots of the following:

[pid  7631] socket(PF_NETLINK, SOCK_DGRAM, 11) = 8
[pid  7631] getpid()= 7631
[pid  7631] bind(8, {sa_family=AF_NETLINK, pid=7631, groups=},
12) = 0
[pid  7631] sendto(8,
4\0\0\0\3\0\0\0\1\0\0\0\317\35\0\0\4\0\0\0\1\0\0\0\1\0\0\0\317\35\0\0...,
52, 0, NULL, 0) = 52
[pid  7631] poll([{fd=8, events=POLLIN}], 1, 12 unfinished ...
[pid  7630] ... read resumed 0x7fff6d011a30, 1024) = ? ERESTARTSYS (To
be restarted)
[pid  7630] --- SIGALRM (Alarm clock) @ 0 (0) ---
[pid  7630] rt_sigreturn(0xe)   = -1 EINTR (Interrupted system call)
[pid  7630] close(8)= 0
[pid  7630] wait4(7631, Process 7630 suspended

I asked for help on the #drbd channel on FreeNode, and the only remark I
got there was that I should upgrade the userspace tools. Somehow, that
doesn't feel right to me - can a newer kernel require new userspace
tools to still be able to use a certain kernel functionality at all?
Doesn't this fall under not breaking userspace with new kernel versions?

Even if the kernel did require new userspace tools, should there not be
some better mechanism to notify the user they must upgrade them before
things will work? At the moment all I see without strace is:

# drbdadm attach r0
DRBD module version: 8.4.2
   userland version: 8.3.13
you should upgrade your drbd tools!
[hang]

There is nothing in dmesg during this time, either.

Cheers,
Chris

PS: Please ensure you CC me as I'm no longer an LKML subscriber.

-- 
Chris Boot
bo...@bootc.net

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

2007-08-20 Thread Chris Boot

Chris Boot wrote:

I'll probably just try and recompile the kernel with 8k stacks and see
how it goes. Screw the support, we're unlikely to get it anyway. :-P



Please report how this works out.
  


I will. This will probably be on Monday now, since the machine isn't 
accepting SysRq requests over the serial console. :-(


OK, with the recompiled kernel this appears to work just fine now. I've 
been pounding the box all day with rsyncs, VMware VMs, plenty of web 
serving (inc. SVN) and so far it's holding up just fine. Cheers for the 
diagnosis.


Many thanks,
Chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

2007-08-20 Thread Chris Boot

Chris Boot wrote:

I'll probably just try and recompile the kernel with 8k stacks and see
how it goes. Screw the support, we're unlikely to get it anyway. :-P



Please report how this works out.
  


I will. This will probably be on Monday now, since the machine isn't 
accepting SysRq requests over the serial console. :-(


OK, with the recompiled kernel this appears to work just fine now. I've 
been pounding the box all day with rsyncs, VMware VMs, plenty of web 
serving (inc. SVN) and so far it's holding up just fine. Cheers for the 
diagnosis.


Many thanks,
Chris

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

2007-08-18 Thread Chris Boot

Måns Rullgård wrote:

Chris Boot <[EMAIL PROTECTED]> writes:

  

Måns Rullgård wrote:


Chris Boot <[EMAIL PROTECTED]> writes:


  

All,

I've got a box running RHEL5 and haven't been impressed by ext3
performance on it (running of a 1.5TB HP MSA20 using the cciss
driver). I compiled XFS as a module and tried it out since I'm used to
using it on Debian, which runs much more efficiently. However, every
so often the kernel panics as below. Apologies for the tainted kernel,
but we run VMware Server on the box as well.

Does anyone have any hits/tips for using XFS on Red Hat? What's
causing the panic below, and is there a way around this?

BUG: unable to handle kernel paging request at virtual address b8af9d60
printing eip:
c0415974
*pde = 
Oops:  [#1]
SMP last sysfs file: /block/loop7/dev


[...]
  

[] xfsbufd_wakeup+0x28/0x49 [xfs]
[] shrink_slab+0x56/0x13c
[] try_to_free_pages+0x162/0x23e
[] __alloc_pages+0x18d/0x27e
[] find_or_create_page+0x53/0x8c
[] __getblk+0x162/0x270
[] do_lookup+0x53/0x157
[] ext3_getblk+0x7c/0x233 [ext3]
[] ext3_getblk+0xeb/0x233 [ext3]
[] mntput_no_expire+0x11/0x6a
[] ext3_bread+0x13/0x69 [ext3]
[] htree_dirblock_to_tree+0x22/0x113 [ext3]
[] ext3_htree_fill_tree+0x58/0x1a0 [ext3]
[] do_path_lookup+0x20e/0x25f
[] get_empty_filp+0x99/0x15e
[] ext3_permission+0x0/0xa [ext3]
[] ext3_readdir+0x1ce/0x59b [ext3]
[] filldir+0x0/0xb9
[] sys_fstat64+0x1e/0x23
[] vfs_readdir+0x63/0x8d
[] filldir+0x0/0xb9
[] sys_getdents+0x5f/0x9c
[] syscall_call+0x7/0xb
===



Your Redhat kernel is probably built with 4k stacks and XFS+loop+ext3
seems to be enough to overflow it.

  

Thanks, that explains a lot. However, I don't have any XFS filesystems
mounted over loop devices on ext3. Earlier in the day I had iso9660 on
loop on xfs, could that have caused the issue? It was unmounted and
deleted when this panic occurred.



The mention of /block/loop7/dev and the presence both XFS and ext3
function in the call stack suggested to me that you might have an ext3
filesystem in a loop device on XFS.  I see no other explanation for
that call stack other than a stack overflow, but then we're still back
at the same root cause.

Are you using device-mapper and/or md?  They too are known to blow 4k
stacks when used with XFS.
  


I am. The situation was earlier on was iso9660 on loop on xfs on lvm on 
cciss. I guess that might have smashed the stack undetectably and 
induced corruption encountered later on? When I experienced this panic 
the machine would have probably been performing a backup, which was 
simply a load of ext3/xfs filesystems on lvm on the HP cciss controller. 
None of the loop devices would have been mounted.


I have a few machines now with 4k stacks and using lvm + md + xfs and 
have no trouble at all, but none are Red Hat (all Debian) and none use 
cciss either. Maybe it's a deadly combination.



I'll probably just try and recompile the kernel with 8k stacks and see
how it goes. Screw the support, we're unlikely to get it anyway. :-P



Please report how this works out.
  


I will. This will probably be on Monday now, since the machine isn't 
accepting SysRq requests over the serial console. :-(


Many thanks,
Chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

2007-08-18 Thread Chris Boot

Måns Rullgård wrote:

Chris Boot <[EMAIL PROTECTED]> writes:

  

All,

I've got a box running RHEL5 and haven't been impressed by ext3
performance on it (running of a 1.5TB HP MSA20 using the cciss
driver). I compiled XFS as a module and tried it out since I'm used to
using it on Debian, which runs much more efficiently. However, every
so often the kernel panics as below. Apologies for the tainted kernel,
but we run VMware Server on the box as well.

Does anyone have any hits/tips for using XFS on Red Hat? What's
causing the panic below, and is there a way around this?

BUG: unable to handle kernel paging request at virtual address b8af9d60
printing eip:
c0415974
*pde = 
Oops:  [#1]
SMP last sysfs file: /block/loop7/dev
Modules linked in: loop nfsd exportfs lockd nfs_acl iscsi_trgt(U)
autofs4 hidp nls_utf8 cifs ppdev rfcomm l2cap bluetooth vmnet(U)
vmmon(U) sunrpc ipv6 xfs(U) video sbs i2c_ec button battery asus_acpi
ac lp st sg floppy serio_raw intel_rng pcspkr e100 mii e7xxx_edac
i2c_i801 edac_mc i2c_core e1000 r8169 ide_cd cdrom parport_pc parport
dm_snapshot dm_zero dm_mirror dm_mod cciss mptspi mptscsih
scsi_transport_spi sd_mod scsi_mod mptbase ext3 jbd ehci_hcd ohci_hcd
uhci_hcd
CPU:1
EIP:0060:[]Tainted: P  VLI
EFLAGS: 00010046   (2.6.18-8.1.8.el5 #1) EIP is at
smp_send_reschedule+0x3/0x53
eax: c213f000   ebx: c213f000   ecx: eef84000   edx: c213f000
esi: 1086   edi: f668c000   ebp: f4f2fce8   esp: f4f2fc8c
ds: 007b   es: 007b   ss: 0068
Process crond (pid: 3146, ti=f4f2f000 task=f51faaa0 task.ti=f4f2f000)
Stack: 66d66b89 c041dc23  a9afbb0e fea5 01904500 
000f  0001 0001 c200c6e0 0100 
0069 0180 018fc500 c200d240 0003 0292 f601efc0
f6027e00  0050 Call Trace:
[] try_to_wake_up+0x351/0x37b
[] xfsbufd_wakeup+0x28/0x49 [xfs]
[] shrink_slab+0x56/0x13c
[] try_to_free_pages+0x162/0x23e
[] __alloc_pages+0x18d/0x27e
[] find_or_create_page+0x53/0x8c
[] __getblk+0x162/0x270
[] do_lookup+0x53/0x157
[] ext3_getblk+0x7c/0x233 [ext3]
[] ext3_getblk+0xeb/0x233 [ext3]
[] mntput_no_expire+0x11/0x6a
[] ext3_bread+0x13/0x69 [ext3]
[] htree_dirblock_to_tree+0x22/0x113 [ext3]
[] ext3_htree_fill_tree+0x58/0x1a0 [ext3]
[] do_path_lookup+0x20e/0x25f
[] get_empty_filp+0x99/0x15e
[] ext3_permission+0x0/0xa [ext3]
[] ext3_readdir+0x1ce/0x59b [ext3]
[] filldir+0x0/0xb9
[] sys_fstat64+0x1e/0x23
[] vfs_readdir+0x63/0x8d
[] filldir+0x0/0xb9
[] sys_getdents+0x5f/0x9c
[] syscall_call+0x7/0xb
===



Your Redhat kernel is probably built with 4k stacks and XFS+loop+ext3
seems to be enough to overflow it.
  
Thanks, that explains a lot. However, I don't have any XFS filesystems 
mounted over loop devices on ext3. Earlier in the day I had iso9660 on 
loop on xfs, could that have caused the issue? It was unmounted and 
deleted when this panic occurred.


I'll probably just try and recompile the kernel with 8k stacks and see 
how it goes. Screw the support, we're unlikely to get it anyway. :-P


Many thanks,
Chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

2007-08-18 Thread Chris Boot

All,

I've got a box running RHEL5 and haven't been impressed by ext3 
performance on it (running of a 1.5TB HP MSA20 using the cciss driver). 
I compiled XFS as a module and tried it out since I'm used to using it 
on Debian, which runs much more efficiently. However, every so often the 
kernel panics as below. Apologies for the tainted kernel, but we run 
VMware Server on the box as well.


Does anyone have any hits/tips for using XFS on Red Hat? What's causing 
the panic below, and is there a way around this?


Many thanks,
Chris Boot

BUG: unable to handle kernel paging request at virtual address b8af9d60
printing eip:
c0415974
*pde = 
Oops:  [#1]
SMP 
last sysfs file: /block/loop7/dev
Modules linked in: loop nfsd exportfs lockd nfs_acl iscsi_trgt(U) 
autofs4 hidp nls_utf8 cifs ppdev rfcomm l2cap bluetooth vmnet(U) 
vmmon(U) sunrpc ipv6 xfs(U) video sbs i2c_ec button battery asus_acpi ac 
lp st sg floppy serio_raw intel_rng pcspkr e100 mii e7xxx_edac i2c_i801 
edac_mc i2c_core e1000 r8169 ide_cd cdrom parport_pc parport dm_snapshot 
dm_zero dm_mirror dm_mod cciss mptspi mptscsih scsi_transport_spi sd_mod 
scsi_mod mptbase ext3 jbd ehci_hcd ohci_hcd uhci_hcd

CPU:1
EIP:0060:[]Tainted: P  VLI
EFLAGS: 00010046   (2.6.18-8.1.8.el5 #1) 
EIP is at smp_send_reschedule+0x3/0x53

eax: c213f000   ebx: c213f000   ecx: eef84000   edx: c213f000
esi: 1086   edi: f668c000   ebp: f4f2fce8   esp: f4f2fc8c
ds: 007b   es: 007b   ss: 0068
Process crond (pid: 3146, ti=f4f2f000 task=f51faaa0 task.ti=f4f2f000)
Stack: 66d66b89 c041dc23  a9afbb0e fea5 01904500  
000f 
   0001 0001 c200c6e0 0100  0069 
0180 
  018fc500 c200d240 0003 0292 f601efc0 f6027e00  
0050 
Call Trace:

[] try_to_wake_up+0x351/0x37b
[] xfsbufd_wakeup+0x28/0x49 [xfs]
[] shrink_slab+0x56/0x13c
[] try_to_free_pages+0x162/0x23e
[] __alloc_pages+0x18d/0x27e
[] find_or_create_page+0x53/0x8c
[] __getblk+0x162/0x270
[] do_lookup+0x53/0x157
[] ext3_getblk+0x7c/0x233 [ext3]
[] ext3_getblk+0xeb/0x233 [ext3]
[] mntput_no_expire+0x11/0x6a
[] ext3_bread+0x13/0x69 [ext3]
[] htree_dirblock_to_tree+0x22/0x113 [ext3]
[] ext3_htree_fill_tree+0x58/0x1a0 [ext3]
[] do_path_lookup+0x20e/0x25f
[] get_empty_filp+0x99/0x15e
[] ext3_permission+0x0/0xa [ext3]
[] ext3_readdir+0x1ce/0x59b [ext3]
[] filldir+0x0/0xb9
[] sys_fstat64+0x1e/0x23
[] vfs_readdir+0x63/0x8d
[] filldir+0x0/0xb9
[] sys_getdents+0x5f/0x9c
[] syscall_call+0x7/0xb
===
Code: 5d c3 b9 01 00 00 00 31 d2 6a 00 b8 f0 5a 41 c0 e8 2a ff ff ff fa 
e8 52 16 00 00 fb 58 c3 b8 54 3a 66 c0 e9 8e 6b 1e 00 53 89 c3 <0f> a3 
05 60 1f 6d c0 19 c0 85 c0 75 27 e8 bf db 00 00 50 68 55 
EIP: [] smp_send_reschedule+0x3/0x53 SS:ESP 0068:f4f2fc8c

<0>Kernel panic - not syncing: Fatal exception

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

2007-08-18 Thread Chris Boot

All,

I've got a box running RHEL5 and haven't been impressed by ext3 
performance on it (running of a 1.5TB HP MSA20 using the cciss driver). 
I compiled XFS as a module and tried it out since I'm used to using it 
on Debian, which runs much more efficiently. However, every so often the 
kernel panics as below. Apologies for the tainted kernel, but we run 
VMware Server on the box as well.


Does anyone have any hits/tips for using XFS on Red Hat? What's causing 
the panic below, and is there a way around this?


Many thanks,
Chris Boot

BUG: unable to handle kernel paging request at virtual address b8af9d60
printing eip:
c0415974
*pde = 
Oops:  [#1]
SMP 
last sysfs file: /block/loop7/dev
Modules linked in: loop nfsd exportfs lockd nfs_acl iscsi_trgt(U) 
autofs4 hidp nls_utf8 cifs ppdev rfcomm l2cap bluetooth vmnet(U) 
vmmon(U) sunrpc ipv6 xfs(U) video sbs i2c_ec button battery asus_acpi ac 
lp st sg floppy serio_raw intel_rng pcspkr e100 mii e7xxx_edac i2c_i801 
edac_mc i2c_core e1000 r8169 ide_cd cdrom parport_pc parport dm_snapshot 
dm_zero dm_mirror dm_mod cciss mptspi mptscsih scsi_transport_spi sd_mod 
scsi_mod mptbase ext3 jbd ehci_hcd ohci_hcd uhci_hcd

CPU:1
EIP:0060:[c0415974]Tainted: P  VLI
EFLAGS: 00010046   (2.6.18-8.1.8.el5 #1) 
EIP is at smp_send_reschedule+0x3/0x53

eax: c213f000   ebx: c213f000   ecx: eef84000   edx: c213f000
esi: 1086   edi: f668c000   ebp: f4f2fce8   esp: f4f2fc8c
ds: 007b   es: 007b   ss: 0068
Process crond (pid: 3146, ti=f4f2f000 task=f51faaa0 task.ti=f4f2f000)
Stack: 66d66b89 c041dc23  a9afbb0e fea5 01904500  
000f 
   0001 0001 c200c6e0 0100  0069 
0180 
  018fc500 c200d240 0003 0292 f601efc0 f6027e00  
0050 
Call Trace:

[c041dc23] try_to_wake_up+0x351/0x37b
[f936884e] xfsbufd_wakeup+0x28/0x49 [xfs]
[c04572f9] shrink_slab+0x56/0x13c
[c0457c0c] try_to_free_pages+0x162/0x23e
[c0454064] __alloc_pages+0x18d/0x27e
[c045214e] find_or_create_page+0x53/0x8c
[c046c7b1] __getblk+0x162/0x270
[c0475be0] do_lookup+0x53/0x157
[f889138f] ext3_getblk+0x7c/0x233 [ext3]
[f88913fe] ext3_getblk+0xeb/0x233 [ext3]
[c048215c] mntput_no_expire+0x11/0x6a
[f889226e] ext3_bread+0x13/0x69 [ext3]
[f8895606] htree_dirblock_to_tree+0x22/0x113 [ext3]
[f889574f] ext3_htree_fill_tree+0x58/0x1a0 [ext3]
[c047828b] do_path_lookup+0x20e/0x25f
[c046b987] get_empty_filp+0x99/0x15e
[f889d611] ext3_permission+0x0/0xa [ext3]
[f888eaa3] ext3_readdir+0x1ce/0x59b [ext3]
[c047a0dd] filldir+0x0/0xb9
[c0472973] sys_fstat64+0x1e/0x23
[c047a1f9] vfs_readdir+0x63/0x8d
[c047a0dd] filldir+0x0/0xb9
[c047a447] sys_getdents+0x5f/0x9c
[c0403eff] syscall_call+0x7/0xb
===
Code: 5d c3 b9 01 00 00 00 31 d2 6a 00 b8 f0 5a 41 c0 e8 2a ff ff ff fa 
e8 52 16 00 00 fb 58 c3 b8 54 3a 66 c0 e9 8e 6b 1e 00 53 89 c3 0f a3 
05 60 1f 6d c0 19 c0 85 c0 75 27 e8 bf db 00 00 50 68 55 
EIP: [c0415974] smp_send_reschedule+0x3/0x53 SS:ESP 0068:f4f2fc8c

0Kernel panic - not syncing: Fatal exception

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

2007-08-18 Thread Chris Boot

Måns Rullgård wrote:

Chris Boot [EMAIL PROTECTED] writes:

  

All,

I've got a box running RHEL5 and haven't been impressed by ext3
performance on it (running of a 1.5TB HP MSA20 using the cciss
driver). I compiled XFS as a module and tried it out since I'm used to
using it on Debian, which runs much more efficiently. However, every
so often the kernel panics as below. Apologies for the tainted kernel,
but we run VMware Server on the box as well.

Does anyone have any hits/tips for using XFS on Red Hat? What's
causing the panic below, and is there a way around this?

BUG: unable to handle kernel paging request at virtual address b8af9d60
printing eip:
c0415974
*pde = 
Oops:  [#1]
SMP last sysfs file: /block/loop7/dev
Modules linked in: loop nfsd exportfs lockd nfs_acl iscsi_trgt(U)
autofs4 hidp nls_utf8 cifs ppdev rfcomm l2cap bluetooth vmnet(U)
vmmon(U) sunrpc ipv6 xfs(U) video sbs i2c_ec button battery asus_acpi
ac lp st sg floppy serio_raw intel_rng pcspkr e100 mii e7xxx_edac
i2c_i801 edac_mc i2c_core e1000 r8169 ide_cd cdrom parport_pc parport
dm_snapshot dm_zero dm_mirror dm_mod cciss mptspi mptscsih
scsi_transport_spi sd_mod scsi_mod mptbase ext3 jbd ehci_hcd ohci_hcd
uhci_hcd
CPU:1
EIP:0060:[c0415974]Tainted: P  VLI
EFLAGS: 00010046   (2.6.18-8.1.8.el5 #1) EIP is at
smp_send_reschedule+0x3/0x53
eax: c213f000   ebx: c213f000   ecx: eef84000   edx: c213f000
esi: 1086   edi: f668c000   ebp: f4f2fce8   esp: f4f2fc8c
ds: 007b   es: 007b   ss: 0068
Process crond (pid: 3146, ti=f4f2f000 task=f51faaa0 task.ti=f4f2f000)
Stack: 66d66b89 c041dc23  a9afbb0e fea5 01904500 
000f  0001 0001 c200c6e0 0100 
0069 0180 018fc500 c200d240 0003 0292 f601efc0
f6027e00  0050 Call Trace:
[c041dc23] try_to_wake_up+0x351/0x37b
[f936884e] xfsbufd_wakeup+0x28/0x49 [xfs]
[c04572f9] shrink_slab+0x56/0x13c
[c0457c0c] try_to_free_pages+0x162/0x23e
[c0454064] __alloc_pages+0x18d/0x27e
[c045214e] find_or_create_page+0x53/0x8c
[c046c7b1] __getblk+0x162/0x270
[c0475be0] do_lookup+0x53/0x157
[f889138f] ext3_getblk+0x7c/0x233 [ext3]
[f88913fe] ext3_getblk+0xeb/0x233 [ext3]
[c048215c] mntput_no_expire+0x11/0x6a
[f889226e] ext3_bread+0x13/0x69 [ext3]
[f8895606] htree_dirblock_to_tree+0x22/0x113 [ext3]
[f889574f] ext3_htree_fill_tree+0x58/0x1a0 [ext3]
[c047828b] do_path_lookup+0x20e/0x25f
[c046b987] get_empty_filp+0x99/0x15e
[f889d611] ext3_permission+0x0/0xa [ext3]
[f888eaa3] ext3_readdir+0x1ce/0x59b [ext3]
[c047a0dd] filldir+0x0/0xb9
[c0472973] sys_fstat64+0x1e/0x23
[c047a1f9] vfs_readdir+0x63/0x8d
[c047a0dd] filldir+0x0/0xb9
[c047a447] sys_getdents+0x5f/0x9c
[c0403eff] syscall_call+0x7/0xb
===



Your Redhat kernel is probably built with 4k stacks and XFS+loop+ext3
seems to be enough to overflow it.
  
Thanks, that explains a lot. However, I don't have any XFS filesystems 
mounted over loop devices on ext3. Earlier in the day I had iso9660 on 
loop on xfs, could that have caused the issue? It was unmounted and 
deleted when this panic occurred.


I'll probably just try and recompile the kernel with 8k stacks and see 
how it goes. Screw the support, we're unlikely to get it anyway. :-P


Many thanks,
Chris

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

2007-08-18 Thread Chris Boot

Måns Rullgård wrote:

Chris Boot [EMAIL PROTECTED] writes:

  

Måns Rullgård wrote:


Chris Boot [EMAIL PROTECTED] writes:


  

All,

I've got a box running RHEL5 and haven't been impressed by ext3
performance on it (running of a 1.5TB HP MSA20 using the cciss
driver). I compiled XFS as a module and tried it out since I'm used to
using it on Debian, which runs much more efficiently. However, every
so often the kernel panics as below. Apologies for the tainted kernel,
but we run VMware Server on the box as well.

Does anyone have any hits/tips for using XFS on Red Hat? What's
causing the panic below, and is there a way around this?

BUG: unable to handle kernel paging request at virtual address b8af9d60
printing eip:
c0415974
*pde = 
Oops:  [#1]
SMP last sysfs file: /block/loop7/dev


[...]
  

[f936884e] xfsbufd_wakeup+0x28/0x49 [xfs]
[c04572f9] shrink_slab+0x56/0x13c
[c0457c0c] try_to_free_pages+0x162/0x23e
[c0454064] __alloc_pages+0x18d/0x27e
[c045214e] find_or_create_page+0x53/0x8c
[c046c7b1] __getblk+0x162/0x270
[c0475be0] do_lookup+0x53/0x157
[f889138f] ext3_getblk+0x7c/0x233 [ext3]
[f88913fe] ext3_getblk+0xeb/0x233 [ext3]
[c048215c] mntput_no_expire+0x11/0x6a
[f889226e] ext3_bread+0x13/0x69 [ext3]
[f8895606] htree_dirblock_to_tree+0x22/0x113 [ext3]
[f889574f] ext3_htree_fill_tree+0x58/0x1a0 [ext3]
[c047828b] do_path_lookup+0x20e/0x25f
[c046b987] get_empty_filp+0x99/0x15e
[f889d611] ext3_permission+0x0/0xa [ext3]
[f888eaa3] ext3_readdir+0x1ce/0x59b [ext3]
[c047a0dd] filldir+0x0/0xb9
[c0472973] sys_fstat64+0x1e/0x23
[c047a1f9] vfs_readdir+0x63/0x8d
[c047a0dd] filldir+0x0/0xb9
[c047a447] sys_getdents+0x5f/0x9c
[c0403eff] syscall_call+0x7/0xb
===



Your Redhat kernel is probably built with 4k stacks and XFS+loop+ext3
seems to be enough to overflow it.

  

Thanks, that explains a lot. However, I don't have any XFS filesystems
mounted over loop devices on ext3. Earlier in the day I had iso9660 on
loop on xfs, could that have caused the issue? It was unmounted and
deleted when this panic occurred.



The mention of /block/loop7/dev and the presence both XFS and ext3
function in the call stack suggested to me that you might have an ext3
filesystem in a loop device on XFS.  I see no other explanation for
that call stack other than a stack overflow, but then we're still back
at the same root cause.

Are you using device-mapper and/or md?  They too are known to blow 4k
stacks when used with XFS.
  


I am. The situation was earlier on was iso9660 on loop on xfs on lvm on 
cciss. I guess that might have smashed the stack undetectably and 
induced corruption encountered later on? When I experienced this panic 
the machine would have probably been performing a backup, which was 
simply a load of ext3/xfs filesystems on lvm on the HP cciss controller. 
None of the loop devices would have been mounted.


I have a few machines now with 4k stacks and using lvm + md + xfs and 
have no trouble at all, but none are Red Hat (all Debian) and none use 
cciss either. Maybe it's a deadly combination.



I'll probably just try and recompile the kernel with 8k stacks and see
how it goes. Screw the support, we're unlikely to get it anyway. :-P



Please report how this works out.
  


I will. This will probably be on Monday now, since the machine isn't 
accepting SysRq requests over the serial console. :-(


Many thanks,
Chris

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SiI 3112A + Seagate HDs = still no go? [SOLVED]

2005-08-17 Thread Chris Boot

Tejun Heo wrote:


Chris Boot wrote:


Some interesting developments!

I installed a fresh copy of Windows, and all the VIA and nVidia and  
so on drivers. At some point during all this (a period of relatively  
heavy disk IO), the computer seemed to crash and I rebooted it. It  
then worked fine for a while, but during my perfmon testing it 
seemed  to do the same thing. This time I left it for a while and it 
did  eventually wake up again, so I'm guessing the controller is a 
bit  fubared. Perfmon did indeed show several dips down to or very 
close  to 0 during the write operation, with peaks up to 48 MB/sec, 
which is  pretty respectable. So, time to replace the brand-new 
controller I  guess.


Now, do you think this is just my one particular controller card and  
a simple return would fix the problem, or is it more likely a 
problem  with the whole range? It's an Innovision EIO SATA 
controller: http:// www.ivmm.com/eio/products/index.htm


Would it be a safer bet to go for the Adaptec controller of the same  
variety? How reliable are they?



 I frankly don't know.  Maybe it's just one faulty controller, 
connector or whatever.  Maybe the card manufacturer screwed up 
somewhere.  I mean, the only course I took in electronics is 
introductory digital circuits which used 74xx chips and push triggered 
clock on a breadboard.  What would I know about gigahertz signaling 
error.  :-p


 Though, one thing I can say is majority of 311x controllers don't 
seem to suffer from this problem.  So, take your pick.


Right, I've replaced my previous controller with an Adaptec AHA-1205SA, 
and I'm rebulding 2 RAID-1 arrays at 50MB/sec without a hitch.


Thanks for your help diagnosing my problem, it was much appreciated!

Many thanks,
Chris

--
Chris Boot
[EMAIL PROTECTED]
http://www.bootc.net/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SiI 3112A + Seagate HDs = still no go? [SOLVED]

2005-08-17 Thread Chris Boot

Tejun Heo wrote:


Chris Boot wrote:


Some interesting developments!

I installed a fresh copy of Windows, and all the VIA and nVidia and  
so on drivers. At some point during all this (a period of relatively  
heavy disk IO), the computer seemed to crash and I rebooted it. It  
then worked fine for a while, but during my perfmon testing it 
seemed  to do the same thing. This time I left it for a while and it 
did  eventually wake up again, so I'm guessing the controller is a 
bit  fubared. Perfmon did indeed show several dips down to or very 
close  to 0 during the write operation, with peaks up to 48 MB/sec, 
which is  pretty respectable. So, time to replace the brand-new 
controller I  guess.


Now, do you think this is just my one particular controller card and  
a simple return would fix the problem, or is it more likely a 
problem  with the whole range? It's an Innovision EIO SATA 
controller: http:// www.ivmm.com/eio/products/index.htm


Would it be a safer bet to go for the Adaptec controller of the same  
variety? How reliable are they?



 I frankly don't know.  Maybe it's just one faulty controller, 
connector or whatever.  Maybe the card manufacturer screwed up 
somewhere.  I mean, the only course I took in electronics is 
introductory digital circuits which used 74xx chips and push triggered 
clock on a breadboard.  What would I know about gigahertz signaling 
error.  :-p


 Though, one thing I can say is majority of 311x controllers don't 
seem to suffer from this problem.  So, take your pick.


Right, I've replaced my previous controller with an Adaptec AHA-1205SA, 
and I'm rebulding 2 RAID-1 arrays at 50MB/sec without a hitch.


Thanks for your help diagnosing my problem, it was much appreciated!

Many thanks,
Chris

--
Chris Boot
[EMAIL PROTECTED]
http://www.bootc.net/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SiI 3112A + Seagate HDs = still no go?

2005-08-13 Thread Chris Boot


On 13 Aug 2005, at 2:13, Tejun Heo wrote:



 Hello, Chris.

Chris Boot wrote:


On 12 Aug 2005, at 15:08, Tejun Heo wrote:



[adding cc to Jeff Garzik. (Hi!)]

 Hi again, Chris.

 Unfortunately, I'm as lost as you are.  Can you please do the   
followings?


 * Verify if read is free from the problem.  ie. does "dd if=/ 
dev/ sd? of=/dev/null" work?


Works like a treat at 30 MB/s. I do get a few errors in the log   
(repeated a couple of times), but they seem mostly harmless:

ata1: status=0x51 { DriveReady SeekComplete Error }
ata1: error=0x04 { DriveStatusError }



 This is IDE ABRT error and it indicates that something strange is  
going on.  You're not getting this kind of error on VIA controller,  
right?


I most certainly am not.



 * Turn on ATA_DEBUG and ATA_VERBOSE_DEBUG in include/linux/  
libata.h (change #undef's to #define's) and make the drive  
hang.   The log should show what was going on.



While untarring and compiling the new kernel I got lots of:
ata1: status=0x51 { DriveReady SeekComplete Error }
ata1: error=0x84 { DriveStatusError BadCRC }



 Wow, this is CRC error.  Something is wrong w/ your controller.


Great...

Syslog seems to die log before I get anything useful, and setting   
loglevel 9 with SysRq gives:

ata_fill_sg: PRD[126]: 0x1206A000, 0x1000)
ata_fill_sg: PRD[127]: 0x1206B000, 0x1000)
ata_dev_select: ENTER, ata1: device 0, wait 1
ATA: abnormal status 0xD9 on port 0xE0804087
ATA: abnormal status 0xD9 on port 0xE0804087
ata_tf_load_mmio: hob: feat 0x0 nsect 0x3, lba 0x1 0x0 0x0
ata_tf_load_mmio: feat 0x0 nsect 0xF8 lba 0x1A 0xEF 0x33
ata_tf_load_mmio: device 0xE0
ATA: abnormal statux 0xD9 on port 0xE0804087
ata_exec_command_mmio: ata: cmd 0x35
ata_scsi_translate: EXIT
It then hangs for exactly 30 seconds, and more stuff flies by   
followed by much the same messages EXCEPT:
1. There seems to be one less ata_fill_sg line every time, since  
PRD [XXX] decrements by one every time.
2. The ata_tf_load_mmio lines give different nsect and lba, the   
device stays the same.




 30 secs is SCSI command timeout and retrying w/ one less chunk is  
sd driver's error recovery behavior.


 It seems that a lot of errors occur while bits are going through  
your SATA connection.  I don't know about Seagate drives, but my  
Samsung drive sometimes locks up if it gets weird packets/ 
commands.  This might be also your case.  PHY-resetting usually  
gets the drive back online but currently libata doesn't do any such  
error recovery actions.  To make sure that it's because of faulty  
controller, can you please try the following?


 * Monitor how IO goes on the drive in Windows.  You can do this by
 - Start->Run and enter perfmon.
 - After perfmon starts, right click on (heh heh, I guess this is
   one of those few times you read this on linux kernel mailing
   list) counter list and select add. Add DiskBytes/sec counter of
   PhysicalDisk object.
 - Adjust scale to 0.010.  Also, change color to black to make
   it stand out.
 - start dd.

 I think, if the errors are due to hardware error, the perfmon  
graph will show some stuttering when it hits command timeout.  So,  
write to disk, as writing seems to cause timeouts.  If the problem  
also happens on Windows, it's highly likely that you have a faulty  
controller.


Some interesting developments!

I installed a fresh copy of Windows, and all the VIA and nVidia and  
so on drivers. At some point during all this (a period of relatively  
heavy disk IO), the computer seemed to crash and I rebooted it. It  
then worked fine for a while, but during my perfmon testing it seemed  
to do the same thing. This time I left it for a while and it did  
eventually wake up again, so I'm guessing the controller is a bit  
fubared. Perfmon did indeed show several dips down to or very close  
to 0 during the write operation, with peaks up to 48 MB/sec, which is  
pretty respectable. So, time to replace the brand-new controller I  
guess.


Now, do you think this is just my one particular controller card and  
a simple return would fix the problem, or is it more likely a problem  
with the whole range? It's an Innovision EIO SATA controller: http:// 
www.ivmm.com/eio/products/index.htm


Would it be a safer bet to go for the Adaptec controller of the same  
variety? How reliable are they?


Many thanks,
Chris

--
Chris Boot
[EMAIL PROTECTED]
http://www.bootc.net/


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SiI 3112A + Seagate HDs = still no go?

2005-08-13 Thread Chris Boot


On 13 Aug 2005, at 2:13, Tejun Heo wrote:



 Hello, Chris.

Chris Boot wrote:


On 12 Aug 2005, at 15:08, Tejun Heo wrote:



[adding cc to Jeff Garzik. (Hi!)]

 Hi again, Chris.

 Unfortunately, I'm as lost as you are.  Can you please do the   
followings?


 * Verify if read is free from the problem.  ie. does dd if=/ 
dev/ sd? of=/dev/null work?


Works like a treat at 30 MB/s. I do get a few errors in the log   
(repeated a couple of times), but they seem mostly harmless:

ata1: status=0x51 { DriveReady SeekComplete Error }
ata1: error=0x04 { DriveStatusError }



 This is IDE ABRT error and it indicates that something strange is  
going on.  You're not getting this kind of error on VIA controller,  
right?


I most certainly am not.



 * Turn on ATA_DEBUG and ATA_VERBOSE_DEBUG in include/linux/  
libata.h (change #undef's to #define's) and make the drive  
hang.   The log should show what was going on.



While untarring and compiling the new kernel I got lots of:
ata1: status=0x51 { DriveReady SeekComplete Error }
ata1: error=0x84 { DriveStatusError BadCRC }



 Wow, this is CRC error.  Something is wrong w/ your controller.


Great...

Syslog seems to die log before I get anything useful, and setting   
loglevel 9 with SysRq gives:

ata_fill_sg: PRD[126]: 0x1206A000, 0x1000)
ata_fill_sg: PRD[127]: 0x1206B000, 0x1000)
ata_dev_select: ENTER, ata1: device 0, wait 1
ATA: abnormal status 0xD9 on port 0xE0804087
ATA: abnormal status 0xD9 on port 0xE0804087
ata_tf_load_mmio: hob: feat 0x0 nsect 0x3, lba 0x1 0x0 0x0
ata_tf_load_mmio: feat 0x0 nsect 0xF8 lba 0x1A 0xEF 0x33
ata_tf_load_mmio: device 0xE0
ATA: abnormal statux 0xD9 on port 0xE0804087
ata_exec_command_mmio: ata: cmd 0x35
ata_scsi_translate: EXIT
It then hangs for exactly 30 seconds, and more stuff flies by   
followed by much the same messages EXCEPT:
1. There seems to be one less ata_fill_sg line every time, since  
PRD [XXX] decrements by one every time.
2. The ata_tf_load_mmio lines give different nsect and lba, the   
device stays the same.




 30 secs is SCSI command timeout and retrying w/ one less chunk is  
sd driver's error recovery behavior.


 It seems that a lot of errors occur while bits are going through  
your SATA connection.  I don't know about Seagate drives, but my  
Samsung drive sometimes locks up if it gets weird packets/ 
commands.  This might be also your case.  PHY-resetting usually  
gets the drive back online but currently libata doesn't do any such  
error recovery actions.  To make sure that it's because of faulty  
controller, can you please try the following?


 * Monitor how IO goes on the drive in Windows.  You can do this by
 - Start-Run and enter perfmon.
 - After perfmon starts, right click on (heh heh, I guess this is
   one of those few times you read this on linux kernel mailing
   list) counter list and select add. Add DiskBytes/sec counter of
   PhysicalDisk object.
 - Adjust scale to 0.010.  Also, change color to black to make
   it stand out.
 - start dd.

 I think, if the errors are due to hardware error, the perfmon  
graph will show some stuttering when it hits command timeout.  So,  
write to disk, as writing seems to cause timeouts.  If the problem  
also happens on Windows, it's highly likely that you have a faulty  
controller.


Some interesting developments!

I installed a fresh copy of Windows, and all the VIA and nVidia and  
so on drivers. At some point during all this (a period of relatively  
heavy disk IO), the computer seemed to crash and I rebooted it. It  
then worked fine for a while, but during my perfmon testing it seemed  
to do the same thing. This time I left it for a while and it did  
eventually wake up again, so I'm guessing the controller is a bit  
fubared. Perfmon did indeed show several dips down to or very close  
to 0 during the write operation, with peaks up to 48 MB/sec, which is  
pretty respectable. So, time to replace the brand-new controller I  
guess.


Now, do you think this is just my one particular controller card and  
a simple return would fix the problem, or is it more likely a problem  
with the whole range? It's an Innovision EIO SATA controller: http:// 
www.ivmm.com/eio/products/index.htm


Would it be a safer bet to go for the Adaptec controller of the same  
variety? How reliable are they?


Many thanks,
Chris

--
Chris Boot
[EMAIL PROTECTED]
http://www.bootc.net/


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SiI 3112A + Seagate HDs = still no go?

2005-08-12 Thread Chris Boot

On 12 Aug 2005, at 15:08, Tejun Heo wrote:


Chris Boot wrote:


Hi Tejun,
On 12 Aug 2005, at 12:33, Chris Boot wrote:


Hi Tejun,

On 12 Aug 2005, at 12:28, Tejun Heo wrote:





 Hello, Chris.

Chris Boot wrote:




On 12 Aug 2005, at 4:24, Tejun Heo wrote:




Chris Boot wrote:





Hi all,
I just recently took the plunge and bought 4 250 GB Seagate
drives  and a 2 port Silicon Image 3112A controller card for   
the 2  drives my  motherboard doesn't handle. No matter how   
hard I try, I  can't get the  hard drives to work: they are   
detected correctly  and work reasonably  well under _very_   
light load, but anything  like building a RAID array  is a  
bit  much and the whole controller  seems to lock up.
I've tried adding the drive to the blacklist in the  
sata_sil.c   driver  and I still have the same trouble: as  
you can see the   messages below  relate to my patched kernel  
with the blacklist   fix. I've seen that  this was discussed  
just yesterday, but  that  seemed to give nothing:  http:// 
www.ussg.iu.edu/hypermail/ linux/ kernel/0508.1/0310.html
Ready and willing to hack my kernel to pieces; this machine  
is  no  use  until I get all the drives working! Needless to  
say  the  drives  connected to the on-board VIA controller  
work  fine, as do  the drives  currently on the SiI  
controller if I  swap them around.

Any ideas?
TIA
Chris






[added linux-ide to cc list]

 Can you please try w/ vanilla kernel (2.6.12 or 2.6.13-rc)?
And  w/ one drive only?




I unplugged both drives from my on-board SATA controller and   
left  just one connected to the 3112A controller. Rebooted with  
a  fresh,  vanilla 2.6.13-rc6 and ran:






 You can leave drives on on-board SATA controller.  It wouldn't   
make any difference.






dd if=/dev/zero of=test.img bs=1M count=16384
After about 30 seconds I got the crash and the kernel started
repeating every 30 seconds (with different sector numbers):

ata1: command 0x35 timeout, stat 0xd9 host_stat 0x1
ata1: status=0xd9 { Busy }
SCSI error : <0 0 0 0> return code = 0x8002
sda: Current: sense key=0xb
ASC=0x47 ASCQ=0x0
end_request: I/O error, dev sda, sector 14937602
ATA: abnormal status 0xD9 on port E0802087
ATA: abnormal status 0xD9 on port E0802087
ATA: abnormal status 0xD9 on port E0802087
dmesg:
Linux version 2.6.13-rc6 ([EMAIL PROTECTED]) (gcc  
version   3.3.5-20050130 (Gentoo 3.3.5.20050130-r1,  
ssp-3.3.5.20050130-1,   pie-8.7.7.1)) #1 Fri Aug 12 12:31:25  
BST 2005

...
libata version 1.11 loaded.
sata_sil version 0.9
ACPI: PCI Interrupt :00:0a.0[A] -> GSI 18 (level, low) ->  
IRQ  177
ata1: SATA max UDMA/100 cmd 0xE0802080 ctl 0xE080208A bmdma
0xE0802000 irq 177
ata2: SATA max UDMA/100 cmd 0xE08020C0 ctl 0xE08020CA bmdma
0xE0802008 irq 177
ata1: dev 0 cfg 49:2f00 82:346b 83:7d01 84:4023 85:3469  
86:3c01   87:4023 88:207f

ata1: dev 0 ATA, max UDMA/133, 488397168 sectors: lba48
ata1: dev 0 configured for UDMA/100
scsi0 : sata_sil
ata2: no device found (phy stat )
scsi1 : sata_sil
  Vendor: ATA   Model: ST3250823AS   Rev: 3.03
  Type:   Direct-Access  ANSI SCSI  
revision: 05

sata_via version 1.1
ACPI: PCI Interrupt :00:0f.0[B] -> Link [ALKA] -> GSI 20   
(level,  low) -> IRQ 169

PCI: Via IRQ fixup for :00:0f.0, from 11 to 9
sata_via(:00:0f.0): routed to hard irq line 9
ata3: SATA max UDMA/133 cmd 0xB400 ctl 0xB802 bmdma 0xC400 irq 169
ata4: SATA max UDMA/133 cmd 0xBC00 ctl 0xC002 bmdma 0xC408 irq 169
ata3: no device found (phy stat )
scsi2 : sata_via
ata4: no device found (phy stat )
scsi3 : sata_via
SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
SCSI device sda: drive cache: write back
SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
SCSI device sda: drive cache: write back
sda: sda1 sda2 sda3
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
Attached scsi generic sg0 at scsi0, channel 0, id 0, lun 0,   
type 0
I forgot to mention previously but I even tried with "noapic   
nolapic  acpi=off pci=routeirq" and got the same trouble.






 This is weird as ST3250823AS (and all Seagate .8 drives) are   
known to work without any problem with sii 3112/3114.  I  
currently  don't own such a drive but someone confirmed me that  
ST3250823AS  works w/ sii 3114 without any problem (including  
bonnie++ results  and all).  So, I don't think it's the good old  
mod15write problem.


 I hope it's just a bad hardware, cable or something like that;   
otherwise, you're hitting a new bug.  Can you verify if the  
drive  works under windows?





Well, what piqued my interest is that the same drives work fine  
on  my on-board sata_via controller. All 4 drives were bought at  
the  same time and *seem* to be from the same batch, and all work  
fine  on the VIA controller and none work on the 3112A. I've also  
tried  different cables, all of which are Belkin which I thought  
were  decent qual

Re: SiI 3112A + Seagate HDs = still no go?

2005-08-12 Thread Chris Boot

Hi there,

I get very different symptoms indeed. My drive isn't in the  
blacklist, and adding it has little effect (status 0xd9 to 0xd8, no  
other differences). Once the controller hangs, I can't even kill dd  
or login at a different terminal, just a complete lockup. If I have 2  
drives plugged in, running the dd on one of them also hangs the  
other, thus I suspect the controller. Also, reading via dd is fine,  
only writing has trouble.


Chris

On 12 Aug 2005, at 16:19, Roger Heflin wrote:



With the Segate sata's I worked with before, I had to
actually remove them from the blacklist, this was a couple
of months ago with the native sata seagate disks.

With the drive in the blacklist the drive worked right
under light conditions, but under a dd read from the boot
seagate the entire machine appeared to block on any io
going to that disk, it did not stop (verified by vmstat),
but I could never get the 55-60MiB/second expected, and
was getting around 15MiB/second, with enormous amounts
of interrupts, after removing it from the blacklist,
I got the 55-60MiB/second rate, and the interrupts were
much more reasonable, and the response of the system
was actually useable.When the lockup occurred, stopping
the dd resulting in all things unlocking and continuing
on, I duplicated this several times with the latest kernel
at the time.

   Roger



-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Chris Boot
Sent: Thursday, August 11, 2005 4:55 PM
To: linux-kernel@vger.kernel.org
Subject: SiI 3112A + Seagate HDs = still no go?

Hi all,

I just recently took the plunge and bought 4 250 GB Seagate
drives and a 2 port Silicon Image 3112A controller card for
the 2 drives my motherboard doesn't handle. No matter how
hard I try, I can't get the hard drives to work: they are
detected correctly and work reasonably well under _very_
light load, but anything like building a RAID array is a bit
much and the whole controller seems to lock up.

I've tried adding the drive to the blacklist in the
sata_sil.c driver and I still have the same trouble: as you
can see the messages below relate to my patched kernel with
the blacklist fix. I've seen that this was discussed just
yesterday, but that seemed to give nothing:
http://www.ussg.iu.edu/hypermail/linux/kernel/0508.1/0310.html

Ready and willing to hack my kernel to pieces; this machine
is no use until I get all the drives working! Needless to say
the drives connected to the on-board VIA controller work
fine, as do the drives currently on the SiI controller if I
swap them around.

Any ideas?

TIA
Chris

The following messages are sent to the log when everything goes mad:

ata1: command 0x35 timeout, stat 0xd8 host_stat 0x0
ata1: status=0xd8 { Busy }
SCSI error : <0 0 0 0> return code = 0x8002
sda: Current: sense key=0xb
ASC=0x47 ASCQ=0x0
end_request: I/O error, dev sda, sector 2990370
ATA: abnormal status 0xD8 on port E0802087
ATA: abnormal status 0xD8 on port E0802087
ATA: abnormal status 0xD8 on port E0802087 [ the above is
transcribed so may not be 100% accurate ]

Dmesg log during boot (and detection):

Aug 11 21:47:05 arcadia Linux version 2.6.12-gentoo-r6
([EMAIL PROTECTED]) (gcc version 3.3.5-20050130 (Gentoo
3.3.5.20050130-r1, ssp-3.3.5.20050130-1, pie-8.7.7.1)) #2 Thu
Aug 11 20:19:00 BST 2005 ...
Aug 11 17:30:12 arcadia sata_sil version 0.9 Aug 11 17:30:12
arcadia ACPI: PCI Interrupt :00:0a.0[A] -> GSI 18 (level,
low) -> IRQ 177 Aug 11 17:30:12 arcadia ata1: SATA max
UDMA/100 cmd 0xE0802080 ctl 0xE080208A bmdma 0xE0802000 irq
177 Aug 11 17:30:12 arcadia ata2: SATA max UDMA/100 cmd
0xE08020C0 ctl 0xE08020CA bmdma 0xE0802008 irq 177 Aug 11
17:30:12 arcadia ata1: dev 0 cfg 49:2f00 82:346b 83:7d01
84:4023 85:3469 86:3c01 87:4023 88:207f
Aug 11 17:30:12 arcadia ata1: dev 0 ATA, max UDMA/133, 488397168
sectors: lba48
Aug 11 17:30:12 arcadia ata1(0): applying Seagate errata fix
Aug 11 17:30:12 arcadia ata1: dev 0 configured for UDMA/100
Aug 11 17:30:12 arcadia scsi0 : sata_sil Aug 11 17:30:12
arcadia ata2: dev 0 cfg 49:2f00 82:346b 83:7d01
84:4023 85:3469 86:3c01 87:4023 88:207f
Aug 11 17:30:12 arcadia ata2: dev 0 ATA, max UDMA/133, 488397168
sectors: lba48
Aug 11 17:30:12 arcadia ata2(0): applying Seagate errata fix
Aug 11 17:30:12 arcadia ata2: dev 0 configured for UDMA/100
Aug 11 17:30:12 arcadia scsi1 : sata_sil
Aug 11 17:30:12 arcadia Vendor: ATA   Model: ST3250823AS
Rev: 3.03
Aug 11 17:30:12 arcadia Type:   Direct-Access
ANSI SCSI revision: 05
Aug 11 17:30:12 arcadia Vendor: ATA   Model: ST3250823AS
Rev: 3.03
Aug 11 17:30:12 arcadia Type:   Direct-Access
ANSI SCSI revision: 05

lspci:

:00:00.0 Host bridge: VIA Technologies, Inc. VT8377
[KT400/KT600 AGP] Host Bridge :00:01.0 PCI bridge: VIA
Technologies, Inc. VT8235 PCI Bridge :00:0a.0 Unknown
mass storage controller: Silicon Image, Inc. SiI
3112 [SATALink/SATARaid] Serial ATA Controller (rev 02)
:00:0c.0 FireWir

Re: SiI 3112A + Seagate HDs = still no go?

2005-08-12 Thread Chris Boot

Hi Tejun,

On 12 Aug 2005, at 12:33, Chris Boot wrote:


Hi Tejun,

On 12 Aug 2005, at 12:28, Tejun Heo wrote:




 Hello, Chris.

Chris Boot wrote:



On 12 Aug 2005, at 4:24, Tejun Heo wrote:



Chris Boot wrote:




Hi all,
I just recently took the plunge and bought 4 250 GB Seagate   
drives  and a 2 port Silicon Image 3112A controller card for  
the 2  drives my  motherboard doesn't handle. No matter how  
hard I try, I  can't get the  hard drives to work: they are  
detected correctly  and work reasonably  well under _very_  
light load, but anything  like building a RAID array  is a bit  
much and the whole controller  seems to lock up.
I've tried adding the drive to the blacklist in the sata_sil.c   
driver  and I still have the same trouble: as you can see the   
messages below  relate to my patched kernel with the blacklist   
fix. I've seen that  this was discussed just yesterday, but  
that  seemed to give nothing:  http://www.ussg.iu.edu/hypermail/ 
linux/ kernel/0508.1/0310.html
Ready and willing to hack my kernel to pieces; this machine is  
no  use  until I get all the drives working! Needless to say  
the  drives  connected to the on-board VIA controller work  
fine, as do  the drives  currently on the SiI controller if I  
swap them around.

Any ideas?
TIA
Chris





[added linux-ide to cc list]

 Can you please try w/ vanilla kernel (2.6.12 or 2.6.13-rc)?   
And  w/ one drive only?



I unplugged both drives from my on-board SATA controller and  
left  just one connected to the 3112A controller. Rebooted with a  
fresh,  vanilla 2.6.13-rc6 and ran:





 You can leave drives on on-board SATA controller.  It wouldn't  
make any difference.





dd if=/dev/zero of=test.img bs=1M count=16384
After about 30 seconds I got the crash and the kernel started   
repeating every 30 seconds (with different sector numbers):

ata1: command 0x35 timeout, stat 0xd9 host_stat 0x1
ata1: status=0xd9 { Busy }
SCSI error : <0 0 0 0> return code = 0x8002
sda: Current: sense key=0xb
ASC=0x47 ASCQ=0x0
end_request: I/O error, dev sda, sector 14937602
ATA: abnormal status 0xD9 on port E0802087
ATA: abnormal status 0xD9 on port E0802087
ATA: abnormal status 0xD9 on port E0802087
dmesg:
Linux version 2.6.13-rc6 ([EMAIL PROTECTED]) (gcc version   
3.3.5-20050130 (Gentoo 3.3.5.20050130-r1, ssp-3.3.5.20050130-1,   
pie-8.7.7.1)) #1 Fri Aug 12 12:31:25 BST 2005

...
libata version 1.11 loaded.
sata_sil version 0.9
ACPI: PCI Interrupt :00:0a.0[A] -> GSI 18 (level, low) -> IRQ  
177
ata1: SATA max UDMA/100 cmd 0xE0802080 ctl 0xE080208A bmdma   
0xE0802000 irq 177
ata2: SATA max UDMA/100 cmd 0xE08020C0 ctl 0xE08020CA bmdma   
0xE0802008 irq 177
ata1: dev 0 cfg 49:2f00 82:346b 83:7d01 84:4023 85:3469 86:3c01   
87:4023 88:207f

ata1: dev 0 ATA, max UDMA/133, 488397168 sectors: lba48
ata1: dev 0 configured for UDMA/100
scsi0 : sata_sil
ata2: no device found (phy stat )
scsi1 : sata_sil
  Vendor: ATA   Model: ST3250823AS   Rev: 3.03
  Type:   Direct-Access  ANSI SCSI revision: 05
sata_via version 1.1
ACPI: PCI Interrupt :00:0f.0[B] -> Link [ALKA] -> GSI 20  
(level,  low) -> IRQ 169

PCI: Via IRQ fixup for :00:0f.0, from 11 to 9
sata_via(:00:0f.0): routed to hard irq line 9
ata3: SATA max UDMA/133 cmd 0xB400 ctl 0xB802 bmdma 0xC400 irq 169
ata4: SATA max UDMA/133 cmd 0xBC00 ctl 0xC002 bmdma 0xC408 irq 169
ata3: no device found (phy stat )
scsi2 : sata_via
ata4: no device found (phy stat )
scsi3 : sata_via
SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
SCSI device sda: drive cache: write back
SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
SCSI device sda: drive cache: write back
sda: sda1 sda2 sda3
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
Attached scsi generic sg0 at scsi0, channel 0, id 0, lun 0,  type 0
I forgot to mention previously but I even tried with "noapic  
nolapic  acpi=off pci=routeirq" and got the same trouble.





 This is weird as ST3250823AS (and all Seagate .8 drives) are  
known to work without any problem with sii 3112/3114.  I currently  
don't own such a drive but someone confirmed me that ST3250823AS  
works w/ sii 3114 without any problem (including bonnie++ results  
and all).  So, I don't think it's the good old mod15write problem.


 I hope it's just a bad hardware, cable or something like that;  
otherwise, you're hitting a new bug.  Can you verify if the drive  
works under windows?




Well, what piqued my interest is that the same drives work fine on  
my on-board sata_via controller. All 4 drives were bought at the  
same time and *seem* to be from the same batch, and all work fine  
on the VIA controller and none work on the 3112A. I've also tried  
different cables, all of which are Belkin which I thought were  
decent quality.


I'll just try installing Winblows and let you know.


I just installed Windows XP SP2 and Cygwin:

$ dd if=/dev/zero of

Re: SiI 3112A + Seagate HDs = still no go?

2005-08-12 Thread Chris Boot

Hi Tejun,

On 12 Aug 2005, at 12:28, Tejun Heo wrote:



 Hello, Chris.

Chris Boot wrote:


On 12 Aug 2005, at 4:24, Tejun Heo wrote:


Chris Boot wrote:



Hi all,
I just recently took the plunge and bought 4 250 GB Seagate   
drives  and a 2 port Silicon Image 3112A controller card for the  
2  drives my  motherboard doesn't handle. No matter how hard I  
try, I  can't get the  hard drives to work: they are detected  
correctly  and work reasonably  well under _very_ light load,  
but anything  like building a RAID array  is a bit much and the  
whole controller  seems to lock up.
I've tried adding the drive to the blacklist in the sata_sil.c   
driver  and I still have the same trouble: as you can see the   
messages below  relate to my patched kernel with the blacklist   
fix. I've seen that  this was discussed just yesterday, but  
that  seemed to give nothing:  http://www.ussg.iu.edu/hypermail/ 
linux/ kernel/0508.1/0310.html
Ready and willing to hack my kernel to pieces; this machine is  
no  use  until I get all the drives working! Needless to say  
the  drives  connected to the on-board VIA controller work fine,  
as do  the drives  currently on the SiI controller if I swap  
them around.

Any ideas?
TIA
Chris




[added linux-ide to cc list]

 Can you please try w/ vanilla kernel (2.6.12 or 2.6.13-rc)?   
And  w/ one drive only?


I unplugged both drives from my on-board SATA controller and left   
just one connected to the 3112A controller. Rebooted with a  
fresh,  vanilla 2.6.13-rc6 and ran:




 You can leave drives on on-board SATA controller.  It wouldn't  
make any difference.




dd if=/dev/zero of=test.img bs=1M count=16384
After about 30 seconds I got the crash and the kernel started   
repeating every 30 seconds (with different sector numbers):

ata1: command 0x35 timeout, stat 0xd9 host_stat 0x1
ata1: status=0xd9 { Busy }
SCSI error : <0 0 0 0> return code = 0x8002
sda: Current: sense key=0xb
ASC=0x47 ASCQ=0x0
end_request: I/O error, dev sda, sector 14937602
ATA: abnormal status 0xD9 on port E0802087
ATA: abnormal status 0xD9 on port E0802087
ATA: abnormal status 0xD9 on port E0802087
dmesg:
Linux version 2.6.13-rc6 ([EMAIL PROTECTED]) (gcc version   
3.3.5-20050130 (Gentoo 3.3.5.20050130-r1, ssp-3.3.5.20050130-1,   
pie-8.7.7.1)) #1 Fri Aug 12 12:31:25 BST 2005

...
libata version 1.11 loaded.
sata_sil version 0.9
ACPI: PCI Interrupt :00:0a.0[A] -> GSI 18 (level, low) -> IRQ 177
ata1: SATA max UDMA/100 cmd 0xE0802080 ctl 0xE080208A bmdma   
0xE0802000 irq 177
ata2: SATA max UDMA/100 cmd 0xE08020C0 ctl 0xE08020CA bmdma   
0xE0802008 irq 177
ata1: dev 0 cfg 49:2f00 82:346b 83:7d01 84:4023 85:3469 86:3c01   
87:4023 88:207f

ata1: dev 0 ATA, max UDMA/133, 488397168 sectors: lba48
ata1: dev 0 configured for UDMA/100
scsi0 : sata_sil
ata2: no device found (phy stat )
scsi1 : sata_sil
  Vendor: ATA   Model: ST3250823AS   Rev: 3.03
  Type:   Direct-Access  ANSI SCSI revision: 05
sata_via version 1.1
ACPI: PCI Interrupt :00:0f.0[B] -> Link [ALKA] -> GSI 20  
(level,  low) -> IRQ 169

PCI: Via IRQ fixup for :00:0f.0, from 11 to 9
sata_via(:00:0f.0): routed to hard irq line 9
ata3: SATA max UDMA/133 cmd 0xB400 ctl 0xB802 bmdma 0xC400 irq 169
ata4: SATA max UDMA/133 cmd 0xBC00 ctl 0xC002 bmdma 0xC408 irq 169
ata3: no device found (phy stat )
scsi2 : sata_via
ata4: no device found (phy stat )
scsi3 : sata_via
SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
SCSI device sda: drive cache: write back
SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
SCSI device sda: drive cache: write back
sda: sda1 sda2 sda3
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
Attached scsi generic sg0 at scsi0, channel 0, id 0, lun 0,  type 0
I forgot to mention previously but I even tried with "noapic  
nolapic  acpi=off pci=routeirq" and got the same trouble.




 This is weird as ST3250823AS (and all Seagate .8 drives) are known  
to work without any problem with sii 3112/3114.  I currently don't  
own such a drive but someone confirmed me that ST3250823AS works w/  
sii 3114 without any problem (including bonnie++ results and all).   
So, I don't think it's the good old mod15write problem.


 I hope it's just a bad hardware, cable or something like that;  
otherwise, you're hitting a new bug.  Can you verify if the drive  
works under windows?


Well, what piqued my interest is that the same drives work fine on my  
on-board sata_via controller. All 4 drives were bought at the same  
time and *seem* to be from the same batch, and all work fine on the  
VIA controller and none work on the 3112A. I've also tried different  
cables, all of which are Belkin which I thought were decent quality.


I'll just try installing Winblows and let you know.

Many thanks,
Chris

--
Chris Boot
[EMAIL PROTECTED]
http://www.bootc.net/


-
To unsubscribe from this list: send the line "u

Re: SiI 3112A + Seagate HDs = still no go?

2005-08-12 Thread Chris Boot


On 12 Aug 2005, at 4:24, Tejun Heo wrote:


Chris Boot wrote:


Hi all,
I just recently took the plunge and bought 4 250 GB Seagate  
drives  and a 2 port Silicon Image 3112A controller card for the 2  
drives my  motherboard doesn't handle. No matter how hard I try, I  
can't get the  hard drives to work: they are detected correctly  
and work reasonably  well under _very_ light load, but anything  
like building a RAID array  is a bit much and the whole controller  
seems to lock up.
I've tried adding the drive to the blacklist in the sata_sil.c  
driver  and I still have the same trouble: as you can see the  
messages below  relate to my patched kernel with the blacklist  
fix. I've seen that  this was discussed just yesterday, but that  
seemed to give nothing:  http://www.ussg.iu.edu/hypermail/linux/ 
kernel/0508.1/0310.html
Ready and willing to hack my kernel to pieces; this machine is no  
use  until I get all the drives working! Needless to say the  
drives  connected to the on-board VIA controller work fine, as do  
the drives  currently on the SiI controller if I swap them around.

Any ideas?
TIA
Chris



[added linux-ide to cc list]

 Can you please try w/ vanilla kernel (2.6.12 or 2.6.13-rc)?  And  
w/ one drive only?


I unplugged both drives from my on-board SATA controller and left  
just one connected to the 3112A controller. Rebooted with a fresh,  
vanilla 2.6.13-rc6 and ran:


dd if=/dev/zero of=test.img bs=1M count=16384

After about 30 seconds I got the crash and the kernel started  
repeating every 30 seconds (with different sector numbers):


ata1: command 0x35 timeout, stat 0xd9 host_stat 0x1
ata1: status=0xd9 { Busy }
SCSI error : <0 0 0 0> return code = 0x8002
sda: Current: sense key=0xb
ASC=0x47 ASCQ=0x0
end_request: I/O error, dev sda, sector 14937602
ATA: abnormal status 0xD9 on port E0802087
ATA: abnormal status 0xD9 on port E0802087
ATA: abnormal status 0xD9 on port E0802087

dmesg:
Linux version 2.6.13-rc6 ([EMAIL PROTECTED]) (gcc version  
3.3.5-20050130 (Gentoo 3.3.5.20050130-r1, ssp-3.3.5.20050130-1,  
pie-8.7.7.1)) #1 Fri Aug 12 12:31:25 BST 2005

...
libata version 1.11 loaded.
sata_sil version 0.9
ACPI: PCI Interrupt :00:0a.0[A] -> GSI 18 (level, low) -> IRQ 177
ata1: SATA max UDMA/100 cmd 0xE0802080 ctl 0xE080208A bmdma  
0xE0802000 irq 177
ata2: SATA max UDMA/100 cmd 0xE08020C0 ctl 0xE08020CA bmdma  
0xE0802008 irq 177
ata1: dev 0 cfg 49:2f00 82:346b 83:7d01 84:4023 85:3469 86:3c01  
87:4023 88:207f

ata1: dev 0 ATA, max UDMA/133, 488397168 sectors: lba48
ata1: dev 0 configured for UDMA/100
scsi0 : sata_sil
ata2: no device found (phy stat )
scsi1 : sata_sil
  Vendor: ATA   Model: ST3250823AS   Rev: 3.03
  Type:   Direct-Access  ANSI SCSI revision: 05
sata_via version 1.1
ACPI: PCI Interrupt :00:0f.0[B] -> Link [ALKA] -> GSI 20 (level,  
low) -> IRQ 169

PCI: Via IRQ fixup for :00:0f.0, from 11 to 9
sata_via(:00:0f.0): routed to hard irq line 9
ata3: SATA max UDMA/133 cmd 0xB400 ctl 0xB802 bmdma 0xC400 irq 169
ata4: SATA max UDMA/133 cmd 0xBC00 ctl 0xC002 bmdma 0xC408 irq 169
ata3: no device found (phy stat )
scsi2 : sata_via
ata4: no device found (phy stat )
scsi3 : sata_via
SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
SCSI device sda: drive cache: write back
SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
SCSI device sda: drive cache: write back
sda: sda1 sda2 sda3
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
Attached scsi generic sg0 at scsi0, channel 0, id 0, lun 0,  type 0

I forgot to mention previously but I even tried with "noapic nolapic  
acpi=off pci=routeirq" and got the same trouble.


Thanks,
Chris

--
Chris Boot
[EMAIL PROTECTED]
http://www.bootc.net/


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SiI 3112A + Seagate HDs = still no go?

2005-08-12 Thread Chris Boot


On 12 Aug 2005, at 4:24, Tejun Heo wrote:


Chris Boot wrote:


Hi all,
I just recently took the plunge and bought 4 250 GB Seagate  
drives  and a 2 port Silicon Image 3112A controller card for the 2  
drives my  motherboard doesn't handle. No matter how hard I try, I  
can't get the  hard drives to work: they are detected correctly  
and work reasonably  well under _very_ light load, but anything  
like building a RAID array  is a bit much and the whole controller  
seems to lock up.
I've tried adding the drive to the blacklist in the sata_sil.c  
driver  and I still have the same trouble: as you can see the  
messages below  relate to my patched kernel with the blacklist  
fix. I've seen that  this was discussed just yesterday, but that  
seemed to give nothing:  http://www.ussg.iu.edu/hypermail/linux/ 
kernel/0508.1/0310.html
Ready and willing to hack my kernel to pieces; this machine is no  
use  until I get all the drives working! Needless to say the  
drives  connected to the on-board VIA controller work fine, as do  
the drives  currently on the SiI controller if I swap them around.

Any ideas?
TIA
Chris



[added linux-ide to cc list]

 Can you please try w/ vanilla kernel (2.6.12 or 2.6.13-rc)?  And  
w/ one drive only?


I unplugged both drives from my on-board SATA controller and left  
just one connected to the 3112A controller. Rebooted with a fresh,  
vanilla 2.6.13-rc6 and ran:


dd if=/dev/zero of=test.img bs=1M count=16384

After about 30 seconds I got the crash and the kernel started  
repeating every 30 seconds (with different sector numbers):


ata1: command 0x35 timeout, stat 0xd9 host_stat 0x1
ata1: status=0xd9 { Busy }
SCSI error : 0 0 0 0 return code = 0x8002
sda: Current: sense key=0xb
ASC=0x47 ASCQ=0x0
end_request: I/O error, dev sda, sector 14937602
ATA: abnormal status 0xD9 on port E0802087
ATA: abnormal status 0xD9 on port E0802087
ATA: abnormal status 0xD9 on port E0802087

dmesg:
Linux version 2.6.13-rc6 ([EMAIL PROTECTED]) (gcc version  
3.3.5-20050130 (Gentoo 3.3.5.20050130-r1, ssp-3.3.5.20050130-1,  
pie-8.7.7.1)) #1 Fri Aug 12 12:31:25 BST 2005

...
libata version 1.11 loaded.
sata_sil version 0.9
ACPI: PCI Interrupt :00:0a.0[A] - GSI 18 (level, low) - IRQ 177
ata1: SATA max UDMA/100 cmd 0xE0802080 ctl 0xE080208A bmdma  
0xE0802000 irq 177
ata2: SATA max UDMA/100 cmd 0xE08020C0 ctl 0xE08020CA bmdma  
0xE0802008 irq 177
ata1: dev 0 cfg 49:2f00 82:346b 83:7d01 84:4023 85:3469 86:3c01  
87:4023 88:207f

ata1: dev 0 ATA, max UDMA/133, 488397168 sectors: lba48
ata1: dev 0 configured for UDMA/100
scsi0 : sata_sil
ata2: no device found (phy stat )
scsi1 : sata_sil
  Vendor: ATA   Model: ST3250823AS   Rev: 3.03
  Type:   Direct-Access  ANSI SCSI revision: 05
sata_via version 1.1
ACPI: PCI Interrupt :00:0f.0[B] - Link [ALKA] - GSI 20 (level,  
low) - IRQ 169

PCI: Via IRQ fixup for :00:0f.0, from 11 to 9
sata_via(:00:0f.0): routed to hard irq line 9
ata3: SATA max UDMA/133 cmd 0xB400 ctl 0xB802 bmdma 0xC400 irq 169
ata4: SATA max UDMA/133 cmd 0xBC00 ctl 0xC002 bmdma 0xC408 irq 169
ata3: no device found (phy stat )
scsi2 : sata_via
ata4: no device found (phy stat )
scsi3 : sata_via
SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
SCSI device sda: drive cache: write back
SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
SCSI device sda: drive cache: write back
sda: sda1 sda2 sda3
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
Attached scsi generic sg0 at scsi0, channel 0, id 0, lun 0,  type 0

I forgot to mention previously but I even tried with noapic nolapic  
acpi=off pci=routeirq and got the same trouble.


Thanks,
Chris

--
Chris Boot
[EMAIL PROTECTED]
http://www.bootc.net/


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SiI 3112A + Seagate HDs = still no go?

2005-08-12 Thread Chris Boot

Hi Tejun,

On 12 Aug 2005, at 12:28, Tejun Heo wrote:



 Hello, Chris.

Chris Boot wrote:


On 12 Aug 2005, at 4:24, Tejun Heo wrote:


Chris Boot wrote:



Hi all,
I just recently took the plunge and bought 4 250 GB Seagate   
drives  and a 2 port Silicon Image 3112A controller card for the  
2  drives my  motherboard doesn't handle. No matter how hard I  
try, I  can't get the  hard drives to work: they are detected  
correctly  and work reasonably  well under _very_ light load,  
but anything  like building a RAID array  is a bit much and the  
whole controller  seems to lock up.
I've tried adding the drive to the blacklist in the sata_sil.c   
driver  and I still have the same trouble: as you can see the   
messages below  relate to my patched kernel with the blacklist   
fix. I've seen that  this was discussed just yesterday, but  
that  seemed to give nothing:  http://www.ussg.iu.edu/hypermail/ 
linux/ kernel/0508.1/0310.html
Ready and willing to hack my kernel to pieces; this machine is  
no  use  until I get all the drives working! Needless to say  
the  drives  connected to the on-board VIA controller work fine,  
as do  the drives  currently on the SiI controller if I swap  
them around.

Any ideas?
TIA
Chris




[added linux-ide to cc list]

 Can you please try w/ vanilla kernel (2.6.12 or 2.6.13-rc)?   
And  w/ one drive only?


I unplugged both drives from my on-board SATA controller and left   
just one connected to the 3112A controller. Rebooted with a  
fresh,  vanilla 2.6.13-rc6 and ran:




 You can leave drives on on-board SATA controller.  It wouldn't  
make any difference.




dd if=/dev/zero of=test.img bs=1M count=16384
After about 30 seconds I got the crash and the kernel started   
repeating every 30 seconds (with different sector numbers):

ata1: command 0x35 timeout, stat 0xd9 host_stat 0x1
ata1: status=0xd9 { Busy }
SCSI error : 0 0 0 0 return code = 0x8002
sda: Current: sense key=0xb
ASC=0x47 ASCQ=0x0
end_request: I/O error, dev sda, sector 14937602
ATA: abnormal status 0xD9 on port E0802087
ATA: abnormal status 0xD9 on port E0802087
ATA: abnormal status 0xD9 on port E0802087
dmesg:
Linux version 2.6.13-rc6 ([EMAIL PROTECTED]) (gcc version   
3.3.5-20050130 (Gentoo 3.3.5.20050130-r1, ssp-3.3.5.20050130-1,   
pie-8.7.7.1)) #1 Fri Aug 12 12:31:25 BST 2005

...
libata version 1.11 loaded.
sata_sil version 0.9
ACPI: PCI Interrupt :00:0a.0[A] - GSI 18 (level, low) - IRQ 177
ata1: SATA max UDMA/100 cmd 0xE0802080 ctl 0xE080208A bmdma   
0xE0802000 irq 177
ata2: SATA max UDMA/100 cmd 0xE08020C0 ctl 0xE08020CA bmdma   
0xE0802008 irq 177
ata1: dev 0 cfg 49:2f00 82:346b 83:7d01 84:4023 85:3469 86:3c01   
87:4023 88:207f

ata1: dev 0 ATA, max UDMA/133, 488397168 sectors: lba48
ata1: dev 0 configured for UDMA/100
scsi0 : sata_sil
ata2: no device found (phy stat )
scsi1 : sata_sil
  Vendor: ATA   Model: ST3250823AS   Rev: 3.03
  Type:   Direct-Access  ANSI SCSI revision: 05
sata_via version 1.1
ACPI: PCI Interrupt :00:0f.0[B] - Link [ALKA] - GSI 20  
(level,  low) - IRQ 169

PCI: Via IRQ fixup for :00:0f.0, from 11 to 9
sata_via(:00:0f.0): routed to hard irq line 9
ata3: SATA max UDMA/133 cmd 0xB400 ctl 0xB802 bmdma 0xC400 irq 169
ata4: SATA max UDMA/133 cmd 0xBC00 ctl 0xC002 bmdma 0xC408 irq 169
ata3: no device found (phy stat )
scsi2 : sata_via
ata4: no device found (phy stat )
scsi3 : sata_via
SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
SCSI device sda: drive cache: write back
SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
SCSI device sda: drive cache: write back
sda: sda1 sda2 sda3
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
Attached scsi generic sg0 at scsi0, channel 0, id 0, lun 0,  type 0
I forgot to mention previously but I even tried with noapic  
nolapic  acpi=off pci=routeirq and got the same trouble.




 This is weird as ST3250823AS (and all Seagate .8 drives) are known  
to work without any problem with sii 3112/3114.  I currently don't  
own such a drive but someone confirmed me that ST3250823AS works w/  
sii 3114 without any problem (including bonnie++ results and all).   
So, I don't think it's the good old mod15write problem.


 I hope it's just a bad hardware, cable or something like that;  
otherwise, you're hitting a new bug.  Can you verify if the drive  
works under windows?


Well, what piqued my interest is that the same drives work fine on my  
on-board sata_via controller. All 4 drives were bought at the same  
time and *seem* to be from the same batch, and all work fine on the  
VIA controller and none work on the 3112A. I've also tried different  
cables, all of which are Belkin which I thought were decent quality.


I'll just try installing Winblows and let you know.

Many thanks,
Chris

--
Chris Boot
[EMAIL PROTECTED]
http://www.bootc.net/


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message

Re: SiI 3112A + Seagate HDs = still no go?

2005-08-12 Thread Chris Boot

Hi Tejun,

On 12 Aug 2005, at 12:33, Chris Boot wrote:


Hi Tejun,

On 12 Aug 2005, at 12:28, Tejun Heo wrote:




 Hello, Chris.

Chris Boot wrote:



On 12 Aug 2005, at 4:24, Tejun Heo wrote:



Chris Boot wrote:




Hi all,
I just recently took the plunge and bought 4 250 GB Seagate   
drives  and a 2 port Silicon Image 3112A controller card for  
the 2  drives my  motherboard doesn't handle. No matter how  
hard I try, I  can't get the  hard drives to work: they are  
detected correctly  and work reasonably  well under _very_  
light load, but anything  like building a RAID array  is a bit  
much and the whole controller  seems to lock up.
I've tried adding the drive to the blacklist in the sata_sil.c   
driver  and I still have the same trouble: as you can see the   
messages below  relate to my patched kernel with the blacklist   
fix. I've seen that  this was discussed just yesterday, but  
that  seemed to give nothing:  http://www.ussg.iu.edu/hypermail/ 
linux/ kernel/0508.1/0310.html
Ready and willing to hack my kernel to pieces; this machine is  
no  use  until I get all the drives working! Needless to say  
the  drives  connected to the on-board VIA controller work  
fine, as do  the drives  currently on the SiI controller if I  
swap them around.

Any ideas?
TIA
Chris





[added linux-ide to cc list]

 Can you please try w/ vanilla kernel (2.6.12 or 2.6.13-rc)?   
And  w/ one drive only?



I unplugged both drives from my on-board SATA controller and  
left  just one connected to the 3112A controller. Rebooted with a  
fresh,  vanilla 2.6.13-rc6 and ran:





 You can leave drives on on-board SATA controller.  It wouldn't  
make any difference.





dd if=/dev/zero of=test.img bs=1M count=16384
After about 30 seconds I got the crash and the kernel started   
repeating every 30 seconds (with different sector numbers):

ata1: command 0x35 timeout, stat 0xd9 host_stat 0x1
ata1: status=0xd9 { Busy }
SCSI error : 0 0 0 0 return code = 0x8002
sda: Current: sense key=0xb
ASC=0x47 ASCQ=0x0
end_request: I/O error, dev sda, sector 14937602
ATA: abnormal status 0xD9 on port E0802087
ATA: abnormal status 0xD9 on port E0802087
ATA: abnormal status 0xD9 on port E0802087
dmesg:
Linux version 2.6.13-rc6 ([EMAIL PROTECTED]) (gcc version   
3.3.5-20050130 (Gentoo 3.3.5.20050130-r1, ssp-3.3.5.20050130-1,   
pie-8.7.7.1)) #1 Fri Aug 12 12:31:25 BST 2005

...
libata version 1.11 loaded.
sata_sil version 0.9
ACPI: PCI Interrupt :00:0a.0[A] - GSI 18 (level, low) - IRQ  
177
ata1: SATA max UDMA/100 cmd 0xE0802080 ctl 0xE080208A bmdma   
0xE0802000 irq 177
ata2: SATA max UDMA/100 cmd 0xE08020C0 ctl 0xE08020CA bmdma   
0xE0802008 irq 177
ata1: dev 0 cfg 49:2f00 82:346b 83:7d01 84:4023 85:3469 86:3c01   
87:4023 88:207f

ata1: dev 0 ATA, max UDMA/133, 488397168 sectors: lba48
ata1: dev 0 configured for UDMA/100
scsi0 : sata_sil
ata2: no device found (phy stat )
scsi1 : sata_sil
  Vendor: ATA   Model: ST3250823AS   Rev: 3.03
  Type:   Direct-Access  ANSI SCSI revision: 05
sata_via version 1.1
ACPI: PCI Interrupt :00:0f.0[B] - Link [ALKA] - GSI 20  
(level,  low) - IRQ 169

PCI: Via IRQ fixup for :00:0f.0, from 11 to 9
sata_via(:00:0f.0): routed to hard irq line 9
ata3: SATA max UDMA/133 cmd 0xB400 ctl 0xB802 bmdma 0xC400 irq 169
ata4: SATA max UDMA/133 cmd 0xBC00 ctl 0xC002 bmdma 0xC408 irq 169
ata3: no device found (phy stat )
scsi2 : sata_via
ata4: no device found (phy stat )
scsi3 : sata_via
SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
SCSI device sda: drive cache: write back
SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
SCSI device sda: drive cache: write back
sda: sda1 sda2 sda3
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
Attached scsi generic sg0 at scsi0, channel 0, id 0, lun 0,  type 0
I forgot to mention previously but I even tried with noapic  
nolapic  acpi=off pci=routeirq and got the same trouble.





 This is weird as ST3250823AS (and all Seagate .8 drives) are  
known to work without any problem with sii 3112/3114.  I currently  
don't own such a drive but someone confirmed me that ST3250823AS  
works w/ sii 3114 without any problem (including bonnie++ results  
and all).  So, I don't think it's the good old mod15write problem.


 I hope it's just a bad hardware, cable or something like that;  
otherwise, you're hitting a new bug.  Can you verify if the drive  
works under windows?




Well, what piqued my interest is that the same drives work fine on  
my on-board sata_via controller. All 4 drives were bought at the  
same time and *seem* to be from the same batch, and all work fine  
on the VIA controller and none work on the 3112A. I've also tried  
different cables, all of which are Belkin which I thought were  
decent quality.


I'll just try installing Winblows and let you know.


I just installed Windows XP SP2 and Cygwin:

$ dd if=/dev/zero of=test.img bs=1M count=4096
4096+0

Re: SiI 3112A + Seagate HDs = still no go?

2005-08-12 Thread Chris Boot

Hi there,

I get very different symptoms indeed. My drive isn't in the  
blacklist, and adding it has little effect (status 0xd9 to 0xd8, no  
other differences). Once the controller hangs, I can't even kill dd  
or login at a different terminal, just a complete lockup. If I have 2  
drives plugged in, running the dd on one of them also hangs the  
other, thus I suspect the controller. Also, reading via dd is fine,  
only writing has trouble.


Chris

On 12 Aug 2005, at 16:19, Roger Heflin wrote:



With the Segate sata's I worked with before, I had to
actually remove them from the blacklist, this was a couple
of months ago with the native sata seagate disks.

With the drive in the blacklist the drive worked right
under light conditions, but under a dd read from the boot
seagate the entire machine appeared to block on any io
going to that disk, it did not stop (verified by vmstat),
but I could never get the 55-60MiB/second expected, and
was getting around 15MiB/second, with enormous amounts
of interrupts, after removing it from the blacklist,
I got the 55-60MiB/second rate, and the interrupts were
much more reasonable, and the response of the system
was actually useable.When the lockup occurred, stopping
the dd resulting in all things unlocking and continuing
on, I duplicated this several times with the latest kernel
at the time.

   Roger



-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Chris Boot
Sent: Thursday, August 11, 2005 4:55 PM
To: linux-kernel@vger.kernel.org
Subject: SiI 3112A + Seagate HDs = still no go?

Hi all,

I just recently took the plunge and bought 4 250 GB Seagate
drives and a 2 port Silicon Image 3112A controller card for
the 2 drives my motherboard doesn't handle. No matter how
hard I try, I can't get the hard drives to work: they are
detected correctly and work reasonably well under _very_
light load, but anything like building a RAID array is a bit
much and the whole controller seems to lock up.

I've tried adding the drive to the blacklist in the
sata_sil.c driver and I still have the same trouble: as you
can see the messages below relate to my patched kernel with
the blacklist fix. I've seen that this was discussed just
yesterday, but that seemed to give nothing:
http://www.ussg.iu.edu/hypermail/linux/kernel/0508.1/0310.html

Ready and willing to hack my kernel to pieces; this machine
is no use until I get all the drives working! Needless to say
the drives connected to the on-board VIA controller work
fine, as do the drives currently on the SiI controller if I
swap them around.

Any ideas?

TIA
Chris

The following messages are sent to the log when everything goes mad:

ata1: command 0x35 timeout, stat 0xd8 host_stat 0x0
ata1: status=0xd8 { Busy }
SCSI error : 0 0 0 0 return code = 0x8002
sda: Current: sense key=0xb
ASC=0x47 ASCQ=0x0
end_request: I/O error, dev sda, sector 2990370
ATA: abnormal status 0xD8 on port E0802087
ATA: abnormal status 0xD8 on port E0802087
ATA: abnormal status 0xD8 on port E0802087 [ the above is
transcribed so may not be 100% accurate ]

Dmesg log during boot (and detection):

Aug 11 21:47:05 arcadia Linux version 2.6.12-gentoo-r6
([EMAIL PROTECTED]) (gcc version 3.3.5-20050130 (Gentoo
3.3.5.20050130-r1, ssp-3.3.5.20050130-1, pie-8.7.7.1)) #2 Thu
Aug 11 20:19:00 BST 2005 ...
Aug 11 17:30:12 arcadia sata_sil version 0.9 Aug 11 17:30:12
arcadia ACPI: PCI Interrupt :00:0a.0[A] - GSI 18 (level,
low) - IRQ 177 Aug 11 17:30:12 arcadia ata1: SATA max
UDMA/100 cmd 0xE0802080 ctl 0xE080208A bmdma 0xE0802000 irq
177 Aug 11 17:30:12 arcadia ata2: SATA max UDMA/100 cmd
0xE08020C0 ctl 0xE08020CA bmdma 0xE0802008 irq 177 Aug 11
17:30:12 arcadia ata1: dev 0 cfg 49:2f00 82:346b 83:7d01
84:4023 85:3469 86:3c01 87:4023 88:207f
Aug 11 17:30:12 arcadia ata1: dev 0 ATA, max UDMA/133, 488397168
sectors: lba48
Aug 11 17:30:12 arcadia ata1(0): applying Seagate errata fix
Aug 11 17:30:12 arcadia ata1: dev 0 configured for UDMA/100
Aug 11 17:30:12 arcadia scsi0 : sata_sil Aug 11 17:30:12
arcadia ata2: dev 0 cfg 49:2f00 82:346b 83:7d01
84:4023 85:3469 86:3c01 87:4023 88:207f
Aug 11 17:30:12 arcadia ata2: dev 0 ATA, max UDMA/133, 488397168
sectors: lba48
Aug 11 17:30:12 arcadia ata2(0): applying Seagate errata fix
Aug 11 17:30:12 arcadia ata2: dev 0 configured for UDMA/100
Aug 11 17:30:12 arcadia scsi1 : sata_sil
Aug 11 17:30:12 arcadia Vendor: ATA   Model: ST3250823AS
Rev: 3.03
Aug 11 17:30:12 arcadia Type:   Direct-Access
ANSI SCSI revision: 05
Aug 11 17:30:12 arcadia Vendor: ATA   Model: ST3250823AS
Rev: 3.03
Aug 11 17:30:12 arcadia Type:   Direct-Access
ANSI SCSI revision: 05

lspci:

:00:00.0 Host bridge: VIA Technologies, Inc. VT8377
[KT400/KT600 AGP] Host Bridge :00:01.0 PCI bridge: VIA
Technologies, Inc. VT8235 PCI Bridge :00:0a.0 Unknown
mass storage controller: Silicon Image, Inc. SiI
3112 [SATALink/SATARaid] Serial ATA Controller (rev 02)
:00:0c.0 FireWire (IEEE 1394

Re: SiI 3112A + Seagate HDs = still no go?

2005-08-12 Thread Chris Boot

On 12 Aug 2005, at 15:08, Tejun Heo wrote:


Chris Boot wrote:


Hi Tejun,
On 12 Aug 2005, at 12:33, Chris Boot wrote:


Hi Tejun,

On 12 Aug 2005, at 12:28, Tejun Heo wrote:





 Hello, Chris.

Chris Boot wrote:




On 12 Aug 2005, at 4:24, Tejun Heo wrote:




Chris Boot wrote:





Hi all,
I just recently took the plunge and bought 4 250 GB Seagate
drives  and a 2 port Silicon Image 3112A controller card for   
the 2  drives my  motherboard doesn't handle. No matter how   
hard I try, I  can't get the  hard drives to work: they are   
detected correctly  and work reasonably  well under _very_   
light load, but anything  like building a RAID array  is a  
bit  much and the whole controller  seems to lock up.
I've tried adding the drive to the blacklist in the  
sata_sil.c   driver  and I still have the same trouble: as  
you can see the   messages below  relate to my patched kernel  
with the blacklist   fix. I've seen that  this was discussed  
just yesterday, but  that  seemed to give nothing:  http:// 
www.ussg.iu.edu/hypermail/ linux/ kernel/0508.1/0310.html
Ready and willing to hack my kernel to pieces; this machine  
is  no  use  until I get all the drives working! Needless to  
say  the  drives  connected to the on-board VIA controller  
work  fine, as do  the drives  currently on the SiI  
controller if I  swap them around.

Any ideas?
TIA
Chris






[added linux-ide to cc list]

 Can you please try w/ vanilla kernel (2.6.12 or 2.6.13-rc)?
And  w/ one drive only?




I unplugged both drives from my on-board SATA controller and   
left  just one connected to the 3112A controller. Rebooted with  
a  fresh,  vanilla 2.6.13-rc6 and ran:






 You can leave drives on on-board SATA controller.  It wouldn't   
make any difference.






dd if=/dev/zero of=test.img bs=1M count=16384
After about 30 seconds I got the crash and the kernel started
repeating every 30 seconds (with different sector numbers):

ata1: command 0x35 timeout, stat 0xd9 host_stat 0x1
ata1: status=0xd9 { Busy }
SCSI error : 0 0 0 0 return code = 0x8002
sda: Current: sense key=0xb
ASC=0x47 ASCQ=0x0
end_request: I/O error, dev sda, sector 14937602
ATA: abnormal status 0xD9 on port E0802087
ATA: abnormal status 0xD9 on port E0802087
ATA: abnormal status 0xD9 on port E0802087
dmesg:
Linux version 2.6.13-rc6 ([EMAIL PROTECTED]) (gcc  
version   3.3.5-20050130 (Gentoo 3.3.5.20050130-r1,  
ssp-3.3.5.20050130-1,   pie-8.7.7.1)) #1 Fri Aug 12 12:31:25  
BST 2005

...
libata version 1.11 loaded.
sata_sil version 0.9
ACPI: PCI Interrupt :00:0a.0[A] - GSI 18 (level, low) -  
IRQ  177
ata1: SATA max UDMA/100 cmd 0xE0802080 ctl 0xE080208A bmdma
0xE0802000 irq 177
ata2: SATA max UDMA/100 cmd 0xE08020C0 ctl 0xE08020CA bmdma
0xE0802008 irq 177
ata1: dev 0 cfg 49:2f00 82:346b 83:7d01 84:4023 85:3469  
86:3c01   87:4023 88:207f

ata1: dev 0 ATA, max UDMA/133, 488397168 sectors: lba48
ata1: dev 0 configured for UDMA/100
scsi0 : sata_sil
ata2: no device found (phy stat )
scsi1 : sata_sil
  Vendor: ATA   Model: ST3250823AS   Rev: 3.03
  Type:   Direct-Access  ANSI SCSI  
revision: 05

sata_via version 1.1
ACPI: PCI Interrupt :00:0f.0[B] - Link [ALKA] - GSI 20   
(level,  low) - IRQ 169

PCI: Via IRQ fixup for :00:0f.0, from 11 to 9
sata_via(:00:0f.0): routed to hard irq line 9
ata3: SATA max UDMA/133 cmd 0xB400 ctl 0xB802 bmdma 0xC400 irq 169
ata4: SATA max UDMA/133 cmd 0xBC00 ctl 0xC002 bmdma 0xC408 irq 169
ata3: no device found (phy stat )
scsi2 : sata_via
ata4: no device found (phy stat )
scsi3 : sata_via
SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
SCSI device sda: drive cache: write back
SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
SCSI device sda: drive cache: write back
sda: sda1 sda2 sda3
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
Attached scsi generic sg0 at scsi0, channel 0, id 0, lun 0,   
type 0
I forgot to mention previously but I even tried with noapic   
nolapic  acpi=off pci=routeirq and got the same trouble.






 This is weird as ST3250823AS (and all Seagate .8 drives) are   
known to work without any problem with sii 3112/3114.  I  
currently  don't own such a drive but someone confirmed me that  
ST3250823AS  works w/ sii 3114 without any problem (including  
bonnie++ results  and all).  So, I don't think it's the good old  
mod15write problem.


 I hope it's just a bad hardware, cable or something like that;   
otherwise, you're hitting a new bug.  Can you verify if the  
drive  works under windows?





Well, what piqued my interest is that the same drives work fine  
on  my on-board sata_via controller. All 4 drives were bought at  
the  same time and *seem* to be from the same batch, and all work  
fine  on the VIA controller and none work on the 3112A. I've also  
tried  different cables, all of which are Belkin which I thought  
were  decent quality.


I'll just try installing Winblows

SiI 3112A + Seagate HDs = still no go?

2005-08-11 Thread Chris Boot

Hi all,

I just recently took the plunge and bought 4 250 GB Seagate drives  
and a 2 port Silicon Image 3112A controller card for the 2 drives my  
motherboard doesn't handle. No matter how hard I try, I can't get the  
hard drives to work: they are detected correctly and work reasonably  
well under _very_ light load, but anything like building a RAID array  
is a bit much and the whole controller seems to lock up.


I've tried adding the drive to the blacklist in the sata_sil.c driver  
and I still have the same trouble: as you can see the messages below  
relate to my patched kernel with the blacklist fix. I've seen that  
this was discussed just yesterday, but that seemed to give nothing:  
http://www.ussg.iu.edu/hypermail/linux/kernel/0508.1/0310.html


Ready and willing to hack my kernel to pieces; this machine is no use  
until I get all the drives working! Needless to say the drives  
connected to the on-board VIA controller work fine, as do the drives  
currently on the SiI controller if I swap them around.


Any ideas?

TIA
Chris

The following messages are sent to the log when everything goes mad:

ata1: command 0x35 timeout, stat 0xd8 host_stat 0x0
ata1: status=0xd8 { Busy }
SCSI error : <0 0 0 0> return code = 0x8002
sda: Current: sense key=0xb
ASC=0x47 ASCQ=0x0
end_request: I/O error, dev sda, sector 2990370
ATA: abnormal status 0xD8 on port E0802087
ATA: abnormal status 0xD8 on port E0802087
ATA: abnormal status 0xD8 on port E0802087
[ the above is transcribed so may not be 100% accurate ]

Dmesg log during boot (and detection):

Aug 11 21:47:05 arcadia Linux version 2.6.12-gentoo-r6  
([EMAIL PROTECTED]) (gcc version 3.3.5-20050130 (Gentoo  
3.3.5.20050130-r1, ssp-3.3.5.20050130-1, pie-8.7.7.1)) #2 Thu Aug 11  
20:19:00 BST 2005

...
Aug 11 17:30:12 arcadia sata_sil version 0.9
Aug 11 17:30:12 arcadia ACPI: PCI Interrupt :00:0a.0[A] -> GSI 18  
(level, low) -> IRQ 177
Aug 11 17:30:12 arcadia ata1: SATA max UDMA/100 cmd 0xE0802080 ctl  
0xE080208A bmdma 0xE0802000 irq 177
Aug 11 17:30:12 arcadia ata2: SATA max UDMA/100 cmd 0xE08020C0 ctl  
0xE08020CA bmdma 0xE0802008 irq 177
Aug 11 17:30:12 arcadia ata1: dev 0 cfg 49:2f00 82:346b 83:7d01  
84:4023 85:3469 86:3c01 87:4023 88:207f
Aug 11 17:30:12 arcadia ata1: dev 0 ATA, max UDMA/133, 488397168  
sectors: lba48

Aug 11 17:30:12 arcadia ata1(0): applying Seagate errata fix
Aug 11 17:30:12 arcadia ata1: dev 0 configured for UDMA/100
Aug 11 17:30:12 arcadia scsi0 : sata_sil
Aug 11 17:30:12 arcadia ata2: dev 0 cfg 49:2f00 82:346b 83:7d01  
84:4023 85:3469 86:3c01 87:4023 88:207f
Aug 11 17:30:12 arcadia ata2: dev 0 ATA, max UDMA/133, 488397168  
sectors: lba48

Aug 11 17:30:12 arcadia ata2(0): applying Seagate errata fix
Aug 11 17:30:12 arcadia ata2: dev 0 configured for UDMA/100
Aug 11 17:30:12 arcadia scsi1 : sata_sil
Aug 11 17:30:12 arcadia Vendor: ATA   Model: ST3250823AS
Rev: 3.03
Aug 11 17:30:12 arcadia Type:   Direct-Access   
ANSI SCSI revision: 05
Aug 11 17:30:12 arcadia Vendor: ATA   Model: ST3250823AS
Rev: 3.03
Aug 11 17:30:12 arcadia Type:   Direct-Access   
ANSI SCSI revision: 05


lspci:

:00:00.0 Host bridge: VIA Technologies, Inc. VT8377 [KT400/KT600  
AGP] Host Bridge

:00:01.0 PCI bridge: VIA Technologies, Inc. VT8235 PCI Bridge
:00:0a.0 Unknown mass storage controller: Silicon Image, Inc. SiI  
3112 [SATALink/SATARaid] Serial ATA Controller (rev 02)
:00:0c.0 FireWire (IEEE 1394): Agere Systems (former Lucent  
Microelectronics) FW323 (rev 61)
:00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA VT6420  
SATA RAID Controller (rev 80)
:00:0f.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/ 
VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
:00:10.0 USB Controller: VIA Technologies, Inc. VT82x UHCI  
USB 1.1 Controller (rev 81)
:00:10.1 USB Controller: VIA Technologies, Inc. VT82x UHCI  
USB 1.1 Controller (rev 81)
:00:10.2 USB Controller: VIA Technologies, Inc. VT82x UHCI  
USB 1.1 Controller (rev 81)
:00:10.3 USB Controller: VIA Technologies, Inc. VT82x UHCI  
USB 1.1 Controller (rev 81)

:00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 86)
:00:11.0 ISA bridge: VIA Technologies, Inc. VT8237 ISA bridge  
[KT600/K8T800/K8T890 South]
:00:11.5 Multimedia audio controller: VIA Technologies, Inc.  
VT8233/A/8235/8237 AC97 Audio Controller (rev 60)
:00:12.0 Ethernet controller: VIA Technologies, Inc. VT6102  
[Rhine-II] (rev 78)
:01:00.0 VGA compatible controller: nVidia Corporation NV11  
[GeForce2 MX/MX 400] (rev b2)


Many thanks,
Chris

--
Chris Boot
[EMAIL PROTECTED]
http://www.bootc.net/




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


SiI 3112A + Seagate HDs = still no go?

2005-08-11 Thread Chris Boot

Hi all,

I just recently took the plunge and bought 4 250 GB Seagate drives  
and a 2 port Silicon Image 3112A controller card for the 2 drives my  
motherboard doesn't handle. No matter how hard I try, I can't get the  
hard drives to work: they are detected correctly and work reasonably  
well under _very_ light load, but anything like building a RAID array  
is a bit much and the whole controller seems to lock up.


I've tried adding the drive to the blacklist in the sata_sil.c driver  
and I still have the same trouble: as you can see the messages below  
relate to my patched kernel with the blacklist fix. I've seen that  
this was discussed just yesterday, but that seemed to give nothing:  
http://www.ussg.iu.edu/hypermail/linux/kernel/0508.1/0310.html


Ready and willing to hack my kernel to pieces; this machine is no use  
until I get all the drives working! Needless to say the drives  
connected to the on-board VIA controller work fine, as do the drives  
currently on the SiI controller if I swap them around.


Any ideas?

TIA
Chris

The following messages are sent to the log when everything goes mad:

ata1: command 0x35 timeout, stat 0xd8 host_stat 0x0
ata1: status=0xd8 { Busy }
SCSI error : 0 0 0 0 return code = 0x8002
sda: Current: sense key=0xb
ASC=0x47 ASCQ=0x0
end_request: I/O error, dev sda, sector 2990370
ATA: abnormal status 0xD8 on port E0802087
ATA: abnormal status 0xD8 on port E0802087
ATA: abnormal status 0xD8 on port E0802087
[ the above is transcribed so may not be 100% accurate ]

Dmesg log during boot (and detection):

Aug 11 21:47:05 arcadia Linux version 2.6.12-gentoo-r6  
([EMAIL PROTECTED]) (gcc version 3.3.5-20050130 (Gentoo  
3.3.5.20050130-r1, ssp-3.3.5.20050130-1, pie-8.7.7.1)) #2 Thu Aug 11  
20:19:00 BST 2005

...
Aug 11 17:30:12 arcadia sata_sil version 0.9
Aug 11 17:30:12 arcadia ACPI: PCI Interrupt :00:0a.0[A] - GSI 18  
(level, low) - IRQ 177
Aug 11 17:30:12 arcadia ata1: SATA max UDMA/100 cmd 0xE0802080 ctl  
0xE080208A bmdma 0xE0802000 irq 177
Aug 11 17:30:12 arcadia ata2: SATA max UDMA/100 cmd 0xE08020C0 ctl  
0xE08020CA bmdma 0xE0802008 irq 177
Aug 11 17:30:12 arcadia ata1: dev 0 cfg 49:2f00 82:346b 83:7d01  
84:4023 85:3469 86:3c01 87:4023 88:207f
Aug 11 17:30:12 arcadia ata1: dev 0 ATA, max UDMA/133, 488397168  
sectors: lba48

Aug 11 17:30:12 arcadia ata1(0): applying Seagate errata fix
Aug 11 17:30:12 arcadia ata1: dev 0 configured for UDMA/100
Aug 11 17:30:12 arcadia scsi0 : sata_sil
Aug 11 17:30:12 arcadia ata2: dev 0 cfg 49:2f00 82:346b 83:7d01  
84:4023 85:3469 86:3c01 87:4023 88:207f
Aug 11 17:30:12 arcadia ata2: dev 0 ATA, max UDMA/133, 488397168  
sectors: lba48

Aug 11 17:30:12 arcadia ata2(0): applying Seagate errata fix
Aug 11 17:30:12 arcadia ata2: dev 0 configured for UDMA/100
Aug 11 17:30:12 arcadia scsi1 : sata_sil
Aug 11 17:30:12 arcadia Vendor: ATA   Model: ST3250823AS
Rev: 3.03
Aug 11 17:30:12 arcadia Type:   Direct-Access   
ANSI SCSI revision: 05
Aug 11 17:30:12 arcadia Vendor: ATA   Model: ST3250823AS
Rev: 3.03
Aug 11 17:30:12 arcadia Type:   Direct-Access   
ANSI SCSI revision: 05


lspci:

:00:00.0 Host bridge: VIA Technologies, Inc. VT8377 [KT400/KT600  
AGP] Host Bridge

:00:01.0 PCI bridge: VIA Technologies, Inc. VT8235 PCI Bridge
:00:0a.0 Unknown mass storage controller: Silicon Image, Inc. SiI  
3112 [SATALink/SATARaid] Serial ATA Controller (rev 02)
:00:0c.0 FireWire (IEEE 1394): Agere Systems (former Lucent  
Microelectronics) FW323 (rev 61)
:00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA VT6420  
SATA RAID Controller (rev 80)
:00:0f.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/ 
VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
:00:10.0 USB Controller: VIA Technologies, Inc. VT82x UHCI  
USB 1.1 Controller (rev 81)
:00:10.1 USB Controller: VIA Technologies, Inc. VT82x UHCI  
USB 1.1 Controller (rev 81)
:00:10.2 USB Controller: VIA Technologies, Inc. VT82x UHCI  
USB 1.1 Controller (rev 81)
:00:10.3 USB Controller: VIA Technologies, Inc. VT82x UHCI  
USB 1.1 Controller (rev 81)

:00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 86)
:00:11.0 ISA bridge: VIA Technologies, Inc. VT8237 ISA bridge  
[KT600/K8T800/K8T890 South]
:00:11.5 Multimedia audio controller: VIA Technologies, Inc.  
VT8233/A/8235/8237 AC97 Audio Controller (rev 60)
:00:12.0 Ethernet controller: VIA Technologies, Inc. VT6102  
[Rhine-II] (rev 78)
:01:00.0 VGA compatible controller: nVidia Corporation NV11  
[GeForce2 MX/MX 400] (rev b2)


Many thanks,
Chris

--
Chris Boot
[EMAIL PROTECTED]
http://www.bootc.net/




-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Cosmetic JFFS patch.

2001-06-29 Thread Chris Boot

Hi,

> Many new Linux users go through an extended period of dual-booting.

And many users also have to sleep in the same room as their computers (still
live w/ parents or are in college) and the fans bother them, so they turn
them off every night.

Just my 2 eurocents.

-- 
Chris Boot
[EMAIL PROTECTED]

"use the source, luke." (obi-wan gnuobi)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Cosmetic JFFS patch.

2001-06-29 Thread Chris Boot

Hi,

 Many new Linux users go through an extended period of dual-booting.

And many users also have to sleep in the same room as their computers (still
live w/ parents or are in college) and the fans bother them, so they turn
them off every night.

Just my 2 eurocents.

-- 
Chris Boot
[EMAIL PROTECTED]

use the source, luke. (obi-wan gnuobi)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: temperature standard - global config option?

2001-06-09 Thread Chris Boot

Hi,

> I haven't encountered any CPU with builtin temperature sensors.

Well, I've got an Apple iMac (tee hee hee) with a PowerPC G3 (or 750 for you
number guys).  I know for sure that all of the G3 / G4 chips have
temperature sensors built onto the CPU core.

Mine's showing 23 degrees Celsius at the moment.

>> This thread keeps going and going and going...
> 
> and going, and going . and still going .

and going, and going, and going...

-- 
    .-. Chris Boot
/v\  [EMAIL PROTECTED]
   // \\
  /(   )\L   I   N   U   X
   ^^-^^>Phear the Penguin<

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: temperature standard - global config option?

2001-06-09 Thread Chris Boot

Hi,

 I haven't encountered any CPU with builtin temperature sensors.

Well, I've got an Apple iMac (tee hee hee) with a PowerPC G3 (or 750 for you
number guys).  I know for sure that all of the G3 / G4 chips have
temperature sensors built onto the CPU core.

Mine's showing 23 degrees Celsius at the moment.

 This thread keeps going and going and going...
 
 and going, and going . and still going .

and going, and going, and going...

-- 
.-. Chris Boot
/v\  [EMAIL PROTECTED]
   // \\
  /(   )\L   I   N   U   X
   ^^-^^Phear the Penguin

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: temperature standard - global config option?

2001-06-08 Thread Chris Boot

Hi,

> Only the truly stupid would assume accuracy from decimal places.

Well then, tell all the teachers in this world that they're stupid, and tell
everyone who learnt from them as well.  I'm in high school (gd. 11, junior)
and my physics teacher is always screaming at us for putting too many
decimal places or having them inconsistent.  There are certain situations
where adding a ±1 is too cumbersome and / or clumsy, so you can specify the
accuracy using just decimal places.

For example, 5.00 would mean pretty much spot on 5 (anywhere from 4.995 to
5.00499), wheras 5 could mean anywhere from 4.5 to 5.499.

Please, let's quit this dumb argument.  We all know that thermistors and
other types of cheap temperature gauges are very inaccurate, and I don't
think expensive thermocouples will make it into computer sensors very soon.
Plus, who the hell could care whether their chip is at 45.4 or 45.5 degrees?
Does it really matter?  A difference of 0.1 will not decide whether your
chip will fry.

Just my 2 eurocents.

-- 
Chris Boot
[EMAIL PROTECTED]

DOS Computers manufactured by companies such as IBM, Compaq, Tandy, and
millions of others are by far the most popular, with about 70 million
machines in use worldwide. Macintosh fans, on the other hand, may note
that cockroaches are far more numerous than humans, and that numbers
alone do not denote a higher life form.
New York Times, November 26, 1991

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: temperature standard - global config option?

2001-06-08 Thread Chris Boot

Hi,

> Then you must have blown your quantum finals.  Royally.  ESPECIALLY
> after that statement about "temperature is nothing but the movement of
> pieces of materie".  Not even close, once you get into the quant.
> 
> Mathematically and quantum mechanically, negative absolute
> temperatures do exist.  In quantum mechanics, temperature is expressed as
> probability populations in various quantum states.

Excuse me, but I don't think that we can get computer temperature sensors as
we know them to measure temperatures of matter in quantum states.  Even if,
one day, we built a usable quantum computer which might need temperature
measurements, I doubt that the Linux kernel would run on it without being
totally rewritten.

Anyhow, I like the discussion.  I love anything to do with quantum physics!

-- 
Chris Boot
[EMAIL PROTECTED]

#define QUESTION ((2b) || (!2b))

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: temperature standard - global config option?

2001-06-08 Thread Chris Boot

Hi,

 Then you must have blown your quantum finals.  Royally.  ESPECIALLY
 after that statement about temperature is nothing but the movement of
 pieces of materie.  Not even close, once you get into the quant.
 
 Mathematically and quantum mechanically, negative absolute
 temperatures do exist.  In quantum mechanics, temperature is expressed as
 probability populations in various quantum states.

Excuse me, but I don't think that we can get computer temperature sensors as
we know them to measure temperatures of matter in quantum states.  Even if,
one day, we built a usable quantum computer which might need temperature
measurements, I doubt that the Linux kernel would run on it without being
totally rewritten.

Anyhow, I like the discussion.  I love anything to do with quantum physics!

-- 
Chris Boot
[EMAIL PROTECTED]

#define QUESTION ((2b) || (!2b))

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: temperature standard - global config option?

2001-06-08 Thread Chris Boot

Hi,

 Only the truly stupid would assume accuracy from decimal places.

Well then, tell all the teachers in this world that they're stupid, and tell
everyone who learnt from them as well.  I'm in high school (gd. 11, junior)
and my physics teacher is always screaming at us for putting too many
decimal places or having them inconsistent.  There are certain situations
where adding a ±1 is too cumbersome and / or clumsy, so you can specify the
accuracy using just decimal places.

For example, 5.00 would mean pretty much spot on 5 (anywhere from 4.995 to
5.00499), wheras 5 could mean anywhere from 4.5 to 5.499.

Please, let's quit this dumb argument.  We all know that thermistors and
other types of cheap temperature gauges are very inaccurate, and I don't
think expensive thermocouples will make it into computer sensors very soon.
Plus, who the hell could care whether their chip is at 45.4 or 45.5 degrees?
Does it really matter?  A difference of 0.1 will not decide whether your
chip will fry.

Just my 2 eurocents.

-- 
Chris Boot
[EMAIL PROTECTED]

DOS Computers manufactured by companies such as IBM, Compaq, Tandy, and
millions of others are by far the most popular, with about 70 million
machines in use worldwide. Macintosh fans, on the other hand, may note
that cockroaches are far more numerous than humans, and that numbers
alone do not denote a higher life form.
New York Times, November 26, 1991

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: temperature standard - global config option?

2001-06-07 Thread Chris Boot

Hi,

>>> Kelvins good idea in general - it is always positive ;-)
>>> 
>>> 0.01*K fits in 16 bits and gives reasonable range.
>>> 
>>> but may be something like K<<6 could be a option? (to allow use of shifts
>>> instead of muls/divs). It would be much more easier to extract int part.
>>> 
>>> just my 2 eurocents.
>> 
>> Why not make it in Celsius ? Is more easy to read it this way.
> 
> It's easier for you as a user to read, but slightly harder to deal with inside
> the code.  
> It's really a user-space issue, inside the kernel should be as standardized as
> possible, and
> Kelvins make the most sense there.

OK, I think by now we've all agreed the following:
 - The issue is NOT displaying temperatures to the user, but a userspace
   program reading them from the kernel.  The userspace program itself can
   do temperature conversions for the user if he/she wants.
 - The most preferable units would be decikelvins, as the value can give a
   relatively precise as well as wide range of numbers ranging from absolute
   zero to about 6340 degrees Celsius ((65535 / 10) - 273) which is well
   within anything that a computer can operate.  It also gives us a good
   base for all sorts of other temperature sensing devices.

Do we all agree on those now?

-- 
Chris Boot
[EMAIL PROTECTED]

#define QUESTION ((2b) || (!2b))

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: temperature standard - global config option?

2001-06-07 Thread Chris Boot

Hi,

 Kelvins good idea in general - it is always positive ;-)
 
 0.01*K fits in 16 bits and gives reasonable range.
 
 but may be something like K6 could be a option? (to allow use of shifts
 instead of muls/divs). It would be much more easier to extract int part.
 
 just my 2 eurocents.
 
 Why not make it in Celsius ? Is more easy to read it this way.
 
 It's easier for you as a user to read, but slightly harder to deal with inside
 the code.  
 It's really a user-space issue, inside the kernel should be as standardized as
 possible, and
 Kelvins make the most sense there.

OK, I think by now we've all agreed the following:
 - The issue is NOT displaying temperatures to the user, but a userspace
   program reading them from the kernel.  The userspace program itself can
   do temperature conversions for the user if he/she wants.
 - The most preferable units would be decikelvins, as the value can give a
   relatively precise as well as wide range of numbers ranging from absolute
   zero to about 6340 degrees Celsius ((65535 / 10) - 273) which is well
   within anything that a computer can operate.  It also gives us a good
   base for all sorts of other temperature sensing devices.

Do we all agree on those now?

-- 
Chris Boot
[EMAIL PROTECTED]

#define QUESTION ((2b) || (!2b))

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: temperature standard - global config option?

2001-06-06 Thread Chris Boot

Hi,

> Please, don't.
> 
> Use kelvins *0.1, and use them consistently everywhere. This is what
> ACPI does, and it is probably right.

I'm sorry, by I don't feel like adding 273 to every number I get just to
find the temperature of something.  What I would do is give configuration
options to choose the default (Celsius/centigrade, Kelvin, or [shudder]
Fahrenheit) then, when you need to print or output a temperature, send it
off to a common converter function so you don't repeat core all over the
place.

Just my 0.02 Eurocents (what an ugly word).

-- 
Chris Boot
[EMAIL PROTECTED]

"Modem error handling really su~c%dk,s.^D^D@R*cCKo#?CB,*o#?C!!b%o#?
NO CARRIER

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: temperature standard - global config option?

2001-06-06 Thread Chris Boot

Hi,

 Please, don't.
 
 Use kelvins *0.1, and use them consistently everywhere. This is what
 ACPI does, and it is probably right.

I'm sorry, by I don't feel like adding 273 to every number I get just to
find the temperature of something.  What I would do is give configuration
options to choose the default (Celsius/centigrade, Kelvin, or [shudder]
Fahrenheit) then, when you need to print or output a temperature, send it
off to a common converter function so you don't repeat core all over the
place.

Just my 0.02 Eurocents (what an ugly word).

-- 
Chris Boot
[EMAIL PROTECTED]

Modem error handling really su~c%dk,s.^D^Dx@R*cCKo#?CB,*o#?C!!b%o#?
NO CARRIER

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/