Dave Watkins wrote:
I'll start with a better description and get onto working with the new
packages

I'm running iometer on the Windows Server 2003 x64 box against an iSCSI
volume using the MS iSCSI initiator (2.02).

Using iometer and selecting any of the 0% read, 0% random access
specifications will return expected performance numbers, but stopping
that test, and changing the access specification to any 100% read, 0%
random test will generate the errors. Larger block sizes _seem_ to make
it happen more frequently so I have created a 256k block size test for
the above access specifications. I have been using 16 and 64 outstanding
I/O's but anything above zero seems to show the error.
Networking on both ends is via Intel e1000 cards and jumbo frames are
enabled, and so is flow control, NAPI is also enabled on the Openfiler
box. All other network settings are default.

Try without all the tweaking and then enable them one by one.

Also have you successfully run the benchmarks on the local box without going through iSCSI?


R.

Dave

-----Original Message-----
From: Rafiu Fakunle [mailto:[EMAIL PROTECTED] Sent: Tuesday, 28 November 2006 1:30 a.m.
To: Dave Watkins
Cc: [email protected]
Subject: Re: [OF-users] iSCSI bug?

Dave Watkins wrote:
Still the same, both with bonding enabled and disabled unfortunatly
First:

try adding "nosoftlockup" to the grub boot options and then run the benchmarks again.

Next:

http://www.openfiler.com/download/PACKAGES/iscsi_trgt-kernel-r78.ccs
http://www.openfiler.com/download/PACKAGES/iscsi_trgt-r78.ccs

(kernel and userland)

same as before (--replace-files)

Finally:

Also a bit more detail about your test set-up (components, parameters, triggers etc) would be great.

R.
-----Original Message-----
From: Rafiu Fakunle [mailto:[EMAIL PROTECTED] Sent: Monday, 27 November 2006 2:52 p.m.
To: Rafiu Fakunle
Cc: Dave Watkins; [email protected]
Subject: Re: [OF-users] iSCSI bug?

Rafiu Fakunle wrote:
Dave Watkins wrote:
Ok, UP is fine. To be sure it wasn't the e1000 driver I also tried
using
only the Broadcom NIC's as well. Under UP there is no error, under
SMP
the error reoccurs even with e1000 not loaded and no bonding.

Hope this helps
Immensely. I'm just doing up a changeset for you.
http://www.openfiler.com/download/PACKAGES/iscsi_trgt-kernel-0.4.14.ccs
conary update iscsi_trgt-kernel-0.4.14.ccs --replace-files

Then test again with 2.6.17.14-0.3.smp.x86_64 (with and without
bonding)


Thx,

R.
R.

Dave

-----Original Message-----
From: Rafiu Fakunle [mailto:[EMAIL PROTECTED] Sent: Monday, 27 November 2006 1:15 p.m.
To: Dave Watkins
Cc: [email protected]
Subject: Re: [OF-users] iSCSI bug?

OK, and UP without trunking?

R.

Dave Watkins wrote:
With or without trunking seem to generate the same problem

Without trunking I got
BUG: soft lockup detected on CPU#0!

Call Trace: <IRQ> <ffffffff8029f73c>{softlockup_tick+210}
       <ffffffff80289151>{update_process_times+66}
<ffffffff802713fe>{smp_local_timer_interrupt+35}
       <ffffffff80271463>{smp_apic_timer_interrupt+65}
<ffffffff8025f54c>{apic_timer_interrupt+132} <EOI>
       <ffffffff80224b87>{tcp_sendmsg+0}
<ffffffff80413bba>{inet_ioctl+0}
       <ffffffff88141216>{:iscsi_trgt:is_data_available+62}
       <ffffffff881419e7>{:iscsi_trgt:istd+1460}
<ffffffff80403ea6>{tcp_sendpage+0}
       <ffffffff8027fef6>{__wake_up_common+67}
<ffffffff8029131c>{keventd_create_kthread+0}
       <ffffffff88141433>{:iscsi_trgt:istd+0}
<ffffffff8029131c>{keventd_create_kthread+0}
       <ffffffff80231a7d>{kthread+200}
<ffffffff8025f8a2>{child_rip+8}
       <ffffffff8029131c>{keventd_create_kthread+0}
<ffffffff8027308f>{flat_send_IPI_mask+0}
       <ffffffff8027308f>{flat_send_IPI_mask+0}
<ffffffff8027308f>{flat_send_IPI_mask+0}
       <ffffffff802319b5>{kthread+0}
<ffffffff8025f89a>{child_rip+0}
BUG: soft lockup detected on CPU#0!

Call Trace: <IRQ> <ffffffff8029f73c>{softlockup_tick+210}
       <ffffffff80289151>{update_process_times+66}
<ffffffff802713fe>{smp_local_timer_interrupt+35}
       <ffffffff80271463>{smp_apic_timer_interrupt+65}
<ffffffff8025f54c>{apic_timer_interrupt+132} <EOI>
       <ffffffff80224b87>{tcp_sendmsg+0}
<ffffffff881411c0>{:iscsi_trgt:nthread_wakeup+35}
       <ffffffff881411b3>{:iscsi_trgt:nthread_wakeup+22}
<ffffffff8814219a>{:iscsi_trgt:istd+3431}
       <ffffffff80403ea6>{tcp_sendpage+0}
<ffffffff8027fef6>{__wake_up_common+67}
       <ffffffff8029131c>{keventd_create_kthread+0}
<ffffffff88141433>{:iscsi_trgt:istd+0}
       <ffffffff8029131c>{keventd_create_kthread+0}
<ffffffff80231a7d>{kthread+200}
       <ffffffff8025f8a2>{child_rip+8}
<ffffffff8029131c>{keventd_create_kthread+0}
       <ffffffff8027308f>{flat_send_IPI_mask+0}
<ffffffff8027308f>{flat_send_IPI_mask+0}
       <ffffffff8027308f>{flat_send_IPI_mask+0}
<ffffffff802319b5>{kthread+0}
       <ffffffff8025f89a>{child_rip+0}

Re-enabling trunking again and I get
BUG: soft lockup detected on CPU#0!

Call Trace: <IRQ> <ffffffff8029f73c>{softlockup_tick+210}
       <ffffffff80289151>{update_process_times+66}
<ffffffff802713fe>{smp_local_timer_interrupt+35}
       <ffffffff80271463>{smp_apic_timer_interrupt+65}
<ffffffff8025f54c>{apic_timer_interrupt+132} <EOI>
       <ffffffff80254356>{tcp_ioctl+0}
<ffffffff8020af50>{__might_sleep+30}
       <ffffffff802326d7>{lock_sock+28}
<ffffffff80263257>{_spin_lock_bh+9}
       <ffffffff8022fd23>{release_sock+15}
<ffffffff802543a2>{tcp_ioctl+76}
       <ffffffff80413c44>{inet_ioctl+138}
<ffffffff88141216>{:iscsi_trgt:is_data_available+62}
       <ffffffff8814125a>{:iscsi_trgt:do_recv+41}
<ffffffff8023081f>{qdisc_restart+24}
       <ffffffff8022eaa6>{dev_queue_xmit+510}
<ffffffff8807c266>{:bonding:bond_dev_queue_xmit+489}
       <ffffffff8023277e>{lock_sock+195}
<ffffffff8807fd96>{:bonding:bond_xmit_roundrobin+154}
       <ffffffff80232136>{__tcp_push_pending_frames+1367}
<ffffffff8022fd23>{release_sock+15}
       <ffffffff80225551>{tcp_sendmsg+2506}
<ffffffff80236f84>{do_sock_write+199}
       <ffffffff803dbac1>{sock_writev+220}
<ffffffff8025db21>{cache_alloc_refill+237}
       <ffffffff80220d80>{tcp_transmit_skb+1579}
<ffffffff80408067>{tcp_retransmit_skb+1352}
       <ffffffff80254356>{tcp_ioctl+0}
<ffffffff8024f5a4>{finish_wait+52}
       <ffffffff803e0d10>{sk_stream_wait_memory+458}
<ffffffff80291608>{autoremove_wake_function+0}
       <ffffffff80291608>{autoremove_wake_function+0}
<ffffffff8022fd23>{release_sock+15}
       <ffffffff80246a25>{try_to_wake_up+955}
<ffffffff88141609>{:iscsi_trgt:istd+470}
       <ffffffff80403ea6>{tcp_sendpage+0}
<ffffffff8027fef6>{__wake_up_common+67}
       <ffffffff8029131c>{keventd_create_kthread+0}
<ffffffff88141433>{:iscsi_trgt:istd+0}
       <ffffffff8029131c>{keventd_create_kthread+0}
<ffffffff80231a7d>{kthread+200}
       <ffffffff8025f8a2>{child_rip+8}
<ffffffff8029131c>{keventd_create_kthread+0}
       <ffffffff8027308f>{flat_send_IPI_mask+0}
<ffffffff8027308f>{flat_send_IPI_mask+0}
       <ffffffff8027308f>{flat_send_IPI_mask+0}
<ffffffff802319b5>{kthread+0}
       <ffffffff8025f89a>{child_rip+0}

Without trunking though the write performance after this doesn't
seem
to
be affected (still at about 80-90MB rather than down at less than
10MB)
-----Original Message-----
From: Rafiu Fakunle [mailto:[EMAIL PROTECTED] Sent: Monday, 27 November 2006 12:27 p.m.
To: Dave Watkins
Cc: [email protected]
Subject: Re: [OF-users] iSCSI bug?

Dave Watkins wrote:
Sorry about that, I remembered as soon as I sent it that I hadn't
included version. It's x86_64 version 2.2 (did a conary updateall
from
2.1 beta. Uname -r gives 2.6.17.14-0.3.smp.x86_64.

I'll try with a UP kernel although it will take some time as I
have
to
rebuild the e1000 module from the UP kernel sources.
Try without the network trunking anyway in the meantime. Would be
an
interesting test.

R.


I'll let you know
if I can reproduce on the UP kernel.

I don't think it's related to that ticket as they are all writes
anyway
and they only see the problem on large files.

Dave

-----Original Message-----
From: Rafiu Fakunle [mailto:[EMAIL PROTECTED] Sent: Monday, 27 November 2006 11:40 a.m.
To: Dave Watkins
Cc: [email protected]
Subject: Re: [OF-users] iSCSI bug?

Hi Dave,

Excellent test and bug report.

I wonder whether it may be related to this:

https://project.openfiler.com/tracker/ticket/435

Can you try to reproduce with a UP kernel pls.

Also I need the output of `uname -r`

Thx,

R.

FTR: this is running r58 from IET svn


Dave Watkins wrote:
Hi All

I think I've found a bug in the iscsi target software in my
benchmarking/testing.

Some background on the hardware first in case it may be related.
Dual core/dual opteron with 2GB of ram
3ware 8006 2 port raid card for OS drives
3ware 9550SX card for data drives
Dual GB Broadcom on-board NIC's teamed into bond0 (management)
Quad port Intel PCI-E GB NIC with all 4 ports teamed into bond1
(main
iscsi data network)
4 x 250GB WD SATA HDD's in RAID5

Of note here is that I have had to replace the e1000 driver with
the
latest from Intel to support the quad port card

I have made some volumes and mounted them on various windows
servers
and
have been using iobench to tune performance of the system. When
using
a
read only test pattern I see this

BUG: soft lockup detected on CPU#0!

Call Trace: <IRQ> <ffffffff8029f73c>{softlockup_tick+210}
       <ffffffff80289151>{update_process_times+66}
<ffffffff802713fe>{smp_local_timer_interrupt+35}
       <ffffffff80271463>{smp_apic_timer_interrupt+65}
<ffffffff8025f54c>{apic_timer_interrupt+132} <EOI>
       <ffffffff88141486>{:iscsi_trgt:istd+83}
<ffffffff88141476>{:iscsi_trgt:istd+67}
       <ffffffff80403ea6>{tcp_sendpage+0}
<ffffffff8027fef6>{__wake_up_common+67}
       <ffffffff8029131c>{keventd_create_kthread+0}
<ffffffff88141433>{:iscsi_trgt:istd+0}
       <ffffffff8029131c>{keventd_create_kthread+0}
<ffffffff80231a7d>{kthread+200}
       <ffffffff8025f8a2>{child_rip+8}
<ffffffff8029131c>{keventd_create_kthread+0}
       <ffffffff802319b5>{kthread+0}
<ffffffff8025f89a>{child_rip+0}
BUG: soft lockup detected on CPU#0!

Call Trace: <IRQ> <ffffffff8029f73c>{softlockup_tick+210}
       <ffffffff80289151>{update_process_times+66}
<ffffffff802713fe>{smp_local_timer_interrupt+35}
       <ffffffff80271463>{smp_apic_timer_interrupt+65}
<ffffffff8025f54c>{apic_timer_interrupt+132} <EOI>
       <ffffffff802631ec>{_spin_unlock_irqrestore+8}
<ffffffff80246a25>{try_to_wake_up+955}
       <ffffffff881411cc>{:iscsi_trgt:nthread_wakeup+47}
<ffffffff8814219a>{:iscsi_trgt:istd+3431}
       <ffffffff80403ea6>{tcp_sendpage+0}
<ffffffff8027fef6>{__wake_up_common+67}
       <ffffffff8029131c>{keventd_create_kthread+0}
<ffffffff88141433>{:iscsi_trgt:istd+0}
       <ffffffff8029131c>{keventd_create_kthread+0}
<ffffffff80231a7d>{kthread+200}
       <ffffffff8025f8a2>{child_rip+8}
<ffffffff8029131c>{keventd_create_kthread+0}
       <ffffffff802319b5>{kthread+0}
<ffffffff8025f89a>{child_rip+0}
Doing write only based patterns this doesn't come up. After this
performance of the system dives (from about 110MB/sec of iscsi
performance to about 10MB/sec).

This is fairly reproducible here so if you need anymore
information
just
ask.

Dave

------------------------------------------------------------------------
_______________________________________________
Openfiler-users mailing list
[email protected]
https://lists.openfiler.com/mailman/listinfo/openfiler-users
_______________________________________________
Openfiler-users mailing list
[email protected]
https://lists.openfiler.com/mailman/listinfo/openfiler-users


_______________________________________________
Openfiler-users mailing list
[email protected]
https://lists.openfiler.com/mailman/listinfo/openfiler-users

Reply via email to