It doesn't look like it's going to compile cleanly on an openfiler
platform unfortunately. I'm going to see if I can recreate the problem
by just moving a lot of data around on the iSCSI volumes.

Dave

-----Original Message-----
From: Rafiu Fakunle [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, 28 November 2006 10:49 a.m.
To: Dave Watkins
Cc: [email protected]
Subject: Re: [OF-users] iSCSI bug?

Dave Watkins wrote:
> When you say "local box" do you mean the openfiler box
Yes please.
>  or the windows
> box? I have run local tests on the windows box without isse but
haven't
> tried the openfiler box. I will try disabling jumbo frames, I have
tried
> without NAPI and flow control with no success
>
> -----Original Message-----
> From: Rafiu Fakunle [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, 28 November 2006 10:27 a.m.
> To: Dave Watkins
> Cc: [email protected]
> Subject: Re: [OF-users] iSCSI bug?
>
> Dave Watkins wrote:
>   
>> I'll start with a better description and get onto working with the
new
>> packages
>>
>> I'm running iometer on the Windows Server 2003 x64 box against an
>>     
> iSCSI
>   
>> volume using the MS iSCSI initiator (2.02).
>>
>> Using iometer and selecting any of the 0% read, 0% random access
>> specifications will return expected performance numbers, but stopping
>> that test, and changing the access specification to any 100% read, 0%
>> random test will generate the errors. Larger block sizes _seem_ to
>>     
> make
>   
>> it happen more frequently so I have created a 256k block size test
for
>> the above access specifications. I have been using 16 and 64
>>     
> outstanding
>   
>> I/O's but anything above zero seems to show the error. 
>>
>> Networking on both ends is via Intel e1000 cards and jumbo frames are
>> enabled, and so is flow control, NAPI is also enabled on the
Openfiler
>> box. All other network settings are default.
>>   
>>     
>
> Try without all the tweaking and then enable them one by one.
>
> Also have you successfully run the benchmarks on the local box without

> going through iSCSI?
>
>
> R.
>
>   
>> Dave
>>
>> -----Original Message-----
>> From: Rafiu Fakunle [mailto:[EMAIL PROTECTED] 
>> Sent: Tuesday, 28 November 2006 1:30 a.m.
>> To: Dave Watkins
>> Cc: [email protected]
>> Subject: Re: [OF-users] iSCSI bug?
>>
>> Dave Watkins wrote:
>>   
>>     
>>> Still the same, both with bonding enabled and disabled unfortunatly
>>>   
>>>     
>>>       
>> First:
>>
>> try adding "nosoftlockup" to the grub boot options and then run the 
>> benchmarks again.
>>
>> Next:
>>
>> http://www.openfiler.com/download/PACKAGES/iscsi_trgt-kernel-r78.ccs
>> http://www.openfiler.com/download/PACKAGES/iscsi_trgt-r78.ccs
>>
>> (kernel and userland)
>>
>> same as before (--replace-files)
>>
>> Finally:
>>
>> Also a bit more detail about your test set-up (components,
parameters,
>>     
>
>   
>> triggers etc) would be great.
>>
>> R.
>>   
>>     
>>> -----Original Message-----
>>> From: Rafiu Fakunle [mailto:[EMAIL PROTECTED] 
>>> Sent: Monday, 27 November 2006 2:52 p.m.
>>> To: Rafiu Fakunle
>>> Cc: Dave Watkins; [email protected]
>>> Subject: Re: [OF-users] iSCSI bug?
>>>
>>> Rafiu Fakunle wrote:
>>>   
>>>     
>>>       
>>>> Dave Watkins wrote:
>>>>     
>>>>       
>>>>         
>>>>> Ok, UP is fine. To be sure it wasn't the e1000 driver I also tried
>>>>>       
>>>>>         
>>>>>           
>>> using
>>>   
>>>     
>>>       
>>>>> only the Broadcom NIC's as well. Under UP there is no error, under
>>>>>       
>>>>>         
>>>>>           
>>> SMP
>>>   
>>>     
>>>       
>>>>> the error reoccurs even with e1000 not loaded and no bonding.
>>>>>
>>>>> Hope this helps
>>>>>   
>>>>>       
>>>>>         
>>>>>           
>>>> Immensely. I'm just doing up a changeset for you.
>>>>     
>>>>       
>>>>         
>>>     
>>>       
>
http://www.openfiler.com/download/PACKAGES/iscsi_trgt-kernel-0.4.14.ccs
>   
>>   
>>     
>>> conary update iscsi_trgt-kernel-0.4.14.ccs --replace-files
>>>
>>> Then test again with 2.6.17.14-0.3.smp.x86_64 (with and without
>>>     
>>>       
>> bonding)
>>   
>>     
>>> Thx,
>>>
>>> R.
>>>   
>>>     
>>>       
>>>> R.
>>>>
>>>>     
>>>>       
>>>>         
>>>>> Dave
>>>>>
>>>>> -----Original Message-----
>>>>> From: Rafiu Fakunle [mailto:[EMAIL PROTECTED] Sent: Monday, 27 
>>>>> November 2006 1:15 p.m.
>>>>> To: Dave Watkins
>>>>> Cc: [email protected]
>>>>> Subject: Re: [OF-users] iSCSI bug?
>>>>>
>>>>> OK, and UP without trunking?
>>>>>
>>>>> R.
>>>>>
>>>>> Dave Watkins wrote:
>>>>>  
>>>>>       
>>>>>         
>>>>>           
>>>>>> With or without trunking seem to generate the same problem
>>>>>>
>>>>>> Without trunking I got
>>>>>> BUG: soft lockup detected on CPU#0!
>>>>>>
>>>>>> Call Trace: <IRQ> <ffffffff8029f73c>{softlockup_tick+210}
>>>>>>        <ffffffff80289151>{update_process_times+66}
>>>>>> <ffffffff802713fe>{smp_local_timer_interrupt+35}
>>>>>>        <ffffffff80271463>{smp_apic_timer_interrupt+65}
>>>>>> <ffffffff8025f54c>{apic_timer_interrupt+132} <EOI>
>>>>>>        <ffffffff80224b87>{tcp_sendmsg+0}
>>>>>> <ffffffff80413bba>{inet_ioctl+0}
>>>>>>        <ffffffff88141216>{:iscsi_trgt:is_data_available+62}
>>>>>>        <ffffffff881419e7>{:iscsi_trgt:istd+1460}
>>>>>> <ffffffff80403ea6>{tcp_sendpage+0}
>>>>>>        <ffffffff8027fef6>{__wake_up_common+67}
>>>>>> <ffffffff8029131c>{keventd_create_kthread+0}
>>>>>>        <ffffffff88141433>{:iscsi_trgt:istd+0}
>>>>>> <ffffffff8029131c>{keventd_create_kthread+0}
>>>>>>        <ffffffff80231a7d>{kthread+200}
>>>>>>         
>>>>>>           
>>>>>>             
>>> <ffffffff8025f8a2>{child_rip+8}
>>>   
>>>     
>>>       
>>>>>>        <ffffffff8029131c>{keventd_create_kthread+0}
>>>>>> <ffffffff8027308f>{flat_send_IPI_mask+0}
>>>>>>        <ffffffff8027308f>{flat_send_IPI_mask+0}
>>>>>> <ffffffff8027308f>{flat_send_IPI_mask+0}
>>>>>>        <ffffffff802319b5>{kthread+0}
>>>>>>           
>>>>>>             
>> <ffffffff8025f89a>{child_rip+0}
>>   
>>     
>>>>>> BUG: soft lockup detected on CPU#0!
>>>>>>
>>>>>> Call Trace: <IRQ> <ffffffff8029f73c>{softlockup_tick+210}
>>>>>>        <ffffffff80289151>{update_process_times+66}
>>>>>> <ffffffff802713fe>{smp_local_timer_interrupt+35}
>>>>>>        <ffffffff80271463>{smp_apic_timer_interrupt+65}
>>>>>> <ffffffff8025f54c>{apic_timer_interrupt+132} <EOI>
>>>>>>        <ffffffff80224b87>{tcp_sendmsg+0}
>>>>>> <ffffffff881411c0>{:iscsi_trgt:nthread_wakeup+35}
>>>>>>        <ffffffff881411b3>{:iscsi_trgt:nthread_wakeup+22}
>>>>>> <ffffffff8814219a>{:iscsi_trgt:istd+3431}
>>>>>>        <ffffffff80403ea6>{tcp_sendpage+0}
>>>>>> <ffffffff8027fef6>{__wake_up_common+67}
>>>>>>        <ffffffff8029131c>{keventd_create_kthread+0}
>>>>>> <ffffffff88141433>{:iscsi_trgt:istd+0}
>>>>>>        <ffffffff8029131c>{keventd_create_kthread+0}
>>>>>> <ffffffff80231a7d>{kthread+200}
>>>>>>        <ffffffff8025f8a2>{child_rip+8}
>>>>>> <ffffffff8029131c>{keventd_create_kthread+0}
>>>>>>        <ffffffff8027308f>{flat_send_IPI_mask+0}
>>>>>> <ffffffff8027308f>{flat_send_IPI_mask+0}
>>>>>>        <ffffffff8027308f>{flat_send_IPI_mask+0}
>>>>>> <ffffffff802319b5>{kthread+0}
>>>>>>        <ffffffff8025f89a>{child_rip+0}
>>>>>>
>>>>>> Re-enabling trunking again and I get
>>>>>> BUG: soft lockup detected on CPU#0!
>>>>>>
>>>>>> Call Trace: <IRQ> <ffffffff8029f73c>{softlockup_tick+210}
>>>>>>        <ffffffff80289151>{update_process_times+66}
>>>>>> <ffffffff802713fe>{smp_local_timer_interrupt+35}
>>>>>>        <ffffffff80271463>{smp_apic_timer_interrupt+65}
>>>>>> <ffffffff8025f54c>{apic_timer_interrupt+132} <EOI>
>>>>>>        <ffffffff80254356>{tcp_ioctl+0}
>>>>>> <ffffffff8020af50>{__might_sleep+30}
>>>>>>        <ffffffff802326d7>{lock_sock+28}
>>>>>> <ffffffff80263257>{_spin_lock_bh+9}
>>>>>>        <ffffffff8022fd23>{release_sock+15}
>>>>>> <ffffffff802543a2>{tcp_ioctl+76}
>>>>>>        <ffffffff80413c44>{inet_ioctl+138}
>>>>>> <ffffffff88141216>{:iscsi_trgt:is_data_available+62}
>>>>>>        <ffffffff8814125a>{:iscsi_trgt:do_recv+41}
>>>>>> <ffffffff8023081f>{qdisc_restart+24}
>>>>>>        <ffffffff8022eaa6>{dev_queue_xmit+510}
>>>>>> <ffffffff8807c266>{:bonding:bond_dev_queue_xmit+489}
>>>>>>        <ffffffff8023277e>{lock_sock+195}
>>>>>> <ffffffff8807fd96>{:bonding:bond_xmit_roundrobin+154}
>>>>>>        <ffffffff80232136>{__tcp_push_pending_frames+1367}
>>>>>> <ffffffff8022fd23>{release_sock+15}
>>>>>>        <ffffffff80225551>{tcp_sendmsg+2506}
>>>>>> <ffffffff80236f84>{do_sock_write+199}
>>>>>>        <ffffffff803dbac1>{sock_writev+220}
>>>>>> <ffffffff8025db21>{cache_alloc_refill+237}
>>>>>>        <ffffffff80220d80>{tcp_transmit_skb+1579}
>>>>>> <ffffffff80408067>{tcp_retransmit_skb+1352}
>>>>>>        <ffffffff80254356>{tcp_ioctl+0}
>>>>>> <ffffffff8024f5a4>{finish_wait+52}
>>>>>>        <ffffffff803e0d10>{sk_stream_wait_memory+458}
>>>>>> <ffffffff80291608>{autoremove_wake_function+0}
>>>>>>        <ffffffff80291608>{autoremove_wake_function+0}
>>>>>> <ffffffff8022fd23>{release_sock+15}
>>>>>>        <ffffffff80246a25>{try_to_wake_up+955}
>>>>>> <ffffffff88141609>{:iscsi_trgt:istd+470}
>>>>>>        <ffffffff80403ea6>{tcp_sendpage+0}
>>>>>> <ffffffff8027fef6>{__wake_up_common+67}
>>>>>>        <ffffffff8029131c>{keventd_create_kthread+0}
>>>>>> <ffffffff88141433>{:iscsi_trgt:istd+0}
>>>>>>        <ffffffff8029131c>{keventd_create_kthread+0}
>>>>>> <ffffffff80231a7d>{kthread+200}
>>>>>>        <ffffffff8025f8a2>{child_rip+8}
>>>>>> <ffffffff8029131c>{keventd_create_kthread+0}
>>>>>>        <ffffffff8027308f>{flat_send_IPI_mask+0}
>>>>>> <ffffffff8027308f>{flat_send_IPI_mask+0}
>>>>>>        <ffffffff8027308f>{flat_send_IPI_mask+0}
>>>>>> <ffffffff802319b5>{kthread+0}
>>>>>>        <ffffffff8025f89a>{child_rip+0}
>>>>>>
>>>>>> Without trunking though the write performance after this doesn't
>>>>>>         
>>>>>>           
>>>>>>             
>>> seem
>>>   
>>>     
>>>       
>>>>>>     
>>>>>>         
>>>>>>           
>>>>>>             
>>>>> to
>>>>>  
>>>>>       
>>>>>         
>>>>>           
>>>>>> be affected (still at about 80-90MB rather than down at less than
>>>>>>     
>>>>>>         
>>>>>>           
>>>>>>             
>>>>> 10MB)
>>>>>  
>>>>>       
>>>>>         
>>>>>           
>>>>>> -----Original Message-----
>>>>>> From: Rafiu Fakunle [mailto:[EMAIL PROTECTED] Sent: Monday, 27

>>>>>> November 2006 12:27 p.m.
>>>>>> To: Dave Watkins
>>>>>> Cc: [email protected]
>>>>>> Subject: Re: [OF-users] iSCSI bug?
>>>>>>
>>>>>> Dave Watkins wrote:
>>>>>>      
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>> Sorry about that, I remembered as soon as I sent it that I
hadn't
>>>>>>> included version. It's x86_64 version 2.2 (did a conary
updateall
>>>>>>>       
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>>> from
>>>>>  
>>>>>       
>>>>>         
>>>>>           
>>>>>>> 2.1 beta. Uname -r gives 2.6.17.14-0.3.smp.x86_64.
>>>>>>>
>>>>>>> I'll try with a UP kernel although it will take some time as I
>>>>>>>             
>>>>>>>               
>> have
>>   
>>     
>>>>>>>       
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>>> to
>>>>>  
>>>>>       
>>>>>         
>>>>>           
>>>>>>> rebuild the e1000 module from the UP kernel sources.           
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>>>> Try without the network trunking anyway in the meantime. Would be
>>>>>>           
>>>>>>             
>> an
>>   
>>     
>>>>>>         
>>>>>>           
>>>>>>             
>>>   
>>>     
>>>       
>>>>>> interesting test.
>>>>>>
>>>>>> R.
>>>>>>
>>>>>>
>>>>>>      
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>> I'll let you know
>>>>>>> if I can reproduce on the UP kernel.
>>>>>>>
>>>>>>> I don't think it's related to that ticket as they are all writes
>>>>>>>           
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>>>> anyway
>>>>>>      
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>> and they only see the problem on large files.
>>>>>>>
>>>>>>> Dave
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Rafiu Fakunle [mailto:[EMAIL PROTECTED] Sent: Monday,
27
>>>>>>>               
>
>   
>>>>>>> November 2006 11:40 a.m.
>>>>>>> To: Dave Watkins
>>>>>>> Cc: [email protected]
>>>>>>> Subject: Re: [OF-users] iSCSI bug?
>>>>>>>
>>>>>>> Hi Dave,
>>>>>>>
>>>>>>> Excellent test and bug report.
>>>>>>>
>>>>>>> I wonder whether it may be related to this:
>>>>>>>
>>>>>>> https://project.openfiler.com/tracker/ticket/435
>>>>>>>
>>>>>>> Can you try to reproduce with a UP kernel pls.
>>>>>>>
>>>>>>> Also I need the output of `uname -r`
>>>>>>>
>>>>>>> Thx,
>>>>>>>
>>>>>>> R.
>>>>>>>
>>>>>>> FTR: this is running r58 from IET svn
>>>>>>>
>>>>>>>
>>>>>>> Dave Watkins wrote:
>>>>>>>            
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>>>>>> Hi All
>>>>>>>>
>>>>>>>> I think I've found a bug in the iscsi target software in my
>>>>>>>> benchmarking/testing.
>>>>>>>>
>>>>>>>> Some background on the hardware first in case it may be
related.
>>>>>>>> Dual core/dual opteron with 2GB of ram
>>>>>>>> 3ware 8006 2 port raid card for OS drives
>>>>>>>> 3ware 9550SX card for data drives
>>>>>>>> Dual GB Broadcom on-board NIC's teamed into bond0 (management)
>>>>>>>> Quad port Intel PCI-E GB NIC with all 4 ports teamed into bond1
>>>>>>>>         
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>>> (main
>>>>>  
>>>>>       
>>>>>         
>>>>>           
>>>>>>>> iscsi data network)
>>>>>>>> 4 x 250GB WD SATA HDD's in RAID5
>>>>>>>>
>>>>>>>> Of note here is that I have had to replace the e1000 driver
with
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>> the
>>>   
>>>     
>>>       
>>>>>>>> latest from Intel to support the quad port card
>>>>>>>>
>>>>>>>> I have made some volumes and mounted them on various windows
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>> servers
>>>   
>>>     
>>>       
>>>>>>>>                   
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>>>>> and
>>>>>>>            
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>>>>>> have been using iobench to tune performance of the system. When
>>>>>>>>         
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>>> using
>>>>>  
>>>>>       
>>>>>         
>>>>>           
>>>>>>>>                   
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>>>>> a
>>>>>>>            
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>>>>>> read only test pattern I see this
>>>>>>>>
>>>>>>>> BUG: soft lockup detected on CPU#0!
>>>>>>>>
>>>>>>>> Call Trace: <IRQ> <ffffffff8029f73c>{softlockup_tick+210}
>>>>>>>>        <ffffffff80289151>{update_process_times+66}
>>>>>>>> <ffffffff802713fe>{smp_local_timer_interrupt+35}
>>>>>>>>        <ffffffff80271463>{smp_apic_timer_interrupt+65}
>>>>>>>> <ffffffff8025f54c>{apic_timer_interrupt+132} <EOI>
>>>>>>>>        <ffffffff88141486>{:iscsi_trgt:istd+83}
>>>>>>>> <ffffffff88141476>{:iscsi_trgt:istd+67}
>>>>>>>>        <ffffffff80403ea6>{tcp_sendpage+0}
>>>>>>>> <ffffffff8027fef6>{__wake_up_common+67}
>>>>>>>>        <ffffffff8029131c>{keventd_create_kthread+0}
>>>>>>>> <ffffffff88141433>{:iscsi_trgt:istd+0}
>>>>>>>>        <ffffffff8029131c>{keventd_create_kthread+0}
>>>>>>>> <ffffffff80231a7d>{kthread+200}
>>>>>>>>        <ffffffff8025f8a2>{child_rip+8}
>>>>>>>> <ffffffff8029131c>{keventd_create_kthread+0}
>>>>>>>>        <ffffffff802319b5>{kthread+0}
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>> <ffffffff8025f89a>{child_rip+0}
>>>   
>>>     
>>>       
>>>>>>>> BUG: soft lockup detected on CPU#0!
>>>>>>>>
>>>>>>>> Call Trace: <IRQ> <ffffffff8029f73c>{softlockup_tick+210}
>>>>>>>>        <ffffffff80289151>{update_process_times+66}
>>>>>>>> <ffffffff802713fe>{smp_local_timer_interrupt+35}
>>>>>>>>        <ffffffff80271463>{smp_apic_timer_interrupt+65}
>>>>>>>> <ffffffff8025f54c>{apic_timer_interrupt+132} <EOI>
>>>>>>>>        <ffffffff802631ec>{_spin_unlock_irqrestore+8}
>>>>>>>> <ffffffff80246a25>{try_to_wake_up+955}
>>>>>>>>        <ffffffff881411cc>{:iscsi_trgt:nthread_wakeup+47}
>>>>>>>> <ffffffff8814219a>{:iscsi_trgt:istd+3431}
>>>>>>>>        <ffffffff80403ea6>{tcp_sendpage+0}
>>>>>>>> <ffffffff8027fef6>{__wake_up_common+67}
>>>>>>>>        <ffffffff8029131c>{keventd_create_kthread+0}
>>>>>>>> <ffffffff88141433>{:iscsi_trgt:istd+0}
>>>>>>>>        <ffffffff8029131c>{keventd_create_kthread+0}
>>>>>>>> <ffffffff80231a7d>{kthread+200}
>>>>>>>>        <ffffffff8025f8a2>{child_rip+8}
>>>>>>>> <ffffffff8029131c>{keventd_create_kthread+0}
>>>>>>>>        <ffffffff802319b5>{kthread+0}
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>> <ffffffff8025f89a>{child_rip+0}
>>>   
>>>     
>>>       
>>>>>>>> Doing write only based patterns this doesn't come up. After
this
>>>>>>>> performance of the system dives (from about 110MB/sec of iscsi
>>>>>>>> performance to about 10MB/sec).
>>>>>>>>
>>>>>>>> This is fairly reproducible here so if you need anymore
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>> information
>>>   
>>>     
>>>       
>>>>>>>>                   
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>>>>> just
>>>>>>>            
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>>>>>> ask.
>>>>>>>>
>>>>>>>> Dave
>>>>>>>>
>>>>>>>>  
>>>>>>>>                   
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>
------------------------------------------------------------------------
>   
>>   
>>     
>>>   
>>>     
>>>       
>>>>>  
>>>>>       
>>>>>         
>>>>>           
>>>>>>      
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>>>            
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>>>>>> _______________________________________________
>>>>>>>> Openfiler-users mailing list
>>>>>>>> [email protected]
>>>>>>>> https://lists.openfiler.com/mailman/listinfo/openfiler-users
>>>>>>>>                     
>>>>>>>>             
>>>>>>>>               
>>>>>>>>                 
>>>>>>>             
>>>>>>>           
>>>>>>>             
>>>>>>>               
>>>>>>       
>>>>>>         
>>>>>>           
>>>>>>             
>>>>>   
>>>>>       
>>>>>         
>>>>>           
>>>> _______________________________________________
>>>> Openfiler-users mailing list
>>>> [email protected]
>>>> https://lists.openfiler.com/mailman/listinfo/openfiler-users
>>>>     
>>>>       
>>>>         
>>>   
>>>     
>>>       
>>   
>>     
>
>   

_______________________________________________
Openfiler-users mailing list
[email protected]
https://lists.openfiler.com/mailman/listinfo/openfiler-users

Reply via email to