Hi Mike,

Well, I'm not too familiar with rpm src's... I've used them a couple of times 
so here goes:

This is OVM 2.2.1.  So I went to oracle's edelivery site, dnloaded the 
source-iso for OVM 2.2.1, mounted the iso, cp'd off the kernel-source:

  kernel-2.6.18-128.2.1.4.25.el5.src.rpm

rpm2cpio | cpio -id 'd it and see a bunch of patches and a few tarballs:

-- linux-2.6.18.tar.bz2 
-- patch-2.6.18.4.bz2

Additionally, I see the following patches that I think might be related:


[r...@fedora14 ovm]# ls -l | grep iscsi
-rw-rw-r--. 1 root root    16053 Dec  1 08:31 
linux-2.6-firmware-ibft_iscsi-prevent-misconfigured-ibfts.patch
-rw-rw-r--. 1 root root   236632 Dec  1 08:31 linux-2.6-iscsi-add-qla4xxx2.patch
-rw-rw-r--. 1 root root     3164 Dec  1 08:31 linux-2.6-iscsi-fix-ping-ovm.patch
-rw-rw-r--. 1 root root     2967 Dec  1 08:31 
linux-2.6-iscsi-remove-old-code.patch
-rw-rw-r--. 1 root root    65402 Dec  1 08:31 
linux-2.6-iscsi-update-to-2-6-19-rc1.upstream.patch
-rw-rw-r--. 1 root root      731 Dec  1 08:31 
linux-2.6-net-qla3xxx-read-iscsi-target-disk-fail.patch
-rw-rw-r--. 1 root root    12236 Dec  1 08:31 
linux-2.6-scsi-fix-iscsi-write-handling-regression.patch
-rw-rw-r--. 1 root root    32919 Dec  1 08:31 
linux-2.6-scsi-iscsi-boot-firmware-table-tool-support.patch
-rw-rw-r--. 1 root root      842 Dec  1 08:31 
linux-2.6-scsi-iscsi-borked-kmalloc.patch
-rw-rw-r--. 1 root root     3710 Dec  1 08:31 
linux-2.6-scsi-iscsi-fix-nop-timeout-detection.patch
-rw-rw-r--. 1 root root     1476 Dec  1 08:31 
linux-2.6-scsi-iscsi-fix-sense-len-handling.patch
-rw-rw-r--. 1 root root     2115 Dec  1 08:31 
linux-2.6-scsi-iscsi-set-host-template.patch
-rw-rw-r--. 1 root root    50469 Dec  1 08:31 
linux-2.6-scsi-iscsi_tcp-update.patch
-rw-rw-r--. 1 root root     5409 Dec  1 08:31 
linux-2.6-scsi-libiscsi-data-corruption-when-resending-packets.patch
-rw-rw-r--. 1 root root     2859 Dec  1 08:31 
linux-2.6-scsi-libiscsi-fix-nop-response-reply-and-session-cleanup-race.patch
-rw-rw-r--. 1 root root     5178 Dec  1 08:31 
linux-2.6-scsi-oops-in-iscsi-packet-transfer-path.patch
-rw-rw-r--. 1 root root     4331 Dec  1 08:31 
linux-2.6-scsi-qla4xxx-increase-iscsi-session-check-to-3-tuple.patch
-rw-rw-r--. 1 root root    93728 Dec  1 08:31 
linux-2.6-scsi-update-iscsi_tcp-driver.patch
-rw-rw-r--. 1 root root     1887 Dec  1 08:31 
linux-2.6-xen-iscsi-oops-on-x86_64-xen-domu.patch
-rw-rw-r--. 1 root root   506042 Dec  1 08:31 ovs-iscsi-port-from-el5u4.patch
-rw-rw-r--. 1 root root     2060 Dec  1 08:31 
ovs-iscsi-wake-xmit-thread-when-killing-session.patch

And with regards to the kernel that we are actually running -- 
2.6.18-128.2.1.4.27 -- here is the changelog to describe changes since the 
src.rpm's and patches listed above in the iso:

[r...@oim6102501 ~]# rpm -q kernel-2.6.18-128.2.1.4.27.el5 --changelog
* Fri Jul 23 2010 Guru Anbalagane <[email protected]> 
[2.6.18-128.2.1.4.27.el5]
- fix the ocfs2 hang caused by orphan scan lock [Orabug 9736359] 
([email protected])
- Fix netpoll to only poll devices that are present [orabug 9651647] 
([email protected])
- Fix pci hot remove crash [orabug 9686788] ([email protected])

* Tue May 11 2010 Joe Jin <[email protected]> [2.6.18-128.2.1.4.26.el5]
- update bnx2x to 1.52.12.
- update bnx2 to 2.0.8b.
- update cnic to 1.9.13b.
- add bnx2i support(2.0.1e).

* Mon Mar 15 2010 Kevin Lyons <[email protected]> 
[2.6.18-128.2.1.4.25.el5]
- Disable RSC in ixgbe driver to avoid BUG during shutdown 
([email protected])
  [bugz 10164]

* Fri Mar 12 2010 Kevin Lyons <[email protected]> 
[2.6.18-128.2.1.4.24.el5]
- [xen]: fix blkback and blktap read sysfs statistics panic [orabug 9294434]
  (Joe Jin)

...

Do you have a recommendation for me to help get those files to you?   Like I 
said, I'm just not very versed in patching and rpm stuff (other than updates 
and such... or compiling kernels from 10+ years ago...).  If you can toss a 
couple links or commands my way, I'd be more than happy to dig to get at the 
info you're look for.  Or if you want me to just tar.bz2 up the patches from 
above and other files, I can do that as well.

Let me know how you think I can help you help me :)


Also, as for the tunables, I'll check into those today.

Thanks again,
Joe


On Nov 30, 2010, at 9:09 PM, Mike Christie wrote:

> On 11/30/2010 11:05 AM, hootjr29 wrote:
>> Hi all,
>> 
>> I am running into issues where I am getting iscsid ping timeouts for
>> my connections (not all.. just some... and it appears to be when the
>> EqualLogic system is busier).
>> 
>> Example:
>> =======
>> Nov 29 01:03:47 oim6102505 kernel:  connection90:0: ping timeout of 10
>> secs expired, recv timeout 5, last rx 198077764, last ping 1980790
>> 14, now 198081514
>> Nov 29 01:03:47 oim6102505 kernel:  connection90:0: detected conn
>> error (1011)
>> Nov 29 01:03:47 oim6102505 multipathd: sdam: readsector0 checker
>> reports path is down
>> 
>> GIVENS:
>> =======
>> [r...@servernamehere ~]# iscsiadm -m host -P 1
>> Host Number: 10
>>         State: running
>>         Transport: tcp
>>         Initiatorname:<empty>
>>         IPaddress: 192.168.9.9
>>         HWaddress: 00:10:18:3B:e5:23
>>         Netdev:<empty>
>> [r...@servernamehere ~]# rpm -qa | grep iscsi
>> iscsi-initiator-utils-6.2.0.871-0.7.el5
>> [r...@servernamehere ~]# uname -a
>> Linux servernamehere 2.6.18-128.2.1.4.27.el5xen #1 SMP Sat Jul 24
>> 02:16:40 EDT 2010 i686 i686 i386 GNU/Linux
>> 
>> 
>> I've run into issues in the past where this was related to nop-out
>> code.  Mike Christie had provided the patches that appear to have
>> resolved it in the open-iscsi 871 code.  I worked with Oracle support
>> (this is an Oracle VM 2.2.1 environment). and they were able to update
>> their yum repos to reflect this open-iscsi update.
> 
> 
> 
> Could you send me the libiscsi.c and iscsi_tcp.c files in the kernel you are 
> using or could you point me to the kernel source?
> 
> 
>> 
>> Now (a year or so later), I'm starting to see more connection timeout
>> messages.  After digging into this I determined that it looks like we
>> may be hitting possible EqualLogic problems with it sending pings in a
>> different way that it is expected in the nop-out standard/code?
>> 
>> I found this thread which may be related:
>> 
>>   
>> http://groups.google.com/group/open-iscsi/browse_thread/thread/a220595ec4f5f1d2/e90fc5d983a6186c?lnk=gst&q=bnx2i#e90fc5d983a6186c
>> 
> 
> I think those issues were related to and the fault of the offload bnx2i 
> driver. There were several bugs in that code related to nops/pings. They 
> should not affect you.
> 
> 
>> QUESTIONS:
>> ===========
>> 1) I guess what I'm wondering (and I've asked Oracle support to dig
>> further into this as well, btw) is if anyone knows if bnx2 falls into
>> the same type of bugs as bnx2i with regards to nop-out code?
> 
> No. If you are using bnx2 + iscsi_tcp then bnx2i does not come into play.
> 
> 
>> 2) If I disable nop-outs, this will likely remove these errors.  But
>> will it negatively affect my connections?  Even if the EQLX is 100%
>> busy doing stuff, will the scsi and dm-multipath code just handle that
>> outside of iSCSI code?  In other words, I guess I don't know what what
>> question I'm really asking here, but just am nervous about disabling
>> nop-outs :/
> 
> If you disable initiator nops and there is a valid problem then it will take 
> longer to fail a path in cases the network layer does not give us an error 
> and we were detecting the problem from the nop timing out.
> 
> The scsi layer and dm-multipath will eventually figure things out. It will 
> just take longer. The scsi layer's per command timeout 
> (/sys/block/sdX/device/timeout) will eventually expire. This will start the 
> scsi error handler which tries to send aborts and resets. If the path is 
> really bad those will fail since we cannot reach the target. The iscsi layer 
> will then try to relogin for node.session.timeo.replacement_timeout seconds. 
> When that fails, the iscsi layer will tell the scsi layer that we have failed 
> and the scsi layer will then notify the multipath layer which will retry the 
> IO on another path.
> 
> 
>> 
>> Any help/advice is appreciated :)
>> 
> 
> 
> Another EQL customer contacted EQL/Dell support they had them try these 
> settings:
> 
> 
> 1) sysctl.conf
> 
> net.ipv4.conf.all.arp_ignore=1
> net.ipv4.conf.all.arp_announce=2
> net.ipv4.netfilter.ip_conntrack_tcp_be_liberal=1
> 
> 
> 2) iscsid.conf
> 
> node.session.cmds_max = 1024
> node.session.queue_depth = 128
> 

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.

Reply via email to