On Wed, Feb 20, 2008 at 06:09:50PM -0800, Sparqz wrote:
> Hi All,
> I have a nasty problem with open-iscsi on SLES10 + an Infortrend iSCSI
> array.
> Basically it looks like everything goes wrong as soon as the read/
> write load becomes heavy, although network dumps suggest the problem
> is always there, it just goes critical when the load is too heavy.
> My setup:
> 1x HP DL585 - SLES10 x86_64
> 1x HP DL585 - RHEL4 x86_64
> 1x HP DL380 - SLES10 i586

SLES10 or SLES10SP1 ? 

Have you tried installing and using the latest open-iscsi from open-iscsi.org ?
> 2x Cisco 2960G (gigabit) switches
> 2x Infortrend A16E-G2130-4 with 16x 1TB disks each
> The two Infortrend arrays have all their gigabit ethernet ports
> plugged into one of the cisco switches, then we have 2 fibre
> connections leading to the other cisco switch which has the three
> servers plugged into it.  The network is completely isolated from our
> other company networks.

So you have only 2 gbit/sec of bandwidth between the Cisco switches? 

How many ethernet ports do your iSCSI arrays have (plugged in to the

How many ethernet ports each server is using / plugged in to the switch? 

> At first I thought it was a network problem, so we replaced our dodgy
> Netgear switches with quality Cisco networking gear, but the problem
> is the same, if anything it's worse because the Cisco switches
> facilitate higher bandwidth (extra ~20mb/s) and the errors seem to be
> more reliably producible.

Do you see packet drops/errors in any of the ports? Check all ports in both
> None of the linux ethernet statistics report any errors (ifconfig) and
> the cisco switches don't report any packet errors either.  The
> Infortrend arrays don't provide ethernet statistics.

Check linux TCP statistics for tcp retransmits? netstat -s
> Wireshark (ethereal) shows many errors - clusters of Duplicate ACKs,
> and a few "previous segment lost".

Are you using ethernet flow control? Check the switch settings, and server
NIC settings.. and possible iSCSI array settings.. 

In a bigger IP-SAN setup with many servers and switches flow control might be
needed to get a good performance and to prevent tcp retransmits from
happening (=preventing the switch port buffers becoming full and packet drop

> Any help would be much appreciated !!!

Btw have you tried with ext3? XFS is known to have problems with some setups
and versions.. 

I'm not familiar with Infotrend iSCSI arrays so can't comment much about

-- Pasi

