Re: Kernel / iscsi problem under high load

2009-04-03 Thread Konrad Rzeszutek

On Fri, Apr 03, 2009 at 10:42:31AM +0100, Gonçalo Borges wrote:
> >
> >
> >
> > Sure.. but the normal rdac handler (that comes with the kernel) doesn't
> > spit those errors. It looks as a proprietary module.
> >
> > If this is the proprietary module, what happens when you use the one that
> > comes with
> > the RHEL5U2 kernel?
> >
> 
> 
> This RDAC handler is suggested in
> http://publib.boulder.ibm.com/infocenter/systems/topic/liaai/rdac/BPMultipathRDAC.pdf,
> and I had to download it from
> http://www.lsi.com/rdac/rdac-LINUX-09.02.C5.16-source.tar.gz, and compile
> it. I haven't tested the RDAC from the Kernel... Do you have any info on how
> to do it?

Move the module it created to some old place (those would be the mpp*.ko files)
and make sure that there is a dm-rdac.ko is in your /lib/modules/`uname -r`/ 
directory.

Boot a normal initrd, not the one the LSI package created.

The multipath.conf that you posted will work. You can check that by running
lsmod | grep rdac

and you should see dm_rdac loaded.

> 
> What I have done previously was to test the DM-multipath with the
> "path_checker readsector0" in /etc/multipath. I got the same problems in

Yikes. You don't want that.

> this Raid 10 configuration for the DS3300. However, dividing the same DS3300
> in 6 R1, I had no problems either with the present RDAC or with readsector0,

6 R1 ?

> but I got better I/O performance with the RDAC.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Kernel / iscsi problem under high load

2009-04-03 Thread Ulrich Windl

On 3 Apr 2009 at 11:42, Gonçalo Borges wrote:

[...]
> Is this 2048 GB limit imposed on iSCSI? Because there is nothing in SCSI
> itlself which forces you to this limit... Nowadays, you could have huge
> partitions (if you do GPT partitions with PARTED)... So, if there is a
> limit, it should come from iSCSI...

Hi, I just looked it up (use: "T10 SBC-2"): SCSI (See SBC-2, section 4.1) seems 
to 
use "short LBA" (four bytes) and "long LBA" (eight bytes) to address blocks in 
block devices. So it seems our storage system only supports "short LBA". I 
don't 
know what Linux supports. I'd guess sizes up to 2^32-1 blocks are safe, however.

> 
> 
> >
> > > [r...@core26 ~]# fdisk -l /dev/sdb1
> > > Disk /dev/sdb1: 499.9 GB, 49983104 bytes
> >
> > Isn't that a bit small for 2.7TB ? I think you should use fdisk on the
> > disk, not
> > on the partition!
> 
> 
> 
> Here goes the output of fdisk on the disk:
> 
>  [r...@core26 ~]# fdisk -l /dev/sdb
> WARNING: GPT (GUID Partition Table) detected on '/dev/sdb'! The util fdisk
> doesn't support GPT. Use GNU Parted.
> Disk /dev/sdb: 2998.9 GB, 2998998663168 bytes
> 255 heads, 63 sectors/track, 364607 cylinders
> Units = cylinders of 16065 * 512 = 8225280 bytes
>Device Boot  Start End  Blocks   Id  System
> /dev/sdb1   1  267350  2147483647+  ee  EFI GPT

As the utility said, MS-DOS partitions can only handle partitions up to 2TB 
(2047 
point something GB). I had little experience with parted yet, so you must find 
out 
yorself. At least your utilities seem to do have done the right thing.

> 
> We do GPT partitions with parted in order to overcome the deficienty of the
> (old!) 2048GB limit. Here goes the output of a parted (just to be sure):
> 
> [r...@core26 ~]# parted /dev/mapper/iscsi06-apoio2
> GNU Parted 1.8.1
> Using /dev/mapper/iscsi06-apoio2
> Welcome to GNU Parted! Type 'help' to view a list of commands.
> (parted) print
> Model: Linux device-mapper (dm)
> Disk /dev/mapper/iscsi06-apoio2: 2999GB
> Sector size (logical/physical): 512B/512B
> Partition Table: gpt
> 
> Number  Start   EndSize   File system  NameFlags
>  1  17.4kB  500GB  500GB  ext3 iscsi06-apoio2

Ok, so your 500GB partition is at the start of the device. that should be safe 
for 
Linux.

[...]
> In principle, the partitition should be on the beguining of the logical
> volume but I can not confirm it with parted. If this is the case, everything
> shoudl work fine. However, If there is the limit of 2048 GB of storage per
> LUN, this may confuse the setup.. don't know for sure.

Now you'll have to compare the sector number Linux complains about (to be past 
the 
end of the device / partition with the actual limit. Linux shouldn't access a 
device past the limit. Usually the commands that create filesystems do that 
correctly so that Linux shouldn't exceed the limits.

If there is an access outside the valid range, it could be some corruption vie 
iSCSI. You could use something like "dd if=/dev/zero 
of=a_big_file_in_your_filesystem" to fill your filesystem completely. Linux 
shouldn't complain about access past the end of the device. If it does, you'll 
have to dig further into details.

Regards,
Ulrich


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Kernel / iscsi problem under high load

2009-04-03 Thread Gonçalo Borges
Hi...

> [r...@core26 ~]# multipath -ll
> > sda: checker msg is "rdac checker reports path is down"
> > iscsi06-apoio1 (3600a0b80003ad1e50f2e49ae6d3e) dm-0 IBM,VirtualDisk
> > [size=2.7T][features=1 queue_if_no_path][hwhandler=0]
>
> Very interesting: Out SAN system allows only 2048 GB of storage per LUN.
> Lookinginto the SCSI protocol, it seems there is a 32bit number of blocks
> (512Bytes) to count the LUN capacity. Thus roughly 4Gig times 0.4kB makes
> 2TB. I
> wonder how your system represents 2.7TB in the SCSI protocol.
>


Is this 2048 GB limit imposed on iSCSI? Because there is nothing in SCSI
itlself which forces you to this limit... Nowadays, you could have huge
partitions (if you do GPT partitions with PARTED)... So, if there is a
limit, it should come from iSCSI...


>
> > [r...@core26 ~]# fdisk -l /dev/sdb1
> > Disk /dev/sdb1: 499.9 GB, 49983104 bytes
>
> Isn't that a bit small for 2.7TB ? I think you should use fdisk on the
> disk, not
> on the partition!



Here goes the output of fdisk on the disk:

 [r...@core26 ~]# fdisk -l /dev/sdb
WARNING: GPT (GUID Partition Table) detected on '/dev/sdb'! The util fdisk
doesn't support GPT. Use GNU Parted.
Disk /dev/sdb: 2998.9 GB, 2998998663168 bytes
255 heads, 63 sectors/track, 364607 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
   Device Boot  Start End  Blocks   Id  System
/dev/sdb1   1  267350  2147483647+  ee  EFI GPT

We do GPT partitions with parted in order to overcome the deficienty of the
(old!) 2048GB limit. Here goes the output of a parted (just to be sure):

[r...@core26 ~]# parted /dev/mapper/iscsi06-apoio2
GNU Parted 1.8.1
Using /dev/mapper/iscsi06-apoio2
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) print
Model: Linux device-mapper (dm)
Disk /dev/mapper/iscsi06-apoio2: 2999GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number  Start   EndSize   File system  NameFlags
 1  17.4kB  500GB  500GB  ext3 iscsi06-apoio2


> [r...@core26 ~]# df -k
> > Filesystem   1K-blocks  Used Available Use% Mounted on
> > /dev/sda1 90491396   2008072  83812428   3% /
> > tmpfs   524288 0524288   0% /dev/shm
> > /dev/mapper/iscsi06-apoio1p1
> >  480618344202804 456001480   1% /apoio06-1
> > /dev/mapper/iscsi06-apoio2p1
> >  480618344202800 456001484   1% /apoio06-2
> >
> > The sizes, although not exactly the same (but that doesn't happen also
> for
> > the system disk), are very close.
>
> So you have roughly 500GB on a 2.7TB LUN in use.
>

That is right... I have a logical volume of  2.7TB but a partition of 500GB.
But isn't this allowed?

> I do not think the difference I see in previous commands is big enough to
> justify a wrong setup. But I'm just guessing and I'm not really an expert.

It now depends where the partition is located on the disk (use a corrected
> fdisk
> invocation to find out).
>

In principle, the partitition should be on the beguining of the logical
volume but I can not confirm it with parted. If this is the case, everything
shoudl work fine. However, If there is the limit of 2048 GB of storage per
LUN, this may confuse the setup.. don't know for sure.


Cheers and Thanks
Goncalo

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Kernel / iscsi problem under high load

2009-04-03 Thread Gonçalo Borges
>
>
>
> Sure.. but the normal rdac handler (that comes with the kernel) doesn't
> spit those errors. It looks as a proprietary module.
>
> If this is the proprietary module, what happens when you use the one that
> comes with
> the RHEL5U2 kernel?
>


This RDAC handler is suggested in
http://publib.boulder.ibm.com/infocenter/systems/topic/liaai/rdac/BPMultipathRDAC.pdf,
and I had to download it from
http://www.lsi.com/rdac/rdac-LINUX-09.02.C5.16-source.tar.gz, and compile
it. I haven't tested the RDAC from the Kernel... Do you have any info on how
to do it?

What I have done previously was to test the DM-multipath with the
"path_checker readsector0" in /etc/multipath. I got the same problems in
this Raid 10 configuration for the DS3300. However, dividing the same DS3300
in 6 R1, I had no problems either with the present RDAC or with readsector0,
but I got better I/O performance with the RDAC.

Cheers
Goncalo

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Kernel / iscsi problem under high load

2009-04-03 Thread Ulrich Windl

On 3 Apr 2009 at 9:27, I wrote:

> Very interesting: Out SAN system allows only 2048 GB of storage per LUN. 
---^Our

> Lookinginto the SCSI protocol, it seems there is a 32bit number of blocks 
> (512Bytes) to count the LUN capacity. Thus roughly 4Gig times 0.4kB makes 
> 2TB. I 
---^0.5
> wonder how your system represents 2.7TB in the SCSI protocol.

Sorry for the typing!

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Kernel / iscsi problem under high load

2009-04-03 Thread Ulrich Windl

On 2 Apr 2009 at 18:19, Gonçalo Borges wrote:

[...]
> I have the following multipath devices:
[...]
> [r...@core26 ~]# multipath -ll
> sda: checker msg is "rdac checker reports path is down"
> iscsi06-apoio1 (3600a0b80003ad1e50f2e49ae6d3e) dm-0 IBM,VirtualDisk
> [size=2.7T][features=1 queue_if_no_path][hwhandler=0]

Very interesting: Out SAN system allows only 2048 GB of storage per LUN. 
Lookinginto the SCSI protocol, it seems there is a 32bit number of blocks 
(512Bytes) to count the LUN capacity. Thus roughly 4Gig times 0.4kB makes 2TB. 
I 
wonder how your system represents 2.7TB in the SCSI protocol.

[...]
> [r...@core26 ~]# fdisk -l /dev/sdb1
> Disk /dev/sdb1: 499.9 GB, 49983104 bytes

Isn't that a bit small for 2.7TB ? I think you should use fdisk on the disk, 
not 
on the partition!

> 255 heads, 63 sectors/track, 60788 cylinders
> Units = cylinders of 16065 * 512 = 8225280 bytes
> Disk /dev/sdb1 doesn't contain a valid partition table

See above!
[...]
> [r...@core26 ~]# df -k
> Filesystem   1K-blocks  Used Available Use% Mounted on
> /dev/sda1 90491396   2008072  83812428   3% /
> tmpfs   524288 0524288   0% /dev/shm
> /dev/mapper/iscsi06-apoio1p1
>  480618344202804 456001480   1% /apoio06-1
> /dev/mapper/iscsi06-apoio2p1
>  480618344202800 456001484   1% /apoio06-2
> 
> The sizes, although not exactly the same (but that doesn't happen also for
> the system disk), are very close.

So you have roughly 500GB on a 2.7TB LUN in use.

> 
> 
> 
> > Then one could compare those sizes to those reported by the kernel. Maybe
> > the
> > setup just wrong, and it takes a while until the end of the device is
> > reached.
> >
> 
> 
> I do not think the difference I see in previous commands is big enough to
> justify a wrong setup. But I'm just guessing and I'm not really an expert.

It now depends where the partition is located on the disk (use a corrected 
fdisk 
invocation to find out).

> 
> 
> >
> > Then I would start slowly, i.e. with one izone running on one client.
> >
> 
> 
> I've already performed the same testes with 6 Raid 0 and 6 Raid 1 instead of
> 2 Raid 10 in similar DS 3300 systems without having this kind of errors. But
> probably, I could be hitting some kind of limit..
> 
> 
> >
> > BTW, what do you want to measure: the kernel throughput, the network
> > throughput,
> > the iSCSI throughput, the controller throughput, or the disk throughput?
> > You
> > should have some concrete idea before starting the benchmark. Also with
> > just 12
> > disks I see little sense in having that many threads accessign the disk. To
> > shorten a lengthy test, it may be advisable to reduce the system memory
> > (iozone
> > recommands to create a file size at least three times the amount of RAM,
> > end even
> > 8GB on a local disk takes hours to perform)
> 
> 
> I want to measure the I/O performance for the RAID in sequential and random
> write/reads. What matters for the final user is that he was able to
> write/read at XXX MB/s. I want to stress the system to know the limit of the
> ISCSI controllers (this is why I'm starting so many threads). In theory, at
> the controllers limit, they should take a lot of time to deal with the I/O
> traffic from the diferent clients but they are not suppose to die.

I was able to reach the limit of our system (380MB/s over 4Gb FC) with one 
single 
machine. As a summary: Performance is best if you write large blocks (1MB) 
sequentially. Anything else is bad.

Regards,
Ulrich


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Kernel / iscsi problem under high load

2009-04-02 Thread Konrad Rzeszutek

On Thu, Apr 02, 2009 at 06:27:56PM +0100, Gonçalo Borges wrote:
> > > Apr  1 11:44:13 core26 kernel: 122 [RAIDarray.mpp]iscsi06:1:0:1
> > > Controller IO time expired. Delta 43701 secs
> > > Apr  1 11:44:13 core26 kernel: 497 [RAIDarray.mpp]iscsi06:1:0:1 Failed
> > > controller to 0. retry. vcmnd SN 458970 pdev H6:C0:T0:L1
> > > 0x00/0x00/0x00 0x0002 mpp_status:2
> >
> > What is the RAIDArray.mpp  program? Is that something the IBM docs
> > mentioned needs to be installed? Is that a version of Open-iSCSI
> > module .. or maybe the rdac handler??
> >
> 
> This is just the rdac handler! Teh RDAC handler is activated booting your

Sure.. but the normal rdac handler (that comes with the kernel) doesn't
spit those errors. It looks as a proprietary module.

If this is the proprietary module, what happens when you use the one that comes 
with
the RHEL5U2 kernel?

> system with the mpp module which may be configured in a /etc/grub.conf such
> as:
> 
> root (hd0,0)
> kernel /boot/xen.gz-2.6.18-92.1.22.el5 dom0_mem=1024M
> module /boot/vmlinuz-2.6.18-92.1.22.el5xen ro root=LABEL=/
> module /boot/mpp-2.6.18-92.1.22.el5xen.img
> 
> Cheers
> Goncalo
> 
> > 

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Kernel / iscsi problem under high load

2009-04-02 Thread Gonçalo Borges
>
> Where did you get this kernel? Is it from xen or from Red Hat? If it is
> from Red Hat? I have not seen some of the error messages in your log in
> the upstream or RHEL code.
>
>
This is a xen kernel but distributed in the Scientific Linux official
releases. Check, for example:


http://ftp.scientificlinux.org/linux/scientific/5rolling/x86_64/SL/kernel-xen-2.6.18-128.1.1.el5.x86_64.rpm

Cheers
Gonçalo

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Kernel / iscsi problem under high load

2009-04-02 Thread Gonçalo Borges
> > Apr  1 11:44:13 core26 kernel: 122 [RAIDarray.mpp]iscsi06:1:0:1
> > Controller IO time expired. Delta 43701 secs
> > Apr  1 11:44:13 core26 kernel: 497 [RAIDarray.mpp]iscsi06:1:0:1 Failed
> > controller to 0. retry. vcmnd SN 458970 pdev H6:C0:T0:L1
> > 0x00/0x00/0x00 0x0002 mpp_status:2
>
> What is the RAIDArray.mpp  program? Is that something the IBM docs
> mentioned needs to be installed? Is that a version of Open-iSCSI
> module .. or maybe the rdac handler??
>

This is just the rdac handler! Teh RDAC handler is activated booting your
system with the mpp module which may be configured in a /etc/grub.conf such
as:

root (hd0,0)
kernel /boot/xen.gz-2.6.18-92.1.22.el5 dom0_mem=1024M
module /boot/vmlinuz-2.6.18-92.1.22.el5xen ro root=LABEL=/
module /boot/mpp-2.6.18-92.1.22.el5xen.img

Cheers
Goncalo

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Kernel / iscsi problem under high load

2009-04-02 Thread Gonçalo Borges
Hi...

First of all, thanks for the reply.

After recovering my system, I tried to perform the tests you ask for

It might be good to know what scsiinfo (or similar) says about the size of
> the LUN
> at the start of yout tests. Likewise, show what "fdisk -l" tells about the
> partitions, and finally what "df -k" tells about the capacity of the file
> system.



I have the following multipath devices:

[r...@core26 ~]# dmsetup ls
iscsi06-apoio1(253, 0)   -> dm-0
iscsi06-apoio1p1(253, 3)-> dm-3
iscsi06-apoio2p1(253, 2)-> dm-2 (the one which gave problems
previously,it was called dm-10),
iscsi06-apoio2(253, 1)-> dm-1

[r...@core26 ~]# multipath -ll
sda: checker msg is "rdac checker reports path is down"
iscsi06-apoio1 (3600a0b80003ad1e50f2e49ae6d3e) dm-0 IBM,VirtualDisk
[size=2.7T][features=1 queue_if_no_path][hwhandler=0]
\_ round-robin 0 [prio=100][active]
 \_ 25:0:0:0 sdb 8:16  [active][ready]
iscsi06-apoio2 (3600a0b80003ad2130f8649ae6d5b) dm-1 IBM,VirtualDisk
[size=2.7T][features=1 queue_if_no_path][hwhandler=0]
\_ round-robin 0 [prio=100][active]
 \_ 25:0:0:1 sdc 8:32  [active][ready]

So, we are interested in iscsi06-apoio2 (dm-2, sdc) and in iscsi06-apoio1
(dm-3, sdb)


[r...@core26 ~]# fdisk -l /dev/sdb1
Disk /dev/sdb1: 499.9 GB, 49983104 bytes
255 heads, 63 sectors/track, 60788 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk /dev/sdb1 doesn't contain a valid partition table


[r...@core26 ~]# fdisk -l /dev/sdc1
Disk /dev/sdc1: 499.9 GB, 49983104 bytes
255 heads, 63 sectors/track, 60788 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk /dev/sdc1 doesn't contain a valid partition table


[r...@core26 ~]# df -k
Filesystem   1K-blocks  Used Available Use% Mounted on
/dev/sda1 90491396   2008072  83812428   3% /
tmpfs   524288 0524288   0% /dev/shm
/dev/mapper/iscsi06-apoio1p1
 480618344202804 456001480   1% /apoio06-1
/dev/mapper/iscsi06-apoio2p1
 480618344202800 456001484   1% /apoio06-2

The sizes, although not exactly the same (but that doesn't happen also for
the system disk), are very close.



> Then one could compare those sizes to those reported by the kernel. Maybe
> the
> setup just wrong, and it takes a while until the end of the device is
> reached.
>


I do not think the difference I see in previous commands is big enough to
justify a wrong setup. But I'm just guessing and I'm not really an expert.


>
> Then I would start slowly, i.e. with one izone running on one client.
>


I've already performed the same testes with 6 Raid 0 and 6 Raid 1 instead of
2 Raid 10 in similar DS 3300 systems without having this kind of errors. But
probably, I could be hitting some kind of limit..


>
> BTW, what do you want to measure: the kernel throughput, the network
> throughput,
> the iSCSI throughput, the controller throughput, or the disk throughput?
> You
> should have some concrete idea before starting the benchmark. Also with
> just 12
> disks I see little sense in having that many threads accessign the disk. To
> shorten a lengthy test, it may be advisable to reduce the system memory
> (iozone
> recommands to create a file size at least three times the amount of RAM,
> end even
> 8GB on a local disk takes hours to perform)


I want to measure the I/O performance for the RAID in sequential and random
write/reads. What matters for the final user is that he was able to
write/read at XXX MB/s. I want to stress the system to know the limit of the
ISCSI controllers (this is why I'm starting so many threads). In theory, at
the controllers limit, they should take a lot of time to deal with the I/O
traffic from the diferent clients but they are not suppose to die.

Cheers
Goncalo

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Kernel / iscsi problem under high load

2009-04-02 Thread Konrad Rzeszutek

> Apr  1 11:44:13 core26 kernel: 122 [RAIDarray.mpp]iscsi06:1:0:1
> Controller IO time expired. Delta 43701 secs
> Apr  1 11:44:13 core26 kernel: 497 [RAIDarray.mpp]iscsi06:1:0:1 Failed
> controller to 0. retry. vcmnd SN 458970 pdev H6:C0:T0:L1
> 0x00/0x00/0x00 0x0002 mpp_status:2

What is the RAIDArray.mpp  program? Is that something the IBM docs
mentioned needs to be installed? Is that a version of Open-iSCSI
module .. or maybe the rdac handler??


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Kernel / iscsi problem under high load

2009-04-02 Thread Mike Christie

Gonçalo Borges wrote:
> Dear open-iscsi gurus...
> 
> I'm working with an IBM DS3300 storage system accessible via open-
> iscsi technology to many different clients. To give you the proper
> context before describing my problem, I'll introduce my setup:
> 
> 
> *** The target setup ***
> 
> - IBM DS3300 storage system as ISCSI target with 12 SATA disks.
> - 2 ISCSI controllers taking care of 2 network interfaces each (with
> Jumbo frames enabled).
> - Each controller owns a Raid 10 with one logical unit / partition
> each
> 
> 
> *** The initiators setup (with network interfaces with Jumbo frames
> enabled) ***
> 
> - OS:
> [r...@core26]# cat /etc/redhat-release
> Scientific Linux SL release 5.2 (Boron)
> 
> - Kernel:
> [r...@core26]# uname -a
> Linux core26.ncg.ingrid.pt 2.6.18-92.1.22.el5xen #1 SMP Tue Dec 16
> 07:06:23 EST 2008 x86_64 x86_64 x86_64 GNU/Linux
> 

Where did you get this kernel? Is it from xen or from Red Hat? If it is 
from Red Hat? I have not seen some of the error messages in your log in 
the upstream or RHEL code.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---



Re: Kernel / iscsi problem under high load

2009-04-02 Thread Ulrich Windl

On 2 Apr 2009 at 6:34, Gonçalo Borges wrote:

[...]
> In conclusion, it seems that at a given time there are attempts to
> access beyond end of device. I don't know who is the guilty guy, if
> the kernel itself if the iscsi framework. Then the ISCSI controller
> start to fail also with IO expired messages.

It might be good to know what scsiinfo (or similar) says about the size of the 
LUN 
at the start of yout tests. Likewise, show what "fdisk -l" tells about the 
partitions, and finally what "df -k" tells about the capacity of the file 
system. 
Then one could compare those sizes to those reported by the kernel. Maybe the 
setup just wrong, and it takes a while until the end of the device is reached.

Then I would start slowly, i.e. with one izone running on one client.

BTW, what do you want to measure: the kernel throughput, the network 
throughput, 
the iSCSI throughput, the controller throughput, or the disk throughput? You 
should have some concrete idea before starting the benchmark. Also with just 12 
disks I see little sense in having that many threads accessign the disk. To 
shorten a lengthy test, it may be advisable to reduce the system memory (iozone 
recommands to create a file size at least three times the amount of RAM, end 
even 
8GB on a local disk takes hours to perform)

Regards,
Ulrich


> 
> Do you have suggestion of what can I be doing wrong?



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~--~~~~--~~--~--~---