Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-12 Thread Nicholas A. Bellinger
Greetings all,

On Tue, 2008-02-12 at 17:05 +0100, Bart Van Assche wrote:
> On Feb 6, 2008 1:11 AM, Nicholas A. Bellinger <[EMAIL PROTECTED]> wrote:
> > I have always observed the case with LIO SE/iSCSI target mode ...
> 
> Hello Nicholas,
> 
> Are you sure that the LIO-SE kernel module source code is ready for
> inclusion in the mainstream Linux kernel ? As you know I tried to test
> the LIO-SE iSCSI target. Already while configuring the target I
> encountered a kernel crash that froze the whole system. I can
> reproduce this kernel crash easily, and I reported it 11 days ago on
> the LIO-SE mailing list (February 4, 2008). One of the call stacks I
> posted shows a crash in mempool_alloc() called from jbd. Or: the crash
> is most likely the result of memory corruption caused by LIO-SE.
> 

So I was able to FINALLY track this down to:

-# CONFIG_SLUB_DEBUG is not set
-# CONFIG_SLAB is not set
-CONFIG_SLUB=y
+CONFIG_SLAB=y

in both your and Chris Weiss's configs that was causing the
reproduceable general protection faults.  I also disabled
CONFIG_RELOCATABLE and crash dump because I was debugging using kdb in
x86_64 VM on 2.6.24 with your config.  I am pretty sure you can leave
this (crash dump) in your config for testing.

This can take a while to compile and take up alot of space, esp. with
all of the kernel debug options enabled, which on 2.6.24, really amounts
to alot of CPU time when building.  Also with your original config, I
was seeing some strange undefined module objects after Stage 2 Link with
iscsi_target_mod with modpost with the SLUB the lockups (which are not
random btw, and are tracked back to __kmalloc())..  Also, at module load
time with the original config, there where some warning about symbol
objects (I believe it was SCSI related, same as the ones with modpost).

In any event, the dozen 1000 loop discovery test is now working fine (as
well as IPoIB) with the above config change, and you should be ready to
go for your testing.

Tomo, Vlad, Andrew and Co:

Do you have any ideas why this would be the case with LIO-Target..?  Is
anyone else seeing something similar to this with their target mode
(mabye its all out of tree code..?) that is having an issue..? I am
using Debian x86_64 and Bart and Chris are using Ubuntu x86_64 and we
both have this problem with CONFIG_SLUB on >= 2.6.22 kernel.org
kernels. 

Also, I will recompile some of my non x86 machines with the above
enabled and see if I can reproduce..  Here the Bart's config again:

http://groups.google.com/group/linux-iscsi-target-dev/browse_thread/thread/30835aede1028188


> Because I was curious to know why it took so long to fix such a severe
> crash, I started browsing through the LIO-SE source code. Analysis of
> the LIO-SE kernel module source code learned me that this crash is not
> a coincidence. Dynamic memory allocation (kmalloc()/kfree()) in the
> LIO-SE kernel module is complex and hard to verify.

What the LIO-SE Target module does is complex. :P  Sorry for taking so
long, I had to start tracking this down by CONFIG_ option with your
config on an x86_64 VM. 

>  There are 412
> memory allocation/deallocation calls in the current version of the
> LIO-SE kernel module source code, which is a lot. Additionally,
> because of the complexity of the memory handling in LIO-SE, it is not
> possible to verify the correctness of the memory handling by analyzing
> a single function at a time. In my opinion this makes the LIO-SE
> source code hard to maintain.
> Furthermore, the LIO-SE kernel module source code does not follow
> conventions that have proven their value in the past like grouping all
> error handling at the end of a function. As could be expected, the
> consequence is that error handling is not correct in several
> functions, resulting in memory leaks in case of an error.

I would be more than happy to point the release paths for iSCSI Target
and LIO-SE to show they are not actual memory leaks (as I mentioned,
this code has been stable for a number of years) for some particular SE
or iSCSI Target logic if you are interested..

Also, if we are talking about target mode storage engine that should be
going upstream, the API to the current stable and future storage
systems, and of course the Mem->SG and SG->Mem that handles all possible
cases of max_sectors and sector_size to past, present, and future.  I
really glad that you have been taking a look at this, because some of
the code (as you mention) can get very complex to make this a reality as
it has been with LIO-Target since v2.2.  

>  Some
> examples of functions in which error handling is clearly incorrect:
> * transport_allocate_passthrough().
> * iscsi_do_build_list().
> 

You did find the one in transport_allocate_passthrough() and the
strncpy() + strlen() in userspace.  Also, thanks for pointing me to the
missing sg_init_table() and sg_mark_end() usage for 2.6.24.  I will post
an update to my thread about how to do this for other drivers..

I will have a look at 

Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-12 Thread Bart Van Assche
On Feb 6, 2008 1:11 AM, Nicholas A. Bellinger <[EMAIL PROTECTED]> wrote:
> I have always observed the case with LIO SE/iSCSI target mode ...

Hello Nicholas,

Are you sure that the LIO-SE kernel module source code is ready for
inclusion in the mainstream Linux kernel ? As you know I tried to test
the LIO-SE iSCSI target. Already while configuring the target I
encountered a kernel crash that froze the whole system. I can
reproduce this kernel crash easily, and I reported it 11 days ago on
the LIO-SE mailing list (February 4, 2008). One of the call stacks I
posted shows a crash in mempool_alloc() called from jbd. Or: the crash
is most likely the result of memory corruption caused by LIO-SE.

Because I was curious to know why it took so long to fix such a severe
crash, I started browsing through the LIO-SE source code. Analysis of
the LIO-SE kernel module source code learned me that this crash is not
a coincidence. Dynamic memory allocation (kmalloc()/kfree()) in the
LIO-SE kernel module is complex and hard to verify. There are 412
memory allocation/deallocation calls in the current version of the
LIO-SE kernel module source code, which is a lot. Additionally,
because of the complexity of the memory handling in LIO-SE, it is not
possible to verify the correctness of the memory handling by analyzing
a single function at a time. In my opinion this makes the LIO-SE
source code hard to maintain.
Furthermore, the LIO-SE kernel module source code does not follow
conventions that have proven their value in the past like grouping all
error handling at the end of a function. As could be expected, the
consequence is that error handling is not correct in several
functions, resulting in memory leaks in case of an error. Some
examples of functions in which error handling is clearly incorrect:
* transport_allocate_passthrough().
* iscsi_do_build_list().

Bart Van Assche.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-12 Thread Nicholas A. Bellinger
Greetings all,

On Tue, 2008-02-12 at 17:05 +0100, Bart Van Assche wrote:
 On Feb 6, 2008 1:11 AM, Nicholas A. Bellinger [EMAIL PROTECTED] wrote:
  I have always observed the case with LIO SE/iSCSI target mode ...
 
 Hello Nicholas,
 
 Are you sure that the LIO-SE kernel module source code is ready for
 inclusion in the mainstream Linux kernel ? As you know I tried to test
 the LIO-SE iSCSI target. Already while configuring the target I
 encountered a kernel crash that froze the whole system. I can
 reproduce this kernel crash easily, and I reported it 11 days ago on
 the LIO-SE mailing list (February 4, 2008). One of the call stacks I
 posted shows a crash in mempool_alloc() called from jbd. Or: the crash
 is most likely the result of memory corruption caused by LIO-SE.
 

So I was able to FINALLY track this down to:

-# CONFIG_SLUB_DEBUG is not set
-# CONFIG_SLAB is not set
-CONFIG_SLUB=y
+CONFIG_SLAB=y

in both your and Chris Weiss's configs that was causing the
reproduceable general protection faults.  I also disabled
CONFIG_RELOCATABLE and crash dump because I was debugging using kdb in
x86_64 VM on 2.6.24 with your config.  I am pretty sure you can leave
this (crash dump) in your config for testing.

This can take a while to compile and take up alot of space, esp. with
all of the kernel debug options enabled, which on 2.6.24, really amounts
to alot of CPU time when building.  Also with your original config, I
was seeing some strange undefined module objects after Stage 2 Link with
iscsi_target_mod with modpost with the SLUB the lockups (which are not
random btw, and are tracked back to __kmalloc())..  Also, at module load
time with the original config, there where some warning about symbol
objects (I believe it was SCSI related, same as the ones with modpost).

In any event, the dozen 1000 loop discovery test is now working fine (as
well as IPoIB) with the above config change, and you should be ready to
go for your testing.

Tomo, Vlad, Andrew and Co:

Do you have any ideas why this would be the case with LIO-Target..?  Is
anyone else seeing something similar to this with their target mode
(mabye its all out of tree code..?) that is having an issue..? I am
using Debian x86_64 and Bart and Chris are using Ubuntu x86_64 and we
both have this problem with CONFIG_SLUB on = 2.6.22 kernel.org
kernels. 

Also, I will recompile some of my non x86 machines with the above
enabled and see if I can reproduce..  Here the Bart's config again:

http://groups.google.com/group/linux-iscsi-target-dev/browse_thread/thread/30835aede1028188


 Because I was curious to know why it took so long to fix such a severe
 crash, I started browsing through the LIO-SE source code. Analysis of
 the LIO-SE kernel module source code learned me that this crash is not
 a coincidence. Dynamic memory allocation (kmalloc()/kfree()) in the
 LIO-SE kernel module is complex and hard to verify.

What the LIO-SE Target module does is complex. :P  Sorry for taking so
long, I had to start tracking this down by CONFIG_ option with your
config on an x86_64 VM. 

  There are 412
 memory allocation/deallocation calls in the current version of the
 LIO-SE kernel module source code, which is a lot. Additionally,
 because of the complexity of the memory handling in LIO-SE, it is not
 possible to verify the correctness of the memory handling by analyzing
 a single function at a time. In my opinion this makes the LIO-SE
 source code hard to maintain.
 Furthermore, the LIO-SE kernel module source code does not follow
 conventions that have proven their value in the past like grouping all
 error handling at the end of a function. As could be expected, the
 consequence is that error handling is not correct in several
 functions, resulting in memory leaks in case of an error.

I would be more than happy to point the release paths for iSCSI Target
and LIO-SE to show they are not actual memory leaks (as I mentioned,
this code has been stable for a number of years) for some particular SE
or iSCSI Target logic if you are interested..

Also, if we are talking about target mode storage engine that should be
going upstream, the API to the current stable and future storage
systems, and of course the Mem-SG and SG-Mem that handles all possible
cases of max_sectors and sector_size to past, present, and future.  I
really glad that you have been taking a look at this, because some of
the code (as you mention) can get very complex to make this a reality as
it has been with LIO-Target since v2.2.  

  Some
 examples of functions in which error handling is clearly incorrect:
 * transport_allocate_passthrough().
 * iscsi_do_build_list().
 

You did find the one in transport_allocate_passthrough() and the
strncpy() + strlen() in userspace.  Also, thanks for pointing me to the
missing sg_init_table() and sg_mark_end() usage for 2.6.24.  I will post
an update to my thread about how to do this for other drivers..

I will have a look at your new changes and post them on 

Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-12 Thread Bart Van Assche
On Feb 6, 2008 1:11 AM, Nicholas A. Bellinger [EMAIL PROTECTED] wrote:
 I have always observed the case with LIO SE/iSCSI target mode ...

Hello Nicholas,

Are you sure that the LIO-SE kernel module source code is ready for
inclusion in the mainstream Linux kernel ? As you know I tried to test
the LIO-SE iSCSI target. Already while configuring the target I
encountered a kernel crash that froze the whole system. I can
reproduce this kernel crash easily, and I reported it 11 days ago on
the LIO-SE mailing list (February 4, 2008). One of the call stacks I
posted shows a crash in mempool_alloc() called from jbd. Or: the crash
is most likely the result of memory corruption caused by LIO-SE.

Because I was curious to know why it took so long to fix such a severe
crash, I started browsing through the LIO-SE source code. Analysis of
the LIO-SE kernel module source code learned me that this crash is not
a coincidence. Dynamic memory allocation (kmalloc()/kfree()) in the
LIO-SE kernel module is complex and hard to verify. There are 412
memory allocation/deallocation calls in the current version of the
LIO-SE kernel module source code, which is a lot. Additionally,
because of the complexity of the memory handling in LIO-SE, it is not
possible to verify the correctness of the memory handling by analyzing
a single function at a time. In my opinion this makes the LIO-SE
source code hard to maintain.
Furthermore, the LIO-SE kernel module source code does not follow
conventions that have proven their value in the past like grouping all
error handling at the end of a function. As could be expected, the
consequence is that error handling is not correct in several
functions, resulting in memory leaks in case of an error. Some
examples of functions in which error handling is clearly incorrect:
* transport_allocate_passthrough().
* iscsi_do_build_list().

Bart Van Assche.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-08 Thread Luben Tuikov
--- On Fri, 2/8/08, Nicholas A. Bellinger <[EMAIL PROTECTED]> wrote:
> > Is there an open iSCSI Target implementation which
> does NOT
> > issue commands to sub-target devices via the SCSI
> mid-layer, but
> > bypasses it completely?
> > 
> >Luben
> > 
> 
> Hi Luben,
> 
> I am guessing you mean futher down the stack, which I
> don't know this to

Yes, that's what I meant.

> be the case.  Going futher up the layers is the design of
> v2.9 LIO-SE.
> There is a diagram explaining the basic concepts from a
> 10,000 foot
> level.
> 
> http://linux-iscsi.org/builds/user/nab/storage-engine-concept.pdf

Thanks!

   Luben
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-08 Thread Nicholas A. Bellinger
On Thu, 2008-02-07 at 12:37 -0800, Luben Tuikov wrote:
> Is there an open iSCSI Target implementation which does NOT
> issue commands to sub-target devices via the SCSI mid-layer, but
> bypasses it completely?
> 
>Luben
> 

Hi Luben,

I am guessing you mean futher down the stack, which I don't know this to
be the case.  Going futher up the layers is the design of v2.9 LIO-SE.
There is a diagram explaining the basic concepts from a 10,000 foot
level.

http://linux-iscsi.org/builds/user/nab/storage-engine-concept.pdf

Note that only traditional iSCSI target is currently implemented in v2.9
LIO-SE codebase in the list of target mode fabrics on left side of the
layout.  The API between the protocol headers that does
encoding/decoding target mode storage packets is probably the least
mature area of the LIO stack (because it has always been iSCSI looking
towards iSER :).  I don't know who has the most mature API between the
storage engine and target storage protocol for doing this between SCST
and STGT, I am guessing SCST because of the difference in age of the
projects.  Could someone be so kind to fill me in on this..?

Also note, the storage engine plugin for doing userspace passthrough on
the right is also currently not implemented.  Userspace passthrough in
this context is an target engine I/O that is enforcing max_sector and
sector_size limitiations, and encodes/decodes target storage protocol
packets all out of view of userspace.  The addressing will be completely
different if we are pointing SE target packets at non SCSI target ports
in userspace.

--nab

> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-08 Thread Luben Tuikov
--- On Fri, 2/8/08, Nicholas A. Bellinger [EMAIL PROTECTED] wrote:
  Is there an open iSCSI Target implementation which
 does NOT
  issue commands to sub-target devices via the SCSI
 mid-layer, but
  bypasses it completely?
  
 Luben
  
 
 Hi Luben,
 
 I am guessing you mean futher down the stack, which I
 don't know this to

Yes, that's what I meant.

 be the case.  Going futher up the layers is the design of
 v2.9 LIO-SE.
 There is a diagram explaining the basic concepts from a
 10,000 foot
 level.
 
 http://linux-iscsi.org/builds/user/nab/storage-engine-concept.pdf

Thanks!

   Luben
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-08 Thread Nicholas A. Bellinger
On Thu, 2008-02-07 at 12:37 -0800, Luben Tuikov wrote:
 Is there an open iSCSI Target implementation which does NOT
 issue commands to sub-target devices via the SCSI mid-layer, but
 bypasses it completely?
 
Luben
 

Hi Luben,

I am guessing you mean futher down the stack, which I don't know this to
be the case.  Going futher up the layers is the design of v2.9 LIO-SE.
There is a diagram explaining the basic concepts from a 10,000 foot
level.

http://linux-iscsi.org/builds/user/nab/storage-engine-concept.pdf

Note that only traditional iSCSI target is currently implemented in v2.9
LIO-SE codebase in the list of target mode fabrics on left side of the
layout.  The API between the protocol headers that does
encoding/decoding target mode storage packets is probably the least
mature area of the LIO stack (because it has always been iSCSI looking
towards iSER :).  I don't know who has the most mature API between the
storage engine and target storage protocol for doing this between SCST
and STGT, I am guessing SCST because of the difference in age of the
projects.  Could someone be so kind to fill me in on this..?

Also note, the storage engine plugin for doing userspace passthrough on
the right is also currently not implemented.  Userspace passthrough in
this context is an target engine I/O that is enforcing max_sector and
sector_size limitiations, and encodes/decodes target storage protocol
packets all out of view of userspace.  The addressing will be completely
different if we are pointing SE target packets at non SCSI target ports
in userspace.

--nab

 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-07 Thread Luben Tuikov
Is there an open iSCSI Target implementation which does NOT
issue commands to sub-target devices via the SCSI mid-layer, but
bypasses it completely?

   Luben

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-07 Thread Nicholas A. Bellinger
On Thu, 2008-02-07 at 14:13 +0100, Bart Van Assche wrote: 
> Since the focus of this thread shifted somewhat in the last few
> messages, I'll try to summarize what has been discussed so far:
> - There was a number of participants who joined this discussion
> spontaneously. This suggests that there is considerable interest in
> networked storage and iSCSI.
> - It has been motivated why iSCSI makes sense as a storage protocol
> (compared to ATA over Ethernet and Fibre Channel over Ethernet).
> - The direct I/O performance results for block transfer sizes below 64
> KB are a meaningful benchmark for storage target implementations.
> - It has been discussed whether an iSCSI target should be implemented
> in user space or in kernel space. It is clear now that an
> implementation in the kernel can be made faster than a user space
> implementation 
> (http://kerneltrap.org/mailarchive/linux-kernel/2008/2/4/714804).
> Regarding existing implementations, measurements have a.o. shown that
> SCST is faster than STGT (30% with the following setup: iSCSI via
> IPoIB and direct I/O block transfers with a size of 512 bytes).
> - It has been discussed which iSCSI target implementation should be in
> the mainstream Linux kernel. There is no agreement on this subject
> yet. The short-term options are as follows:
> 1) Do not integrate any new iSCSI target implementation in the
> mainstream Linux kernel.
> 2) Add one of the existing in-kernel iSCSI target implementations to
> the kernel, e.g. SCST or PyX/LIO.
> 3) Create a new in-kernel iSCSI target implementation that combines
> the advantages of the existing iSCSI kernel target implementations
> (iETD, STGT, SCST and PyX/LIO).
> 
> As an iSCSI user, I prefer option (3). The big question is whether the
> various storage target authors agree with this ?
> 

I think the other data point here would be that final target design
needs to be as generic as possible.  Generic in the sense that the
engine eventually needs to be able to accept NDB and other ethernet
based target mode storage configurations to an abstracted device object
(struct scsi_device, struct block_device, or struct file) just as it
would for an IP Storage based request.

We know that NDB and *oE will have their own naming and discovery, and
the first set of IO tasks to be completed would be those using
(iscsi_cmd_t->cmd_flags & ICF_SCSI_DATA_SG_IO_CDB) in
iscsi_target_transport.c in the current code.These are single READ_*
and WRITE_* codepaths that perform DMA memory pre-proceessing in v2.9
LIO-SE. 

Also, being able to tell the engine to accelerate to DMA ring operation
(say to underlying struct scsi_device or struct block_device) instead of
fileio in some cases you will see better performance when using hardware
(ie: not a underlying kernel thread queueing IO into block).  But I have
found FILEIO with sendpage with MD to be faster in single threaded tests
than struct block_device.  I am currently using IBLOCK for LVM for core
LIO operation (which actually sits on software MD raid6).  I do this
because using submit_bio() with se_mem_t mapped arrays of struct
scatterlist -> struct bio_vec can handle power failures properly, and
not send back StatSN Acks to the Initiator who thinks that everything
has already made it to disk.  This is the case with doing IO to struct
file in the kernel today without a kernel level O_DIRECT.

Also for proper kernel-level target mode support, using struct file with
O_DIRECT for storage blocks and emulating control path CDBS is one of
the work items.  This can be made generic or obtained from the
underlying storage object (anything that can be exported from LIO
Subsystem TPI) For real hardware (struct scsi_device in just about all
the cases these days).  Last time I looked this was due to
fs/direct-io.c:dio_refill_pages() using get_user_pages()...

For really transport specific CDB and control code, which in good amount
of cases, we are going eventually be expected to emulate in software. 
I really like how STGT breaks this up into per device type code
segments; spc.c sbc.c mmc.c ssc.c smc.c etc.  Having all of these split
out properly is one strong point of STGT IMHO, and really makes learning
things much easier.  Also, being able to queue these IOs into a
userspace and receive a asynchronous response back up the storage stack.
I think this is actually a pretty interesting potential for passing
storage protocol packets into userspace apps and leave the protocol
state machines and recovery paths in the kernel with a generic target
engine.

Also, I know that the SCST folks have put alot of time into getting the
very SCSI hardware specific target mode control modes to work.  I
personally own a bunch of this adapters, and would really like to see
better support for target mode on non iSCSI type adapters with a single
target mode storage engine that abstracts storage subsystems and wire
protocol fabrics.

--nab

--
To unsubscribe from this list: send the line "unsubscribe 

Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-07 Thread Bart Van Assche
Since the focus of this thread shifted somewhat in the last few
messages, I'll try to summarize what has been discussed so far:
- There was a number of participants who joined this discussion
spontaneously. This suggests that there is considerable interest in
networked storage and iSCSI.
- It has been motivated why iSCSI makes sense as a storage protocol
(compared to ATA over Ethernet and Fibre Channel over Ethernet).
- The direct I/O performance results for block transfer sizes below 64
KB are a meaningful benchmark for storage target implementations.
- It has been discussed whether an iSCSI target should be implemented
in user space or in kernel space. It is clear now that an
implementation in the kernel can be made faster than a user space
implementation (http://kerneltrap.org/mailarchive/linux-kernel/2008/2/4/714804).
Regarding existing implementations, measurements have a.o. shown that
SCST is faster than STGT (30% with the following setup: iSCSI via
IPoIB and direct I/O block transfers with a size of 512 bytes).
- It has been discussed which iSCSI target implementation should be in
the mainstream Linux kernel. There is no agreement on this subject
yet. The short-term options are as follows:
1) Do not integrate any new iSCSI target implementation in the
mainstream Linux kernel.
2) Add one of the existing in-kernel iSCSI target implementations to
the kernel, e.g. SCST or PyX/LIO.
3) Create a new in-kernel iSCSI target implementation that combines
the advantages of the existing iSCSI kernel target implementations
(iETD, STGT, SCST and PyX/LIO).

As an iSCSI user, I prefer option (3). The big question is whether the
various storage target authors agree with this ?

Bart Van Assche.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-07 Thread Bart Van Assche
Since the focus of this thread shifted somewhat in the last few
messages, I'll try to summarize what has been discussed so far:
- There was a number of participants who joined this discussion
spontaneously. This suggests that there is considerable interest in
networked storage and iSCSI.
- It has been motivated why iSCSI makes sense as a storage protocol
(compared to ATA over Ethernet and Fibre Channel over Ethernet).
- The direct I/O performance results for block transfer sizes below 64
KB are a meaningful benchmark for storage target implementations.
- It has been discussed whether an iSCSI target should be implemented
in user space or in kernel space. It is clear now that an
implementation in the kernel can be made faster than a user space
implementation (http://kerneltrap.org/mailarchive/linux-kernel/2008/2/4/714804).
Regarding existing implementations, measurements have a.o. shown that
SCST is faster than STGT (30% with the following setup: iSCSI via
IPoIB and direct I/O block transfers with a size of 512 bytes).
- It has been discussed which iSCSI target implementation should be in
the mainstream Linux kernel. There is no agreement on this subject
yet. The short-term options are as follows:
1) Do not integrate any new iSCSI target implementation in the
mainstream Linux kernel.
2) Add one of the existing in-kernel iSCSI target implementations to
the kernel, e.g. SCST or PyX/LIO.
3) Create a new in-kernel iSCSI target implementation that combines
the advantages of the existing iSCSI kernel target implementations
(iETD, STGT, SCST and PyX/LIO).

As an iSCSI user, I prefer option (3). The big question is whether the
various storage target authors agree with this ?

Bart Van Assche.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-07 Thread Nicholas A. Bellinger
On Thu, 2008-02-07 at 14:13 +0100, Bart Van Assche wrote: 
 Since the focus of this thread shifted somewhat in the last few
 messages, I'll try to summarize what has been discussed so far:
 - There was a number of participants who joined this discussion
 spontaneously. This suggests that there is considerable interest in
 networked storage and iSCSI.
 - It has been motivated why iSCSI makes sense as a storage protocol
 (compared to ATA over Ethernet and Fibre Channel over Ethernet).
 - The direct I/O performance results for block transfer sizes below 64
 KB are a meaningful benchmark for storage target implementations.
 - It has been discussed whether an iSCSI target should be implemented
 in user space or in kernel space. It is clear now that an
 implementation in the kernel can be made faster than a user space
 implementation 
 (http://kerneltrap.org/mailarchive/linux-kernel/2008/2/4/714804).
 Regarding existing implementations, measurements have a.o. shown that
 SCST is faster than STGT (30% with the following setup: iSCSI via
 IPoIB and direct I/O block transfers with a size of 512 bytes).
 - It has been discussed which iSCSI target implementation should be in
 the mainstream Linux kernel. There is no agreement on this subject
 yet. The short-term options are as follows:
 1) Do not integrate any new iSCSI target implementation in the
 mainstream Linux kernel.
 2) Add one of the existing in-kernel iSCSI target implementations to
 the kernel, e.g. SCST or PyX/LIO.
 3) Create a new in-kernel iSCSI target implementation that combines
 the advantages of the existing iSCSI kernel target implementations
 (iETD, STGT, SCST and PyX/LIO).
 
 As an iSCSI user, I prefer option (3). The big question is whether the
 various storage target authors agree with this ?
 

I think the other data point here would be that final target design
needs to be as generic as possible.  Generic in the sense that the
engine eventually needs to be able to accept NDB and other ethernet
based target mode storage configurations to an abstracted device object
(struct scsi_device, struct block_device, or struct file) just as it
would for an IP Storage based request.

We know that NDB and *oE will have their own naming and discovery, and
the first set of IO tasks to be completed would be those using
(iscsi_cmd_t-cmd_flags  ICF_SCSI_DATA_SG_IO_CDB) in
iscsi_target_transport.c in the current code.These are single READ_*
and WRITE_* codepaths that perform DMA memory pre-proceessing in v2.9
LIO-SE. 

Also, being able to tell the engine to accelerate to DMA ring operation
(say to underlying struct scsi_device or struct block_device) instead of
fileio in some cases you will see better performance when using hardware
(ie: not a underlying kernel thread queueing IO into block).  But I have
found FILEIO with sendpage with MD to be faster in single threaded tests
than struct block_device.  I am currently using IBLOCK for LVM for core
LIO operation (which actually sits on software MD raid6).  I do this
because using submit_bio() with se_mem_t mapped arrays of struct
scatterlist - struct bio_vec can handle power failures properly, and
not send back StatSN Acks to the Initiator who thinks that everything
has already made it to disk.  This is the case with doing IO to struct
file in the kernel today without a kernel level O_DIRECT.

Also for proper kernel-level target mode support, using struct file with
O_DIRECT for storage blocks and emulating control path CDBS is one of
the work items.  This can be made generic or obtained from the
underlying storage object (anything that can be exported from LIO
Subsystem TPI) For real hardware (struct scsi_device in just about all
the cases these days).  Last time I looked this was due to
fs/direct-io.c:dio_refill_pages() using get_user_pages()...

For really transport specific CDB and control code, which in good amount
of cases, we are going eventually be expected to emulate in software. 
I really like how STGT breaks this up into per device type code
segments; spc.c sbc.c mmc.c ssc.c smc.c etc.  Having all of these split
out properly is one strong point of STGT IMHO, and really makes learning
things much easier.  Also, being able to queue these IOs into a
userspace and receive a asynchronous response back up the storage stack.
I think this is actually a pretty interesting potential for passing
storage protocol packets into userspace apps and leave the protocol
state machines and recovery paths in the kernel with a generic target
engine.

Also, I know that the SCST folks have put alot of time into getting the
very SCSI hardware specific target mode control modes to work.  I
personally own a bunch of this adapters, and would really like to see
better support for target mode on non iSCSI type adapters with a single
target mode storage engine that abstracts storage subsystems and wire
protocol fabrics.

--nab

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to 

Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-07 Thread Luben Tuikov
Is there an open iSCSI Target implementation which does NOT
issue commands to sub-target devices via the SCSI mid-layer, but
bypasses it completely?

   Luben

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Nicholas A. Bellinger
On Wed, 2008-02-06 at 10:29 +0900, FUJITA Tomonori wrote:
> On Tue, 05 Feb 2008 18:09:15 +0100
> Matteo Tescione <[EMAIL PROTECTED]> wrote:
> 
> > On 5-02-2008 14:38, "FUJITA Tomonori" <[EMAIL PROTECTED]> wrote:
> > 
> > > On Tue, 05 Feb 2008 08:14:01 +0100
> > > Tomasz Chmielewski <[EMAIL PROTECTED]> wrote:
> > > 
> > >> James Bottomley schrieb:
> > >> 
> > >>> These are both features being independently worked on, are they not?
> > >>> Even if they weren't, the combination of the size of SCST in kernel plus
> > >>> the problem of having to find a migration path for the current STGT
> > >>> users still looks to me to involve the greater amount of work.
> > >> 
> > >> I don't want to be mean, but does anyone actually use STGT in
> > >> production? Seriously?
> > >> 
> > >> In the latest development version of STGT, it's only possible to stop
> > >> the tgtd target daemon using KILL / 9 signal - which also means all
> > >> iSCSI initiator connections are corrupted when tgtd target daemon is
> > >> started again (kernel upgrade, target daemon upgrade, server reboot 
> > >> etc.).
> > > 
> > > I don't know what "iSCSI initiator connections are corrupted"
> > > mean. But if you reboot a server, how can an iSCSI target
> > > implementation keep iSCSI tcp connections?
> > > 
> > > 
> > >> Imagine you have to reboot all your NFS clients when you reboot your NFS
> > >> server. Not only that - your data is probably corrupted, or at least the
> > >> filesystem deserves checking...
> > 

The TCP connection will drop, remember that the TCP connection state for
one side has completely vanished.  Depending on iSCSI/iSER
ErrorRecoveryLevel that is set, this will mean:

1) Session Recovery, ERL=0 - Restarting the entire nexus and all
connections across all of the possible subnets or comm-links.  All
outstanding un-StatSN acknowledged commands will be returned back to the
SCSI subsystem with RETRY status.  Once a single connection has been
reestablished to start the nexus, the CDBs will be resent.

2) Connection Recovery, ERL=2 - CDBs from the failed connection(s) will
be retried (nothing changes in the PDU) to fill the iSCSI CmdSN ordering
gap, or be explictly retried with TMR TASK_REASSIGN for ones already
acknowledged by the ExpCmdSN that are returned to the initiator in
response packets or by way of unsolicited NopINs.

> > Don't know if matters, but in my setup (iscsi on top of drbd+heartbeat)
> > rebooting the primary server doesn't affect my iscsi traffic, SCST correctly
> > manages stop/crash, by sending unit attention to clients on reconnect.
> > Drbd+heartbeat correctly manages those things too.
> > Still from an end-user POV, i was able to reboot/survive a crash only with
> > SCST, IETD still has reconnect problems and STGT are even worst.
> 
> Please tell us on stgt-devel mailing list if you see problems. We will
> try to fix them.
> 

FYI, the LIO code also supports rmmoding iscsi_target_mod while at full
10 Gb/sec speed.  I think it should be a requirement to be able to
control per initiator, per portal group, per LUN, per device, per HBA in
the design without restarting any other objects.

--nab

> Thanks,
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread FUJITA Tomonori
On Tue, 05 Feb 2008 18:09:15 +0100
Matteo Tescione <[EMAIL PROTECTED]> wrote:

> On 5-02-2008 14:38, "FUJITA Tomonori" <[EMAIL PROTECTED]> wrote:
> 
> > On Tue, 05 Feb 2008 08:14:01 +0100
> > Tomasz Chmielewski <[EMAIL PROTECTED]> wrote:
> > 
> >> James Bottomley schrieb:
> >> 
> >>> These are both features being independently worked on, are they not?
> >>> Even if they weren't, the combination of the size of SCST in kernel plus
> >>> the problem of having to find a migration path for the current STGT
> >>> users still looks to me to involve the greater amount of work.
> >> 
> >> I don't want to be mean, but does anyone actually use STGT in
> >> production? Seriously?
> >> 
> >> In the latest development version of STGT, it's only possible to stop
> >> the tgtd target daemon using KILL / 9 signal - which also means all
> >> iSCSI initiator connections are corrupted when tgtd target daemon is
> >> started again (kernel upgrade, target daemon upgrade, server reboot etc.).
> > 
> > I don't know what "iSCSI initiator connections are corrupted"
> > mean. But if you reboot a server, how can an iSCSI target
> > implementation keep iSCSI tcp connections?
> > 
> > 
> >> Imagine you have to reboot all your NFS clients when you reboot your NFS
> >> server. Not only that - your data is probably corrupted, or at least the
> >> filesystem deserves checking...
> 
> Don't know if matters, but in my setup (iscsi on top of drbd+heartbeat)
> rebooting the primary server doesn't affect my iscsi traffic, SCST correctly
> manages stop/crash, by sending unit attention to clients on reconnect.
> Drbd+heartbeat correctly manages those things too.
> Still from an end-user POV, i was able to reboot/survive a crash only with
> SCST, IETD still has reconnect problems and STGT are even worst.

Please tell us on stgt-devel mailing list if you see problems. We will
try to fix them.

Thanks,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Matteo Tescione
On 5-02-2008 14:38, "FUJITA Tomonori" <[EMAIL PROTECTED]> wrote:

> On Tue, 05 Feb 2008 08:14:01 +0100
> Tomasz Chmielewski <[EMAIL PROTECTED]> wrote:
> 
>> James Bottomley schrieb:
>> 
>>> These are both features being independently worked on, are they not?
>>> Even if they weren't, the combination of the size of SCST in kernel plus
>>> the problem of having to find a migration path for the current STGT
>>> users still looks to me to involve the greater amount of work.
>> 
>> I don't want to be mean, but does anyone actually use STGT in
>> production? Seriously?
>> 
>> In the latest development version of STGT, it's only possible to stop
>> the tgtd target daemon using KILL / 9 signal - which also means all
>> iSCSI initiator connections are corrupted when tgtd target daemon is
>> started again (kernel upgrade, target daemon upgrade, server reboot etc.).
> 
> I don't know what "iSCSI initiator connections are corrupted"
> mean. But if you reboot a server, how can an iSCSI target
> implementation keep iSCSI tcp connections?
> 
> 
>> Imagine you have to reboot all your NFS clients when you reboot your NFS
>> server. Not only that - your data is probably corrupted, or at least the
>> filesystem deserves checking...

Don't know if matters, but in my setup (iscsi on top of drbd+heartbeat)
rebooting the primary server doesn't affect my iscsi traffic, SCST correctly
manages stop/crash, by sending unit attention to clients on reconnect.
Drbd+heartbeat correctly manages those things too.
Still from an end-user POV, i was able to reboot/survive a crash only with
SCST, IETD still has reconnect problems and STGT are even worst.

Regards,
--matteo


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread FUJITA Tomonori
On Tue, 05 Feb 2008 17:07:07 +0100
Tomasz Chmielewski <[EMAIL PROTECTED]> wrote:

> FUJITA Tomonori schrieb:
> > On Tue, 05 Feb 2008 08:14:01 +0100
> > Tomasz Chmielewski <[EMAIL PROTECTED]> wrote:
> > 
> >> James Bottomley schrieb:
> >>
> >>> These are both features being independently worked on, are they not?
> >>> Even if they weren't, the combination of the size of SCST in kernel plus
> >>> the problem of having to find a migration path for the current STGT
> >>> users still looks to me to involve the greater amount of work.
> >> I don't want to be mean, but does anyone actually use STGT in
> >> production? Seriously?
> >>
> >> In the latest development version of STGT, it's only possible to stop
> >> the tgtd target daemon using KILL / 9 signal - which also means all
> >> iSCSI initiator connections are corrupted when tgtd target daemon is
> >> started again (kernel upgrade, target daemon upgrade, server reboot etc.).
> > 
> > I don't know what "iSCSI initiator connections are corrupted"
> > mean. But if you reboot a server, how can an iSCSI target
> > implementation keep iSCSI tcp connections?
> 
> The problem with tgtd is that you can't start it (configured) in an
> "atomic" way.
> Usually, one will start tgtd and it's configuration in a script (I 
> replaced some parameters with "..." to make it shorter and more readable):

Thanks for the details. So the way to stop the daemon is not related
with your problem.

It's easily fixable. Can you start a new thread about this on
stgt-devel mailing list? When we agree on the interface to start the
daemon, I'll implement it.


> tgtd
> tgtadm --op new ...
> tgtadm --lld iscsi --op new ...

(snip)

> So the only way to start/restart tgtd reliably is to do hacks which are 
> needed with yet another iSCSI kernel implementation (IET): use iptables.
> 
> iptables 
> tgtd
> sleep 1
> tgtadm --op new ...
> tgtadm --lld iscsi --op new ...
> iptables 
> 
> 
> A bit ugly, isn't it?
> Having to tinker with a firewall in order to start a daemon is by no 
> means a sign of a well-tested and mature project.
> 
> That's why I asked how many people use stgt in a production environment 
> - James was worried about a potential migration path for current users.

I don't know how many people use stgt in a production environment but
I'm not sure that this problem prevents many people from using it in a
production environment.

You want to reboot a server running target devices while initiators
connect to it. Rebooting the target server behind the initiators
seldom works. System adminstorators in my workplace reboot storage
devices once a year and tell us to shut down the initiator machines
that use them before that.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Ming Zhang
On Tue, 2008-02-05 at 17:07 +0100, Tomasz Chmielewski wrote:
> FUJITA Tomonori schrieb:
> > On Tue, 05 Feb 2008 08:14:01 +0100
> > Tomasz Chmielewski <[EMAIL PROTECTED]> wrote:
> > 
> >> James Bottomley schrieb:
> >>
> >>> These are both features being independently worked on, are they not?
> >>> Even if they weren't, the combination of the size of SCST in kernel plus
> >>> the problem of having to find a migration path for the current STGT
> >>> users still looks to me to involve the greater amount of work.
> >> I don't want to be mean, but does anyone actually use STGT in
> >> production? Seriously?
> >>
> >> In the latest development version of STGT, it's only possible to stop
> >> the tgtd target daemon using KILL / 9 signal - which also means all
> >> iSCSI initiator connections are corrupted when tgtd target daemon is
> >> started again (kernel upgrade, target daemon upgrade, server reboot etc.).
> > 
> > I don't know what "iSCSI initiator connections are corrupted"
> > mean. But if you reboot a server, how can an iSCSI target
> > implementation keep iSCSI tcp connections?
> 
> The problem with tgtd is that you can't start it (configured) in an
> "atomic" way.
> Usually, one will start tgtd and it's configuration in a script (I 
> replaced some parameters with "..." to make it shorter and more readable):
> 
> 
> tgtd
> tgtadm --op new ...
> tgtadm --lld iscsi --op new ...
> 
> 
> However, this won't work - tgtd goes immediately in the background as it 
> is still starting, and the first tgtadm commands will fail:

this should be a easy fix. start tgtd, get port setup ready in forked
process, then signal its parent that ready to quit. or set port ready in
parent, fork and pass to daemon.


> 
> # bash -x tgtd-start
> + tgtd
> + tgtadm --op new --mode target ...
> tgtadm: can't connect to the tgt daemon, Connection refused
> tgtadm: can't send the request to the tgt daemon, Transport endpoint is 
> not connected
> + tgtadm --lld iscsi --op new --mode account ...
> tgtadm: can't connect to the tgt daemon, Connection refused
> tgtadm: can't send the request to the tgt daemon, Transport endpoint is 
> not connected
> + tgtadm --lld iscsi --op bind --mode account --tid 1 ...
> tgtadm: can't find the target
> + tgtadm --op new --mode logicalunit --tid 1 --lun 1 ...
> tgtadm: can't find the target
> + tgtadm --op bind --mode target --tid 1 -I ALL
> tgtadm: can't find the target
> + tgtadm --op new --mode target --tid 2 ...
> + tgtadm --op new --mode logicalunit --tid 2 --lun 1 ...
> + tgtadm --op bind --mode target --tid 2 -I ALL
> 
> 
> OK, if tgtd takes longer to start, perhaps it's a good idea to sleep a 
> second right after tgtd?
> 
> tgtd
> sleep 1
> tgtadm --op new ...
> tgtadm --lld iscsi --op new ...
> 
> 
> No, it is not a good idea - if tgtd listens on port 3260 *and* is 
> unconfigured yet,  any reconnecting initiator will fail, like below:

this is another easy fix. tgtd started with unconfigured status and then
a tgtadm can configure it and turn it into ready status.


those are really minor usability issue. ( i know it is painful for user,
i agree)


the major problem here is to discuss in architectural wise, which one is
better... linux kernel should have one implementation that is good from
foundation...





> 
> end_request: I/O error, dev sdb, sector 7045192
> Buffer I/O error on device sdb, logical block 880649
> lost page write due to I/O error on sdb
> Aborting journal on device sdb.
> ext3_abort called.
> EXT3-fs error (device sdb): ext3_journal_start_sb: Detected aborted journal
> Remounting filesystem read-only
> end_request: I/O error, dev sdb, sector 7045880
> Buffer I/O error on device sdb, logical block 880735
> lost page write due to I/O error on sdb
> end_request: I/O error, dev sdb, sector 6728
> Buffer I/O error on device sdb, logical block 841
> lost page write due to I/O error on sdb
> end_request: I/O error, dev sdb, sector 7045192
> Buffer I/O error on device sdb, logical block 880649
> lost page write due to I/O error on sdb
> end_request: I/O error, dev sdb, sector 7045880
> Buffer I/O error on device sdb, logical block 880735
> lost page write due to I/O error on sdb
> __journal_remove_journal_head: freeing b_frozen_data
> __journal_remove_journal_head: freeing b_frozen_data
> 
> 
> Ouch.
> 
> So the only way to start/restart tgtd reliably is to do hacks which are 
> needed with yet another iSCSI kernel implementation (IET): use iptables.
> 
> iptables 
> tgtd
> sleep 1
> tgtadm --op new ...
> tgtadm --lld iscsi --op new ...
> iptables 
> 
> 
> A bit ugly, isn't it?
> Having to tinker with a firewall in order to start a daemon is by no 
> means a sign of a well-tested and mature project.
> 
> That's why I asked how many people use stgt in a production environment 
> - James was worried about a potential migration path for current users.
> 
> 
> 
> -- 
> Tomasz Chmielewski
> http://wpkg.org
> 
> 
> 

Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Tomasz Chmielewski

FUJITA Tomonori schrieb:

On Tue, 05 Feb 2008 08:14:01 +0100
Tomasz Chmielewski <[EMAIL PROTECTED]> wrote:


James Bottomley schrieb:


These are both features being independently worked on, are they not?
Even if they weren't, the combination of the size of SCST in kernel plus
the problem of having to find a migration path for the current STGT
users still looks to me to involve the greater amount of work.

I don't want to be mean, but does anyone actually use STGT in
production? Seriously?

In the latest development version of STGT, it's only possible to stop
the tgtd target daemon using KILL / 9 signal - which also means all
iSCSI initiator connections are corrupted when tgtd target daemon is
started again (kernel upgrade, target daemon upgrade, server reboot etc.).


I don't know what "iSCSI initiator connections are corrupted"
mean. But if you reboot a server, how can an iSCSI target
implementation keep iSCSI tcp connections?


The problem with tgtd is that you can't start it (configured) in an
"atomic" way.
Usually, one will start tgtd and it's configuration in a script (I 
replaced some parameters with "..." to make it shorter and more readable):



tgtd
tgtadm --op new ...
tgtadm --lld iscsi --op new ...


However, this won't work - tgtd goes immediately in the background as it 
is still starting, and the first tgtadm commands will fail:


# bash -x tgtd-start
+ tgtd
+ tgtadm --op new --mode target ...
tgtadm: can't connect to the tgt daemon, Connection refused
tgtadm: can't send the request to the tgt daemon, Transport endpoint is 
not connected

+ tgtadm --lld iscsi --op new --mode account ...
tgtadm: can't connect to the tgt daemon, Connection refused
tgtadm: can't send the request to the tgt daemon, Transport endpoint is 
not connected

+ tgtadm --lld iscsi --op bind --mode account --tid 1 ...
tgtadm: can't find the target
+ tgtadm --op new --mode logicalunit --tid 1 --lun 1 ...
tgtadm: can't find the target
+ tgtadm --op bind --mode target --tid 1 -I ALL
tgtadm: can't find the target
+ tgtadm --op new --mode target --tid 2 ...
+ tgtadm --op new --mode logicalunit --tid 2 --lun 1 ...
+ tgtadm --op bind --mode target --tid 2 -I ALL


OK, if tgtd takes longer to start, perhaps it's a good idea to sleep a 
second right after tgtd?


tgtd
sleep 1
tgtadm --op new ...
tgtadm --lld iscsi --op new ...


No, it is not a good idea - if tgtd listens on port 3260 *and* is 
unconfigured yet,  any reconnecting initiator will fail, like below:


end_request: I/O error, dev sdb, sector 7045192
Buffer I/O error on device sdb, logical block 880649
lost page write due to I/O error on sdb
Aborting journal on device sdb.
ext3_abort called.
EXT3-fs error (device sdb): ext3_journal_start_sb: Detected aborted journal
Remounting filesystem read-only
end_request: I/O error, dev sdb, sector 7045880
Buffer I/O error on device sdb, logical block 880735
lost page write due to I/O error on sdb
end_request: I/O error, dev sdb, sector 6728
Buffer I/O error on device sdb, logical block 841
lost page write due to I/O error on sdb
end_request: I/O error, dev sdb, sector 7045192
Buffer I/O error on device sdb, logical block 880649
lost page write due to I/O error on sdb
end_request: I/O error, dev sdb, sector 7045880
Buffer I/O error on device sdb, logical block 880735
lost page write due to I/O error on sdb
__journal_remove_journal_head: freeing b_frozen_data
__journal_remove_journal_head: freeing b_frozen_data


Ouch.

So the only way to start/restart tgtd reliably is to do hacks which are 
needed with yet another iSCSI kernel implementation (IET): use iptables.


iptables 
tgtd
sleep 1
tgtadm --op new ...
tgtadm --lld iscsi --op new ...
iptables 


A bit ugly, isn't it?
Having to tinker with a firewall in order to start a daemon is by no 
means a sign of a well-tested and mature project.


That's why I asked how many people use stgt in a production environment 
- James was worried about a potential migration path for current users.




--
Tomasz Chmielewski
http://wpkg.org

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread FUJITA Tomonori
On Mon, 4 Feb 2008 20:07:01 -0600
"Chris Weiss" <[EMAIL PROTECTED]> wrote:

> On Feb 4, 2008 11:30 AM, Douglas Gilbert <[EMAIL PROTECTED]> wrote:
> > Alan Cox wrote:
> > >> better. So for example, I personally suspect that ATA-over-ethernet is 
> > >> way
> > >> better than some crazy SCSI-over-TCP crap, but I'm biased for simple and
> > >> low-level, and against those crazy SCSI people to begin with.
> > >
> > > Current ATAoE isn't. It can't support NCQ. A variant that did NCQ and IP
> > > would probably trash iSCSI for latency if nothing else.
> >
> > And a variant that doesn't do ATA or IP:
> > http://www.fcoe.com/
> >
> 
> however, and interestingly enough, the open-fcoe software target
> depends on scst (for now anyway)

STGT also supports software FCoE target driver though it's still
experimental stuff.

http://www.mail-archive.com/[EMAIL PROTECTED]/msg12705.html

It works in user space like STGT's iSCSI (and iSER) target driver
(i.e. no kernel/user space interaction).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread FUJITA Tomonori
On Tue, 05 Feb 2008 05:43:10 +0100
Matteo Tescione <[EMAIL PROTECTED]> wrote:

> Hi all,
> And sorry for intrusion, i am not a developer but i work everyday with iscsi
> and i found it fantastic.
> Altough Aoe, Fcoe and so on could be better, we have to look in real world
> implementations what is needed *now*, and if we look at vmware world,
> virtual iron, microsoft clustering etc, the answer is iSCSI.
> And now, SCST is the best open-source iSCSI target. So, from an end-user
> point of view, what are the really problems to not integrate scst in the
> mainstream kernel?

Currently, the best open-source iSCSI target implemenation in Linux is
Nicholas's LIO, I guess.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread FUJITA Tomonori
On Tue, 05 Feb 2008 08:14:01 +0100
Tomasz Chmielewski <[EMAIL PROTECTED]> wrote:

> James Bottomley schrieb:
> 
> > These are both features being independently worked on, are they not?
> > Even if they weren't, the combination of the size of SCST in kernel plus
> > the problem of having to find a migration path for the current STGT
> > users still looks to me to involve the greater amount of work.
> 
> I don't want to be mean, but does anyone actually use STGT in
> production? Seriously?
> 
> In the latest development version of STGT, it's only possible to stop
> the tgtd target daemon using KILL / 9 signal - which also means all
> iSCSI initiator connections are corrupted when tgtd target daemon is
> started again (kernel upgrade, target daemon upgrade, server reboot etc.).

I don't know what "iSCSI initiator connections are corrupted"
mean. But if you reboot a server, how can an iSCSI target
implementation keep iSCSI tcp connections?


> Imagine you have to reboot all your NFS clients when you reboot your NFS
> server. Not only that - your data is probably corrupted, or at least the
> filesystem deserves checking...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread FUJITA Tomonori
On Tue, 05 Feb 2008 05:43:10 +0100
Matteo Tescione [EMAIL PROTECTED] wrote:

 Hi all,
 And sorry for intrusion, i am not a developer but i work everyday with iscsi
 and i found it fantastic.
 Altough Aoe, Fcoe and so on could be better, we have to look in real world
 implementations what is needed *now*, and if we look at vmware world,
 virtual iron, microsoft clustering etc, the answer is iSCSI.
 And now, SCST is the best open-source iSCSI target. So, from an end-user
 point of view, what are the really problems to not integrate scst in the
 mainstream kernel?

Currently, the best open-source iSCSI target implemenation in Linux is
Nicholas's LIO, I guess.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread FUJITA Tomonori
On Tue, 05 Feb 2008 08:14:01 +0100
Tomasz Chmielewski [EMAIL PROTECTED] wrote:

 James Bottomley schrieb:
 
  These are both features being independently worked on, are they not?
  Even if they weren't, the combination of the size of SCST in kernel plus
  the problem of having to find a migration path for the current STGT
  users still looks to me to involve the greater amount of work.
 
 I don't want to be mean, but does anyone actually use STGT in
 production? Seriously?
 
 In the latest development version of STGT, it's only possible to stop
 the tgtd target daemon using KILL / 9 signal - which also means all
 iSCSI initiator connections are corrupted when tgtd target daemon is
 started again (kernel upgrade, target daemon upgrade, server reboot etc.).

I don't know what iSCSI initiator connections are corrupted
mean. But if you reboot a server, how can an iSCSI target
implementation keep iSCSI tcp connections?


 Imagine you have to reboot all your NFS clients when you reboot your NFS
 server. Not only that - your data is probably corrupted, or at least the
 filesystem deserves checking...
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread FUJITA Tomonori
On Mon, 4 Feb 2008 20:07:01 -0600
Chris Weiss [EMAIL PROTECTED] wrote:

 On Feb 4, 2008 11:30 AM, Douglas Gilbert [EMAIL PROTECTED] wrote:
  Alan Cox wrote:
   better. So for example, I personally suspect that ATA-over-ethernet is 
   way
   better than some crazy SCSI-over-TCP crap, but I'm biased for simple and
   low-level, and against those crazy SCSI people to begin with.
  
   Current ATAoE isn't. It can't support NCQ. A variant that did NCQ and IP
   would probably trash iSCSI for latency if nothing else.
 
  And a variant that doesn't do ATA or IP:
  http://www.fcoe.com/
 
 
 however, and interestingly enough, the open-fcoe software target
 depends on scst (for now anyway)

STGT also supports software FCoE target driver though it's still
experimental stuff.

http://www.mail-archive.com/[EMAIL PROTECTED]/msg12705.html

It works in user space like STGT's iSCSI (and iSER) target driver
(i.e. no kernel/user space interaction).
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Tomasz Chmielewski

FUJITA Tomonori schrieb:

On Tue, 05 Feb 2008 08:14:01 +0100
Tomasz Chmielewski [EMAIL PROTECTED] wrote:


James Bottomley schrieb:


These are both features being independently worked on, are they not?
Even if they weren't, the combination of the size of SCST in kernel plus
the problem of having to find a migration path for the current STGT
users still looks to me to involve the greater amount of work.

I don't want to be mean, but does anyone actually use STGT in
production? Seriously?

In the latest development version of STGT, it's only possible to stop
the tgtd target daemon using KILL / 9 signal - which also means all
iSCSI initiator connections are corrupted when tgtd target daemon is
started again (kernel upgrade, target daemon upgrade, server reboot etc.).


I don't know what iSCSI initiator connections are corrupted
mean. But if you reboot a server, how can an iSCSI target
implementation keep iSCSI tcp connections?


The problem with tgtd is that you can't start it (configured) in an
atomic way.
Usually, one will start tgtd and it's configuration in a script (I 
replaced some parameters with ... to make it shorter and more readable):



tgtd
tgtadm --op new ...
tgtadm --lld iscsi --op new ...


However, this won't work - tgtd goes immediately in the background as it 
is still starting, and the first tgtadm commands will fail:


# bash -x tgtd-start
+ tgtd
+ tgtadm --op new --mode target ...
tgtadm: can't connect to the tgt daemon, Connection refused
tgtadm: can't send the request to the tgt daemon, Transport endpoint is 
not connected

+ tgtadm --lld iscsi --op new --mode account ...
tgtadm: can't connect to the tgt daemon, Connection refused
tgtadm: can't send the request to the tgt daemon, Transport endpoint is 
not connected

+ tgtadm --lld iscsi --op bind --mode account --tid 1 ...
tgtadm: can't find the target
+ tgtadm --op new --mode logicalunit --tid 1 --lun 1 ...
tgtadm: can't find the target
+ tgtadm --op bind --mode target --tid 1 -I ALL
tgtadm: can't find the target
+ tgtadm --op new --mode target --tid 2 ...
+ tgtadm --op new --mode logicalunit --tid 2 --lun 1 ...
+ tgtadm --op bind --mode target --tid 2 -I ALL


OK, if tgtd takes longer to start, perhaps it's a good idea to sleep a 
second right after tgtd?


tgtd
sleep 1
tgtadm --op new ...
tgtadm --lld iscsi --op new ...


No, it is not a good idea - if tgtd listens on port 3260 *and* is 
unconfigured yet,  any reconnecting initiator will fail, like below:


end_request: I/O error, dev sdb, sector 7045192
Buffer I/O error on device sdb, logical block 880649
lost page write due to I/O error on sdb
Aborting journal on device sdb.
ext3_abort called.
EXT3-fs error (device sdb): ext3_journal_start_sb: Detected aborted journal
Remounting filesystem read-only
end_request: I/O error, dev sdb, sector 7045880
Buffer I/O error on device sdb, logical block 880735
lost page write due to I/O error on sdb
end_request: I/O error, dev sdb, sector 6728
Buffer I/O error on device sdb, logical block 841
lost page write due to I/O error on sdb
end_request: I/O error, dev sdb, sector 7045192
Buffer I/O error on device sdb, logical block 880649
lost page write due to I/O error on sdb
end_request: I/O error, dev sdb, sector 7045880
Buffer I/O error on device sdb, logical block 880735
lost page write due to I/O error on sdb
__journal_remove_journal_head: freeing b_frozen_data
__journal_remove_journal_head: freeing b_frozen_data


Ouch.

So the only way to start/restart tgtd reliably is to do hacks which are 
needed with yet another iSCSI kernel implementation (IET): use iptables.


iptables block iSCSI traffic
tgtd
sleep 1
tgtadm --op new ...
tgtadm --lld iscsi --op new ...
iptables unblock iSCSI traffic


A bit ugly, isn't it?
Having to tinker with a firewall in order to start a daemon is by no 
means a sign of a well-tested and mature project.


That's why I asked how many people use stgt in a production environment 
- James was worried about a potential migration path for current users.




--
Tomasz Chmielewski
http://wpkg.org

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Ming Zhang
On Tue, 2008-02-05 at 17:07 +0100, Tomasz Chmielewski wrote:
 FUJITA Tomonori schrieb:
  On Tue, 05 Feb 2008 08:14:01 +0100
  Tomasz Chmielewski [EMAIL PROTECTED] wrote:
  
  James Bottomley schrieb:
 
  These are both features being independently worked on, are they not?
  Even if they weren't, the combination of the size of SCST in kernel plus
  the problem of having to find a migration path for the current STGT
  users still looks to me to involve the greater amount of work.
  I don't want to be mean, but does anyone actually use STGT in
  production? Seriously?
 
  In the latest development version of STGT, it's only possible to stop
  the tgtd target daemon using KILL / 9 signal - which also means all
  iSCSI initiator connections are corrupted when tgtd target daemon is
  started again (kernel upgrade, target daemon upgrade, server reboot etc.).
  
  I don't know what iSCSI initiator connections are corrupted
  mean. But if you reboot a server, how can an iSCSI target
  implementation keep iSCSI tcp connections?
 
 The problem with tgtd is that you can't start it (configured) in an
 atomic way.
 Usually, one will start tgtd and it's configuration in a script (I 
 replaced some parameters with ... to make it shorter and more readable):
 
 
 tgtd
 tgtadm --op new ...
 tgtadm --lld iscsi --op new ...
 
 
 However, this won't work - tgtd goes immediately in the background as it 
 is still starting, and the first tgtadm commands will fail:

this should be a easy fix. start tgtd, get port setup ready in forked
process, then signal its parent that ready to quit. or set port ready in
parent, fork and pass to daemon.


 
 # bash -x tgtd-start
 + tgtd
 + tgtadm --op new --mode target ...
 tgtadm: can't connect to the tgt daemon, Connection refused
 tgtadm: can't send the request to the tgt daemon, Transport endpoint is 
 not connected
 + tgtadm --lld iscsi --op new --mode account ...
 tgtadm: can't connect to the tgt daemon, Connection refused
 tgtadm: can't send the request to the tgt daemon, Transport endpoint is 
 not connected
 + tgtadm --lld iscsi --op bind --mode account --tid 1 ...
 tgtadm: can't find the target
 + tgtadm --op new --mode logicalunit --tid 1 --lun 1 ...
 tgtadm: can't find the target
 + tgtadm --op bind --mode target --tid 1 -I ALL
 tgtadm: can't find the target
 + tgtadm --op new --mode target --tid 2 ...
 + tgtadm --op new --mode logicalunit --tid 2 --lun 1 ...
 + tgtadm --op bind --mode target --tid 2 -I ALL
 
 
 OK, if tgtd takes longer to start, perhaps it's a good idea to sleep a 
 second right after tgtd?
 
 tgtd
 sleep 1
 tgtadm --op new ...
 tgtadm --lld iscsi --op new ...
 
 
 No, it is not a good idea - if tgtd listens on port 3260 *and* is 
 unconfigured yet,  any reconnecting initiator will fail, like below:

this is another easy fix. tgtd started with unconfigured status and then
a tgtadm can configure it and turn it into ready status.


those are really minor usability issue. ( i know it is painful for user,
i agree)


the major problem here is to discuss in architectural wise, which one is
better... linux kernel should have one implementation that is good from
foundation...





 
 end_request: I/O error, dev sdb, sector 7045192
 Buffer I/O error on device sdb, logical block 880649
 lost page write due to I/O error on sdb
 Aborting journal on device sdb.
 ext3_abort called.
 EXT3-fs error (device sdb): ext3_journal_start_sb: Detected aborted journal
 Remounting filesystem read-only
 end_request: I/O error, dev sdb, sector 7045880
 Buffer I/O error on device sdb, logical block 880735
 lost page write due to I/O error on sdb
 end_request: I/O error, dev sdb, sector 6728
 Buffer I/O error on device sdb, logical block 841
 lost page write due to I/O error on sdb
 end_request: I/O error, dev sdb, sector 7045192
 Buffer I/O error on device sdb, logical block 880649
 lost page write due to I/O error on sdb
 end_request: I/O error, dev sdb, sector 7045880
 Buffer I/O error on device sdb, logical block 880735
 lost page write due to I/O error on sdb
 __journal_remove_journal_head: freeing b_frozen_data
 __journal_remove_journal_head: freeing b_frozen_data
 
 
 Ouch.
 
 So the only way to start/restart tgtd reliably is to do hacks which are 
 needed with yet another iSCSI kernel implementation (IET): use iptables.
 
 iptables block iSCSI traffic
 tgtd
 sleep 1
 tgtadm --op new ...
 tgtadm --lld iscsi --op new ...
 iptables unblock iSCSI traffic
 
 
 A bit ugly, isn't it?
 Having to tinker with a firewall in order to start a daemon is by no 
 means a sign of a well-tested and mature project.
 
 That's why I asked how many people use stgt in a production environment 
 - James was worried about a potential migration path for current users.
 
 
 
 -- 
 Tomasz Chmielewski
 http://wpkg.org
 
 
 -
 This SF.net email is sponsored by: Microsoft
 Defy all challenges. Microsoft(R) Visual Studio 2008.
 

Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread FUJITA Tomonori
On Tue, 05 Feb 2008 17:07:07 +0100
Tomasz Chmielewski [EMAIL PROTECTED] wrote:

 FUJITA Tomonori schrieb:
  On Tue, 05 Feb 2008 08:14:01 +0100
  Tomasz Chmielewski [EMAIL PROTECTED] wrote:
  
  James Bottomley schrieb:
 
  These are both features being independently worked on, are they not?
  Even if they weren't, the combination of the size of SCST in kernel plus
  the problem of having to find a migration path for the current STGT
  users still looks to me to involve the greater amount of work.
  I don't want to be mean, but does anyone actually use STGT in
  production? Seriously?
 
  In the latest development version of STGT, it's only possible to stop
  the tgtd target daemon using KILL / 9 signal - which also means all
  iSCSI initiator connections are corrupted when tgtd target daemon is
  started again (kernel upgrade, target daemon upgrade, server reboot etc.).
  
  I don't know what iSCSI initiator connections are corrupted
  mean. But if you reboot a server, how can an iSCSI target
  implementation keep iSCSI tcp connections?
 
 The problem with tgtd is that you can't start it (configured) in an
 atomic way.
 Usually, one will start tgtd and it's configuration in a script (I 
 replaced some parameters with ... to make it shorter and more readable):

Thanks for the details. So the way to stop the daemon is not related
with your problem.

It's easily fixable. Can you start a new thread about this on
stgt-devel mailing list? When we agree on the interface to start the
daemon, I'll implement it.


 tgtd
 tgtadm --op new ...
 tgtadm --lld iscsi --op new ...

(snip)

 So the only way to start/restart tgtd reliably is to do hacks which are 
 needed with yet another iSCSI kernel implementation (IET): use iptables.
 
 iptables block iSCSI traffic
 tgtd
 sleep 1
 tgtadm --op new ...
 tgtadm --lld iscsi --op new ...
 iptables unblock iSCSI traffic
 
 
 A bit ugly, isn't it?
 Having to tinker with a firewall in order to start a daemon is by no 
 means a sign of a well-tested and mature project.
 
 That's why I asked how many people use stgt in a production environment 
 - James was worried about a potential migration path for current users.

I don't know how many people use stgt in a production environment but
I'm not sure that this problem prevents many people from using it in a
production environment.

You want to reboot a server running target devices while initiators
connect to it. Rebooting the target server behind the initiators
seldom works. System adminstorators in my workplace reboot storage
devices once a year and tell us to shut down the initiator machines
that use them before that.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Matteo Tescione
On 5-02-2008 14:38, FUJITA Tomonori [EMAIL PROTECTED] wrote:

 On Tue, 05 Feb 2008 08:14:01 +0100
 Tomasz Chmielewski [EMAIL PROTECTED] wrote:
 
 James Bottomley schrieb:
 
 These are both features being independently worked on, are they not?
 Even if they weren't, the combination of the size of SCST in kernel plus
 the problem of having to find a migration path for the current STGT
 users still looks to me to involve the greater amount of work.
 
 I don't want to be mean, but does anyone actually use STGT in
 production? Seriously?
 
 In the latest development version of STGT, it's only possible to stop
 the tgtd target daemon using KILL / 9 signal - which also means all
 iSCSI initiator connections are corrupted when tgtd target daemon is
 started again (kernel upgrade, target daemon upgrade, server reboot etc.).
 
 I don't know what iSCSI initiator connections are corrupted
 mean. But if you reboot a server, how can an iSCSI target
 implementation keep iSCSI tcp connections?
 
 
 Imagine you have to reboot all your NFS clients when you reboot your NFS
 server. Not only that - your data is probably corrupted, or at least the
 filesystem deserves checking...

Don't know if matters, but in my setup (iscsi on top of drbd+heartbeat)
rebooting the primary server doesn't affect my iscsi traffic, SCST correctly
manages stop/crash, by sending unit attention to clients on reconnect.
Drbd+heartbeat correctly manages those things too.
Still from an end-user POV, i was able to reboot/survive a crash only with
SCST, IETD still has reconnect problems and STGT are even worst.

Regards,
--matteo


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread FUJITA Tomonori
On Tue, 05 Feb 2008 18:09:15 +0100
Matteo Tescione [EMAIL PROTECTED] wrote:

 On 5-02-2008 14:38, FUJITA Tomonori [EMAIL PROTECTED] wrote:
 
  On Tue, 05 Feb 2008 08:14:01 +0100
  Tomasz Chmielewski [EMAIL PROTECTED] wrote:
  
  James Bottomley schrieb:
  
  These are both features being independently worked on, are they not?
  Even if they weren't, the combination of the size of SCST in kernel plus
  the problem of having to find a migration path for the current STGT
  users still looks to me to involve the greater amount of work.
  
  I don't want to be mean, but does anyone actually use STGT in
  production? Seriously?
  
  In the latest development version of STGT, it's only possible to stop
  the tgtd target daemon using KILL / 9 signal - which also means all
  iSCSI initiator connections are corrupted when tgtd target daemon is
  started again (kernel upgrade, target daemon upgrade, server reboot etc.).
  
  I don't know what iSCSI initiator connections are corrupted
  mean. But if you reboot a server, how can an iSCSI target
  implementation keep iSCSI tcp connections?
  
  
  Imagine you have to reboot all your NFS clients when you reboot your NFS
  server. Not only that - your data is probably corrupted, or at least the
  filesystem deserves checking...
 
 Don't know if matters, but in my setup (iscsi on top of drbd+heartbeat)
 rebooting the primary server doesn't affect my iscsi traffic, SCST correctly
 manages stop/crash, by sending unit attention to clients on reconnect.
 Drbd+heartbeat correctly manages those things too.
 Still from an end-user POV, i was able to reboot/survive a crash only with
 SCST, IETD still has reconnect problems and STGT are even worst.

Please tell us on stgt-devel mailing list if you see problems. We will
try to fix them.

Thanks,
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Nicholas A. Bellinger
On Wed, 2008-02-06 at 10:29 +0900, FUJITA Tomonori wrote:
 On Tue, 05 Feb 2008 18:09:15 +0100
 Matteo Tescione [EMAIL PROTECTED] wrote:
 
  On 5-02-2008 14:38, FUJITA Tomonori [EMAIL PROTECTED] wrote:
  
   On Tue, 05 Feb 2008 08:14:01 +0100
   Tomasz Chmielewski [EMAIL PROTECTED] wrote:
   
   James Bottomley schrieb:
   
   These are both features being independently worked on, are they not?
   Even if they weren't, the combination of the size of SCST in kernel plus
   the problem of having to find a migration path for the current STGT
   users still looks to me to involve the greater amount of work.
   
   I don't want to be mean, but does anyone actually use STGT in
   production? Seriously?
   
   In the latest development version of STGT, it's only possible to stop
   the tgtd target daemon using KILL / 9 signal - which also means all
   iSCSI initiator connections are corrupted when tgtd target daemon is
   started again (kernel upgrade, target daemon upgrade, server reboot 
   etc.).
   
   I don't know what iSCSI initiator connections are corrupted
   mean. But if you reboot a server, how can an iSCSI target
   implementation keep iSCSI tcp connections?
   
   
   Imagine you have to reboot all your NFS clients when you reboot your NFS
   server. Not only that - your data is probably corrupted, or at least the
   filesystem deserves checking...
  

The TCP connection will drop, remember that the TCP connection state for
one side has completely vanished.  Depending on iSCSI/iSER
ErrorRecoveryLevel that is set, this will mean:

1) Session Recovery, ERL=0 - Restarting the entire nexus and all
connections across all of the possible subnets or comm-links.  All
outstanding un-StatSN acknowledged commands will be returned back to the
SCSI subsystem with RETRY status.  Once a single connection has been
reestablished to start the nexus, the CDBs will be resent.

2) Connection Recovery, ERL=2 - CDBs from the failed connection(s) will
be retried (nothing changes in the PDU) to fill the iSCSI CmdSN ordering
gap, or be explictly retried with TMR TASK_REASSIGN for ones already
acknowledged by the ExpCmdSN that are returned to the initiator in
response packets or by way of unsolicited NopINs.

  Don't know if matters, but in my setup (iscsi on top of drbd+heartbeat)
  rebooting the primary server doesn't affect my iscsi traffic, SCST correctly
  manages stop/crash, by sending unit attention to clients on reconnect.
  Drbd+heartbeat correctly manages those things too.
  Still from an end-user POV, i was able to reboot/survive a crash only with
  SCST, IETD still has reconnect problems and STGT are even worst.
 
 Please tell us on stgt-devel mailing list if you see problems. We will
 try to fix them.
 

FYI, the LIO code also supports rmmoding iscsi_target_mod while at full
10 Gb/sec speed.  I think it should be a requirement to be able to
control per initiator, per portal group, per LUN, per device, per HBA in
the design without restarting any other objects.

--nab

 Thanks,
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread Tomasz Chmielewski

James Bottomley schrieb:


These are both features being independently worked on, are they not?
Even if they weren't, the combination of the size of SCST in kernel plus
the problem of having to find a migration path for the current STGT
users still looks to me to involve the greater amount of work.


I don't want to be mean, but does anyone actually use STGT in
production? Seriously?

In the latest development version of STGT, it's only possible to stop
the tgtd target daemon using KILL / 9 signal - which also means all
iSCSI initiator connections are corrupted when tgtd target daemon is
started again (kernel upgrade, target daemon upgrade, server reboot etc.).

Imagine you have to reboot all your NFS clients when you reboot your NFS
server. Not only that - your data is probably corrupted, or at least the
filesystem deserves checking...


--
Tomasz Chmielewski
http://wpkg.org



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread James Bottomley

On Tue, 2008-02-05 at 05:43 +0100, Matteo Tescione wrote:
> Hi all,
> And sorry for intrusion, i am not a developer but i work everyday with iscsi
> and i found it fantastic.
> Altough Aoe, Fcoe and so on could be better, we have to look in real world
> implementations what is needed *now*, and if we look at vmware world,
> virtual iron, microsoft clustering etc, the answer is iSCSI.
> And now, SCST is the best open-source iSCSI target. So, from an end-user
> point of view, what are the really problems to not integrate scst in the
> mainstream kernel?

The fact that your last statement is conjecture.  It's definitely untrue
for non-IB networks, and the jury is still out on IB networks.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread Matteo Tescione
Hi all,
And sorry for intrusion, i am not a developer but i work everyday with iscsi
and i found it fantastic.
Altough Aoe, Fcoe and so on could be better, we have to look in real world
implementations what is needed *now*, and if we look at vmware world,
virtual iron, microsoft clustering etc, the answer is iSCSI.
And now, SCST is the best open-source iSCSI target. So, from an end-user
point of view, what are the really problems to not integrate scst in the
mainstream kernel?

Just my two cent,
--
So long and thank for all the fish
--
#Matteo Tescione
#RMnet srl


> 
> 
> On Mon, 4 Feb 2008, Matt Mackall wrote:
>> 
>> But ATAoE is boring because it's not IP. Which means no routing,
>> firewalls, tunnels, congestion control, etc.
> 
> The thing is, that's often an advantage. Not just for performance.
> 
>> NBD and iSCSI (for all its hideous growths) can take advantage of these
>> things.
> 
> .. and all this could equally well be done by a simple bridging protocol
> (completely independently of any AoE code).
> 
> The thing is, iSCSI does things at the wrong level. It *forces* people to
> use the complex protocols, when it's a known that a lot of people don't
> want it. 
> 
> Which is why these AoE and FCoE things keep popping up.
> 
> It's easy to bridge ethernet and add a new layer on top of AoE if you need
> it. In comparison, it's *impossible* to remove an unnecessary layer from
> iSCSI.
> 
> This is why "simple and low-level is good". It's always possible to build
> on top of low-level protocols, while it's generally never possible to
> simplify overly complex ones.
> 
> Linus
> 
> -
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2008.
> http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
> ___
> Scst-devel mailing list
> [EMAIL PROTECTED]
> https://lists.sourceforge.net/lists/listinfo/scst-devel
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread Chris Weiss
On Feb 4, 2008 11:30 AM, Douglas Gilbert <[EMAIL PROTECTED]> wrote:
> Alan Cox wrote:
> >> better. So for example, I personally suspect that ATA-over-ethernet is way
> >> better than some crazy SCSI-over-TCP crap, but I'm biased for simple and
> >> low-level, and against those crazy SCSI people to begin with.
> >
> > Current ATAoE isn't. It can't support NCQ. A variant that did NCQ and IP
> > would probably trash iSCSI for latency if nothing else.
>
> And a variant that doesn't do ATA or IP:
> http://www.fcoe.com/
>

however, and interestingly enough, the open-fcoe software target
depends on scst (for now anyway)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread 4news
On lunedì 4 febbraio 2008, Linus Torvalds wrote:
> So from a purely personal standpoint, I'd like to say that I'm not really
> interested in iSCSI (and I don't quite know why I've been cc'd on this
> whole discussion) and think that other approaches are potentially *much*
> better. So for example, I personally suspect that ATA-over-ethernet is way
> better than some crazy SCSI-over-TCP crap, but I'm biased for simple and
> low-level, and against those crazy SCSI people to begin with.

surely aoe is better than iscsi almost on performance because of the lesser 
protocol stack:
iscsi ->  scsi - ip - eth
aoe -> ata - eth

but surely iscsi is more a standard than aoe and is more actively used by 
real-world .

Other really useful feature are that:
- iscsi is capable to move to a ip based san scsi devices by routing that ( 
i've some tape changer routed by scst to some system that don't have other 
way to see a tape).
- because it work on the ip layer it can be routed between long distance , so 
having needed bandwidth you can have a really remote block device spoking a 
standard protocol between non ethereogenus systems.
- iscsi is now the cheapest san avaible.

bye,
marco.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread 4news
On lunedì 4 febbraio 2008, Linus Torvalds wrote:
 So from a purely personal standpoint, I'd like to say that I'm not really
 interested in iSCSI (and I don't quite know why I've been cc'd on this
 whole discussion) and think that other approaches are potentially *much*
 better. So for example, I personally suspect that ATA-over-ethernet is way
 better than some crazy SCSI-over-TCP crap, but I'm biased for simple and
 low-level, and against those crazy SCSI people to begin with.

surely aoe is better than iscsi almost on performance because of the lesser 
protocol stack:
iscsi -  scsi - ip - eth
aoe - ata - eth

but surely iscsi is more a standard than aoe and is more actively used by 
real-world .

Other really useful feature are that:
- iscsi is capable to move to a ip based san scsi devices by routing that ( 
i've some tape changer routed by scst to some system that don't have other 
way to see a tape).
- because it work on the ip layer it can be routed between long distance , so 
having needed bandwidth you can have a really remote block device spoking a 
standard protocol between non ethereogenus systems.
- iscsi is now the cheapest san avaible.

bye,
marco.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread Chris Weiss
On Feb 4, 2008 11:30 AM, Douglas Gilbert [EMAIL PROTECTED] wrote:
 Alan Cox wrote:
  better. So for example, I personally suspect that ATA-over-ethernet is way
  better than some crazy SCSI-over-TCP crap, but I'm biased for simple and
  low-level, and against those crazy SCSI people to begin with.
 
  Current ATAoE isn't. It can't support NCQ. A variant that did NCQ and IP
  would probably trash iSCSI for latency if nothing else.

 And a variant that doesn't do ATA or IP:
 http://www.fcoe.com/


however, and interestingly enough, the open-fcoe software target
depends on scst (for now anyway)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread Matteo Tescione
Hi all,
And sorry for intrusion, i am not a developer but i work everyday with iscsi
and i found it fantastic.
Altough Aoe, Fcoe and so on could be better, we have to look in real world
implementations what is needed *now*, and if we look at vmware world,
virtual iron, microsoft clustering etc, the answer is iSCSI.
And now, SCST is the best open-source iSCSI target. So, from an end-user
point of view, what are the really problems to not integrate scst in the
mainstream kernel?

Just my two cent,
--
So long and thank for all the fish
--
#Matteo Tescione
#RMnet srl


 
 
 On Mon, 4 Feb 2008, Matt Mackall wrote:
 
 But ATAoE is boring because it's not IP. Which means no routing,
 firewalls, tunnels, congestion control, etc.
 
 The thing is, that's often an advantage. Not just for performance.
 
 NBD and iSCSI (for all its hideous growths) can take advantage of these
 things.
 
 .. and all this could equally well be done by a simple bridging protocol
 (completely independently of any AoE code).
 
 The thing is, iSCSI does things at the wrong level. It *forces* people to
 use the complex protocols, when it's a known that a lot of people don't
 want it. 
 
 Which is why these AoE and FCoE things keep popping up.
 
 It's easy to bridge ethernet and add a new layer on top of AoE if you need
 it. In comparison, it's *impossible* to remove an unnecessary layer from
 iSCSI.
 
 This is why simple and low-level is good. It's always possible to build
 on top of low-level protocols, while it's generally never possible to
 simplify overly complex ones.
 
 Linus
 
 -
 This SF.net email is sponsored by: Microsoft
 Defy all challenges. Microsoft(R) Visual Studio 2008.
 http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
 ___
 Scst-devel mailing list
 [EMAIL PROTECTED]
 https://lists.sourceforge.net/lists/listinfo/scst-devel
 


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread James Bottomley

On Tue, 2008-02-05 at 05:43 +0100, Matteo Tescione wrote:
 Hi all,
 And sorry for intrusion, i am not a developer but i work everyday with iscsi
 and i found it fantastic.
 Altough Aoe, Fcoe and so on could be better, we have to look in real world
 implementations what is needed *now*, and if we look at vmware world,
 virtual iron, microsoft clustering etc, the answer is iSCSI.
 And now, SCST is the best open-source iSCSI target. So, from an end-user
 point of view, what are the really problems to not integrate scst in the
 mainstream kernel?

The fact that your last statement is conjecture.  It's definitely untrue
for non-IB networks, and the jury is still out on IB networks.

James


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-04 Thread Tomasz Chmielewski

James Bottomley schrieb:


These are both features being independently worked on, are they not?
Even if they weren't, the combination of the size of SCST in kernel plus
the problem of having to find a migration path for the current STGT
users still looks to me to involve the greater amount of work.


I don't want to be mean, but does anyone actually use STGT in
production? Seriously?

In the latest development version of STGT, it's only possible to stop
the tgtd target daemon using KILL / 9 signal - which also means all
iSCSI initiator connections are corrupted when tgtd target daemon is
started again (kernel upgrade, target daemon upgrade, server reboot etc.).

Imagine you have to reboot all your NFS clients when you reboot your NFS
server. Not only that - your data is probably corrupted, or at least the
filesystem deserves checking...


--
Tomasz Chmielewski
http://wpkg.org



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-01 Thread Vladislav Bolkhovitin

Vladislav Bolkhovitin wrote:

Bart Van Assche wrote:

On Jan 31, 2008 5:25 PM, Joe Landman <[EMAIL PROTECTED]> 
wrote:



Vladislav Bolkhovitin wrote:


Actually, I don't know what kind of conclusions it is possible to make
from disktest's results (maybe only how throughput gets bigger or 
slower

with increasing number of threads?), it's a good stress test tool, but
not more.



Unfortunately, I agree.  Bonnie++, dd tests, and a few others seem to
bear far closer to "real world" tests than disktest and iozone, the
latter of which does more to test the speed of RAM cache and system call
performance than actual IO.




I have ran some tests with Bonnie++, but found out that on a fast
network like IB the filesystem used for the test has a really big
impact on the test results.

If anyone has a suggestion for a better test than dd to compare the
performance of SCSI storage protocols, please let it know.



I would suggest you to try something from real life, like:

 - Copying large file tree over a single or multiple IB links

 - Measure of some DB engine's TPC

 - etc.


Forgot to mention. During those tests make sure that imported devices 
from both SCST and STGT report in the kernel log the same write cache 
and FUA capabilities, since they significantly affect initiator's 
behavior. Like:


sd 4:0:0:5: [sdf] Write cache: enabled, read cache: enabled, supports 
DPO and FUA


For SCST the fastest mode is NV_CACHE, refer to its README file for details.


Bart Van Assche.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Scst-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/scst-devel



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-01 Thread Vladislav Bolkhovitin

David Dillow wrote:

On Thu, 2008-01-31 at 18:08 +0100, Bart Van Assche wrote:


If anyone has a suggestion for a better test than dd to compare the
performance of SCSI storage protocols, please let it know.



xdd on /dev/sda, sdb, etc. using -dio to do direct IO seems to work
decently, though it is hard (ie, impossible) to get a repeatable
sequence of IO when using higher queue depths, as it uses threads to
generate multiple requests.


This utility seems to be a good one, but it's basically the same as 
disktest, although much more advanced.



You may also look at sgpdd_survey from Lustre's iokit, but I've not done
much with that -- it uses the sg devices to send lowlevel SCSI commands.


Yes, it might be worth to try. Since fundamentally it's the same as 
O_DIRECT dd, but with a bit less overhead on the initiator side (hence 
less initiator side latency), most likely it will show ever bigger 
difference, than it is with dd.



I've been playing around with some benchmark code using libaio, but it's
not in generally usable shape.

xdd:
http://www.ioperformance.com/products.htm

Lustre IO Kit:
http://manual.lustre.org/manual/LustreManual16_HTML/DynamicHTML-20-1.html


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-01 Thread Vladislav Bolkhovitin

Bart Van Assche wrote:

On Jan 31, 2008 5:25 PM, Joe Landman <[EMAIL PROTECTED]> wrote:


Vladislav Bolkhovitin wrote:


Actually, I don't know what kind of conclusions it is possible to make
from disktest's results (maybe only how throughput gets bigger or slower
with increasing number of threads?), it's a good stress test tool, but
not more.


Unfortunately, I agree.  Bonnie++, dd tests, and a few others seem to
bear far closer to "real world" tests than disktest and iozone, the
latter of which does more to test the speed of RAM cache and system call
performance than actual IO.



I have ran some tests with Bonnie++, but found out that on a fast
network like IB the filesystem used for the test has a really big
impact on the test results.

If anyone has a suggestion for a better test than dd to compare the
performance of SCSI storage protocols, please let it know.


I would suggest you to try something from real life, like:

 - Copying large file tree over a single or multiple IB links

 - Measure of some DB engine's TPC

 - etc.


Bart Van Assche.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Scst-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/scst-devel



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-01 Thread Vladislav Bolkhovitin

Bart Van Assche wrote:

On Jan 31, 2008 5:25 PM, Joe Landman [EMAIL PROTECTED] wrote:


Vladislav Bolkhovitin wrote:


Actually, I don't know what kind of conclusions it is possible to make
from disktest's results (maybe only how throughput gets bigger or slower
with increasing number of threads?), it's a good stress test tool, but
not more.


Unfortunately, I agree.  Bonnie++, dd tests, and a few others seem to
bear far closer to real world tests than disktest and iozone, the
latter of which does more to test the speed of RAM cache and system call
performance than actual IO.



I have ran some tests with Bonnie++, but found out that on a fast
network like IB the filesystem used for the test has a really big
impact on the test results.

If anyone has a suggestion for a better test than dd to compare the
performance of SCSI storage protocols, please let it know.


I would suggest you to try something from real life, like:

 - Copying large file tree over a single or multiple IB links

 - Measure of some DB engine's TPC

 - etc.


Bart Van Assche.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Scst-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/scst-devel



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-01 Thread Vladislav Bolkhovitin

David Dillow wrote:

On Thu, 2008-01-31 at 18:08 +0100, Bart Van Assche wrote:


If anyone has a suggestion for a better test than dd to compare the
performance of SCSI storage protocols, please let it know.



xdd on /dev/sda, sdb, etc. using -dio to do direct IO seems to work
decently, though it is hard (ie, impossible) to get a repeatable
sequence of IO when using higher queue depths, as it uses threads to
generate multiple requests.


This utility seems to be a good one, but it's basically the same as 
disktest, although much more advanced.



You may also look at sgpdd_survey from Lustre's iokit, but I've not done
much with that -- it uses the sg devices to send lowlevel SCSI commands.


Yes, it might be worth to try. Since fundamentally it's the same as 
O_DIRECT dd, but with a bit less overhead on the initiator side (hence 
less initiator side latency), most likely it will show ever bigger 
difference, than it is with dd.



I've been playing around with some benchmark code using libaio, but it's
not in generally usable shape.

xdd:
http://www.ioperformance.com/products.htm

Lustre IO Kit:
http://manual.lustre.org/manual/LustreManual16_HTML/DynamicHTML-20-1.html


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-01 Thread Vladislav Bolkhovitin

Vladislav Bolkhovitin wrote:

Bart Van Assche wrote:

On Jan 31, 2008 5:25 PM, Joe Landman [EMAIL PROTECTED] 
wrote:



Vladislav Bolkhovitin wrote:


Actually, I don't know what kind of conclusions it is possible to make
from disktest's results (maybe only how throughput gets bigger or 
slower

with increasing number of threads?), it's a good stress test tool, but
not more.



Unfortunately, I agree.  Bonnie++, dd tests, and a few others seem to
bear far closer to real world tests than disktest and iozone, the
latter of which does more to test the speed of RAM cache and system call
performance than actual IO.




I have ran some tests with Bonnie++, but found out that on a fast
network like IB the filesystem used for the test has a really big
impact on the test results.

If anyone has a suggestion for a better test than dd to compare the
performance of SCSI storage protocols, please let it know.



I would suggest you to try something from real life, like:

 - Copying large file tree over a single or multiple IB links

 - Measure of some DB engine's TPC

 - etc.


Forgot to mention. During those tests make sure that imported devices 
from both SCST and STGT report in the kernel log the same write cache 
and FUA capabilities, since they significantly affect initiator's 
behavior. Like:


sd 4:0:0:5: [sdf] Write cache: enabled, read cache: enabled, supports 
DPO and FUA


For SCST the fastest mode is NV_CACHE, refer to its README file for details.


Bart Van Assche.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Scst-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/scst-devel



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-01-31 Thread David Dillow

On Thu, 2008-01-31 at 18:08 +0100, Bart Van Assche wrote:
> If anyone has a suggestion for a better test than dd to compare the
> performance of SCSI storage protocols, please let it know.

xdd on /dev/sda, sdb, etc. using -dio to do direct IO seems to work
decently, though it is hard (ie, impossible) to get a repeatable
sequence of IO when using higher queue depths, as it uses threads to
generate multiple requests.

You may also look at sgpdd_survey from Lustre's iokit, but I've not done
much with that -- it uses the sg devices to send lowlevel SCSI commands.

I've been playing around with some benchmark code using libaio, but it's
not in generally usable shape.

xdd:
http://www.ioperformance.com/products.htm

Lustre IO Kit:
http://manual.lustre.org/manual/LustreManual16_HTML/DynamicHTML-20-1.html
-- 
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-01-31 Thread Joe Landman

Bart Van Assche wrote:


I have ran some tests with Bonnie++, but found out that on a fast
network like IB the filesystem used for the test has a really big
impact on the test results.


This is true of the file systems when physically directly connected to 
the unit as well.  Some file systems are designed with high performance 
in mind, some are not.



If anyone has a suggestion for a better test than dd to compare the
performance of SCSI storage protocols, please let it know.


Hmmm... if you care about the protocol side, I can't help.  Our users 
are more concerned with the file system side, so this is where we focus 
our tuning attention.




Bart Van Assche.


Joe

--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: [EMAIL PROTECTED]
web  : http://www.scalableinformatics.com
   http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 866 888 3112
cell : +1 734 612 4615
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-01-31 Thread Bart Van Assche
On Jan 31, 2008 5:25 PM, Joe Landman <[EMAIL PROTECTED]> wrote:
> Vladislav Bolkhovitin wrote:
> > Actually, I don't know what kind of conclusions it is possible to make
> > from disktest's results (maybe only how throughput gets bigger or slower
> > with increasing number of threads?), it's a good stress test tool, but
> > not more.
>
> Unfortunately, I agree.  Bonnie++, dd tests, and a few others seem to
> bear far closer to "real world" tests than disktest and iozone, the
> latter of which does more to test the speed of RAM cache and system call
> performance than actual IO.

I have ran some tests with Bonnie++, but found out that on a fast
network like IB the filesystem used for the test has a really big
impact on the test results.

If anyone has a suggestion for a better test than dd to compare the
performance of SCSI storage protocols, please let it know.

Bart Van Assche.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-01-31 Thread Joe Landman

Vladislav Bolkhovitin wrote:

Bart Van Assche wrote:


[...]


I can run disktest on the same setups I ran dd on. This will take some
time however.


Disktest was already referenced in the beginning of the performance 
comparison thread, but its results are not very interesting if we are 
going to find out, which implementation is more effective, because in 
the modes, in which usually people run this utility, it produces latency 
insensitive workload (multiple threads working in parallel). So, such 


There are other issues with disktest, in that you can easily specify 
option combinations that generate apparently 5+ GB/s of IO, though 
actual traffic over the link to storage is very low.  Caveat disktest 
emptor.


multithreaded disktests results will be different between STGT and SCST 
only if STGT's implementation will get target CPU bound. If CPU on the 
target is powerful enough, even extra busy loops in the STGT or SCST hot 
path code will change nothing.


Additionally, multithreaded disktest over RAM disk is a good example of 
a synthetic benchmark, which has almost no relation with real life 
workloads. But people like it, because it produces nice looking results.


I agree.  The backing store should be a disk for it to have meaning, 
though please note my caveat above.




Actually, I don't know what kind of conclusions it is possible to make 
from disktest's results (maybe only how throughput gets bigger or slower 
with increasing number of threads?), it's a good stress test tool, but 
not more.


Unfortunately, I agree.  Bonnie++, dd tests, and a few others seem to 
bear far closer to "real world" tests than disktest and iozone, the 
latter of which does more to test the speed of RAM cache and system call 
performance than actual IO.




Disktest is new to me -- any hints with regard to suitable
combinations of command line parameters are welcome. The most recent
version I could find on http://ltp.sourceforge.net/ is ltp-20071231.

Bart Van Assche.


Here is what I have run:

disktest -K 8 -B 256k  -I F -N 2000 -P A -w /big/file
disktest -K 8 -B 64k   -I F -N 2000 -P A -w /big/file
disktest -K 8 -B 1k-I B -N 200  -P A  /dev/sdb2

and many others.



Joe


--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: [EMAIL PROTECTED]
web  : http://www.scalableinformatics.com
   http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 866 888 3112
cell : +1 734 612 4615
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-01-31 Thread Joe Landman

Bart Van Assche wrote:


I have ran some tests with Bonnie++, but found out that on a fast
network like IB the filesystem used for the test has a really big
impact on the test results.


This is true of the file systems when physically directly connected to 
the unit as well.  Some file systems are designed with high performance 
in mind, some are not.



If anyone has a suggestion for a better test than dd to compare the
performance of SCSI storage protocols, please let it know.


Hmmm... if you care about the protocol side, I can't help.  Our users 
are more concerned with the file system side, so this is where we focus 
our tuning attention.




Bart Van Assche.


Joe

--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: [EMAIL PROTECTED]
web  : http://www.scalableinformatics.com
   http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 866 888 3112
cell : +1 734 612 4615
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-01-31 Thread Joe Landman

Vladislav Bolkhovitin wrote:

Bart Van Assche wrote:


[...]


I can run disktest on the same setups I ran dd on. This will take some
time however.


Disktest was already referenced in the beginning of the performance 
comparison thread, but its results are not very interesting if we are 
going to find out, which implementation is more effective, because in 
the modes, in which usually people run this utility, it produces latency 
insensitive workload (multiple threads working in parallel). So, such 


There are other issues with disktest, in that you can easily specify 
option combinations that generate apparently 5+ GB/s of IO, though 
actual traffic over the link to storage is very low.  Caveat disktest 
emptor.


multithreaded disktests results will be different between STGT and SCST 
only if STGT's implementation will get target CPU bound. If CPU on the 
target is powerful enough, even extra busy loops in the STGT or SCST hot 
path code will change nothing.


Additionally, multithreaded disktest over RAM disk is a good example of 
a synthetic benchmark, which has almost no relation with real life 
workloads. But people like it, because it produces nice looking results.


I agree.  The backing store should be a disk for it to have meaning, 
though please note my caveat above.




Actually, I don't know what kind of conclusions it is possible to make 
from disktest's results (maybe only how throughput gets bigger or slower 
with increasing number of threads?), it's a good stress test tool, but 
not more.


Unfortunately, I agree.  Bonnie++, dd tests, and a few others seem to 
bear far closer to real world tests than disktest and iozone, the 
latter of which does more to test the speed of RAM cache and system call 
performance than actual IO.




Disktest is new to me -- any hints with regard to suitable
combinations of command line parameters are welcome. The most recent
version I could find on http://ltp.sourceforge.net/ is ltp-20071231.

Bart Van Assche.


Here is what I have run:

disktest -K 8 -B 256k  -I F -N 2000 -P A -w /big/file
disktest -K 8 -B 64k   -I F -N 2000 -P A -w /big/file
disktest -K 8 -B 1k-I B -N 200  -P A  /dev/sdb2

and many others.



Joe


--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: [EMAIL PROTECTED]
web  : http://www.scalableinformatics.com
   http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 866 888 3112
cell : +1 734 612 4615
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-01-31 Thread Bart Van Assche
On Jan 31, 2008 5:25 PM, Joe Landman [EMAIL PROTECTED] wrote:
 Vladislav Bolkhovitin wrote:
  Actually, I don't know what kind of conclusions it is possible to make
  from disktest's results (maybe only how throughput gets bigger or slower
  with increasing number of threads?), it's a good stress test tool, but
  not more.

 Unfortunately, I agree.  Bonnie++, dd tests, and a few others seem to
 bear far closer to real world tests than disktest and iozone, the
 latter of which does more to test the speed of RAM cache and system call
 performance than actual IO.

I have ran some tests with Bonnie++, but found out that on a fast
network like IB the filesystem used for the test has a really big
impact on the test results.

If anyone has a suggestion for a better test than dd to compare the
performance of SCSI storage protocols, please let it know.

Bart Van Assche.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-01-31 Thread David Dillow

On Thu, 2008-01-31 at 18:08 +0100, Bart Van Assche wrote:
 If anyone has a suggestion for a better test than dd to compare the
 performance of SCSI storage protocols, please let it know.

xdd on /dev/sda, sdb, etc. using -dio to do direct IO seems to work
decently, though it is hard (ie, impossible) to get a repeatable
sequence of IO when using higher queue depths, as it uses threads to
generate multiple requests.

You may also look at sgpdd_survey from Lustre's iokit, but I've not done
much with that -- it uses the sg devices to send lowlevel SCSI commands.

I've been playing around with some benchmark code using libaio, but it's
not in generally usable shape.

xdd:
http://www.ioperformance.com/products.htm

Lustre IO Kit:
http://manual.lustre.org/manual/LustreManual16_HTML/DynamicHTML-20-1.html
-- 
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
(865) 241-6602 office


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-01-29 Thread Vu Pham

FUJITA Tomonori wrote:

On Tue, 29 Jan 2008 13:31:52 -0800
Roland Dreier <[EMAIL PROTECTED]> wrote:


 > .   .   STGT read SCST read.STGT read
  SCST read.
 > .   .  performance   performance   . performance
performance   .
 > .   .  (0.5K, MB/s)  (0.5K, MB/s)  .   (1 MB, MB/s)  
 (1 MB, MB/s)  .
 > . iSER (8 Gb/s network) . 250N/A   .   360   
N/A   .
 > . SRP  (8 Gb/s network) . N/A421   .   N/A   
683   .

 > On the comparable figures, which only seem to be IPoIB they're showing a
 > 13-18% variance, aren't they?  Which isn't an incredible difference.

Maybe I'm all wet, but I think iSER vs. SRP should be roughly
comparable.  The exact formatting of various messages etc. is
different but the data path using RDMA is pretty much identical.  So
the big difference between STGT iSER and SCST SRP hints at some big
difference in the efficiency of the two implementations.


iSER has parameters to limit the maximum size of RDMA (it needs to
repeat RDMA with a poor configuration)?


Anyway, here's the results from Robin Humble:

iSER to 7G ramfs, x86_64, centos4.6, 2.6.22 kernels, git tgtd,
initiator end booted with mem=512M, target with 8G ram

 direct i/o dd
  write/read  800/751 MB/s
dd if=/dev/zero of=/dev/sdc bs=1M count=5000 oflag=direct
dd of=/dev/null if=/dev/sdc bs=1M count=5000 iflag=direct



Both Robin (iser/stgt) and Bart (scst/srp) using ramfs

Robin's numbers come from DDR IB HCAs

Bart's numbers come from SDR IB HCAs:
Results with /dev/ram0 configured as backing store on the 
target (buffered I/O):
Read  Write Read 
   Write
  performance   performance 
performance   performance
  (0.5K, MB/s)  (0.5K, MB/s)  (1 MB, 
MB/s)  (1 MB, MB/s)
STGT + iSER   250  48 349 
   781
SCST + SRP411  66 659 
   746


Results with /dev/ram0 configured as backing store on the 
target (direct I/O):
Read  Write Read 
   Write
  performance   performance 
performance   performance
  (0.5K, MB/s)  (0.5K, MB/s)  (1 MB, 
MB/s)  (1 MB, MB/s)
STGT + iSER 7.9 9.8   589 
   647
SCST + SRP 12.3 9.7   811 
   794


http://www.mail-archive.com/[EMAIL PROTECTED]/msg13514.html

Here are my numbers with DDR IB HCAs, SCST/SRP 5G /dev/ram0 
block_io mode, RHEL5 2.6.18-8.el5


direct i/o dd
   write/read  1100/895 MB/s
 dd if=/dev/zero of=/dev/sdc bs=1M count=5000 oflag=direct
 dd of=/dev/null if=/dev/sdc bs=1M count=5000 iflag=direct

buffered i/o dd
   write/read  950/770 MB/s
 dd if=/dev/zero of=/dev/sdc bs=1M count=5000
 dd of=/dev/null if=/dev/sdc bs=1M count=5000

So when using DDR IB hcas:

  stgt/iser   scst/srp
direct I/O 800/751 1100/895
buffered I/O   1109/350950/770


-vu

http://www.mail-archive.com/[EMAIL PROTECTED]/msg13502.html

I think that STGT is pretty fast with the fast backing storage. 



I don't think that there is the notable perfornace difference between
kernel-space and user-space SRP (or ISER) implementations about moving
data between hosts. IB is expected to enable user-space applications
to move data between hosts quickly (if not, what can IB provide us?).

I think that the question is how fast user-space applications can do
I/Os ccompared with I/Os in kernel space. STGT is eager for the advent
of good asynchronous I/O and event notification interfances.


One more possible optimization for STGT is zero-copy data
transfer. STGT uses pre-registered buffers and move data between page
cache and thsse buffers, and then does RDMA transfer. If we implement
own caching mechanism to use pre-registered buffers directly with (AIO
and O_DIRECT), then STGT can move data without data copies.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Scst-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/scst-devel



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-01-29 Thread Vu Pham

FUJITA Tomonori wrote:

On Tue, 29 Jan 2008 13:31:52 -0800
Roland Dreier [EMAIL PROTECTED] wrote:


  .   .   STGT read SCST read.STGT read
  SCST read.
  .   .  performance   performance   . performance
performance   .
  .   .  (0.5K, MB/s)  (0.5K, MB/s)  .   (1 MB, MB/s)  
 (1 MB, MB/s)  .
  . iSER (8 Gb/s network) . 250N/A   .   360   
N/A   .
  . SRP  (8 Gb/s network) . N/A421   .   N/A   
683   .

  On the comparable figures, which only seem to be IPoIB they're showing a
  13-18% variance, aren't they?  Which isn't an incredible difference.

Maybe I'm all wet, but I think iSER vs. SRP should be roughly
comparable.  The exact formatting of various messages etc. is
different but the data path using RDMA is pretty much identical.  So
the big difference between STGT iSER and SCST SRP hints at some big
difference in the efficiency of the two implementations.


iSER has parameters to limit the maximum size of RDMA (it needs to
repeat RDMA with a poor configuration)?


Anyway, here's the results from Robin Humble:

iSER to 7G ramfs, x86_64, centos4.6, 2.6.22 kernels, git tgtd,
initiator end booted with mem=512M, target with 8G ram

 direct i/o dd
  write/read  800/751 MB/s
dd if=/dev/zero of=/dev/sdc bs=1M count=5000 oflag=direct
dd of=/dev/null if=/dev/sdc bs=1M count=5000 iflag=direct



Both Robin (iser/stgt) and Bart (scst/srp) using ramfs

Robin's numbers come from DDR IB HCAs

Bart's numbers come from SDR IB HCAs:
Results with /dev/ram0 configured as backing store on the 
target (buffered I/O):
Read  Write Read 
   Write
  performance   performance 
performance   performance
  (0.5K, MB/s)  (0.5K, MB/s)  (1 MB, 
MB/s)  (1 MB, MB/s)
STGT + iSER   250  48 349 
   781
SCST + SRP411  66 659 
   746


Results with /dev/ram0 configured as backing store on the 
target (direct I/O):
Read  Write Read 
   Write
  performance   performance 
performance   performance
  (0.5K, MB/s)  (0.5K, MB/s)  (1 MB, 
MB/s)  (1 MB, MB/s)
STGT + iSER 7.9 9.8   589 
   647
SCST + SRP 12.3 9.7   811 
   794


http://www.mail-archive.com/[EMAIL PROTECTED]/msg13514.html

Here are my numbers with DDR IB HCAs, SCST/SRP 5G /dev/ram0 
block_io mode, RHEL5 2.6.18-8.el5


direct i/o dd
   write/read  1100/895 MB/s
 dd if=/dev/zero of=/dev/sdc bs=1M count=5000 oflag=direct
 dd of=/dev/null if=/dev/sdc bs=1M count=5000 iflag=direct

buffered i/o dd
   write/read  950/770 MB/s
 dd if=/dev/zero of=/dev/sdc bs=1M count=5000
 dd of=/dev/null if=/dev/sdc bs=1M count=5000

So when using DDR IB hcas:

  stgt/iser   scst/srp
direct I/O 800/751 1100/895
buffered I/O   1109/350950/770


-vu

http://www.mail-archive.com/[EMAIL PROTECTED]/msg13502.html

I think that STGT is pretty fast with the fast backing storage. 



I don't think that there is the notable perfornace difference between
kernel-space and user-space SRP (or ISER) implementations about moving
data between hosts. IB is expected to enable user-space applications
to move data between hosts quickly (if not, what can IB provide us?).

I think that the question is how fast user-space applications can do
I/Os ccompared with I/Os in kernel space. STGT is eager for the advent
of good asynchronous I/O and event notification interfances.


One more possible optimization for STGT is zero-copy data
transfer. STGT uses pre-registered buffers and move data between page
cache and thsse buffers, and then does RDMA transfer. If we implement
own caching mechanism to use pre-registered buffers directly with (AIO
and O_DIRECT), then STGT can move data without data copies.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Scst-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/scst-devel



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/