from:"Boaz Harrosh"

Re: [PATCH] scsi: libosd: Remove VLA usage

2018-05-13 Thread Boaz Harrosh

On 03/05/18 01:55, Kees Cook wrote:
> On the quest to remove all VLAs from the kernel[1] this rearranges the
> code to avoid a VLA warning under -Wvla (gcc doesn't recognize "const"
> variables as not triggering VLA creation). Additionally cleans up variable
> naming to avoid 80 character column limit.
> 
> [1] 
> https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qpxydaacu1rq...@mail.gmail.com
> 

ACK-BY: Boaz Harrosh <o...@electrozaur.com>

> Signed-off-by: Kees Cook <keesc...@chromium.org>
> ---
>  drivers/scsi/osd/osd_initiator.c | 16 
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/scsi/osd/osd_initiator.c 
> b/drivers/scsi/osd/osd_initiator.c
> index e18877177f1b..917a86a2ae8c 100644
> --- a/drivers/scsi/osd/osd_initiator.c
> +++ b/drivers/scsi/osd/osd_initiator.c
> @@ -1842,14 +1842,14 @@ int osd_req_decode_sense_full(struct osd_request *or,
>   case osd_sense_response_integrity_check:
>   {
>   struct osd_sense_response_integrity_check_descriptor
> - *osricd = cur_descriptor;
> - const unsigned len =
> -   sizeof(osricd->integrity_check_value);
> - char key_dump[len*4 + 2]; /* 2nibbles+space+ASCII */
> -
> - hex_dump_to_buffer(osricd->integrity_check_value, len,
> -32, 1, key_dump, sizeof(key_dump), true);
> - OSD_SENSE_PRINT2("response_integrity [%s]\n", key_dump);
> + *d = cur_descriptor;
> + /* 2nibbles+space+ASCII */
> + char dump[sizeof(d->integrity_check_value) * 4 + 2];
> +
> + hex_dump_to_buffer(d->integrity_check_value,
> + sizeof(d->integrity_check_value),
> + 32, 1, dump, sizeof(dump), true);
> + OSD_SENSE_PRINT2("response_integrity [%s]\n", dump);
>   }
>   case osd_sense_attribute_identification:
>   {
>

Re: [PATCH, RFC] MAINTAINERS: update OSD entries

2017-05-03 Thread Boaz Harrosh

On 05/02/2017 02:33 PM, Jeff Layton wrote:
> On Tue, 2017-05-02 at 09:57 +0200, Christoph Hellwig wrote:
>> The open-osd domain doesn't exist anymore, and mails to the list lead
>> to really annoying bounced that repeat every day.
>>
>> Also the primarydata address for Benny bounces, and while I have a new
>> one for him he doesn't seem to be maintaining the OSD code any more.
>>
>> Which beggs the question:  should we really leave the Supported status
>> in MAINTAINERS given that the code is barely maintained?
>>
>> Signed-off-by: Christoph Hellwig <h...@lst.de>
>> ---
>>  MAINTAINERS | 4 
>>  1 file changed, 4 deletions(-)
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 1bb06c5f7716..28dd83a1d9e2 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -9418,10 +9418,6 @@ F:drivers/net/wireless/intersil/orinoco/
>>  
>>  OSD LIBRARY and FILESYSTEM
>>  M:  Boaz Harrosh <o...@electrozaur.com>
>> -M:  Benny Halevy <bhal...@primarydata.com>
>> -L:  osd-...@open-osd.org
>> -W:  http://open-osd.org
>> -T:  git git://git.open-osd.org/open-osd.git
>>  S:  Maintained
>>  F:  drivers/scsi/osd/
>>  F:  include/scsi/osd_*
> 
> Hah, you beat me to it! I was going to spin up a patch for this today.
> 
> Acked-by: Jeff Layton <jlay...@redhat.com>

Acked-by: Boaz Harrosh <o...@electrozaur.com>

>

Re: [PATCH 3/4] nfs: remove the objlayout driver

2017-04-23 Thread Boaz Harrosh

On 04/21/2017 05:00 PM, Trond Myklebust wrote:
> Maintenance is not development. It’s about doing all the followup
> _after_ the feature is declared to be developed. That’s been missing
> for quite some time in the case of the OSD pNFS code, which is why
> I’m not even bothering to consider staging. If you are saying you are
> still maintaining exofs, and want to continue doing so, then great,
> but note that there is a file called BUGS in that directory that has
> been untouched since 2008, and that’s why I thing staging is a good
> idea.
> 

No, the BUGS file is just stale. As you said was not ever updated. All
the bugs (1) in there do no longer exist for a long long time.
I will send a patch to remove the file.

Yes I maintain this fs. It has the complete fixture list, of what was
first intended. I keep running it every major kernel release to make
sure it actually runs, and there are no regressions.

If this FS stands in the way of any new development please anyone
contact me and I will help in the conversion. Of exofs and the
osd-uld.

Thanks
Boaz

Re: [PATCH 3/4] nfs: remove the objlayout driver

2017-04-21 Thread Boaz Harrosh

On 04/20/2017 11:18 PM, Trond Myklebust wrote:
<>
> 
> OK. I'm applying this patch for the 4.12 merge window. 

That is understandable this code was not tested for a long while

> If, as Boaz
> suggests, there is still an interest in exofs, then I suggest we put
> that to the test by moving it into the STAGING area, to see if someone
> will step up to maintain it.
> 

I do not understand what you mean by "someone will maintain it"
What is it that is missing in exofs that you see needs development?

As I said I regularly run and test this FS and update as things advance
there where no new fixtures for a while, but no regressions either.

Please explain what it means "will maintain it"

> Cheers
>   Trond
> 

Thanks
Boaz

Re: linux-next: manual merge of the scsi-mkp tree with the char-misc tree

2017-04-20 Thread Boaz Harrosh

On 04/07/2017 10:50 PM, Bart Van Assche wrote:
> On Fri, 2017-04-07 at 13:29 -0600, Logan Gunthorpe wrote:
>> On 07/04/17 09:49 AM, Bart Van Assche wrote:
>>> Sorry that I had not yet noticed Logan's patch series. Should my two
>>> patches that conflict with Logan's patch series be dropped and reworked
>>> after Logan's patches are upstream?
>>
>> Yeah, Greg took my patchset around a few maintainers relatively quickly.
>> This is the second conflict, so sorry about that. Looks like the easiest
>> thing would be to just base your change off of mine. It doesn't look too
>> difficult. If you can do it before my patch hits upstream, I'd
>> appreciate some testing and/or review as no one from the scsi side
>> responded and that particular patch was a bit more involved than I would
>> have liked.
> 
> Boaz, had you noticed Logan's osd patch? If not, can you have a look?
> 

I did look, I even sent an ACK on one of the early versions.
The merge breakage is more of a build issue because I never
had get_device fail for me in my testing so it is more
academic.

Yes they both look fine BTW
Boaz

> Thanks,
> Bart.
>

Re: [PATCH 1/4] block: remove the osdblk driver

2017-04-19 Thread Boaz Harrosh

On 04/12/2017 07:01 PM, Christoph Hellwig wrote:
> This was just a proof of concept user for the SCSI OSD library, and
> never had any real users.
> 
> Signed-off-by: Christoph Hellwig <h...@lst.de>

Yes please remove this driver

ACK-by Boaz Harrosh <o...@electrozaur.com>

> ---
>  drivers/block/Kconfig  |  16 --
>  drivers/block/Makefile |   1 -
>  drivers/block/osdblk.c | 693 
> -
>  3 files changed, 710 deletions(-)
>  delete mode 100644 drivers/block/osdblk.c
> 
> diff --git a/drivers/block/Kconfig b/drivers/block/Kconfig
> index a1c2e816128f..58e24c354933 100644
> --- a/drivers/block/Kconfig
> +++ b/drivers/block/Kconfig
> @@ -312,22 +312,6 @@ config BLK_DEV_SKD
>  
>   Use device /dev/skd$N amd /dev/skd$Np$M.
>  
> -config BLK_DEV_OSD
> - tristate "OSD object-as-blkdev support"
> - depends on SCSI_OSD_ULD
> - ---help---
> -   Saying Y or M here will allow the exporting of a single SCSI
> -   OSD (object-based storage) object as a Linux block device.
> -
> -   For example, if you create a 2G object on an OSD device,
> -   you can then use this module to present that 2G object as
> -   a Linux block device.
> -
> -   To compile this driver as a module, choose M here: the
> -   module will be called osdblk.
> -
> -   If unsure, say N.
> -
>  config BLK_DEV_SX8
>   tristate "Promise SATA SX8 support"
>   depends on PCI
> diff --git a/drivers/block/Makefile b/drivers/block/Makefile
> index b12c772bbeb3..42e8cee1cbc7 100644
> --- a/drivers/block/Makefile
> +++ b/drivers/block/Makefile
> @@ -22,7 +22,6 @@ obj-$(CONFIG_CDROM_PKTCDVD) += pktcdvd.o
>  obj-$(CONFIG_MG_DISK)+= mg_disk.o
>  obj-$(CONFIG_SUNVDC) += sunvdc.o
>  obj-$(CONFIG_BLK_DEV_SKD)+= skd.o
> -obj-$(CONFIG_BLK_DEV_OSD)+= osdblk.o
>  
>  obj-$(CONFIG_BLK_DEV_UMEM)   += umem.o
>  obj-$(CONFIG_BLK_DEV_NBD)+= nbd.o
> diff --git a/drivers/block/osdblk.c b/drivers/block/osdblk.c
> deleted file mode 100644
> index 8127b8201a01..
> --- a/drivers/block/osdblk.c
> +++ /dev/null
> @@ -1,693 +0,0 @@
> -
> -/*
> -   osdblk.c -- Export a single SCSI OSD object as a Linux block device
> -
> -
> -   Copyright 2009 Red Hat, Inc.
> -
> -   This program is free software; you can redistribute it and/or modify
> -   it under the terms of the GNU General Public License as published by
> -   the Free Software Foundation.
> -
> -   This program is distributed in the hope that it will be useful,
> -   but WITHOUT ANY WARRANTY; without even the implied warranty of
> -   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> -   GNU General Public License for more details.
> -
> -   You should have received a copy of the GNU General Public License
> -   along with this program; see the file COPYING.  If not, write to
> -   the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA.
> -
> -
> -   Instructions for use
> -   
> -
> -   1) Map a Linux block device to an existing OSD object.
> -
> -  In this example, we will use partition id 1234, object id 5678,
> -  OSD device /dev/osd1.
> -
> -  $ echo "1234 5678 /dev/osd1" > /sys/class/osdblk/add
> -
> -
> -   2) List all active blkdev<->object mappings.
> -
> -  In this example, we have performed step #1 twice, creating two blkdevs,
> -  mapped to two separate OSD objects.
> -
> -  $ cat /sys/class/osdblk/list
> -  0 174 1234 5678 /dev/osd1
> -  1 179 1994 897123 /dev/osd0
> -
> -  The columns, in order, are:
> -  - blkdev unique id
> -  - blkdev assigned major
> -  - OSD object partition id
> -  - OSD object id
> -  - OSD device
> -
> -
> -   3) Remove an active blkdev<->object mapping.
> -
> -  In this example, we remove the mapping with blkdev unique id 1.
> -
> -  $ echo 1 > /sys/class/osdblk/remove
> -
> -
> -   NOTE:  The actual creation and deletion of OSD objects is outside the 
> scope
> -   of this driver.
> -
> - */
> -
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -
> -#define DRV_NAME "osdblk"
> -#define PFX DRV_NAME ": "
> -
> -/* #define _OSDBLK_DEBUG */
> -#ifdef _OSDBLK_DEBUG
> -#define OSDBLK_DEBUG(fmt, a...) \
> - printk(KERN_NOTICE "osdblk @%s:%d: " fmt, __func__, __LINE__, ##a)
> -#else
> -#define OSDBLK_DEBUG(fmt, a...) \
> - do { if (0) printk(fmt, ##a); } while

Re: RFC: drop the T10 OSD code and its users

2017-04-18 Thread Boaz Harrosh

On 04/12/2017 07:01 PM, Christoph Hellwig wrote:

Hi Sir Christoph

> The only real user of the T10 OSD protocol, the pNFS object layout
> driver never went to the point of having shipping products, and the
> other two users (osdblk and exofs) were simple example of it's usage.
> 

I understand why osdblk might be a pain, and was broken from day one, and
should by all means go away ASAP.

But exofs should not be bothering anyone, and as far as I know does
not use any special API's except the osd_uld code of course.

> The code has been mostly unmaintained for years and is getting in the
> way of block / SCSI changes, so I think it's finally time to drop it.
> 

Please tell me what are those changes you are talking about? I might be
able to help in conversion. I guess you mean osd_uld and the Upper SCSI API.
Just point me at a tree where osd_uld is broken, and perhaps with a little
guidance from you I can do a satisfactory conversion.

Is true that no new code went in for a long while, but I still from time
to time run a setup and test that the all stack, like iscsi-bidi and so on still
works.

That said, yes only a stand alone exofs was tested for a long time, a full
pnfs setup is missing any supporting server. So Yes I admit that pnfs-obj is
getting very rotten. And is most probably broken, on the pnfs side of things.
[Which I admit makes my little plea kind of mute ;-) ]

Every once in a while I get emails from Students basing all kind of interesting
experiments on top of the exofs and object base storage. So for the sake of 
academics
and for the sake of a true bidi-stack testing, might we want to evaluate what 
is the
up coming cost, and what is a minimum set we are willing to keep?

Please advise?

thanks
Boaz

> These patches are against Jens' block for-next tree as that already
> has various modifications of the SCSI code.
>

Re: [PATCH] osd_uld: Check scsi_device_get() return value

2017-04-05 Thread Boaz Harrosh

On 03/30/2017 08:17 PM, Bart Van Assche wrote:
> scsi_device_get() can fail. Hence check its return value.
> 
> Signed-off-by: Bart Van Assche <bart.vanass...@sandisk.com>
> Cc: Boaz Harrosh <bharr...@panasas.com>

Cool thanks
ACK-by: Boaz Harrosh <o...@electrozaur.com>

> ---
>  drivers/scsi/osd/osd_uld.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/scsi/osd/osd_uld.c b/drivers/scsi/osd/osd_uld.c
> index e0ce5d2fd14d..1dea4244dd0c 100644
> --- a/drivers/scsi/osd/osd_uld.c
> +++ b/drivers/scsi/osd/osd_uld.c
> @@ -464,14 +464,15 @@ static int osd_probe(struct device *dev)
>   /* hold one more reference to the scsi_device that will get released
>* in __release, in case a logout is happening while fs is mounted
>*/
> - scsi_device_get(scsi_device);
> + if (scsi_device_get(scsi_device))
> + goto err_put_disk;
>   osd_dev_init(>od, scsi_device);
>  
>   /* Detect the OSD Version */
>   error = __detect_osd(oud);
>   if (error) {
>   OSD_ERR("osd detection failed, non-compatible OSD device\n");
> - goto err_put_disk;
> + goto err_put_sdev;
>   }
>  
>   /* init the char-device for communication with user-mode */
> @@ -508,8 +509,9 @@ static int osd_probe(struct device *dev)
>  
>  err_put_cdev:
>   cdev_del(>cdev);
> -err_put_disk:
> +err_put_sdev:
>   scsi_device_put(scsi_device);
> +err_put_disk:
>   put_disk(disk);
>  err_free_osd:
>   dev_set_drvdata(dev, NULL);
>

Re: [PATCH] scsi: osd_uld: remove an unneeded NULL check

2017-03-23 Thread Boaz Harrosh

On 03/23/2017 12:41 PM, Dan Carpenter wrote:
> We don't call the remove() function unless probe() succeeds so "oud"
> can't be NULL here.  Plus, if it were NULL, we dereference it on the
> next line so it would crash anyway.
> 
> Signed-off-by: Dan Carpenter <dan.carpen...@oracle.com>
> 

Thanks sure!
ACK-by Boaz Harrosh <o...@electrozaur.com>

> diff --git a/drivers/scsi/osd/osd_uld.c b/drivers/scsi/osd/osd_uld.c
> index 4101c3178411..8b9941a5687a 100644
> --- a/drivers/scsi/osd/osd_uld.c
> +++ b/drivers/scsi/osd/osd_uld.c
> @@ -507,10 +507,9 @@ static int osd_remove(struct device *dev)
>   struct scsi_device *scsi_device = to_scsi_device(dev);
>   struct osd_uld_device *oud = dev_get_drvdata(dev);
>  
> - if (!oud || (oud->od.scsi_device != scsi_device)) {
> - OSD_ERR("Half cooked osd-device %p,%p || %p!=%p",
> - dev, oud, oud ? oud->od.scsi_device : NULL,
> - scsi_device);
> + if (oud->od.scsi_device != scsi_device) {
> + OSD_ERR("Half cooked osd-device %p, || %p!=%p",
> + dev, oud->od.scsi_device, scsi_device);
>   }
>  
>   cdev_device_del(>cdev, >class_dev);
>

Re: [Lsf-pc] [LSF/MM TOPIC] do we really need PG_error at all?

2017-02-28 Thread Boaz Harrosh

On 02/28/2017 03:11 AM, Jeff Layton wrote:
<>
> 
> I'll probably have questions about the read side as well, but for now it
> looks like it's mostly used in an ad-hoc way to communicate errors
> across subsystems (block to fs layer, for instance).

If memory does not fail me it used to be checked long time ago in the
read-ahead case. On the buffered read case, the first page is read synchronous
and any error is returned to the caller, but then a read-ahead chunk is
read async all the while the original thread returned to the application.
So any errors are only recorded on the page-bit, since otherwise the uptodate
is off and the IO will be retransmitted. Then the move to read_iter changed
all that I think.
But again this is like 5-6 years ago, and maybe I didn't even understand
very well.

> --
> Jeff Layton 
> 

I would like a Documentation of all this as well please. Where are the
tests for this?

Thanks
Boaz

Re: [PATCH v2] osd: remove deadcode

2016-02-24 Thread Boaz Harrosh

On 02/24/2016 01:21 PM, Sudip Mukherjee wrote:
> The variable is_ver1 is always true and so OSD_CAP_LEN can never be
> used.
> Reported by Coverity.
> 
> Signed-off-by: Sudip Mukherjee <sudip.mukher...@codethink.co.uk>

ACK-by: Boaz harrosh <o...@elecrozaur.com>

Thanks
> ---
> 
> v2: Joe Perches asked to mention the tool used in the commit log.
> 
>  drivers/scsi/osd/osd_initiator.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/osd/osd_initiator.c 
> b/drivers/scsi/osd/osd_initiator.c
> index d8a2b51..3b11aad 100644
> --- a/drivers/scsi/osd/osd_initiator.c
> +++ b/drivers/scsi/osd/osd_initiator.c
> @@ -2006,9 +2006,8 @@ EXPORT_SYMBOL(osd_sec_init_nosec_doall_caps);
>   */
>  void osd_set_caps(struct osd_cdb *cdb, const void *caps)
>  {
> - bool is_ver1 = true;
>   /* NOTE: They start at same address */
> - memcpy(>v1.caps, caps, is_ver1 ? OSDv1_CAP_LEN : OSD_CAP_LEN);
> + memcpy(>v1.caps, caps, OSDv1_CAP_LEN);
>  }
>  
>  bool osd_is_sec_alldata(struct osd_security_parameters *sec_parms __unused)
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] [SCSI] osd: fix signed char versus %02x issue

2015-12-08 Thread Boaz Harrosh

On 12/08/2015 04:25 PM, Rasmus Villemoes wrote:
> If char is signed and one of these bytes happen to have a value
> outside the ascii range, the corresponding output will consist of
> "ff" followed by the two hex chars that were actually
> intended. One way to fix it would be to change the casts to (u8*) aka
> (unsigned char*), but it is much simpler (and generates smaller code)
> to use the %ph extension which was created for such short hexdumps.
> 

Ha real cool, thanks I hated that crap

ACK-by: Boaz Harrosh <o...@electrozaur.com>

> Signed-off-by: Rasmus Villemoes <li...@rasmusvillemoes.dk>
> ---
>  drivers/scsi/osd/osd_initiator.c | 5 +
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/drivers/scsi/osd/osd_initiator.c 
> b/drivers/scsi/osd/osd_initiator.c
> index 0cccd6033feb..d8a2b5185f56 100644
> --- a/drivers/scsi/osd/osd_initiator.c
> +++ b/drivers/scsi/osd/osd_initiator.c
> @@ -170,10 +170,7 @@ static int _osd_get_print_system_info(struct osd_dev *od,
>  
>   /* FIXME: Where are the time utilities */
>   pFirst = get_attrs[a++].val_ptr;
> - OSD_INFO("CLOCK  [0x%02x%02x%02x%02x%02x%02x]\n",
> - ((char *)pFirst)[0], ((char *)pFirst)[1],
> - ((char *)pFirst)[2], ((char *)pFirst)[3],
> - ((char *)pFirst)[4], ((char *)pFirst)[5]);
> + OSD_INFO("CLOCK  [0x%6phN]\n", pFirst);
>  
>   if (a < nelem) { /* IBM-OSD-SIM bug, Might not have it */
>   unsigned len = get_attrs[a].len;
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [REGRESSION v4.3] scsi_dh: use-after-free when removing scsi device

2015-09-30 Thread Boaz Harrosh

On 09/30/2015 12:28 PM, Hannes Reinecke wrote:
<>
> Pushing things into the background is typically not the best of
> ideas; actually I've been running into issues with udev not being
> complete by the time the next round is started. So more often than
> not I would be greeted with messages:
> 
> 'write: no such file or directory'
> 
> when executing this line. Removing the '&' at the end made this
> warning go away.
> 
> And actually I'm not sure if the above script is a valid testcase;

So are you saying it is allowed to crash the Kernel with a crappy
script?

> from what I've seen there is no locking / reference counting when
> accessing sysfs attributes. So as soon as you _can_ access the sysfs
> attribute it is implicitly assumed to be valid.
> In your case you will be _removing_ the sysfs attribute even though
> it is still accessed, which of course will crash.
> 

Is that allowed? for usermode script to race and crash the Kernel?

>From the original email it sounds like this used to be fine and it
now crashes (with the &)

Thanks
Boaz

> Can you still reproduce this problem after removing the '&' in that
> line?
> 
>>  echo "-- delete $dev --" > /dev/kmsg
>>  echo 1 > /sys/class/scsi_device/${dev}/device/delete
>>
>>  n=$((n + 1))
>> done
>> --- cut here --
> 
> Having said that I've retried your test script with my ALUA handler
> update, and didn't find any issues there.
> It happily completed about 500 rounds at which point I got bored.
> Of course, this is after removing the '&' in the said line.
> 
> Cheers,
> 
> Hannes
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] SCSI-OSD: Delete an unnecessary check before the function call put_disk

2015-06-28 Thread Boaz Harrosh

On 06/24/2015 05:16 PM, SF Markus Elfring wrote:
 From: Markus Elfring elfr...@users.sourceforge.net
 Date: Wed, 24 Jun 2015 16:06:21 +0200
 
 The put_disk() function tests whether its argument is NULL and then
 returns immediately. Thus the test around the call is not needed.
 
 This issue was detected by using the Coccinelle software.
 
 Signed-off-by: Markus Elfring elfr...@users.sourceforge.net

ACK-by: Boaz Harrosh o...@electrozaur.com

 ---
  drivers/scsi/osd/osd_uld.c | 4 +---
  1 file changed, 1 insertion(+), 3 deletions(-)
 
 diff --git a/drivers/scsi/osd/osd_uld.c b/drivers/scsi/osd/osd_uld.c
 index 243eab3..e2075522 100644
 --- a/drivers/scsi/osd/osd_uld.c
 +++ b/drivers/scsi/osd/osd_uld.c
 @@ -407,9 +407,7 @@ static void __remove(struct device *dev)
  
   OSD_INFO(osd_remove %s\n,
oud-disk ? oud-disk-disk_name : NULL);
 -
 - if (oud-disk)
 - put_disk(oud-disk);
 + put_disk(oud-disk);
   ida_remove(osd_minor_ida, oud-minor);
  
   kfree(oud);
 

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v6 9/9] snic:Add Makefile, patch Kconfig, MAINTAINERS

2015-05-27 Thread Boaz Harrosh

On 05/27/2015 10:19 AM, Narsimhulu Musini wrote:
 Kconfig for kbuild
 Makefile to build snic module
 
 Updated MAINTAINERS file
 
 Signed-off-by: Narsimhulu Musini nmus...@cisco.com
 Signed-off-by: Sesidhar Baddela sebad...@cisco.com
 ---
 * v3
 - Added additional config section (CONFIG_SNIC_DEBUG_FS) for enabling 
 debugging
   functionality.
 
 * v2
 - Added compile time flags for debugfs dependent functionality.
 
  MAINTAINERS|  7 +++
  drivers/scsi/Kconfig   | 17 +
  drivers/scsi/Makefile  |  1 +
  drivers/scsi/snic/Makefile | 21 +
  4 files changed, 46 insertions(+)
  create mode 100644 drivers/scsi/snic/Makefile
 
 diff --git a/MAINTAINERS b/MAINTAINERS
 index 2a97e05..368fb76 100644
 --- a/MAINTAINERS
 +++ b/MAINTAINERS
 @@ -2536,6 +2536,13 @@ L: linux-scsi@vger.kernel.org
  S:   Supported
  F:   drivers/scsi/fnic/
  
 +CISCO SCSI HBA DRIVER
 +M:   Narsimhulu Musini nmus...@cisco.com
 +M:   Sesidhar Baddela sebad...@cisco.com
 +L:   linux-scsi@vger.kernel.org
 +S:   Supported
 +F:   drivers/scsi/snic/
 +
  CMPC ACPI DRIVER
  M:   Thadeu Lima de Souza Cascardo casca...@holoscopio.com
  M:   Daniel Oliveira Nascimento d...@syst.com.br
 diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig
 index 9c92f41..8baab3f 100644
 --- a/drivers/scsi/Kconfig
 +++ b/drivers/scsi/Kconfig
 @@ -634,6 +634,23 @@ config FCOE_FNIC
 file:Documentation/scsi/scsi.txt.
 The module will be called fnic.
  
 +config SCSI_SNIC
 + tristate Cisco SNIC Driver
 + depends on PCI  SCSI  X86_64
 + help
 +   This is support for the Cisco PCI-Express SCSI HBA.
 +
 +   To compile this driver as a module, choose M here and read
 +   file:Documentation/scsi/scsi.txt.
 +   The module will be called snic.
 +
 +config SCSI_SNIC_DEBUG_FS
 + bool Cisco SNIC Driver Debugfs Support
 + depends on SCSI_SNIC  DEBUG_FS
 + help
 +   This enables to list debugging information from SNIC Driver
 +   available via debugfs file system
 +
  config SCSI_DMX3191D
   tristate DMX3191D SCSI support
   depends on PCI  SCSI
 diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile
 index 58158f1..f643942 100644
 --- a/drivers/scsi/Makefile
 +++ b/drivers/scsi/Makefile
 @@ -39,6 +39,7 @@ obj-$(CONFIG_LIBFC) += libfc/
  obj-$(CONFIG_LIBFCOE)+= fcoe/
  obj-$(CONFIG_FCOE)   += fcoe/
  obj-$(CONFIG_FCOE_FNIC)  += fnic/
 +obj-$(CONFIG_SCSI_SNIC)  += snic/
  obj-$(CONFIG_SCSI_BNX2X_FCOE)+= libfc/ fcoe/ bnx2fc/
  obj-$(CONFIG_ISCSI_TCP)  += libiscsi.o   libiscsi_tcp.o iscsi_tcp.o
  obj-$(CONFIG_INFINIBAND_ISER)+= libiscsi.o
 diff --git a/drivers/scsi/snic/Makefile b/drivers/scsi/snic/Makefile
 new file mode 100644
 index 000..572102a
 --- /dev/null
 +++ b/drivers/scsi/snic/Makefile
 @@ -0,0 +1,21 @@
 +obj-$(CONFIG_SCSI_SNIC) += snic.o
 +
 +snic-y := \
 + snic_attrs.o \
 + snic_main.o \
 + snic_res.o \
 + snic_isr.o \
 + snic_ctl.o \
 + snic_io.o \
 + snic_scsi.o \
 + snic_disc.o \
 + vnic_cq.o \
 + vnic_intr.o \
 + vnic_dev.o \
 + vnic_wq.o
 +
 +ifeq ($(CONFIG_SCSI_SNIC_DEBUG_FS), y)
 +ccflags-y += -DSNIC_DEBUG_FS

Why do you need an extra define here just use
CONFIG_SCSI_SNIC_DEBUG_FS in source code directly

 +snic-y += snic_debugfs.o \
 + snic_trc.o
 +endif
 

snic-$(CONFIG_SCSI_SNIC_DEBUG_FS) += snic_debugfs.o

You do not the  ifeq () thing at all

Cheers
Boaz

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v6 9/9] snic:Add Makefile, patch Kconfig, MAINTAINERS

2015-05-27 Thread Boaz Harrosh

On 05/27/2015 01:02 PM, Narsimhulu Musini (nmusini) wrote:

 +ifeq ($(CONFIG_SCSI_SNIC_DEBUG_FS), y)
 +ccflags-y += -DSNIC_DEBUG_FS

 Why do you need an extra define here just use
 CONFIG_SCSI_SNIC_DEBUG_FS in source code directly
 Agree, I just want to use a shorter macro in the source.

Don't do this. It is a convention that if a programmer
sees a CONFIG_XXX he knows that this is settable by a
Kconfig. By making it shorter the programmer will think that
this is dead code because it is not set anywhere.


 +snic-y += snic_debugfs.o \
 +   snic_trc.o
 +endif


 snic-$(CONFIG_SCSI_SNIC_DEBUG_FS) += snic_debugfs.o
 If CONFIG_SCSI_SNIC_DEBUGFS is not set, then it leaves a build variable
 ³snic- in build system. ifeq() avoids such thing.

That is fine. This is done all over the Kernel Makefile.
There are two tons of these do nothing make variables

It the way we do it in the Kernel. (I actually like it)


 You do not the  ifeq () thing at all

 Cheers
 Boaz

 Thanks
 simha

 

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] scatterlist: enable sg chaining for all architectures

2015-04-29 Thread Boaz Harrosh

On 04/29/2015 05:15 AM, James Bottomley wrote:
 
 Perhaps the best thing to do is just fix target and call it quits?
 

Right! drivers write code for sg_chaining and on ARCHs that do not
support it the code just works.
Only the max_sg is smaller and the chaining code never kicks in
and is dead code for these ARCHs.

 James

Cheers
Boaz

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/17] Clear up bidi command confusion

2015-01-29 Thread Boaz Harrosh

On 01/29/2015 04:41 PM, James Bottomley wrote:
 On Thu, 2015-01-29 at 15:00 +0200, Boaz Harrosh wrote:
 On 01/23/2015 03:12 PM, Christoph Hellwig wrote:
 On Fri, Jan 23, 2015 at 01:05:30PM +0100, Bart Van Assche wrote:
 There is some confusion in the SCSI core and in SCSI LLDs around the
 meaning of sc_data_direction and whether or not this member can have the
 value DMA_BIDIRECTIONAL. Clear up this confusion. The patches in this
 series are:

 I wonder if we should change the code to set DMA_BIDIRECTIONAL for
 bidi commands.  That seems a lot more logical than the current
 version.


 You cannot do this. Because a Bidi Command is actually two commands
 one for the WRITE side (DMA_TO_DEVICE) which is this one, and another
 command for the READ side (DMA_FROM_DEVICE).
 
 You're not thinking about this the correct way.  DMA_BIDIRECTIONAL is a
 DMA engine flag.  We use it in SCSI for historical reasons (mainly to
 prevent a translation from the SCSI version).  Since it was never used,
 most SCSI drivers have code to check and reject commands with it set.
 For actual bidirectional commands, you have to check scsi_bidi_command()
 and programme the DMA engine separately for both phases, so the flag is
 useless in this case.
 

This is what I meant how is above different from what I said?

 The proposal is to make it the signal for bidirectional commands

This I disagree, and would be a very bad idea and will take us back, to
hacking time. As you noted, please let us leave DMA_BIDIRECTIONAL to the
DMA engines. Let us just stir away from the DMA_XXX flags altogether and
move back to block layer flags. 

 (in which case most HBAs would make it do the right thing; i.e. reject) 

This is not necessary, (the check and rejection), commands will not be issued
to LLDs that did not mark support for BiDi. So they will never receive
such commands and it is added dead code.

 but
 it still wouldn't be how you'd program the DMA enigne; that still would
 have to be done in two phases as it is today.
 

Exactly my point. Only one small correction, these are not two phases, as in
phases-1 then phases-2. These are two memory buffers independently programmed
for DMA transfer. The actual transfer occurs simultaneously, on both buffers.

 James
 
 

Thanks
Boaz

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/17] Clear up bidi command confusion

2015-01-29 Thread Boaz Harrosh

On 01/23/2015 03:12 PM, Christoph Hellwig wrote:
 On Fri, Jan 23, 2015 at 01:05:30PM +0100, Bart Van Assche wrote:
 There is some confusion in the SCSI core and in SCSI LLDs around the
 meaning of sc_data_direction and whether or not this member can have the
 value DMA_BIDIRECTIONAL. Clear up this confusion. The patches in this
 series are:
 
 I wonder if we should change the code to set DMA_BIDIRECTIONAL for
 bidi commands.  That seems a lot more logical than the current
 version.
 

You cannot do this. Because a Bidi Command is actually two commands
one for the WRITE side (DMA_TO_DEVICE) which is this one, and another
command for the READ side (DMA_FROM_DEVICE).

The DMA_XXX_ enum must not to be confused with the t10-scsi Bidi commands.
In DMA world the enum means: What are the allowed access on the memory buffer,
is Device allowed to read-from-memory-only or write-to-memory-only or both
simultaneously on the same buffer.
It is a flag to the IO-mmu on the direction of its mappings.

A t10-scsi bidi command is two buffers. The command caries two distinct
buffers one is only-written-to 2nd only-read-from. But these are two distinct
buffers each his proper direction.

If you ask me you need to remove sc_data_direction all together and just
go directly to the READ/WRITE flag at request level. SCSI has no business
duplicating the same information in another member another way.

In any way t10-scsi has no use in the entire of its STD of the DMA_BIDIRECTIONAL
sense. ie. same buffer, two directional access. So DMA_BIDIRECTIONAL is just
dead code for sc_data_direction. It could be imagined but t10-scsi does not
have a use for it.

Again Not to be confused with one-command two buffers, which is exactly the 
opposite.

 Also I don't think all the debug checks for bidi commands that you
 change should stay at all - driver need to set the QUEUE_FLAG_BIDI to
 ever see a bidi command.
 

Exactly just remove the all checks.

 It would also nice to add a host template flag for bidi support instead
 of having to poke into the block layer request_queue while we're at it.

Cheers
Boaz


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/17] Clear up bidi command confusion

2015-01-29 Thread Boaz Harrosh

On 01/26/2015 11:58 AM, Bart Van Assche wrote:

 Hello Christoph,
 
 This makes sense to me. I will rework this patch series as you proposed.
 

Do you have a bidi setup to test against?

Sending xor command to scsi_dbg is only half the test for me because, yes
there are two buffers, but they are of same size so bugs might be masked.
(From experience)

Thanks
Boaz

 Bart.

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/17] Clear up bidi command confusion

2015-01-29 Thread Boaz Harrosh

On 01/29/2015 03:20 PM, Bart Van Assche wrote:
 On 01/29/15 14:07, Boaz Harrosh wrote:
 On 01/26/2015 11:58 AM, Bart Van Assche wrote:

 Hello Christoph,

 This makes sense to me. I will rework this patch series as you proposed.

 Do you have a bidi setup to test against?

 Sending xor command to scsi_dbg is only half the test for me because, yes
 there are two buffers, but they are of same size so bugs might be masked.
 (From experience)
 
 Hello Boaz,
 
 If anyone would like to submit a scsi_debug patch that adds support to
 that kernel driver for a bidi command for which the input and output
 buffers have different sizes I think that would be helpful.
 

Hi Bart.

scsi_dbg is a scsi-blk-device type device (forgot the exact name).
I do not think there is such a command defined by the STD for this command
set.

its only for other device types like osd, printer and so on.

But sending XOR commands to scsi_dbg is a good start.

 Bart.
 

Please Note: I do not agree for the use of the constant DMA_BIDIRECTIONAL
to denote scsi_bidi_cmnd(cmd). This is wrong and takes us back not
forward.

If anything at all is done, cmd-sc_data_direction should be just removed
all together. It is not needed and denotes nothing more than what is already
available at the request level. Please do not add any more new code on top
of cmd-sc_data_direction. (At minimum just remove the all DMA_BIDIRECTIONAL
checks and you are done)

Cheers
Boaz

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Large disk drives

2014-11-06 Thread Boaz Harrosh

On 11/06/2014 05:54 PM, James Bottomley wrote:
 
 We don't have a failure.  This is the problem.  Determining that a
 problem exists 
 

OK Sorry. I assumed the bridge is smart enough to do nothing,

ie READ_CAPACITY_10 is passed as is via sata to the device that
actually supports READ_CAPACITY_16, as I understand then the 
actual good drive is not suppose to send size-modulue-2G in
response to READ_CAPACITY_10. Should it ?

Then the bridge just sends that back to me, now if I send
READ_CAPACITY_16 the bridge will return NOT-SUPPORTED because
it is unexpected.

But what are you saying that the bridge was smart enough to
do READ_CAPACITY_16 get a 64bit value from the drive and then
return the lower 32bit to me ? Really ? I would not imagine
in the life of me someone so dumb. And surly it is against
any spec.

Are you sure ? I think you are wrong I think the guy reported
that he can only see 2T out of his 3T drive which means the
bridge returned 0x, exactly 2T

 James
 
 

But hey, I guess stupidity is limitless, this one here pushing
real far.

Cheers
Boaz

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Large disk drives

2014-11-05 Thread Boaz Harrosh

On 11/05/2014 06:34 PM, Alan Stern wrote:

 
 It's simpler than that: The drive is attached directly to the computer
 (i.e., via SATA rather than USB) when the partition table is created.  
 With no USB-SATA bridge chip to mess things up, there's no problem
 determining the correct capacity.
 

Right!

I think it should be very simple to, just as we send a
READ_CAPACITY16 / READ_CAPACITY32 / READ_CAPACITY64 or is
it a GET_CODE_PAGE ? whatever we send today we can also
have READ_CAPACITY_PART() which will send a READ of
sector 0 the raw PC_COMMAND way and run it through some
partition analyzer. We only need to support gpt or msdos
partitions I think in this case.

Then if READ_CAPACITY16 returns all (s) and
READ_CAPACITY32 is not supported and/or blacklisted,
then yes attempt a READ_CAPACITY_PART() which should
be sam-2 compatible.

It should not take longer than a weekend afternoon over
cup of Machha. If any one sends a bad device my way, I'll
do it. But anyone should be able to code something so simple.

 Alan Stern
 

Cheers
Boaz

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [GIT PULL] osd: Boaz Harrosh - change of email

2014-10-22 Thread Boaz Harrosh

On 10/22/2014 12:04 AM, James Bottomley wrote:
 On Tue, 2014-10-21 at 17:05 +0300, Boaz Harrosh wrote:
 Hi Sir Linus

 A small administrative stuff, was on vacation and the old email bounced on 
 me.
 I was hoping to still make the 3 weeks merge window, but it was 2 weeks at 
 the
 end.
 Your call if to make this wait for next window. Needless to say that it is
 ZERO risk, just change of email.

 Based on commit [bfe01a5b] Linux 3.17

 3 patches available in the git repository at:

   git://git.open-osd.org/linux-open-osd.git for-linus

 for you to fetch changes up to 1fa3a002b2546c42c343c77c144871285896ced5:

   Boaz Harrosh - fix email in Documentation (2014-10-19 20:36:36 +0300)

 
 Boaz Harrosh (3):
   MAINTAINERS: Change Boaz Harrosh's email
   Boaz Harrosh - Fix broken email address
   Boaz Harrosh - fix email in Documentation

  Documentation/scsi/osd.txt  | 3 +--
  MAINTAINERS | 2 +-
  drivers/scsi/osd/Kbuild | 2 +-
  drivers/scsi/osd/Kconfig| 2 +-
  drivers/scsi/osd/osd_debug.h| 2 +-
  drivers/scsi/osd/osd_initiator.c| 4 ++--
  drivers/scsi/osd/osd_uld.c  | 4 ++--
  fs/exofs/Kbuild | 2 +-
  fs/exofs/common.h   | 2 +-
  fs/exofs/dir.c  | 2 +-
  fs/exofs/exofs.h| 2 +-
  fs/exofs/file.c | 2 +-
  fs/exofs/inode.c| 2 +-
  fs/exofs/namei.c| 2 +-
  fs/exofs/ore.c  | 4 ++--
  fs/exofs/ore_raid.c | 2 +-
  fs/exofs/ore_raid.h | 2 +-
  fs/exofs/super.c| 2 +-
  fs/exofs/symlink.c  | 2 +-
  fs/exofs/sys.c  | 2 +-
  fs/nfs/objlayout/objio_osd.c| 2 +-
  fs/nfs/objlayout/objlayout.c| 2 +-
  fs/nfs/objlayout/objlayout.h| 2 +-
  fs/nfs/objlayout/pnfs_osd_xdr_cli.c | 2 +-
  include/linux/pnfs_osd_xdr.h| 2 +-
  include/scsi/osd_initiator.h| 2 +-
  include/scsi/osd_ore.h  | 2 +-
  include/scsi/osd_protocol.h | 4 ++--
  include/scsi/osd_sec.h  | 2 +-
  include/scsi/osd_sense.h| 2 +-
  include/scsi/osd_types.h| 2 +-
  31 files changed, 35 insertions(+), 36 deletions(-)
 
 This is a bit of an unnecessary massive churn.  No one expects the
 author named in the file to stay up to date, especially because the
 @domain.com usually credits the company who paid for the code, so it's
 left in as a kind of mark of respect for them.  I'm not saying it
 applies in your case, just that it creates the common expectation of
 in-file authors needing to be traced through the MAINTAINERS file.
 
 Could you not just update the MAINTAINERS file only, like everyone else?

OK, I did not know this. The Panasas credit is still there so no
harm done. Anyway Linus pulled it, so lets leave it at that for now.

Next time I'll know.

 James

Thanks
Boaz

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[GIT PULL] osd: Boaz Harrosh - change of email

2014-10-21 Thread Boaz Harrosh

Hi Sir Linus

A small administrative stuff, was on vacation and the old email bounced on me.
I was hoping to still make the 3 weeks merge window, but it was 2 weeks at the
end.
Your call if to make this wait for next window. Needless to say that it is
ZERO risk, just change of email.

Based on commit [bfe01a5b] Linux 3.17

3 patches available in the git repository at:

  git://git.open-osd.org/linux-open-osd.git for-linus

for you to fetch changes up to 1fa3a002b2546c42c343c77c144871285896ced5:

  Boaz Harrosh - fix email in Documentation (2014-10-19 20:36:36 +0300)


Boaz Harrosh (3):
  MAINTAINERS: Change Boaz Harrosh's email
  Boaz Harrosh - Fix broken email address
  Boaz Harrosh - fix email in Documentation

 Documentation/scsi/osd.txt  | 3 +--
 MAINTAINERS | 2 +-
 drivers/scsi/osd/Kbuild | 2 +-
 drivers/scsi/osd/Kconfig| 2 +-
 drivers/scsi/osd/osd_debug.h| 2 +-
 drivers/scsi/osd/osd_initiator.c| 4 ++--
 drivers/scsi/osd/osd_uld.c  | 4 ++--
 fs/exofs/Kbuild | 2 +-
 fs/exofs/common.h   | 2 +-
 fs/exofs/dir.c  | 2 +-
 fs/exofs/exofs.h| 2 +-
 fs/exofs/file.c | 2 +-
 fs/exofs/inode.c| 2 +-
 fs/exofs/namei.c| 2 +-
 fs/exofs/ore.c  | 4 ++--
 fs/exofs/ore_raid.c | 2 +-
 fs/exofs/ore_raid.h | 2 +-
 fs/exofs/super.c| 2 +-
 fs/exofs/symlink.c  | 2 +-
 fs/exofs/sys.c  | 2 +-
 fs/nfs/objlayout/objio_osd.c| 2 +-
 fs/nfs/objlayout/objlayout.c| 2 +-
 fs/nfs/objlayout/objlayout.h| 2 +-
 fs/nfs/objlayout/pnfs_osd_xdr_cli.c | 2 +-
 include/linux/pnfs_osd_xdr.h| 2 +-
 include/scsi/osd_initiator.h| 2 +-
 include/scsi/osd_ore.h  | 2 +-
 include/scsi/osd_protocol.h | 4 ++--
 include/scsi/osd_sec.h  | 2 +-
 include/scsi/osd_sense.h| 2 +-
 include/scsi/osd_types.h| 2 +-
 31 files changed, 35 insertions(+), 36 deletions(-)

Thanks, have a good Kernel Cycle
Boaz
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/2] MAINTAINERS: Change Boaz Harrosh's email

2014-10-19 Thread Boaz Harrosh

From: Boaz Harrosh o...@electrozaur.com

I have moved on, and do no longer have Panasas email access.
Update to an email that can reach me.

So change bharr...@panasas.com = o...@electrozaur.com

Explain of email address:
* electrozaur.com is a domain owned by me.
* ooo - Stands for Open Osd . Org

Another email alias that can be used is:
open...@gmail.com

CC: Greg KH gre...@linuxfoundation.org
Signed-off-by: Boaz Harrosh o...@electrozaur.com
---
 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index f10ed39..fc2c9a8 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6725,7 +6725,7 @@ S:Orphan
 F: drivers/net/wireless/orinoco/
 
 OSD LIBRARY and FILESYSTEM
-M: Boaz Harrosh bharr...@panasas.com
+M: Boaz Harrosh o...@electrozaur.com
 M: Benny Halevy bhal...@primarydata.com
 L: osd-...@open-osd.org
 W: http://open-osd.org
-- 
1.9.3

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/2] Boaz Harrosh - Fix broken email address

2014-10-19 Thread Boaz Harrosh

From: Boaz Harrosh o...@electrozaur.com

I no longer have access to the Panasas email.
So change to an email that can always reach me.

Signed-off-by: Boaz Harrosh o...@electrozaur.com
---
 drivers/scsi/osd/Kbuild | 2 +-
 drivers/scsi/osd/Kconfig| 2 +-
 drivers/scsi/osd/osd_debug.h| 2 +-
 drivers/scsi/osd/osd_initiator.c| 4 ++--
 drivers/scsi/osd/osd_uld.c  | 4 ++--
 fs/exofs/Kbuild | 2 +-
 fs/exofs/common.h   | 2 +-
 fs/exofs/dir.c  | 2 +-
 fs/exofs/exofs.h| 2 +-
 fs/exofs/file.c | 2 +-
 fs/exofs/inode.c| 2 +-
 fs/exofs/namei.c| 2 +-
 fs/exofs/ore.c  | 4 ++--
 fs/exofs/ore_raid.c | 2 +-
 fs/exofs/ore_raid.h | 2 +-
 fs/exofs/super.c| 2 +-
 fs/exofs/symlink.c  | 2 +-
 fs/exofs/sys.c  | 2 +-
 fs/nfs/objlayout/objio_osd.c| 2 +-
 fs/nfs/objlayout/objlayout.c| 2 +-
 fs/nfs/objlayout/objlayout.h| 2 +-
 fs/nfs/objlayout/pnfs_osd_xdr_cli.c | 2 +-
 include/linux/pnfs_osd_xdr.h| 2 +-
 include/scsi/osd_initiator.h| 2 +-
 include/scsi/osd_ore.h  | 2 +-
 include/scsi/osd_protocol.h | 4 ++--
 include/scsi/osd_sec.h  | 2 +-
 include/scsi/osd_sense.h| 2 +-
 include/scsi/osd_types.h| 2 +-
 29 files changed, 33 insertions(+), 33 deletions(-)

diff --git a/drivers/scsi/osd/Kbuild b/drivers/scsi/osd/Kbuild
index 5fd73d77..58cecd4 100644
--- a/drivers/scsi/osd/Kbuild
+++ b/drivers/scsi/osd/Kbuild
@@ -4,7 +4,7 @@
 # Copyright (C) 2008 Panasas Inc.  All rights reserved.
 #
 # Authors:
-#   Boaz Harrosh bharr...@panasas.com
+#   Boaz Harrosh o...@electrozaur.com
 #   Benny Halevy bhal...@panasas.com
 #
 # This program is free software; you can redistribute it and/or modify
diff --git a/drivers/scsi/osd/Kconfig b/drivers/scsi/osd/Kconfig
index a070351..347cc5e 100644
--- a/drivers/scsi/osd/Kconfig
+++ b/drivers/scsi/osd/Kconfig
@@ -4,7 +4,7 @@
 # Copyright (C) 2008 Panasas Inc.  All rights reserved.
 #
 # Authors:
-#   Boaz Harrosh bharr...@panasas.com
+#   Boaz Harrosh o...@electrozaur.com
 #   Benny Halevy bhal...@panasas.com
 #
 # This program is free software; you can redistribute it and/or modify
diff --git a/drivers/scsi/osd/osd_debug.h b/drivers/scsi/osd/osd_debug.h
index 579e491..2634126 100644
--- a/drivers/scsi/osd/osd_debug.h
+++ b/drivers/scsi/osd/osd_debug.h
@@ -4,7 +4,7 @@
  * Copyright (C) 2008 Panasas Inc.  All rights reserved.
  *
  * Authors:
- *   Boaz Harrosh bharr...@panasas.com
+ *   Boaz Harrosh o...@electrozaur.com
  *   Benny Halevy bhal...@panasas.com
  *
  * This program is free software; you can redistribute it and/or modify
diff --git a/drivers/scsi/osd/osd_initiator.c b/drivers/scsi/osd/osd_initiator.c
index 5f4cbf0..52e243b 100644
--- a/drivers/scsi/osd/osd_initiator.c
+++ b/drivers/scsi/osd/osd_initiator.c
@@ -7,7 +7,7 @@
  * Copyright (C) 2008 Panasas Inc.  All rights reserved.
  *
  * Authors:
- *   Boaz Harrosh bharr...@panasas.com
+ *   Boaz Harrosh o...@electrozaur.com
  *   Benny Halevy bhal...@panasas.com
  *
  * This program is free software; you can redistribute it and/or modify
@@ -57,7 +57,7 @@
 
 enum { OSD_REQ_RETRIES = 1 };
 
-MODULE_AUTHOR(Boaz Harrosh bharr...@panasas.com);
+MODULE_AUTHOR(Boaz Harrosh o...@electrozaur.com);
 MODULE_DESCRIPTION(open-osd initiator library libosd.ko);
 MODULE_LICENSE(GPL);
 
diff --git a/drivers/scsi/osd/osd_uld.c b/drivers/scsi/osd/osd_uld.c
index e1d9a4c..92cdd4b 100644
--- a/drivers/scsi/osd/osd_uld.c
+++ b/drivers/scsi/osd/osd_uld.c
@@ -10,7 +10,7 @@
  * Copyright (C) 2008 Panasas Inc.  All rights reserved.
  *
  * Authors:
- *   Boaz Harrosh bharr...@panasas.com
+ *   Boaz Harrosh o...@electrozaur.com
  *   Benny Halevy bhal...@panasas.com
  *
  * This program is free software; you can redistribute it and/or modify
@@ -74,7 +74,7 @@
 static const char osd_name[] = osd;
 static const char *osd_version_string = open-osd 0.2.1;
 
-MODULE_AUTHOR(Boaz Harrosh bharr...@panasas.com);
+MODULE_AUTHOR(Boaz Harrosh o...@electrozaur.com);
 MODULE_DESCRIPTION(open-osd Upper-Layer-Driver osd.ko);
 MODULE_LICENSE(GPL);
 MODULE_ALIAS_CHARDEV_MAJOR(SCSI_OSD_MAJOR);
diff --git a/fs/exofs/Kbuild b/fs/exofs/Kbuild
index 389ba83..b47c7b8 100644
--- a/fs/exofs/Kbuild
+++ b/fs/exofs/Kbuild
@@ -4,7 +4,7 @@
 # Copyright (C) 2008 Panasas Inc.  All rights reserved.
 #
 # Authors:
-#   Boaz Harrosh bharr...@panasas.com
+#   Boaz Harrosh o...@electrozaur.com
 #
 # This program is free software; you can redistribute it and/or modify
 # it under the terms of the GNU General Public License version 2
diff --git a/fs/exofs/common.h b/fs/exofs/common.h
index 3bbd469..7d88ef5 100644
--- a/fs/exofs/common.h
+++ b/fs/exofs/common.h
@@ -4,7 +4,7 @@
  * Copyright (C) 2005, 2006
  * Avishay Traeger (avis

[PATCH 3/3] Boaz Harrosh - fix email in Documentation

2014-10-19 Thread Boaz Harrosh

From: Boaz Harrosh o...@electrozaur.com

I forgot to update Documentation/*.txt

Signed-off-by: Boaz Harrosh o...@electrozaur.com
---
 Documentation/scsi/osd.txt | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/Documentation/scsi/osd.txt b/Documentation/scsi/osd.txt
index da162f7..5a9879b 100644
--- a/Documentation/scsi/osd.txt
+++ b/Documentation/scsi/osd.txt
@@ -184,8 +184,7 @@ Any problems, questions, bug reports, lonely OSD nights, 
please email:
 More up-to-date information can be found on:
 http://open-osd.org
 
-Boaz Harrosh bharr...@panasas.com
-Benny Halevy bhal...@panasas.com
+Boaz Harrosh o...@electrozaur.com
 
 References
 ==
-- 
1.9.3

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 2/3] libiscsi, iser: Adjust data_length to include protection information

2014-08-14 Thread Boaz Harrosh

On 08/13/2014 04:09 PM, Sagi Grimberg wrote:
 On 8/6/2014 4:25 PM, Boaz Harrosh wrote:
 
 Hey Boaz,
 
 So in the current flow, I still don't think it is wrong/buggy, the
 transfer byte length related to scsi buffer length 
(In iscsi for sure 
 but I think that for all transports it is the same).
 
 But,
 Since you have such a strong objection on this I'm also OK with changing
 stuff... We can pass a buffer to scsi_transfer_length (although it has 
 no meaning effectively) and we can keep in/out_len local variables and
 print them along with total transfer.
 
 Do you want me to send out a patch here (since I have some hardware to 
 test it...) or are you still planning to send out something?
 

I'll do it. As you said there is no bugs, just ugly code. I will send a
cleanup

Thanks
Boaz

 Cheers,
 Sagi.
 

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH V5] Save command pool address of Scsi_Host

2014-08-04 Thread Boaz Harrosh

On 08/04/2014 02:30 PM, jgr...@suse.com wrote:
 From: Juergen Gross jgr...@suse.com
 
 If a scsi host driver specifies .cmd_len in it's scsi_host_template, a 
 driver's
 private command pool is needed. scsi_find_host_cmd_pool() will locate it, but
 scsi_alloc_host_cmd_pool() isn't saving the pool address in the host template.
 
 This will result in an access error when the host is removed.
 
 Avoid the problem by saving the address of a new allocated command pool where
 it is expected.
 
 Signed-off-by: Juergen Gross jgr...@suse.com
 ---
  drivers/scsi/scsi.c | 12 ++--
  1 file changed, 10 insertions(+), 2 deletions(-)
 
 diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
 index 88d46fe..b0cef5b 100644
 --- a/drivers/scsi/scsi.c
 +++ b/drivers/scsi/scsi.c
 @@ -380,6 +380,10 @@ scsi_alloc_host_cmd_pool(struct Scsi_Host *shost)
   pool-slab_flags |= SLAB_CACHE_DMA;
   pool-gfp_mask = __GFP_DMA;
   }
 +
 + if (hostt-cmd_size)
 + hostt-cmd_pool = pool;
 +
   return pool;
  }
  
 @@ -424,8 +428,10 @@ out:
  out_free_slab:
   kmem_cache_destroy(pool-cmd_slab);
  out_free_pool:
 - if (hostt-cmd_size)
 + if (hostt-cmd_size) {
   scsi_free_host_cmd_pool(pool);

I disagree I liked the V4 version better. It is much nicer on the review
that we NULL the same one we pass in with no need for context that's outside
of this scope

Just my $0.017
Boaz

 + hostt-cmd_pool = NULL;
 + }
   goto out;
  }
  
 @@ -447,8 +453,10 @@ static void scsi_put_host_cmd_pool(struct Scsi_Host 
 *shost)
   if (!--pool-users) {
   kmem_cache_destroy(pool-cmd_slab);
   kmem_cache_destroy(pool-sense_slab);
 - if (hostt-cmd_size)
 + if (hostt-cmd_size) {
   scsi_free_host_cmd_pool(pool);
 + hostt-cmd_pool = NULL;
 + }
   }
   mutex_unlock(host_cmd_pool_mutex);
  }
 

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFD] brd.ko Never supported partitions should we remove the dead code ?

2014-07-30 Thread Boaz Harrosh

On 07/29/2014 08:19 PM, Ross Zwisler wrote:
 On Tue, 2014-07-29 at 19:37 +0300, Boaz Harrosh wrote:
 Hi folks

 I've been playing with yet unseen prd block device, and hit an issue with 
 partitioning
 but since prd.c is a copy/paste of brd.c the same exact issue exists with 
 brd.

 It does not support partitions!

 An attempt to fdisk say a couple of partitions, then mkfs  mount an 
 individual
 partition will result in the primary device being accessed and a corrupted 
 disk data.

 If I enable max_part(=2) properly on modprobe, then fdisk a couple of 
 partitions all
 goes well

 But then if I pass the device to Kernel (Say via mount), after a call to
  bdev = blkdev_get_by_path(dev_name, mode, fs_type);

 I get the following results: 
 (size is extracted using:i_size_read(bdev-bd_inode);
  part_size is extracted using:   bdev-bd_part-nr_sects  9;
 )

 dev_name == /dev/ram0
 [Jul29 17:40] foo: [foo_mount:880] size=0x4000 bdev=88003dc24340 
 bd_inode=88003dc24430 bd_part=88003ca22848 part_size=0x4000

 dev_name == /dev/ram0p1
 [ +16.816065] foo: [foo_mount:880] size=0x4000 bdev=88003d2f6d80 
 bd_inode=88003d2f6e70 bd_part=88003ca22848 part_size=0x4000

 dev_name == /dev/ram0p2
 [Jul29 17:41] foo: [foo_mount:880] size=0x4000 bdev=88003dc24680 
 bd_inode=88003dc24770 bd_part=88003ca22848 part_size=0x4000

 We can see that all 3 different bdev's point to the same bd_part, which is 
 the BUG. Funny is that bd_inode is different but contains
 same wrong size, for exactly the same reason.
 
 Yep, I've seen this as well.  It turns out that partitions *do* work in brd if
 you don't specify the 'rd_nr' module parameter, but if you do specify the
 module parameter you get the partition overlapping behavior that you observed.
 
 AFAICT the difference between these two scenarios is that when you set rd_nr,
 the 'range' variable is set so something small, but when you don't set it
 'range' is 1,048,576.  That's done by this bit of code in brd_init:
 
   if (rd_nr) {
   nr = rd_nr;
   range = rd_nr  part_shift;
   } else {
   nr = CONFIG_BLK_DEV_RAM_COUNT;
   range = 1UL  MINORBITS;
   }
 
 The difference in behavior happens up the stack somewhere as a result of you
 passing 'range' into blk_register_region().
 
 Anyway, the quick and dirty workaround is to always have 'range' be something
 large.  This is what I've done in the current master branch of the upstream
 PRD tree - see commit 0256739ccab03d1662221f595b03797370c2ac12.
 
 For what it's worth, I'm planning on updating prd (and probably brd) so that
 it dynamically allocates partition numbers.  See commit
 469071a37afc8a627b6b2ddf29db0a097d864845 for how this was done in the nvme
 driver.
 

Hi Ross

I've looked at 469071a37afc8a627b6b2ddf29db0a097d864845, but surly as you said
it is not the fix. I have posted the fix, in the code you snipped out.

So I get it that you guys are not convinced that we need to remove support for
Partitions for brd? For me a partition for /dev/ramX is totally bogus. What's
the point at all? Since we have the same effect if we just do rd_nr and
partition memory this way, and save 1M of memory. And it is not as if you can
loose power and want the partitioning persistent for the next boot. But sigh I
will not fight you about it, I made my point of this being pointless and that is
that.

I will post a proper patch for brd on top of Jens's tree unless you want it
rebased over some other tree, I guess it can just be queued for 3.17.

 - Ross

Thanks
Boaz

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/2] brd: Fix the partitions BUG

2014-07-30 Thread Boaz Harrosh


With current code after a call to:
bdev = blkdev_get_by_path(dev_name, mode, fs_type);
size = i_size_read(bdev-bd_inode);
part_size  = bdev-bd_part-nr_sects  9;

I get the following bad results:
dev_name == /dev/ram0
  foo: [foo_mount:880] size=0x4000 bdev=88003dc24340 \
bd_inode=88003dc24430 bd_part=88003ca22848 part_size=0x4000
dev_name == /dev/ram0p1
  foo: [foo_mount:880] size=0x4000 bdev=88003d2f6d80 \
bd_inode=88003d2f6e70 bd_part=88003ca22848 part_size=0x4000
dev_name == /dev/ram0p2
  foo: [foo_mount:880] size=0x4000 bdev=88003dc24680 \
bd_inode=88003dc24770 bd_part=88003ca22848 part_size=0x4000
Note how all three bdev(s) point to the same bd_part.

This is do to a single bad clubber in brd_probe() which is
removed in this patch:
-   *part = 0;

because of this all 3 bdev(s) above get to point to the same bd_part[0]

While at it fix/rename brd_init_one() since all devices are created on
load of driver, brd_probe() will never be called with a new un-created
device.
brd_init_one() is now renamed to brd_find() which is what it does.

TODO: There is one more partitions BUG regarding
  brd_direct_access() which is fixed on the next patch.

Signed-off-by: Boaz Harrosh b...@plexistor.com
---
 drivers/block/brd.c | 19 ---
 1 file changed, 8 insertions(+), 11 deletions(-)

diff --git a/drivers/block/brd.c b/drivers/block/brd.c
index c7d138e..92334f6 100644
--- a/drivers/block/brd.c
+++ b/drivers/block/brd.c
@@ -523,22 +523,20 @@ static void brd_free(struct brd_device *brd)
kfree(brd);
 }
 
-static struct brd_device *brd_init_one(int i)
+static struct brd_device *brd_find(int i)
 {
struct brd_device *brd;
 
list_for_each_entry(brd, brd_devices, brd_list) {
if (brd-brd_number == i)
-   goto out;
+   return brd;
}
 
-   brd = brd_alloc(i);
-   if (brd) {
-   add_disk(brd-brd_disk);
-   list_add_tail(brd-brd_list, brd_devices);
-   }
-out:
-   return brd;
+   /* brd always allocates all its devices at load time, therefor
+* brd_probe will never be called with a new brd_number
+*/
+   printk(KERN_EROR brd: brd_find unexpected device %d\n, i);
+   return NULL;
 }
 
 static void brd_del_one(struct brd_device *brd)
@@ -554,11 +552,10 @@ static struct kobject *brd_probe(dev_t dev, int *part, 
void *data)
struct kobject *kobj;
 
mutex_lock(brd_devices_mutex);
-   brd = brd_init_one(MINOR(dev)  part_shift);
+   brd = brd_find(MINOR(dev)  part_shift);
kobj = brd ? get_disk(brd-brd_disk) : NULL;
mutex_unlock(brd_devices_mutex);
 
-   *part = 0;
return kobj;
 }
 
-- 
1.9.3


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/2] brd: Fix brd_direct_access with partitions

2014-07-30 Thread Boaz Harrosh


When brd_direct_access() is called on a partition-bdev
it would access the wrong sector. And caller would then
corrupt the device's data.

This is a preliminary fix, Matthew Wilcox has a patch
in his DAX patchset which will define a global wrapper
to bdev-bd_disk-fops-direct_access that will do the
proper checks and translations before calling a driver
global member. (The way it is done at the rest of the
block stack)

CC: Matthew Wilcox matthew.r.wil...@intel.com
Signed-off-by: Boaz Harrosh b...@plexistor.com
---
 drivers/block/brd.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/block/brd.c b/drivers/block/brd.c
index 92334f6..7506864 100644
--- a/drivers/block/brd.c
+++ b/drivers/block/brd.c
@@ -378,9 +378,10 @@ static int brd_direct_access(struct block_device *bdev, 
sector_t sector,
 
if (!brd)
return -ENODEV;
+   sector += get_start_sect(bdev);
if (sector  (PAGE_SECTORS-1))
return -EINVAL;
-   if (sector + PAGE_SECTORS  get_capacity(bdev-bd_disk))
+   if (unlikely(sector + PAGE_SECTORS  part_nr_sects_read(bdev-bd_part)))
return -ERANGE;
page = brd_insert_page(brd, sector);
if (!page)
-- 
1.9.3


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] brd: Fix brd_direct_access with partitions

2014-07-30 Thread Boaz Harrosh

On 07/30/2014 06:34 PM, Matthew Wilcox wrote:
 On Wed, Jul 30, 2014 at 05:18:47PM +0300, Boaz Harrosh wrote:
 When brd_direct_access() is called on a partition-bdev
 it would access the wrong sector. And caller would then
 corrupt the device's data.

 This is a preliminary fix, Matthew Wilcox has a patch
 in his DAX patchset which will define a global wrapper
 to bdev-bd_disk-fops-direct_access that will do the
 proper checks and translations before calling a driver
 global member. (The way it is done at the rest of the
 block stack)
 
 Uh, no, let's just focus on getting the DAX code in instead of putting
 in interim patches that will conflict.  Patch 4/22 is uncontroversial,
 fixes this problem, has no dependencies, and is key to the rest of the DAX
 patchset.  If this problem wants to be fixed, then put 4/22 in instead.
 

OK, I agree, could you please push  4/22 as reply to here for
Jens to take for 3.17 ? (together with my 1/2)

Thanks
Boaz

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] brd: Fix the partitions BUG

2014-07-30 Thread Boaz Harrosh

On 07/30/2014 07:50 PM, Ross Zwisler wrote:

 + */
 +printk(KERN_EROR brd: brd_find unexpected device %d\n, i);
 
 s/KERN_EROR/KERN_ERR/
 

Yes thanks, sigh, code should compile

driver error. I used pr_err but last inspection I saw that printk is used
everywhere and, crapped ...

 +return NULL;
  }
  
  static void brd_del_one(struct brd_device *brd)
 @@ -554,11 +552,10 @@ static struct kobject *brd_probe(dev_t dev, int *part, 
 void *data)
  struct kobject *kobj;
  
  mutex_lock(brd_devices_mutex);
 -brd = brd_init_one(MINOR(dev)  part_shift);
 +brd = brd_find(MINOR(dev)  part_shift);
  kobj = brd ? get_disk(brd-brd_disk) : NULL;
  mutex_unlock(brd_devices_mutex);
  
 -*part = 0;
  return kobj;
  }
 
 It is possible to create new block devices with BRD at runtime:
 
   # mknod /dev/new_brd b 1 4 
   # fdisk -l /dev/new_brd
 
 This causes a new BRD disk to be created, and hits your error case:
 
   Jul 30 10:40:57 alara kernel: brd: brd_find unexpected device 4
   

Ha OK, So this is the mystery key. I never knew that trick, OK
I guess I need to leave this in

 I guess in general I'm not saying that BRD needs to have partitions - indeed
 it may not give you much in the way of added functionality.  As the code
 currently stands partitions aren't surfaced anyway
 (GENHD_FL_SUPPRESS_PARTITION_INFO is set).  For PRD, however, I *do* want to
 enable partitions correctly because eventually I'd like to enhance PRD so that
 it *does* actually handle NVDIMMs correctly, and for that partitions do make
 sense.  

So lets talk about that for a bit. Why would you want legacy partitions for
NVDIMMs. For one fdisk will waste 1M of memory on this. And with NVDIMMs you
actually need a different manager all together, more like lvm stuff.

I have patches here that changes prd a lot.
- Global memory parameters are gone each device remaps and operates on it's own 
memory range.
- the prd_params are gone instead you have one memmap= parameter of the form
  memmap=nnn$ooo,[nnn$ooo]...
  where:
nnn - is size in bytes of the form number[K/M/G]
ooo - is offset in bytes of the form number[K/M/G]

  Very much the same as the memmap= Kernel parameters. So just copy/paste what
  you did there and it will work.

  Now each memory range has one prd_device created for it. Note that if
  specified ranges are just contiguous then it is like your prd_count thing 
where
  you took a contiguous range and sliced it up, only here they can be of
  different size.
  Off course it has an added fixture of dis-contiguous ranges like we have
  in our multy-nodes NUMA system in the lab that host DDR3 NvDIMMs in two
  banks.

- A sysfs interface is added to add/remove memory range on the fly like
  osdblk

- If no parameters are specified at all, the Kernel command-line is parsed
  and all memmap= sections are attempted and used if are not already claimed.

An interface like mknod above is not supported since it is actually pointless.

My current code still supports partitions but I still think it is silly.

[
 This is all speculative for DDR3-NVDIMMs, we have seen that the memory 
controller
 actually tags these DIMMs with type 12 but it looks like this is all vendor
 specific right now and I understand that DDR4 standardize all that. So I was
 hoping you guys are working on all that with the NvDIMM stuff.

 Now lets say that I have established auto-detection for each DIMM and have
 extracted its SN (Our DDR3 DIMMs each have an SN that can be displayed by the
 vendor supplied tool)

 An NvDIMM manager will need to establish an NvDIMM table and set order.
 The motivation of a partition table is that a disk moved to another machine
 and/or booted into another OS will have persistent and STD way to not
 clobber over established DATA. But here we have like a disk cluster/raid,
 an ordered set of drives.

 Unique to NvDIMMs is the interleaving. If pairs are then inserted in wrong
 order or are re-mixed. This should be detected and refused to be mounted
 (unless a re-initialize is forced) So data is not silently lost on operator
 errors.
 
 All this is then persisted on some DIMM, first or last. If code is able to
 re-arrange order, like when physical address have changed it will, but in the
 wrong interleaving case it will refuse to mount.
]

 And if I have to implement and debug partitions for PRD, it's easy to
 stick them in BRD in case anyone wants to use them.
 

I was finally playing with BRD, and it is missing your getgeo patch, so
it is currently completely alien to fdisk and partitions.

So why, why keep a fixture that never existed, I still hope to convince you
that this is crap. Specially with brd that can have devices added on the fly
like you showed above. Who needs them at all? Why waist 1M of memory
on each device for no extra gain?

Specially in light of my new prd that does away of any needs of a partitioning
since it supports arbitrary slicing in another way.

 -

[RFD] brd.ko Never supported partitions should we remove the dead code ?

2014-07-29 Thread Boaz Harrosh

Hi folks

I've been playing with yet unseen prd block device, and hit an issue with 
partitioning
but since prd.c is a copy/paste of brd.c the same exact issue exists with brd.

It does not support partitions!

An attempt to fdisk say a couple of partitions, then mkfs  mount an individual
partition will result in the primary device being accessed and a corrupted disk 
data.

If I enable max_part(=2) properly on modprobe, then fdisk a couple of 
partitions all
goes well

But then if I pass the device to Kernel (Say via mount), after a call to
bdev = blkdev_get_by_path(dev_name, mode, fs_type);

I get the following results: 
(size is extracted using:   i_size_read(bdev-bd_inode);
 part_size is extracted using:  bdev-bd_part-nr_sects  9;
)

dev_name == /dev/ram0
[Jul29 17:40] foo: [foo_mount:880] size=0x4000 bdev=88003dc24340 
bd_inode=88003dc24430 bd_part=88003ca22848 part_size=0x4000

dev_name == /dev/ram0p1
[ +16.816065] foo: [foo_mount:880] size=0x4000 bdev=88003d2f6d80 
bd_inode=88003d2f6e70 bd_part=88003ca22848 part_size=0x4000

dev_name == /dev/ram0p2
[Jul29 17:41] foo: [foo_mount:880] size=0x4000 bdev=88003dc24680 
bd_inode=88003dc24770 bd_part=88003ca22848 part_size=0x4000

We can see that all 3 different bdev's point to the same bd_part, which is the 
BUG. Funny is that bd_inode is different but contains
same wrong size, for exactly the same reason.

The fix for this is easy:

diff --git drivers/block/brd.c drivers/block/brd.c
index c7d138e..0d982d7 100644
--- drivers/block/brd.c
+++ drivers/block/brd.c
@@ -378,10 +378,12 @@ static int brd_direct_access(struct block_device *bdev, 
sector_t sector,
 
if (!brd)
return -ENODEV;
+   sector += get_start_sect(bdev);
if (sector  (PAGE_SECTORS-1))
return -EINVAL;
+/* Check is wrong here we need to check against bdev-bd_part-nr_sects */
if (sector + PAGE_SECTORS  get_capacity(bdev-bd_disk))
return -ERANGE;
page = brd_insert_page(brd, sector);
if (!page)
return -ENOSPC;
@@ -532,11 +532,17 @@ static struct brd_device *brd_init_one(int i)
goto out;
}
 
+/* In the structure of this driver this will never happen and is wrong
+ * to do. We should just return NULL if not found.
+ * This is not the bug it is just DEAD CODE
+ * 
brd = brd_alloc(i);
if (brd) {
add_disk(brd-brd_disk);
list_add_tail(brd-brd_list, brd_devices);
}
+*/
+   prd = NULL;
 out:
return brd;
 }
@@ -558,7 +564,8 @@ static struct kobject *brd_probe(dev_t dev, int *part, void 
*data)
kobj = brd ? get_disk(brd-brd_disk) : NULL;
mutex_unlock(brd_devices_mutex);
 
-   *part = 0;
+// Fix the partition BUG *part comes in correctly must not need to touch it
+// *part = 0;
return kobj;
 }
---

After this simple fix of commenting out *part = 0, running again I get

dev_name == /dev/ram0
[  +0.04] foo: [foo_mount:880] size=0x4000 bdev=88003d2f7740 
inode=88003d2f7830 part=88001da8c048 part_size=0x4000
dev_name == /dev/ram0p1
[  +0.02] foo: [foo_mount:880] size=0x1e748200 bdev=88003dc25040 
inode=88003dc25130 part=880036f6a000 part_size=0x1e748200
dev_name == /dev/ram0p2
[  +0.02] foo: [foo_mount:880] size=0x217b7e00 bdev=88003d2f7a80 
inode=88003d2f7b70 part=880036f69000 part_size=0x217b7e00
 

But before we are running to fix this bug. Could we please do better and just 
remove the support for partitions
all together.
Since it *never* worked anyway, so probably no one needs it! Surly no 
one used it

Thanks
Boaz
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFD] brd.ko Never supported partitions should we remove the dead code ?

2014-07-29 Thread Boaz Harrosh

On 07/29/2014 07:56 PM, Matthew Wilcox wrote:
 On Tue, Jul 29, 2014 at 07:37:49PM +0300, Boaz Harrosh wrote:
 But before we are running to fix this bug. Could we please do better and 
 just remove the support for partitions
 all together.
  Since it *never* worked anyway, so probably no one needs it! Surly no 
 one used it
 
 I fixed this in patch 4/22 of the v8 series.  The correct place to
 handle partitioning is in the block layer, not the individual drivers
 (nor the filesystems).
 

Sir Mathew hi

 @@ -378,10 +378,12 @@ static int brd_direct_access(struct block_device *bdev, 
 sector_t sector,
  
   if (!brd)
   return -ENODEV;
 + sector += get_start_sect(bdev);
   if (sector  (PAGE_SECTORS-1))
   return -EINVAL;
 +/* Check is wrong here we need to check against bdev-bd_part-nr_sects */
   if (sector + PAGE_SECTORS  get_capacity(bdev-bd_disk))
   return -ERANGE;
   page = brd_insert_page(brd, sector);
   if (!page)
   return -ENOSPC;

you mean you fixed the brd_direct_access() hunk by pointing all users to a 
wrapper
that does the proper offsetting, sure.

But that is not the main bug I was talking about, the main BUG is that 
partitions are
not supported at all because of the clubber of *part in brd_probe()

 @@ -558,7 +564,8 @@ static struct kobject *brd_probe(dev_t dev, int *part, 
 void *data)
   kobj = brd ? get_disk(brd-brd_disk) : NULL;
   mutex_unlock(brd_devices_mutex);
  
 - *part = 0;
 +//   Fix the partition BUG *part comes in correctly must not need to touch it
 +//   *part = 0;
   return kobj;
  }

And my point is that: No one uses partitions with brd, why should we not remove 
it
all together, why fix it and keep it?

Cheers
Boaz

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 1/3] scsi_cmnd: Introduce scsi_transfer_length helper

2014-07-27 Thread Boaz Harrosh

On 06/25/2014 04:17 AM, Vladislav Bolkhovitin wrote:
 Martin K. Petersen, on 06/23/2014 06:58 PM wrote:
 Mike == Mike Christie micha...@cs.wisc.edu writes:
 + unsigned int xfer_len = blk_rq_bytes(scmd-request);

 Mike Can you do bidi and dif/dix?

 Nope.
 
 Correction: at the moment.
 
 There is a proposal of READ GATHERED command, which is bidirectional and 
 potentially 
 DIF/DIX.
 

That's all very good. But current infrastructure can not support BIDI+DIFF.
Because you we'll need *two* diff vectors one for each side.

Now in fact at the block layer this is actually supported. Since BIDI is
two requests, and each can have its own DIFF bio. But on the scsi layer
this is not implemented.

So I'd say: *For now DIFF and BIDI are mutually exclusive.*

(You'll need someone to love these new commands a lot in order to
 enable it)

 Vlad
 

Cheers
Boaz

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 1/3] scsi_cmnd: Introduce scsi_transfer_length helper

2014-07-27 Thread Boaz Harrosh

On 06/25/2014 01:32 PM, Sagi Grimberg wrote:
 On 6/25/2014 11:48 AM, Sagi Grimberg wrote:
 
 SNIP

 I made the patch below which should fix both bidi
 support in iscsi and also WRITE_SAME (and similar commands) support.

 I'm a bit confused, for all commands (bidi or not) the iscsi header 
 data_length
 should describe the scsi_data_buffer length, bidi information will lie 
 in AHS header.
 (in case bidi will ever co-exist with PI, we might need another helper 
 that will look
 in req-special + PI, something like scsi_bidi_transfer_length)

 So why not keep scsi_transfer_length to work on sdb length (take 
 scsi_bufflen(scmnd) or
 scsi_out(scmnd)-length as MKP suggested) and that's it - don't touch 
 libiscsi.

 Let me test that.

 
 So I tested a bidirectional command using:
 $ sg_raw --infile=/root/filex --send=1024 --request=1024 
 --outfile=/root/filex /dev/bsg/7:0:0:0 53 00 00 00 00 00 00 00 02 00
 
 And I see:
 kernel: session1: iscsi_prep_scsi_cmd_pdu iscsi prep [bidirectional cid 
 0 sc 880468ca1e00 cdb 0x53 itt 0x16 len 1024 bidi_len 1024 cmdsn 223 
 win 64]
 

This is a very bad example and tested nothing, since len  bidi_len
are the same. So even if you had a bug and took length from the
wrong side it would come out the same.

You must test with a bidi command that has two different lengths for
each side

If you have a tree that you want me to test I will be glad too.
From this thread I'm confused as to what patches you want me to
test? please point me to a tree you need testing. You can bug me
any time, any tree. I will be happy to test.

Cheers
Boaz

 This confirms what I wrote above, so AFAICT no additional fix is 
 required for libiscsi wrt bidi commands support.
 
 Mike?
 
 Sagi.
 --
 To unsubscribe from this list: send the line unsubscribe linux-scsi in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 2/3] libiscsi, iser: Adjust data_length to include protection information

2014-07-27 Thread Boaz Harrosh

On 06/11/2014 12:09 PM, Sagi Grimberg wrote:
 In case protection information exists over the wire
 iscsi header data length is required to include it.
 Use protection information aware scsi helpers to set
 the correct transfer length.
 
 In order to avoid breakage, remove iser transfer length
 checks for each task as they are not always true and
 somewhat redundant anyway.
 
 Signed-off-by: Sagi Grimberg sa...@mellanox.com
 Reviewed-by: Mike Christie micha...@cs.wisc.edu
 ---
  drivers/infiniband/ulp/iser/iser_initiator.c |   34 +++--
  drivers/scsi/libiscsi.c  |   18 +++---
  2 files changed, 19 insertions(+), 33 deletions(-)
 
 diff --git a/drivers/infiniband/ulp/iser/iser_initiator.c 
 b/drivers/infiniband/ulp/iser/iser_initiator.c
 index 2e2d903..8d44a40 100644
 --- a/drivers/infiniband/ulp/iser/iser_initiator.c
 +++ b/drivers/infiniband/ulp/iser/iser_initiator.c
 @@ -41,11 +41,11 @@
  #include iscsi_iser.h
  
  /* Register user buffer memory and initialize passive rdma
 - *  dto descriptor. Total data size is stored in
 - *  iser_task-data[ISER_DIR_IN].data_len
 + *  dto descriptor. Data size is stored in
 + *  task-data[ISER_DIR_IN].data_len, Protection size
 + *  os stored in task-prot[ISER_DIR_IN].data_len
   */
 -static int iser_prepare_read_cmd(struct iscsi_task *task,
 -  unsigned int edtl)
 +static int iser_prepare_read_cmd(struct iscsi_task *task)
  
  {
   struct iscsi_iser_task *iser_task = task-dd_data;
 @@ -73,14 +73,6 @@ static int iser_prepare_read_cmd(struct iscsi_task *task,
   return err;
   }
  
 - if (edtl  iser_task-data[ISER_DIR_IN].data_len) {
 - iser_err(Total data length: %ld, less than EDTL: 
 -  %d, in READ cmd BHS itt: %d, conn: 0x%p\n,
 -  iser_task-data[ISER_DIR_IN].data_len, edtl,
 -  task-itt, iser_task-ib_conn);
 - return -EINVAL;
 - }
 -
   err = device-iser_reg_rdma_mem(iser_task, ISER_DIR_IN);
   if (err) {
   iser_err(Failed to set up Data-IN RDMA\n);
 @@ -100,8 +92,9 @@ static int iser_prepare_read_cmd(struct iscsi_task *task,
  }
  
  /* Register user buffer memory and initialize passive rdma
 - *  dto descriptor. Total data size is stored in
 - *  task-data[ISER_DIR_OUT].data_len
 + *  dto descriptor. Data size is stored in
 + *  task-data[ISER_DIR_OUT].data_len, Protection size
 + *  is stored at task-prot[ISER_DIR_OUT].data_len
   */
  static int
  iser_prepare_write_cmd(struct iscsi_task *task,
 @@ -135,14 +128,6 @@ iser_prepare_write_cmd(struct iscsi_task *task,
   return err;
   }
  
 - if (edtl  iser_task-data[ISER_DIR_OUT].data_len) {
 - iser_err(Total data length: %ld, less than EDTL: %d, 
 -  in WRITE cmd BHS itt: %d, conn: 0x%p\n,
 -  iser_task-data[ISER_DIR_OUT].data_len,
 -  edtl, task-itt, task-conn);
 - return -EINVAL;
 - }
 -
   err = device-iser_reg_rdma_mem(iser_task, ISER_DIR_OUT);
   if (err != 0) {
   iser_err(Failed to register write cmd RDMA mem\n);
 @@ -417,11 +402,12 @@ int iser_send_command(struct iscsi_conn *conn,
   if (scsi_prot_sg_count(sc)) {
   prot_buf-buf  = scsi_prot_sglist(sc);
   prot_buf-size = scsi_prot_sg_count(sc);
 - prot_buf-data_len = sc-prot_sdb-length;
 + prot_buf-data_len = data_buf-data_len 
 +  ilog2(sc-device-sector_size) * 8;
   }
  
   if (hdr-flags  ISCSI_FLAG_CMD_READ) {
 - err = iser_prepare_read_cmd(task, edtl);
 + err = iser_prepare_read_cmd(task);
   if (err)
   goto send_command_error;
   }
 diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
 index 26dc005..3f46234 100644
 --- a/drivers/scsi/libiscsi.c
 +++ b/drivers/scsi/libiscsi.c
 @@ -338,7 +338,7 @@ static int iscsi_prep_scsi_cmd_pdu(struct iscsi_task 
 *task)
   struct iscsi_session *session = conn-session;
   struct scsi_cmnd *sc = task-sc;
   struct iscsi_scsi_req *hdr;
 - unsigned hdrlength, cmd_len;
 + unsigned hdrlength, cmd_len, transfer_length;

I hate that you introduced this new transfer_length variable. It does
not exist. In BIDI supporting driver there is out_len and in_len just
as original code.

   itt_t itt;
   int rc;
  
 @@ -391,11 +391,11 @@ static int iscsi_prep_scsi_cmd_pdu(struct iscsi_task 
 *task)
   if (scsi_get_prot_op(sc) != SCSI_PROT_NORMAL)
   task-protected = true;
  
 + transfer_length = scsi_transfer_length(sc);
 + hdr-data_length = cpu_to_be32(transfer_length);
   if (sc-sc_data_direction == DMA_TO_DEVICE) {
 - unsigned out_len = scsi_out(sc)-length;

In light of all the comments and the obvious bugs, please just
re do this patch. Do not just

Re: [PATCH 2/2] block: support 16 byte CDBs for SG_IO

2014-07-20 Thread Boaz Harrosh

On 07/20/2014 01:23 PM, Christoph Hellwig wrote:
 Signed-off-by: Christoph Hellwig h...@lst.de

Hi Christoph I've quickly reviewed your code and have a few questions

 ---
  block/scsi_ioctl.c | 24 +---
  1 file changed, 17 insertions(+), 7 deletions(-)
 
 diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
 index c4e6633..a804f3e 100644
 --- a/block/scsi_ioctl.c
 +++ b/block/scsi_ioctl.c
 @@ -288,8 +288,6 @@ static int sg_io(struct request_queue *q, struct gendisk 
 *bd_disk,
  
   if (hdr-interface_id != 'S')
   return -EINVAL;
 - if (hdr-cmd_len  BLK_MAX_CDB)
 - return -EINVAL;
  
   if (hdr-dxfer_len  (queue_max_hw_sectors(q)  9))
   return -EIO;
 @@ -306,14 +304,21 @@ static int sg_io(struct request_queue *q, struct 
 gendisk *bd_disk,
   break;
   }
  
 + ret = -ENOMEM;
   rq = blk_get_request(q, writing ? WRITE : READ, GFP_KERNEL);
   if (!rq)
 - return -ENOMEM;
 + goto out;
   blk_rq_set_block_pc(rq);
  
 + if (hdr-cmd_len  BLK_MAX_CDB) {
 + rq-cmd = kzalloc(hdr-cmd_len, GFP_KERNEL);

So two things here:
- hdr-cmd_len is char so can be MAX of 255. I understand that 4 bytes 
alignment is a SCSI
  thing so you found no point of checking any max size?

- Why the zero alloc, if you are going to paste over it the exact same length. 
Now if like in scsi
  you need 4 bytes alignment and zero padding yes, but here you do not do this 
(and probably shouldn't).
  Hence why zero-alloc?

- Looking at sg.h where hdr-cmd_len is defined it stills has a comment of = 
16 you might want to
  remove the comment by now.

 + if (!rq-cmd)
 + goto out_put_request;
 + }
 +
   ret = -EFAULT;
   if (blk_fill_sghdr_rq(q, rq, hdr, mode))

Inside here: blk_fill_sghdr_rq() calls blk_verify_command() which does:
(Below cmd[] is the buffer copied from user)

/* Anybody who can open the device can do a read-safe command */
if (test_bit(cmd[0], filter-read_ok))
return 0;

/* Write-safe commands require a writable open */
if (test_bit(cmd[0], filter-write_ok)  has_write_perm)
return 0;

Now I am not sure what type Commands you guys intend for these large commands
but if they are say SCSI-VARLEN this test will not work. Also a user might fool
the system with pretending to be read a device modifying command.

I would pass len to this test function and only permit these above if command is
= 16. Any special large command is root only.

Thanks
Boaz

 - goto out;
 + goto out_free_cdb;
  
   if (hdr-iovec_count) {
   size_t iov_data_len;
 @@ -323,7 +328,7 @@ static int sg_io(struct request_queue *q, struct gendisk 
 *bd_disk,
   0, NULL, iov);
   if (ret  0) {
   kfree(iov);
 - goto out;
 + goto out_free_cdb;
   }
  
   iov_data_len = ret;
 @@ -346,7 +351,7 @@ static int sg_io(struct request_queue *q, struct gendisk 
 *bd_disk,
 GFP_KERNEL);
  
   if (ret)
 - goto out;
 + goto out_free_cdb;
  
   bio = rq-bio;
   memset(sense, 0, sizeof(sense));
 @@ -365,8 +370,13 @@ static int sg_io(struct request_queue *q, struct gendisk 
 *bd_disk,
   hdr-duration = jiffies_to_msecs(jiffies - start_time);
  
   ret = blk_complete_sghdr_rq(rq, hdr, bio);
 -out:
 +
 +out_free_cdb:
 + if (rq-cmd != rq-__cmd)
 + kfree(rq-cmd);
 +out_put_request:
   blk_put_request(rq);
 +out:
   return ret;
  }
  
 

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] block: support 16 byte CDBs for SG_IO

2014-07-20 Thread Boaz Harrosh

On 07/20/2014 04:27 PM, Christoph Hellwig wrote:
 On Sun, Jul 20, 2014 at 02:47:49PM +0300, Boaz Harrosh wrote:

 So two things here:
 - hdr-cmd_len is char so can be MAX of 255. I understand that 4 bytes 
 alignment is a SCSI
   thing so you found no point of checking any max size?
 
 I don't see any point to force the aligmnet - the devices need to reject
 garbage commands, and if for some reason we'd see future commands
 that are  252 and  255 we are prepared to handle them.
 

I agree

 - Why the zero alloc, if you are going to paste over it the exact same 
 length. Now if like in scsi
   you need 4 bytes alignment and zero padding yes, but here you do not do 
 this (and probably shouldn't).
   Hence why zero-alloc?
 
 No good reason except that's what sg and bsg do.
 

Ha sorry didn't look there. Looks redundant to me that's all


 Inside here: blk_fill_sghdr_rq() calls blk_verify_command() which does:
 (Below cmd[] is the buffer copied from user)

  /* Anybody who can open the device can do a read-safe command */
  if (test_bit(cmd[0], filter-read_ok))
  return 0;

  /* Write-safe commands require a writable open */
  if (test_bit(cmd[0], filter-write_ok)  has_write_perm)
  return 0;

 Now I am not sure what type Commands you guys intend for these large 
 commands
 but if they are say SCSI-VARLEN this test will not work. Also a user might 
 fool
 the system with pretending to be read a device modifying command.

 I would pass len to this test function and only permit these above if 
 command is
 = 16. Any special large command is root only.
 
 Honestly that whole filter crap should never have been merged to start with,
 you'll just need proper CAP_SYS_RAWIO for variable length commands.
 
 

I agree. What I'm saying is - Are you sure that with current code as is we will 
not
pass on large commands without the proper CAP_SYS_RAWIO.

Should we not make sure and add:
/* root can do any command. */
if (capable(CAP_SYS_RAWIO))
return 0;
+
+   if (cmnd_len  BLK_MAX_CDB)
+   return -EPERM;

Rrrr you are right. I finally get the filter code. Anything that is not 
white-listed
is rejected. OK sorry for the noise.

Reviewed-by: Boaz Harrosh b...@plexistor.com

Thanks
Boaz



--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] sg: add SG_FLAG_Q_AT_TAIL flag

2014-06-05 Thread Boaz Harrosh

On 06/04/2014 05:58 PM, Douglas Gilbert wrote:
 When the SG_IO ioctl was copied into the block layer and
 later into the bsg driver, subtle differences emerged.
 
 One difference is the way injected commands are queued through
 the block layer (i.e. this is not SCSI device queueing nor SATA
 NCQ). Summarizing:
- SG_IO in the block layer: blk_exec*(at_head=false)
- sg SG_IO: at_head=true
- bsg SG_IO: at_head=true
 
 Some time ago Boaz Harrosh introduced a sg v4 flag called
 BSG_FLAG_Q_AT_TAIL to override the bsg driver default.
 This patch does the equivalent for the sg driver.
 

Yep

 
 ChangeLog:
  Introduce SG_FLAG_Q_AT_TAIL flag to cause commands
  to be injected into the block layer with
  at_head=false.
 
 Changes since v1:
  Make guard condition (only take sg v3 interface or later
  invocations) clearer.
 
 Signed-off-by: Douglas Gilbert dgilb...@interlog.com
 

Douglas Hi

Please I'm just curious? up until now all application users
used SG_FLAG_Q_AT_HEAD. Therefor I deduce that you guys
came across a new application that can make use of the new
SG_FLAG_Q_AT_TAIL facility.

What was the application's writer considerations for using
the old sg interface and not use the newer bsg interface
that already has this support. For me I can see 2 main areas
that I find bsg missing.

1. aio scatter_gather type io. 
   (ie multiple pointers multiple length buffers that are
written / read from same linear range on device)
  [The async aspect of aio can be implemented via bsg
   with the write+read system calls]
2. mmap of direct device range to user vm memory

Which of these, or other, considerations where cardinal
in using of the old interface in new code?

(For me personally both above shortcomings are very
 much missed]

Thanks
Boaz

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v5] sg: relax 16 byte cdb restriction

2014-06-05 Thread Boaz Harrosh

On 06/05/2014 06:40 PM, Boaz Harrosh wrote:
 On 06/03/2014 08:18 PM, Douglas Gilbert wrote:
 v4 of this patch was sent 20131201.

 ChangeLog:
  - remove the 16 byte CDB (SCSI command) length limit
from the sg driver by handling longer CDBs the same
way as the bsg driver. Remove comment from sg.h
public interface about the cmd_len field being
limited to 16 bytes.
  - remove some dead code caused by this change
  - cleanup comment block at the top of sg.h, fix urls

 Signed-off-by: Douglas Gilbert dgilb...@interlog.com


 sg_cdb_dg5.patch


 
  
 +/* SG_MAX_CDB_SIZE should be 260 (spc4r37 section 3.1.30) however the type
 + * of sg_io_hdr::cmd_len can only represent 255
 + */
 +#define SG_MAX_CDB_SIZE 255
 +


BTW here too. Why new code is using the old interface?
I thought that the all point for having bsg at all is
that new stuff are implemented there in a cleaner interface
for example large CDBs.
Otherwise what was the point for bsg in the first place?
(It was before my time, It would be nice to know?)

What is then left unique at bsg after this patch? only bidi?

Thanks
Boaz

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [ANNOUNCE] scsi patch queue tree

2014-06-02 Thread Boaz Harrosh

On 06/02/2014 10:58 AM, Christoph Hellwig wrote:
 With Linus opening the merge window things should calm down now.
 
 I've pushed the mvsas pciid update, and a revert of the be2scsi patch
 that wasn't quite ready to the drivers-for-3.16 branch.
 
 I've also created a new drivers-for-3.16-2 branch that contains the
 remaining hpsa updates that might be too big for Linus taste that late
 but should probably be fine.  I also expect to add the relatively small
 lpfc update to it once I get second review in proper form.
 

Hi Christoph

Could you please add this very trivial patch for this merge window?
http://www.spinics.net/lists/linux-scsi/msg74774.html

my hack here: http://www.spinics.net/lists/linux-scsi/msg74775.html

Forgot to CC you sorry
Thanks
Boaz

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [ANNOUNCE] scsi patch queue tree

2014-06-02 Thread Boaz Harrosh

 
 Looks more like an ack than a hack :)
 

Oops so I do need that coffee  ;0)


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/1] include/scsi/osd_protocol.h: remove unnecessary __constant

2014-06-01 Thread Boaz Harrosh

On 06/01/2014 05:06 PM, Fabian Frederick wrote:
 __constant_cpu_to_be16 converted to cpu_to_be16
 
 This patch fixes checkpatch warnings:
 
 WARNING: __constant_cpu_to_be16 should be cpu_to_be16
 
 Cc: Boaz Harrosh bharr...@panasas.com
 Cc: Benny Halevy bhal...@primarydata.com
 Signed-off-by: Fabian Frederick f...@skynet.be

Ack-by: Boaz Harrosh bharr...@panasas.com

James please apply
Thanks
Boaz

 ---
  include/scsi/osd_protocol.h | 10 +-
  1 file changed, 5 insertions(+), 5 deletions(-)
 
 diff --git a/include/scsi/osd_protocol.h b/include/scsi/osd_protocol.h
 index 25ac628..a2594af 100644
 --- a/include/scsi/osd_protocol.h
 +++ b/include/scsi/osd_protocol.h
 @@ -263,16 +263,16 @@ static inline struct osd_cdb_head *osd_cdb_head(struct 
 osd_cdb *ocdb)
   * Ex name = FORMAT_OSD we have OSD_ACT_FORMAT_OSD  OSDv1_ACT_FORMAT_OSD
   */
  #define OSD_ACT___(Name, Num) \
 - OSD_ACT_##Name = __constant_cpu_to_be16(0x8880 + Num), \
 - OSDv1_ACT_##Name = __constant_cpu_to_be16(0x8800 + Num),
 + OSD_ACT_##Name = cpu_to_be16(0x8880 + Num), \
 + OSDv1_ACT_##Name = cpu_to_be16(0x8800 + Num),
  
  /* V2 only actions */
  #define OSD_ACT_V2(Name, Num) \
 - OSD_ACT_##Name = __constant_cpu_to_be16(0x8880 + Num),
 + OSD_ACT_##Name = cpu_to_be16(0x8880 + Num),
  
  #define OSD_ACT_V1_V2(Name, Num1, Num2) \
 - OSD_ACT_##Name = __constant_cpu_to_be16(Num2), \
 - OSDv1_ACT_##Name = __constant_cpu_to_be16(Num1),
 + OSD_ACT_##Name = cpu_to_be16(Num2), \
 + OSDv1_ACT_##Name = cpu_to_be16(Num1),
  
  enum osd_service_actions {
   OSD_ACT_V2(OBJECT_STRUCTURE_CHECK,  0x00)
 

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] scsi: lpfc: lpfc_init: use kcalloc for allocating memory

2014-04-08 Thread Boaz Harrosh

On 03/18/2014 10:51 PM, Matei Oprea wrote:
 It's easier to use kcalloc for allocating arrays. While at it
 also remove useless casting value.
 
 Signed-off-by: Matei Oprea e...@opreamatei.ro
 Cc: ROSEdu Kernel Community fire...@lists.rosedu.org
 ---
  drivers/scsi/lpfc/lpfc_init.c |4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)
 
 diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
 index 68c94cc..0a51ca5 100644
 --- a/drivers/scsi/lpfc/lpfc_init.c
 +++ b/drivers/scsi/lpfc/lpfc_init.c
 @@ -4731,9 +4731,9 @@ lpfc_sli_driver_resource_setup(struct lpfc_hba *phba)
   }
  
   if (!phba-sli.ring)
 - phba-sli.ring = (struct lpfc_sli_ring *)
 - kzalloc(LPFC_SLI3_MAX_RING *
 + phba-sli.ring = kcalloc(LPFC_SLI3_MAX_RING,
   sizeof(struct lpfc_sli_ring), GFP_KERNEL);
 +

Just a nit please put this recheck inside the if(){} so in the hot path
when phba-sli.ring is already allocated it will not test twice.

   if (!phba-sli.ring)

and unlikely() on this if

   return -ENOMEM;
  
 

thanks
Boaz

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: misc scsi midlayer updates

2014-03-30 Thread Boaz Harrosh

On 03/27/2014 06:14 PM, Christoph Hellwig wrote:
 Various patches from the scsi multiqueue development that make sense on their
 own.
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-scsi in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 

Hi Christoph

Many years ago I have sent these exact patches but no-one cared Including
me I guess.

(one instance is here, I sent it several times)
http://comments.gmane.org/gmane.linux.scsi/63347

Please note the 4th patch which is a real BUG, titled:
scsi_lib: Can't RETRY scsi_cmnd if some bytes were completed

I think my:
scsi_lib: Remove that __scsi_release_buffers contraption
Is more elegant, is layered better and is smaller code. (please
consider my version)

Also the first patch is some more cleanup you'd like.

The main patch of collapsing  scsi_end_request is basically the same.

Funny that I've just rebased an exactly 2 years ago version and it
rebased cleanly. So here they are for your reference.

[PATCH 1/4] scsi_lib: request_queue is only needed inside
[PATCH 2/4] scsi_lib: Remove that __scsi_release_buffers contraption
[PATCH 3/4] scsi_lib: Collapse scsi_end_request into only user
[PATCH 4/4] scsi_lib: BUG: Can't RETRY scsi_cmnd if some bytes were

[Your patches have been tested within my MQ testing right?]

Thanks for pushing on this
Boaz

From 9809b82bbb9813370738ac00400c01d61b0c51b4 Mon Sep 17 00:00:00 2001
From: Boaz Harrosh bharr...@panasas.com
Date: Mon, 4 Jan 2010 10:45:56 +0200
Subject: [PATCH 1/4] scsi_lib: request_queue is only needed inside
 scsi_requeue_command

This is a pure cleanup. Just starting the engines for things
to come.

Signed-off-by: Boaz Harrosh bharr...@panasas.com
---
 drivers/scsi/scsi_lib.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 62ec84b..dd834ca 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -491,10 +491,11 @@ void scsi_requeue_run_queue(struct work_struct *work)
  *		sector.
  * Notes:	Upon return, cmd is a stale pointer.
  */
-static void scsi_requeue_command(struct request_queue *q, struct scsi_cmnd *cmd)
+static void scsi_requeue_command(struct scsi_cmnd *cmd)
 {
 	struct scsi_device *sdev = cmd-device;
 	struct request *req = cmd-request;
+	struct request_queue *q = req-q;
 	unsigned long flags;
 
 	/*
@@ -565,7 +566,6 @@ static void __scsi_release_buffers(struct scsi_cmnd *, int);
 static struct scsi_cmnd *scsi_end_request(struct scsi_cmnd *cmd, int error,
 	  int bytes, int requeue)
 {
-	struct request_queue *q = cmd-device-request_queue;
 	struct request *req = cmd-request;
 
 	/*
@@ -584,7 +584,7 @@ static struct scsi_cmnd *scsi_end_request(struct scsi_cmnd *cmd, int error,
  * queue, and goose the queue again.
  */
 scsi_release_buffers(cmd);
-scsi_requeue_command(q, cmd);
+scsi_requeue_command(cmd);
 cmd = NULL;
 			}
 			return cmd;
@@ -779,7 +779,6 @@ static int __scsi_error_from_host_byte(struct scsi_cmnd *cmd, int result)
 void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
 {
 	int result = cmd-result;
-	struct request_queue *q = cmd-device-request_queue;
 	struct request *req = cmd-request;
 	int error = 0;
 	struct scsi_sense_hdr sshdr;
@@ -1003,7 +1002,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
 			scsi_print_command(cmd);
 		}
 		if (blk_end_request_err(req, error))
-			scsi_requeue_command(q, cmd);
+			scsi_requeue_command(cmd);
 		else
 			scsi_next_command(cmd);
 		break;
@@ -1012,7 +1011,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
 		 * A new command will be prepared and issued.
 		 */
 		scsi_release_buffers(cmd);
-		scsi_requeue_command(q, cmd);
+		scsi_requeue_command(cmd);
 		break;
 	case ACTION_RETRY:
 		/* Retry the same command immediately */
-- 
1.8.5.3

From aac579b02c7d3665665a1f1f7c064a82b70b873a Mon Sep 17 00:00:00 2001
From: Boaz Harrosh bharr...@panasas.com
Date: Mon, 4 Jan 2010 12:46:06 +0200
Subject: [PATCH 2/4] scsi_lib: Remove that __scsi_release_buffers contraption

Remove do_bidi_check by checking and nullifying cmd-request
when blk_end_* indicates the request is no longer valid.

Signed-off-by: Boaz Harrosh bharr...@panasas.com
---
 drivers/scsi/scsi_lib.c | 41 +
 1 file changed, 17 insertions(+), 24 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index dd834ca..c4cdc6a 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -539,8 +539,6 @@ void scsi_run_host_queues(struct Scsi_Host *shost)
 		scsi_run_queue(sdev-request_queue);
 }
 
-static void __scsi_release_buffers(struct scsi_cmnd *, int);
-
 /*
  * Function:scsi_end_request()
  *
@@ -591,11 +589,12 @@ static struct scsi_cmnd *scsi_end_request(struct scsi_cmnd *cmd, int error

Re: [PATCH v6 2/2] block,scsi: convert and handle ERR_PTR from blk_get_request

2013-11-04 Thread Boaz Harrosh

On 10/31/2013 03:50 PM, Joe Lawrence wrote:
 The blk_get_request function may fail in low-memory conditions or during
 device removal (even if __GFP_WAIT is set). To distinguish between these
 errors, modify the blk_get_request call stack to return the appropriate
 ERR_PTR. Verify that all callers check the return status and consider
 IS_ERR instead of a simple NULL pointer check.
 
 Signed-off-by: Joe Lawrence joe.lawre...@stratus.com
 ---

 diff --git a/drivers/scsi/osd/osd_initiator.c 
 b/drivers/scsi/osd/osd_initiator.c
 index aa66361ed44b..0250efda647b 100644
 --- a/drivers/scsi/osd/osd_initiator.c
 +++ b/drivers/scsi/osd/osd_initiator.c
 @@ -1567,8 +1567,8 @@ static struct request *_make_request(struct 
 request_queue *q, bool has_write,
   struct request *req;
  
   req = blk_get_request(q, has_write ? WRITE : READ, flags);
 - if (unlikely(!req))
 - return ERR_PTR(-ENOMEM);
 + if (unlikely(IS_ERR(req)))

Just a nit IS_ERR already has an unlikely so it can be dropped here. (No bigy)

ACK-by: Boaz Harrosh bharr...@panasas.com

 + return req;
  
   return req;
   }


Thanks
Boaz

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Bypass block layer and Fill SCSI lower layer driver queue

2013-09-18 Thread Boaz Harrosh

On 09/18/2013 05:07 PM, Douglas Gilbert wrote:
 On 13-09-18 03:58 AM, Jack Wang wrote:
 On 09/18/2013 08:41 AM, Alireza Haghdoost wrote:
 Hi

 I am working on a high throughput and low latency application which
 does not tolerate block layer overhead to send IO request directly to
 fiber channel lower layer SCSI driver. I used to work with libaio but
 currently I am looking for a way to by pass the block layer and send
 SCSI commands from the application layer directly to the SCSI driver
 using /dev/sgX device and ioctl() system call.

 I have noticed that sending IO request through sg device even with
 nonblocking and direct IO flags is quite slow and does not fill up
 lower layer SCSI driver TCQ queue. i.e IO depth or
 /sys/block/sdX/in_flight is always ZERO. Therefore the application
 throughput is even lower that sending IO request through block layer
 with libaio and io_submit() system call. In both cases I used only one
 IO context (or fd) and single threaded.

 Hi Alireza,

 I think what you want is in_flight command scsi dispatch to low level
 device.
 I submit a simple patch to export device_busy

 http://www.spinics.net/lists/linux-scsi/msg68697.html

 I also notice fio sg engine will not fill queue properly, but haven't
 look into deeper.

 Cheers
 Jack

 I have noticed that some well known benchmarking tools like fio does
 not support IO depth for sg devices as well. Therefore, I was
 wondering if it is feasible to bypass block layer and achieve higher
 throughput and lower latency (for sending IO request only).


 Any comment on my issue is highly appreciated.
 
 I'm not sure if this is relevant to your problem but by
 default both the bsg and sg drivers queue at head
 when they inject SCSI commands into the block layer.
 
 The bsg driver has a BSG_FLAG_Q_AT_TAIL flag to change
 that queueing to what may be preferable for your purposes.
 The sg driver could, but does not, support that flag.
 

Yes!

The current best bet for keeping the queues full are with libaio
and direct + asynchronous IO. It should not be significantly slower than
bsg. (Believe me, with direct IO Block-Device cache is bypassed and
the only difference is in who prepares the struct requests for submission)

As Doug said sg can not do it. Also with bsg and above BSG_FLAG_Q_AT_TAIL
You will need to use the write() option and not the ioctl() because the
later is synchronous and you want an asynchronous submit of commands
and background completion of them. (Which is what libaio does with async IO)

With bsg you achieve that with using write() in combination of read() to
receive the completions.

 Doug Gilbert
 

Cheers
Boaz

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4] block: handle pointer error from blk_get_request

2013-05-23 Thread Boaz Harrosh

On 23/05/13 23:09, Joe Lawrence wrote:
 Hi Jens,
 
 Subject: [PATCH v4] block: handle pointer error from blk_get_request
 
 The blk_get_request function may fail in low-memory conditions or during
 device removal (even if __GFP_WAIT is set). To distinguish between these
 errors, modify the blk_get_request call stack to return the appropriate
 error pointer. Verify that all callers check the return status and
 consider IS_ERR instead of a simple NULL pointer check.
 
 Signed-off-by: Joe Lawrence joe.lawre...@stratus.com
 Cc: Jens Axboe ax...@kernel.dk
 Cc: James E.J. Bottomley jbottom...@parallels.com
 Cc: Bart Van Assche bvanass...@acm.org
 Cc: linux-scsi@vger.kernel.org

ACK-by: Boaz Harrosh bharr...@panasas.com

 ---

  drivers/scsi/osd/osd_initiator.c|  4 ++--

 diff --git a/drivers/scsi/osd/osd_initiator.c 
 b/drivers/scsi/osd/osd_initiator.c
 index d8293f2..b4cd56b 100644
 --- a/drivers/scsi/osd/osd_initiator.c
 +++ b/drivers/scsi/osd/osd_initiator.c
 @@ -1567,8 +1567,8 @@ static struct request *_make_request(struct 
 request_queue *q, bool has_write,
   struct request *req;
  
   req = blk_get_request(q, has_write ? WRITE : READ, flags);
 - if (unlikely(!req))
 - return ERR_PTR(-ENOMEM);
 + if (unlikely(IS_ERR(req)))
 + return req;
  
   return req;
   }


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [LSF/MM TOPIC][ATTEND] protection information and userspace

2013-02-07 Thread Boaz Harrosh

On 02/07/2013 01:27 PM, Hannes Reinecke wrote:
 On 02/07/2013 11:01 AM, Darrick J. Wong wrote:
 On Thu, Feb 07, 2013 at 01:40:14AM -0800, Joel Becker wrote:
 On Wed, Feb 06, 2013 at 03:34:49PM -0500, Chuck Lever wrote:

 On Feb 6, 2013, at 3:24 PM, Darrick J. Wong darrick.w...@oracle.com 
 wrote:

 On Wed, Feb 06, 2013 at 01:51:22PM -0600, Ben Myers wrote:
 Hi,

 I'm interested in discussing how to pass protection information to and 
 from
 userspace.  Maybe Martin could be enlisted for the discussion.

 I read that some work has already been done in this area but have not 
 been able
 to locate it.  It looks like the bio-integrity code already makes it 
 possible
 to generate the t10-dif crc in the filesystem.  It would be good to be 
 able to
 get the guard and application tags back out to backup applications such 
 as
 xfsdump.  Enabling other applications to generate their own tags in 
 userspace
 is also interesting.

 This one's been on my list for a couple of years (and companies) too.  A 
 few
 years ago Joel Becker had support for it in his sys_dio proposal (that 
 hasn't
 gone anywhere), and more recently I've theorized that we could add a magic
 fcntl/ioctl to make the kernel recognize, say, the first iovec of a 
 O_DIRECT
 *{read,write}v call as the PI buffer, which I think is similar to how DIX 
 gets
 PI data to a disk.  But it's not like I have any code to show for it.

 I /think/ it's fairly straightforward to change the directio submit code 
 to
 find the userspace PI buffer and amend the block integrity code to attach 
 our
 own PI buffer.  You'd still have to let the block layer set the sector # 
 field,
 but afaik that won't affect the crc or the app tag.

 I hear that the NFS guys want to propose some sort of protocol for 
 transmitting
 PI data (across NFS), but I haven't seen anything concrete yet.

 I'm writing a requirements document for the NFS protocol which I can 
 discuss at LSF.  The use cases for NFS for now would be virtual disk 
 devices (hypervisors) or direct NFS access to storage from user space.

 Like everyone else we are waiting for a magical VFS and user space API to 
 appear that can pass PI to and from storage.

 I'm happy to chat about it.  Unfortunately, like Darrick says, sys_dio()
 coding hasn't happened.  I do think we're better off with some kind of
 explicit API than some magic state on the file.  I mean, even something
 like:

 ssize_t write_with_pi(int fd, const void *buf, size_t count,
   const void *pi, size_t pi_count);

 It's not as nice as a non-historical API (eg sys_dio), but it also
 probably plays nicer with buffered I/O.

 I also pondered simply adding a new io_prep_* function + IO_CMD_ code to 
 libaio
 and all the other plumbing necessary to make that happen...

 void io_prep_preadv_pi(struct iocb *iocb, int fd, const struct iovec *iov,
 int iovcnt, long long offset, const void *pi,
 size_t pi_count);

 This is also what I've envisioned.
 Updating io_prep / async I/O is reasonably easy as its been using a 
 separate structure for passing in the I/O details.
 
 Normal read/write calls don't really map as you simply don't have 
 enough parameter to feed PI information into the kernel.
 So for that you'd need to invent a new interface / syscall.
 
 For aio we just need to add additional fields to an existing structure.
 
 So yeah, I'd be interested in that discussion as well.
 

Me too, in multiple fronts. It's part of my general concern about
   things we would like for user-mode servers

I think that the current aio and libaio Interface is broken for a long
time, for multitude of reasons. For instance the nested structure definitions
are COMPAT broken, and lots of missing pieces. (For example search in archives
for why bsg does not support sg-lists.)

And there are all these additions that everyone wants on top, that call for
a new interface anyway.

So I would like to see a deep fixup of this interface, with an aio version2
that can take into considerations, all of future needs including these
above. Kernel code will be very happy to be implemented with the new, interface
and a COMPAT layer could be put in place for the old interface.

All interested parties should bring to the table what is the extension/changes
they need. And we can try and union all of them together.

(My addition is for support of sg_lists to bsg, in a way that makes Tomo happy
 I know that qemu was wanting this for a while as well as the multitude of
 user-mode servers)

Thanks
Boaz

 Cheers,
 
 Hannes
 

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [LSF/MM TOPIC][ATTEND] protection information and userspace

2013-02-07 Thread Boaz Harrosh

On 02/07/2013 02:08 PM, Boaz Harrosh wrote:
 On 02/07/2013 01:27 PM, Hannes Reinecke wrote:
 On 02/07/2013 11:01 AM, Darrick J. Wong wrote:
 On Thu, Feb 07, 2013 at 01:40:14AM -0800, Joel Becker wrote:
 On Wed, Feb 06, 2013 at 03:34:49PM -0500, Chuck Lever wrote:

 On Feb 6, 2013, at 3:24 PM, Darrick J. Wong darrick.w...@oracle.com 
 wrote:

 On Wed, Feb 06, 2013 at 01:51:22PM -0600, Ben Myers wrote:
 Hi,

 I'm interested in discussing how to pass protection information to and 
 from
 userspace.  Maybe Martin could be enlisted for the discussion.

 I read that some work has already been done in this area but have not 
 been able
 to locate it.  It looks like the bio-integrity code already makes it 
 possible
 to generate the t10-dif crc in the filesystem.  It would be good to be 
 able to
 get the guard and application tags back out to backup applications such 
 as
 xfsdump.  Enabling other applications to generate their own tags in 
 userspace
 is also interesting.

 This one's been on my list for a couple of years (and companies) too.  A 
 few
 years ago Joel Becker had support for it in his sys_dio proposal (that 
 hasn't
 gone anywhere), and more recently I've theorized that we could add a 
 magic
 fcntl/ioctl to make the kernel recognize, say, the first iovec of a 
 O_DIRECT
 *{read,write}v call as the PI buffer, which I think is similar to how 
 DIX gets
 PI data to a disk.  But it's not like I have any code to show for it.

 I /think/ it's fairly straightforward to change the directio submit code 
 to
 find the userspace PI buffer and amend the block integrity code to 
 attach our
 own PI buffer.  You'd still have to let the block layer set the sector # 
 field,
 but afaik that won't affect the crc or the app tag.

 I hear that the NFS guys want to propose some sort of protocol for 
 transmitting
 PI data (across NFS), but I haven't seen anything concrete yet.

 I'm writing a requirements document for the NFS protocol which I can 
 discuss at LSF.  The use cases for NFS for now would be virtual disk 
 devices (hypervisors) or direct NFS access to storage from user space.

 Like everyone else we are waiting for a magical VFS and user space API to 
 appear that can pass PI to and from storage.

 I'm happy to chat about it.  Unfortunately, like Darrick says, sys_dio()
 coding hasn't happened.  I do think we're better off with some kind of
 explicit API than some magic state on the file.  I mean, even something
 like:

ssize_t write_with_pi(int fd, const void *buf, size_t count,
  const void *pi, size_t pi_count);

 It's not as nice as a non-historical API (eg sys_dio), but it also
 probably plays nicer with buffered I/O.

 I also pondered simply adding a new io_prep_* function + IO_CMD_ code to 
 libaio
 and all the other plumbing necessary to make that happen...

 void io_prep_preadv_pi(struct iocb *iocb, int fd, const struct iovec *iov,
int iovcnt, long long offset, const void *pi,
size_t pi_count);

 This is also what I've envisioned.
 Updating io_prep / async I/O is reasonably easy as its been using a 
 separate structure for passing in the I/O details.

 Normal read/write calls don't really map as you simply don't have 
 enough parameter to feed PI information into the kernel.
 So for that you'd need to invent a new interface / syscall.

 For aio we just need to add additional fields to an existing structure.

 So yeah, I'd be interested in that discussion as well.

 
 Me too, in multiple fronts. It's part of my general concern about
things we would like for user-mode servers
 
 I think that the current aio and libaio Interface is broken for a long
 time, for multitude of reasons. For instance the nested structure definitions
 are COMPAT broken, and lots of missing pieces. (For example search in archives
 for why bsg does not support sg-lists.)
 
 And there are all these additions that everyone wants on top, that call for
 a new interface anyway.
 
 So I would like to see a deep fixup of this interface, with an aio version2
 that can take into considerations, all of future needs including these
 above. Kernel code will be very happy to be implemented with the new, 
 interface
 and a COMPAT layer could be put in place for the old interface.
 
 All interested parties should bring to the table what is the extension/changes
 they need. And we can try and union all of them together.
 
 (My addition is for support of sg_lists to bsg, in a way that makes Tomo happy
  I know that qemu was wanting this for a while as well as the multitude of
  user-mode servers)
 

I wanted to add that there is another LSF/MM thread going on about:
[LSF TOPIC] What to do about O_DIRECT?

All these guys should be participating here, so to change core structures
and behavior to a better model, that helps us here, and not against us.

(Again libaio should be changed in concert with Kernel's new API, and we
 can sacrifice old user-mode performance

Re: [LSF/MM TOPIC][ATTEND] protection information and userspace

2013-02-07 Thread Boaz Harrosh

On 02/07/2013 02:29 PM, Bart Van Assche wrote:
 On 02/07/13 13:08, Boaz Harrosh wrote:
 (My addition is for support of sg_lists to bsg, in a way that makes Tomo 
 happy
   I know that qemu was wanting this for a while as well as the multitude of
   user-mode servers)
 
 Do you think it would help / make sense if sg_alloc_table() would be 
 modified such that it allocates the entire scatterlist table via one 
 vmalloc() call instead of chaining several page-sized scatterlist tables 
 ? Note: such a change is not possible without modifying 
 scsi_alloc_sgtable().
 

I don't think so, no. sg_alloc_table() is used not only for direct IO
also for buffered, Now vmalloc() is terribly slow and would be a bottleneck
in today's SSD performance.

I love it that the Linux Kernel never uses vmalloc internally, and only ever
chains everything to upto PAGE_SIZE sized objects. Coming from all these
other OSs that don't, believe me, it is great great performance pain.
(TLBs are a bitch)

 Bart.
 

Thanks
Boaz

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [LSF/MM TOPIC][ATTEND] protection information and userspace

2013-02-07 Thread Boaz Harrosh

On 02/07/2013 02:33 PM, Hannes Reinecke wrote:
 On 02/07/2013 01:16 PM, Boaz Harrosh wrote:
 (Again libaio should be changed in concert with Kernel's new API, and we
   can sacrifice old user-mode performance, with a COMPAT layer. Distro
   maintainers should consider replacing libaio, together with the new
   Kernel, so it is only those that do their own mix-and-match, who can
   fix that mismatch too)

 And while we're at it, I still would _love_ to connect aio_cancel() 
 and blk_abort_request().
 
 That way we could sensibly abort an I/O and get out of the darn 'D' 
 state.
 

Yes!! Thanks. It is very interesting how the socket side of the world
had it correct for ages, and the same fd object on disks is second grade
citizen in UNIX land. (Anybody voting for epoll on async disk IO? )

Thanks Hannes yes that too. And wait_interuptable() too, at couple of
places, will need some serious error handling audit for that.

 Cheers,
 
 Hannes
 

Boaz

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [LSF/MM TOPIC] Thin provisioning SOFT_THRESHOLD error handling

2013-02-07 Thread Boaz Harrosh

On 01/29/2013 10:14 AM, Hannes Reinecke wrote:
 Hi all,
 
 Thin-provisioned devices have the ability to set a 'soft threshold', 
 which is triggered if the real free space for this device is beyond 
 this mark.
 
 The intention behind this is to allow the system to induce some 
 garbage collection with possibly freeing up unused space.
 
 Initially it would be possible to execute garbage collection on 
 filesystems (eg for btrfs).
 
 However, as this concept applies to other areas within the kernel
 (like dm-thinp or even btrfs itself) it might be an idea to have
 a general mechanism / error handling etc in place.
 
 I would like to discuss at LSF the possible implementations
 and handling mechanism for this kind of failure scenarios.
 

Hannes hi!

This is received as an unit attentions, right?
Will it not be worth while to solve the general unit attentions
under udev events, once and for all. Than such a btrfs GC above
can just be a simple oneline udev rule.

(I think that the event-storm problem you had at the time can
 be solved with some Kernel side unit attentions queue, and
 greatly reduce the chance for missed events)

Thanks
Boaz

 Cheers,
 
 Hannes
 

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch] [SCSI] libosd: check for kzalloc() failure

2013-01-30 Thread Boaz Harrosh

On 01/30/2013 04:34 PM, walter harms wrote:
 

 I start to see the complexity of the situation. Would you mind to add
 the comment it can be anything.  UTF-8 is more likely but not guaranteed 
 either ?
 For now using a pascal-string seems the best solution but it should be warned
 that get_attrs[a].val_ptr is NOT a c-string and therefore can not be used with
 the printf-family (i guess the situation will become more clear in future)
 

OK, So... The long story

The STD says that osdname can be *anything* binary or otherwise, and of *any*
length. And is used to uniquely identify the OSD-device/partition in a cluster.

We decided that if so we would stick an 40 char UUID in it, generated by a 
uuidgen
and all the external utilities and surrounding code forces it, and treat it as
a c-string. But this code here in the core cannot make that assumption and still
support the STD.

On the other hand we did want the osdname to be printable in traces and messages
because it is such an important identifier. So I have decided to sacrifice an
extra char in-memory to carry a \NUL and safely stick it into printk(s). Those
(none existent) OSD devices that will put unprintable characters in here will
still function fine, but will look real scary in printk(s).

Please note that the one that sets policy is the osd-target vendor. (And they
all currently use my code base)

So save the kzalloc check this code (tested) is safe and will show strings when
present, but will gracefully show ugly things but still work when not.

 I have no clue why you need that, using c-strings makes life more easy in the
 last minute a sprintf(buf,%u) will save the day ;)
 

Actually it is very funny, because just recently (last week) I have discovered
something that eliminates all those funny business.

printf(%*s, odi-osdname, odi-osdname_len);

The * will instruct c to expect an extra variable following %s which is the
max_length of %s. This is exactly for pascal strings and the such like above.

So I added a TODO to clean that a bit by always printing with %*s


 It look clever to add the NULL test here.
 Noone reviewing the code will understand that.
 (Rule of least surprise)
 

Thanks for looking, I agree it needs a fat comment, I'll do that when I'll
convert to above system. Thanks for looking, That code is really old and never
had any extra eyes on it.

 just my 2 cents,
 re,
  wh
 

Cheers
Boaz

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [patch] [SCSI] libosd: check for kzalloc() failure

2013-01-30 Thread Boaz Harrosh

On 01/30/2013 09:06 AM, Dan Carpenter wrote:
 There wasn't any error handling for this kzalloc().
 

ACK-by: Boaz Harrosh bharr...@panasas.com

James please queue for inclusion

 Signed-off-by: Dan Carpenter dan.carpen...@oracle.com

Thanks Dan

 
 diff --git a/drivers/scsi/osd/osd_initiator.c 
 b/drivers/scsi/osd/osd_initiator.c
 index c06b8e5..d8293f2 100644
 --- a/drivers/scsi/osd/osd_initiator.c
 +++ b/drivers/scsi/osd/osd_initiator.c
 @@ -144,6 +144,10 @@ static int _osd_get_print_system_info(struct osd_dev *od,
   odi-osdname_len = get_attrs[a].len;
   /* Avoid NULL for memcmp optimization 0-length is good enough */
   odi-osdname = kzalloc(odi-osdname_len + 1, GFP_KERNEL);
 + if (!odi-osdname) {
 + ret = -ENOMEM;
 + goto out;
 + }
   if (odi-osdname_len)
   memcpy(odi-osdname, get_attrs[a].val_ptr, odi-osdname_len);
   OSD_INFO(OSD_NAME   [%s]\n, odi-osdname);
 

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [GIT PULL] exofs: 3 changes to exofs osd

2012-12-18 Thread Boaz Harrosh

On 12/18/2012 12:58 PM, James Bottomley wrote:
 On Mon, 2012-12-17 at 18:53 +0200, Boaz Harrosh wrote:
 Hi Linus.

 Please pull the following changes since commit [ddffeb8c] Linux 3.7-rc1
 They are available in the git repository at:

   git://git.open-osd.org/linux-open-osd.git for-linus

 for you to fetch changes up to
  [861d6660] exofs: don't leak io_state and pages on read error 
 (2012-12-14 12:17:32 +0200)

 These are just 3 patches, the last two are bug fixes on the error paths
 in exofs.

 The important patch is the one to osd_uld which adds sysfs info to osd
 devices for use by user-mode clustering discovery software. I'm already
 sitting on this patch since before February this year, It is important for
 some of the big installation cluster systems, who's been compiling their
 own kernel just for that patch.
 
 I'm a bit perplexed by this.  You got notice when it was added to the
 SCSI tree and now it's already upstream:
 
 commit 51976a8c85cec0c62e410bc38b8a11dbc690764d
 Author: Boaz Harrosh bharr...@panasas.com
 Date:   Wed Oct 24 14:51:41 2012 -0700
 
 [SCSI] osd_uld: Add osdname  systemid sysfs at scsi_osd class
 
 But the authorship info differs ... it looks like you forgot to include
 the From: tag in your original patch send.
 

I'm so sorry, I completely goofed on this one. It's what happens when
you are swamped with other work and are doing things without thinking.
I totally forgot that I need to remove this patch. 

Both these patches where in linux-next for a long time. So I believe
the merge will go just fine. Lets leave it like this, or I can rebase
and remove it?

 James
 
 

Thanks
Boaz

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [GIT PULL] exofs: 3 changes to exofs osd

2012-12-18 Thread Boaz Harrosh

On 12/18/2012 01:28 PM, James Bottomley wrote:

 Both these patches where in linux-next for a long time. So I believe
 the merge will go just fine. Lets leave it like this, or I can rebase
 and remove it?
 
 If it merges OK, I'd just leave it as is.  It wouldn't be anywhere close
 to the first time we've had the same patch via different trees.
 

Thanks James, it will not happen again. I just had a look on linux-next
of Dec 17 they both where merged in and I did not see any complains.

 James
 
Boaz

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[GIT PULL] exofs: 3 changes to exofs osd

2012-12-17 Thread Boaz Harrosh

Hi Linus.

Please pull the following changes since commit [ddffeb8c] Linux 3.7-rc1
They are available in the git repository at:

  git://git.open-osd.org/linux-open-osd.git for-linus

for you to fetch changes up to
[861d6660] exofs: don't leak io_state and pages on read error 
(2012-12-14 12:17:32 +0200)

These are just 3 patches, the last two are bug fixes on the error paths
in exofs.

The important patch is the one to osd_uld which adds sysfs info to osd
devices for use by user-mode clustering discovery software. I'm already
sitting on this patch since before February this year, It is important for
some of the big installation cluster systems, who's been compiling their
own kernel just for that patch.

Thanks in advance
Boaz


Boaz Harrosh (1):
  exofs: don't leak io_state and pages on read error

Idan Kedar (1):
  exofs: clean up the correct page collection on write error

Sachin Bhamare (1):
  osduld: Add osdname  systemid sysfs at scsi_osd class

 drivers/scsi/osd/osd_uld.c | 28 
 fs/exofs/inode.c   | 16 +---
 2 files changed, 37 insertions(+), 7 deletions(-)
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [osd-dev] [PATCH] osduld: Add osdname systemid sysfs at scsi_osd class

2012-11-05 Thread Boaz Harrosh

On 10/24/2012 02:55 PM, Boaz Harrosh wrote:
 On 10/24/2012 02:51 PM, Boaz Harrosh wrote:

 This patch adds the support for the following two read-only sysfs attributes
 to scsi_osd class members : osdname  systemid

 These attributes will show up as below in sysfs class hierarchy:
 /sys/class/scsi_osd/osdX/osdname
 /sys/class/scsi_osd/osdX/systemid

 The osdname  systemid are OSD device attributes which uniquely
 identify a device on the network, while it's IP and certainly
 it's /dev/osdX device path might change.
 Userspace utilities (e.g. mkfs.exofs) can parse these attributes to
 identify the correct OSD in safer and faster way.

 (Today osd apps open each device in the system and send a
  attributes query for these, in order to access the user
  requested device)

 Signed-off-by: Sachin Bhamare sbham...@panasas.com
 Signed-off-by: Boaz Harrosh bharr...@panasas.com
 
 James hi.
 
 This is for the next (v3.8) merge window. Please submit to scsi-misc tree.
 

James ping!

Do you want that I push it through my osd tree?

Thanks
Boaz

 It is actually a very old well tested, but forgotten patch.
 
 Thanks
 Boaz
 
 ---
  drivers/scsi/osd/osd_uld.c |   28 
  1 files changed, 28 insertions(+), 0 deletions(-)

 diff --git a/drivers/scsi/osd/osd_uld.c b/drivers/scsi/osd/osd_uld.c
 index d4ed9eb..4375417 100644
 --- a/drivers/scsi/osd/osd_uld.c
 +++ b/drivers/scsi/osd/osd_uld.c
 @@ -97,9 +97,37 @@ struct osd_dev_handle {
  
  static DEFINE_IDA(osd_minor_ida);
  
 +/*
 + * scsi sysfs attribute operations
 + */
 +static ssize_t osdname_show(struct device *dev, struct device_attribute 
 *attr,
 +char *buf)
 +{
 +struct osd_uld_device *ould = container_of(dev, struct osd_uld_device,
 +   class_dev);
 +return sprintf(buf, %s\n, ould-odi.osdname);
 +}
 +
 +static ssize_t systemid_show(struct device *dev, struct device_attribute 
 *attr,
 +char *buf)
 +{
 +struct osd_uld_device *ould = container_of(dev, struct osd_uld_device,
 +   class_dev);
 +
 +memcpy(buf, ould-odi.systemid, ould-odi.systemid_len);
 +return ould-odi.systemid_len;
 +}
 +
 +static struct device_attribute osd_uld_attrs[] = {
 +__ATTR(osdname, S_IRUGO, osdname_show, NULL),
 +__ATTR(systemid, S_IRUGO, systemid_show, NULL),
 +__ATTR_NULL,
 +};
 +
  static struct class osd_uld_class = {
  .owner  = THIS_MODULE,
  .name   = scsi_osd,
 +.dev_attrs  = osd_uld_attrs,
  };
  
  /*

 
 
 ___
 osd-dev mailing list
 osd-...@open-osd.org
 http://mailman.open-osd.org/mailman/listinfo/osd-dev
 


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] osduld: Add osdname systemid sysfs at scsi_osd class

2012-10-24 Thread Boaz Harrosh


This patch adds the support for the following two read-only sysfs attributes
to scsi_osd class members : osdname  systemid

These attributes will show up as below in sysfs class hierarchy:
/sys/class/scsi_osd/osdX/osdname
/sys/class/scsi_osd/osdX/systemid

The osdname  systemid are OSD device attributes which uniquely
identify a device on the network, while it's IP and certainly
it's /dev/osdX device path might change.
Userspace utilities (e.g. mkfs.exofs) can parse these attributes to
identify the correct OSD in safer and faster way.

(Today osd apps open each device in the system and send a
 attributes query for these, in order to access the user
 requested device)

Signed-off-by: Sachin Bhamare sbham...@panasas.com
Signed-off-by: Boaz Harrosh bharr...@panasas.com
---
 drivers/scsi/osd/osd_uld.c |   28 
 1 files changed, 28 insertions(+), 0 deletions(-)

diff --git a/drivers/scsi/osd/osd_uld.c b/drivers/scsi/osd/osd_uld.c
index d4ed9eb..4375417 100644
--- a/drivers/scsi/osd/osd_uld.c
+++ b/drivers/scsi/osd/osd_uld.c
@@ -97,9 +97,37 @@ struct osd_dev_handle {
 
 static DEFINE_IDA(osd_minor_ida);
 
+/*
+ * scsi sysfs attribute operations
+ */
+static ssize_t osdname_show(struct device *dev, struct device_attribute *attr,
+   char *buf)
+{
+   struct osd_uld_device *ould = container_of(dev, struct osd_uld_device,
+  class_dev);
+   return sprintf(buf, %s\n, ould-odi.osdname);
+}
+
+static ssize_t systemid_show(struct device *dev, struct device_attribute *attr,
+   char *buf)
+{
+   struct osd_uld_device *ould = container_of(dev, struct osd_uld_device,
+  class_dev);
+
+   memcpy(buf, ould-odi.systemid, ould-odi.systemid_len);
+   return ould-odi.systemid_len;
+}
+
+static struct device_attribute osd_uld_attrs[] = {
+   __ATTR(osdname, S_IRUGO, osdname_show, NULL),
+   __ATTR(systemid, S_IRUGO, systemid_show, NULL),
+   __ATTR_NULL,
+};
+
 static struct class osd_uld_class = {
.owner  = THIS_MODULE,
.name   = scsi_osd,
+   .dev_attrs  = osd_uld_attrs,
 };
 
 /*
-- 
1.7.6.5

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [osd-dev] [PATCH] osduld: Add osdname systemid sysfs at scsi_osd class

2012-10-24 Thread Boaz Harrosh

On 10/24/2012 02:51 PM, Boaz Harrosh wrote:
 
 This patch adds the support for the following two read-only sysfs attributes
 to scsi_osd class members : osdname  systemid
 
 These attributes will show up as below in sysfs class hierarchy:
 /sys/class/scsi_osd/osdX/osdname
 /sys/class/scsi_osd/osdX/systemid
 
 The osdname  systemid are OSD device attributes which uniquely
 identify a device on the network, while it's IP and certainly
 it's /dev/osdX device path might change.
 Userspace utilities (e.g. mkfs.exofs) can parse these attributes to
 identify the correct OSD in safer and faster way.
 
 (Today osd apps open each device in the system and send a
  attributes query for these, in order to access the user
  requested device)
 
 Signed-off-by: Sachin Bhamare sbham...@panasas.com
 Signed-off-by: Boaz Harrosh bharr...@panasas.com

James hi.

This is for the next (v3.8) merge window. Please submit to scsi-misc tree.

It is actually a very old well tested, but forgotten patch.

Thanks
Boaz

 ---
  drivers/scsi/osd/osd_uld.c |   28 
  1 files changed, 28 insertions(+), 0 deletions(-)
 
 diff --git a/drivers/scsi/osd/osd_uld.c b/drivers/scsi/osd/osd_uld.c
 index d4ed9eb..4375417 100644
 --- a/drivers/scsi/osd/osd_uld.c
 +++ b/drivers/scsi/osd/osd_uld.c
 @@ -97,9 +97,37 @@ struct osd_dev_handle {
  
  static DEFINE_IDA(osd_minor_ida);
  
 +/*
 + * scsi sysfs attribute operations
 + */
 +static ssize_t osdname_show(struct device *dev, struct device_attribute 
 *attr,
 + char *buf)
 +{
 + struct osd_uld_device *ould = container_of(dev, struct osd_uld_device,
 +class_dev);
 + return sprintf(buf, %s\n, ould-odi.osdname);
 +}
 +
 +static ssize_t systemid_show(struct device *dev, struct device_attribute 
 *attr,
 + char *buf)
 +{
 + struct osd_uld_device *ould = container_of(dev, struct osd_uld_device,
 +class_dev);
 +
 + memcpy(buf, ould-odi.systemid, ould-odi.systemid_len);
 + return ould-odi.systemid_len;
 +}
 +
 +static struct device_attribute osd_uld_attrs[] = {
 + __ATTR(osdname, S_IRUGO, osdname_show, NULL),
 + __ATTR(systemid, S_IRUGO, systemid_show, NULL),
 + __ATTR_NULL,
 +};
 +
  static struct class osd_uld_class = {
   .owner  = THIS_MODULE,
   .name   = scsi_osd,
 + .dev_attrs  = osd_uld_attrs,
  };
  
  /*
 


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] target: Remove unused se_cmd.cmd_spdtl

2012-08-17 Thread Boaz Harrosh

We should switch this topic to the scsi mailing list

On 08/17/2012 01:49 PM, Boaz Harrosh wrote:

 On 08/17/2012 01:12 AM, Nicholas A. Bellinger wrote:
 
 On Thu, 2012-08-16 at 09:24 -0700, Roland Dreier wrote:
 Actually I'm not sure removing cmd_spdtl is the right answer.

 If /dev/sda is a device on an iSCSI initiator exported by an LIO
 target, try doing:

 # sg_raw -r512 /dev/sda 28 0 0 0 0 0 0 0 2 0

 This issues a READ (10) for 2 sectors, but only sends a length of 512
 at the transfer level.

 The target responds by setting the residual to 512 but transmits all
 1024 bytes, 
 
 
 Is this the correct behavior from the Target? I would imagine that the
 target needs to only send 512 bytes (transfer level size) and set the
 OVERFLOW bit and residual to 512.
 
 Not that it really matter because as you stated below the Initiator makes
 sure nothing bad happens.
 
 BTW what target are we talking about, on the other side of the initiator
 here? (There are two targets in this setup right?)
 
 and the Linux initiator at least rejects it because it
 hits:

 if (tcp_task-data_offset + tcp_conn-in.datalen  total_in_length) {
 ISCSI_DBG_TCP(conn, data_offset(%d) + data_len(%d)  
   total_length_in(%d)\n, tcp_task-data_offset,
   tcp_conn-in.datalen, total_in_length);
 return ISCSI_ERR_DATA_OFFSET;
 }


 Ok, this is the 'overflow' case when the fabric -data_length (expected
 data transfer length of the initiator's buffer) is smaller than the SCSI
 CDB allocation length or sectors*block_size (attempted transfer length)
 the target has been asked to process.

 The following patch which appears to do the right thing from the
 perspective of the target for overflow, but AFAICT open-iscsi still
 returns GOOD status w/ no data-in payload (at least via sg_raw) when the
 iscsi-target is signaling overflow bit in iSCSI Data IN PDU.  Not sure
 yet why this is the case, but drivers/scsi/libiscsi.c:iscsi_data_in_rsp
 code:

if (rhdr-flags  (ISCSI_FLAG_DATA_UNDERFLOW |
ISCSI_FLAG_DATA_OVERFLOW)) {
 int res_count = be32_to_cpu(rhdr-residual_count);

 if (res_count  0 
 (rhdr-flags  ISCSI_FLAG_CMD_OVERFLOW ||
  res_count = scsi_in(sc)-length))
 scsi_in(sc)-resid = res_count;
 else
 sc-result = (DID_BAD_TARGET  16) | 
 rhdr-cmd_status;
 }

 
 
 OK I admit I kind of rearranged all this code a few years ago. Guilty ;-) 
 
 Well now that I look at it again, I think it is totally wrong!!
 The scsi and block layer do not know anything about CMD_OVERFLOW
 If a scsi_in/out(sc)-resid is set it only ever means UNDERFLOW.
 
 Both scsi and block expects to only do transfer_length - resid.
 This is why you get empty buffer back because the transfer_length=512
 minus resid=512 means zero bytes.
 
 So the || ISCSI_FLAG_CMD_OVERFLOW case is wrong.
 
 Now the big question is what to do. Fail the all command, or say
 nothing?
 
 The correct thing is to teach the middle layers about overflow,
 With some kind of warning level system.
 I was also thinking that we can make resid signed and signal
 an overflow with a negative resid. Now the math of
 transfer_length - resid will become correct since it means
 more, in the case above 512 - (-512) would be our 1024 CDB len.
 
 For now this code must be fixed. And the command must fail.
 Do you need that I prepare a patch? (Please you do it, I'm
 swamped, it'll take me time)
 There are 3 more places like this.
 
 BTW did you notice how in the code above we have mixed up the
 use of: ISCSI_FLAG_DATA_OVERFLOW and ISCSI_FLAG_CMD_OVERFLOW?
 That's another bug, here it should be ISCSI_FLAG_DATA_OVERFLOW
 only. The other places with the other flags.
 
 appears to be setting resid for non bidi cases correctly, right..? (mnc
 + boaz CC'ed)

 How about the following to ensure we restrict overflow to keep the
 existing (smaller) cmd-data_length assignment, and only re-assign this
 value for the underflow case..?  (hch CC'ed)

 WDYT..?

 diff --git a/drivers/target/target_core_transport.c 
 b/drivers/target/target_core_transport.c
 index 0eaae23..63b3594 100644
 --- a/drivers/target/target_core_transport.c
 +++ b/drivers/target/target_core_transport.c
 @@ -1183,15 +1183,20 @@ int target_cmd_size_check(struct se_cmd *cmd, 
 unsigned int size)
 /* Returns CHECK_CONDITION + INVALID_CDB_FIELD */
 goto out_invalid_cdb_field;
 }
 -
 +   /*
 +* For the overflow case keep the existing fabric provided
 +* -data_length.  Otherwise for the underflow case, reset
 +* -data_length to the smaller SCSI expected data transfer
 +* length.
 +*/
 if (size  cmd-data_length) {
 
 
 I'm a bit out of context

Re: virtio(-scsi) vs. chained sg_lists (was Re: [PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list)

2012-07-30 Thread Boaz Harrosh

On 07/30/2012 10:12 AM, Paolo Bonzini wrote:

 Il 30/07/2012 01:50, Rusty Russell ha scritto:
 Also, being the first user of chained scatterlist doesn't exactly give
 me warm fuzzies.

 We're far from the first user: they've been in the kernel for well over
 7 years.  They were introduced for the block layer, but they tended to
 ignore private uses of scatterlists like this one.
 
 Yeah, but sg_chain has no users in drivers, only a private one in
 lib/scatterlist.c.  The internal API could be changed to something else
 and leave virtio-scsi screwed...
 
 Yes, we should do this.  But note that this means an iteration, so we
 might as well combine the loops :)
 
 I'm really bad at posting pseudo-code, but you can count the number of
 physically-contiguous entries at the beginning of the list only.  So if
 everything is contiguous, you use a single non-indirect buffer and save
 a kmalloc.  If you use indirect buffers, I suspect it's much less
 effective to collapse physically-contiguous entries.  More elaborate
 heuristics do need a loop, though.
 


[All the below with a grain of salt, from my senile memory]

You must not forget some facts about the scatterlist received here at the LLD.
It has already been DMA mapped and locked by the generic layer.
Which means that the DMA engine has already collapsed physically-contiguous
entries. Those you get here are already unique physically.
(There were bugs in the past, where this was not true, please complain
 if you find them again)

A scatterlist is two different lists taking the same space, but with two
different length.
- One list is the PAGE pointers plus offset  length, which is bigger or
  equal to the 2nd list. The end marker corresponds to this list.

  This list is the input into the DMA engine.

- Second list is the physical DMA addresses list. With their physical-lengths.
  Offset is not needed because it is incorporated in the DMA address.

  This list is the output from the DMA engine.

  The reason 2nd list is shorter is because the DMA engine tries to minimize
  the physical scatter-list entries which is usually a limited HW resource.

  This list might follow chains but it's end is determined by the received
  sg_count from the DMA engine, not by the end marker.

At the time my opinion, and I think Rusty agreed, was that the scatterlist
should be split in two. The input page-ptr list is just the BIO, and the
output of the DMA-engine should just be the physical part of the sg_list,
as a separate parameter. But all this was berried under too much APIs and
the noise was two strong, for any single brave sole.

So I'd just trust blindly the returned sg_count from the DMA engine, it is
already optimized. I THINK

 Paolo


Boaz
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list

2012-07-25 Thread Boaz Harrosh

On 07/25/2012 11:44 AM, Paolo Bonzini wrote:

 Il 25/07/2012 10:29, Wang Sen ha scritto:
 When using the commands below to write some data to a virtio-scsi LUN of the 
 QEMU guest(32-bit) with 1G physical memory(qemu -m 1024), the qemu will 
 crash.

  # sudo mkfs.ext4 /dev/sdb  (/dev/sdb is the virtio-scsi LUN.)
  # sudo mount /dev/sdb /mnt
  # dd if=/dev/zero of=/mnt/file bs=1M count=1024

 In current implementation, sg_set_buf is called to add buffers to sg list 
 which
 is put into the virtqueue eventually. But there are some HighMem pages in 
 table-sgl can not get virtual address by sg_virt. So, sg_virt(sg_elem) may
 return NULL value. This will cause QEMU exit when virtqueue_map_sg is called 
 in QEMU because an invalid GPA is passed by virtqueue.
 
 Heh, I was compiling (almost) the same patch as we speak. :)
 
 I've never seen QEMU crash; the VM would more likely just fail to boot
 with a panic.  But it's the same bug anyway.
 
 My solution is using sg_set_page instead of sg_set_buf.

 I have tested the patch on my workstation. QEMU would not crash any more.

 Signed-off-by: Wang Sen senw...@linux.vnet.ibm.com
 ---
  drivers/scsi/virtio_scsi.c |3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

 diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
 index 1b38431..fc5c88a 100644
 --- a/drivers/scsi/virtio_scsi.c
 +++ b/drivers/scsi/virtio_scsi.c
 @@ -198,7 +198,8 @@ static void virtscsi_map_sgl(struct scatterlist *sg, 
 unsigned int *p_idx,
  int i;
  
  for_each_sg(table-sgl, sg_elem, table-nents, i)
 -sg_set_buf(sg[idx++], sg_virt(sg_elem), sg_elem-length);
 +sg_set_page(sg[idx++], sg_page(sg_elem), sg_elem-length,
 +sg_elem-offset);
 
 This can simply be
 
sg[idx++] = *sg_elem;
 
 Can you repost it with this change, and also add sta...@vger.kernel.org
 to the Cc?  Thanks very much!
 


No! Please use sg_set_page()! Look at sg_set_page(), which calls 
sg_assign_page().
It has all these jump over chained arrays. When you'll start using long
sg_lists (which you should) then jumping from chain to chain must go through
sg_page(sg_elem)  sg_assign_page(), As in the original patch.

Thanks
Boaz

 Paolo
 --
 To unsubscribe from this list: send the line unsubscribe linux-scsi in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list

2012-07-25 Thread Boaz Harrosh

On 07/25/2012 12:41 PM, Paolo Bonzini wrote:

 Il 25/07/2012 11:22, Boaz Harrosh ha scritto:
  for_each_sg(table-sgl, sg_elem, table-nents, i)
 -sg_set_buf(sg[idx++], sg_virt(sg_elem), 
 sg_elem-length);
 +sg_set_page(sg[idx++], sg_page(sg_elem), 
 sg_elem-length,
 +sg_elem-offset);

 This can simply be

sg[idx++] = *sg_elem;

 Can you repost it with this change, and also add sta...@vger.kernel.org
 to the Cc?  Thanks very much!


 No! Please use sg_set_page()! Look at sg_set_page(), which calls 
 sg_assign_page().
 It has all these jump over chained arrays. When you'll start using long
 sg_lists (which you should) then jumping from chain to chain must go through
 sg_page(sg_elem)  sg_assign_page(), As in the original patch.
 
 Hi Boaz,
 
 actually it seems to me that using sg_set_page is wrong, because it will
 not copy the end marker from table-sgl to sg[].  If something chained
 the sg[] scatterlist onto something else, sg_next's test for sg_is_last
 would go beyond the table-nents-th item and access invalid memory.
 


Yes, you did not understand this structure. And Yes I am right, when
using chained lists you *must* use sg_set_page().

You see the chaining belongs to the allocation not the value of the
sg-elements. One must not copy the chaining marker to the destination
array which might have it's own. And one must not crap all over the
destination chaining markers, set at allocation time.

The sizes and mostly the pointers of source and destination are not
the same.

 Using chained sglists is on my to-do list, I expect that it would make a
 nice performance improvement.  However, I was a bit confused as to
 what's the plan there; there is hardly any user, and many arches still
 do not define ARCH_HAS_SG_CHAIN.  Do you have any pointer to discussions
 or LWN articles?
 


Only the source code I'm afraid.

In SCSI land most LLDs should support chaining just by virtu of using the
for_each_sg macro. That all it takes. Your code above does support it.
(In Wang version).

Though more code need probably be added at sg allocation to actually
allocate and prepare a chain.

 I would need to add support for the long sglists to virtio; this is not
 a problem, but in the past Rusty complained that long sg-lists are not
 well suited to virtio (which would like to add elements not just at the
 beginning of a given sglist, but also at the end).  


Well that can be done as well, (If done carefully) It should be easy to add
chained fragments to both the end of a given chain just as at beginning.

It only means that the last element of the appended-to chain moves to
the next fragment and it's place is replaced by a link.

If you have ready made two long segments A and C which you would like not
to reallocate and copy, you insert a two-elements segment in the middle, say
call it B.

The first element of B is the last element of A which is now used as the pointer
to B, and the 2nd element of B is a pointer to C.

 It seems to me that
 virtio would prefer to work with a struct scatterlist ** rather than a
 long sglist.
 


That's just going backwards, and lazy. As you said if you want to enjoy
the better performance cake you better break some eggs ;-)

 Paolo


Cheers
Boaz
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list

2012-07-25 Thread Boaz Harrosh

On 07/25/2012 02:44 PM, Sen Wang wrote:

 2012/7/25 Paolo Bonzini pbonz...@redhat.com:
 Il 25/07/2012 10:29, Wang Sen ha scritto:
 When using the commands below to write some data to a virtio-scsi LUN of the
 QEMU guest(32-bit) with 1G physical memory(qemu -m 1024), the qemu will 
 crash.

   # sudo mkfs.ext4 /dev/sdb  (/dev/sdb is the virtio-scsi LUN.)
   # sudo mount /dev/sdb /mnt
   # dd if=/dev/zero of=/mnt/file bs=1M count=1024

 In current implementation, sg_set_buf is called to add buffers to sg list 
 which
 is put into the virtqueue eventually. But there are some HighMem pages in
 table-sgl can not get virtual address by sg_virt. So, sg_virt(sg_elem) may
 return NULL value. This will cause QEMU exit when virtqueue_map_sg is called
 in QEMU because an invalid GPA is passed by virtqueue.

 Heh, I was compiling (almost) the same patch as we speak. :)
 
 Uh, what a coincidence! :)
 

 I've never seen QEMU crash; the VM would more likely just fail to boot
 with a panic.  But it's the same bug anyway.
 
 I never met this before. How this situation happens?
 

 My solution is using sg_set_page instead of sg_set_buf.

 I have tested the patch on my workstation. QEMU would not crash any more.

 Signed-off-by: Wang Sen senw...@linux.vnet.ibm.com
 ---
  drivers/scsi/virtio_scsi.c |3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

 diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
 index 1b38431..fc5c88a 100644
 --- a/drivers/scsi/virtio_scsi.c
 +++ b/drivers/scsi/virtio_scsi.c
 @@ -198,7 +198,8 @@ static void virtscsi_map_sgl(struct scatterlist *sg, 
 unsigned int *p_idx,
   int i;

   for_each_sg(table-sgl, sg_elem, table-nents, i)
 - sg_set_buf(sg[idx++], sg_virt(sg_elem), sg_elem-length);
 + sg_set_page(sg[idx++], sg_page(sg_elem), sg_elem-length,
 + sg_elem-offset);

 This can simply be

sg[idx++] = *sg_elem;

 
 Yes, I saw your another E-mail. I think you're right. Simply calling
 sg_set_page can not handle
 the flag bits correctly. So, I'll repost the patch soon. Thank you!
 


No this code is correct, though you will need to make sure to properly
terminate the destination sg_list.

But since old code was using sg_set_buf(), than it means it was properly
terminated before, and there for this code is good as is please don't
touch it.

Thanks
Boaz

 Can you repost it with this change, and also add sta...@vger.kernel.org
 to the Cc?  Thanks very much!

 Paolo
 --
 To unsubscribe from this list: send the line unsubscribe linux-scsi in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 
 

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list

2012-07-25 Thread Boaz Harrosh

On 07/25/2012 03:34 PM, Paolo Bonzini wrote:

 Il 25/07/2012 14:13, Wang Sen ha scritto:
 When using the commands below to write some data to a virtio-scsi LUN of the
 QEMU guest(32-bit) with 1G physical memory(qemu -m 1024), the qemu will 
 crash.

 # sudo mkfs.ext4 /dev/sdb  (/dev/sdb is the virtio-scsi LUN.)
 # sudo mount /dev/sdb /mnt
 # dd if=/dev/zero of=/mnt/file bs=1M count=1024

 In current implementation, sg_set_buf is called to add buffers to sg list 
 which
 is put into the virtqueue eventually. But if there are some HighMem pages in
 table-sgl you can not get virtual address by sg_virt. So, sg_virt(sg_elem) 
 may
 return NULL value. This will cause QEMU exit when virtqueue_map_sg is called
 in QEMU because an invalid GPA is passed by virtqueue.

 I take Paolo's solution mentioned in last thread to avoid failure on 
 handling 
 flag bits.
 
 Please include an URL or (better) summarize the reason why sg_set_page
 is not correct in the commit message.  For example, replace this
 paragraph with the following:
 
 To fix this, we can simply copy the original scatterlist entries into
 virtio-scsi's; we need to copy the entries entirely, including the flag
 bits, so using sg_set_page is not correct.
 

 Please send v3 with this change and I'll add my Acked-by.
 


NACK-by: Boaz Harrosh


Apart from the HighMem pages problem, where in previous sg_set_buf()
code was the marker copied? It was not because it is not needed because
the allocation of sg took care of that. For example in 64bit the is no
bugs, right?

If there was a destination sg_list termination bug, it should be fixed
as a separate patch from this HighMem pages problem. But I bet if
you inspect the code carefully there isn't such a bug.

Cheers
Boaz

 Paolo
 

 I have tested the patch on my workstation. QEMU would not crash any more.

 Signed-off-by: Wang Sen senw...@linux.vnet.ibm.com
 ---
  drivers/scsi/virtio_scsi.c |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
 index 1b38431..6661610 100644
 --- a/drivers/scsi/virtio_scsi.c
 +++ b/drivers/scsi/virtio_scsi.c
 @@ -198,7 +198,7 @@ static void virtscsi_map_sgl(struct scatterlist *sg, 
 unsigned int *p_idx,
  int i;
  
  for_each_sg(table-sgl, sg_elem, table-nents, i)
 -sg_set_buf(sg[idx++], sg_virt(sg_elem), sg_elem-length);
 +sg[idx++] = *sg_elem;
  
  *p_idx = idx;
  }

 
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-scsi in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list

2012-07-25 Thread Boaz Harrosh

On 07/25/2012 03:49 PM, Paolo Bonzini wrote:

 
 Except here the destination array has to be given to virtio, which
 doesn't (yet) understand chaining.  I'm using for_each_sg rather than a
 simple memcpy exactly because I want to flatten the input scatterlist
 onto consecutive scatterlist entries, which is what virtio expects (and
 what I'll change when I get to it).
 
 for_each_sg guarantees that I get non-chain scatterlists only, so it is
 okay to value-assign them to sg[].
 


So if the virtio does not understand chaining at all then surly it will
not understand the 2-bit end marker and will get a wrong page pointer
with the 1st bit set.

As I said!! Since the last code did sg_set_buff() and worked then you must
change it with sg_set_page().

If there are *any* chaining-or-no-chaining markers errors these should
be fixed as a separate patch!

Please lets concentrate at the problem at hand.

snip

 
 It was _not_ properly terminated, and didn't matter because virtio
 doesn't care about termination.  Changing all the virtio devices to
 properly terminate chains (and to use for_each_sg) is a prerequisite for
 properly supporting long sglists).
 


Fine then your code is now a crash because the terminating bit was just
copied over, which it was not before.


Now Back to the how to support chaining:

Lets separate the two topics from now on. Send me one mail concerning
the proper above patch, And a different mail for how to support chaining.

 In SCSI land most LLDs should support chaining just by virtu of using the
 for_each_sg macro. That all it takes. Your code above does support it.
 
 Yes, it supports it but still has to undo them before passing to virtio.
 
 What my LLD does is add a request descriptor in front of the scatterlist
 that the LLD receives.  I would like to do this with a 2-item
 scatterlist: one for the request descriptor, and one which is a chain to
 the original scatterlist.  


I hate that plan. Why yet override the scatter element yet again with a third
union of a request descriptor

The reason it was overloaded as a link-pointer in the first place was because
of historical compatibility reasons and not because of a good design.

You should have a proper request descriptor structure defined, pointing to
(or followed by), an sglist-chain. And all of the above is mute.
 

 Except that if I call sg_chain and my
 architecture does not define ARCH_HAS_SG_CHAIN, it will BUG out.  So I
 need to keep all the scatterlist allocation and copying crap that I have
 now within #ifdef, and it will bitrot.
 


except that with the correct design you don't call sg_chain you just do:
request_descriptor-sg_list = sg;

 I would need to add support for the long sglists to virtio; this is not
 a problem, but in the past Rusty complained that long sg-lists are not
 well suited to virtio (which would like to add elements not just at the
 beginning of a given sglist, but also at the end).  

 Well that can be done as well, (If done carefully) It should be easy to add
 chained fragments to both the end of a given chain just as at beginning.
 It only means that the last element of the appended-to chain moves to
 the next fragment and it's place is replaced by a link.
 
 But you cannot do that in constant time, can you?  And that means you do
 not enjoy any benefit in terms of cache misses etc.
 


I did not understand constant time it is O(0) if that what you meant.

(And surly today's code of copy the full list cache misses)

 Also, this assumes that I can modify the appended-to chain.  I'm not
 sure I can do this?
 


Each case it's own. If the appended-to chain is const, yes you'll have
to reallocate it and copy. Is that your case?

Cheers
Boaz

 It seems to me that virtio would prefer to work with a struct
 scatterlist ** rather than a long sglist.

 That's just going backwards, and lazy. As you said if you want to enjoy
 the better performance cake you better break some eggs ;-)
 
 :)
 
 Paolo


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list

2012-07-25 Thread Boaz Harrosh

On 07/25/2012 04:36 PM, Paolo Bonzini wrote:

 Il 25/07/2012 15:26, Boaz Harrosh ha scritto:
 On 07/25/2012 03:49 PM, Paolo Bonzini wrote:


 Except here the destination array has to be given to virtio, which
 doesn't (yet) understand chaining.  I'm using for_each_sg rather than a
 simple memcpy exactly because I want to flatten the input scatterlist
 onto consecutive scatterlist entries, which is what virtio expects (and
 what I'll change when I get to it).

 for_each_sg guarantees that I get non-chain scatterlists only, so it is
 okay to value-assign them to sg[].

 So if the virtio does not understand chaining at all then surly it will
 not understand the 2-bit end marker and will get a wrong page pointer
 with the 1st bit set.
 
 It doesn't understand chaining, but it does use sg_phys(x) so it will
 not get a wrong page pointer for the end marker.
 
 Fine then your code is now a crash because the terminating bit was just
 copied over, which it was not before.
 
 I did test the patch with value-assignment.
 


Still you should use the sg_set_page()!!
1. It is not allowed to directly manipulate sg entries. One should always
   use the proper accessor. Even if open coding does work and is not a bug
   it should not be used anyway!
2. Future code that will support chaining will need to do as I say so why
   change it then, again?

Please don't change two things in one patch. The fix is for high-pages
please fix only that here. You can blasphemy open-code the sg manipulation
in a separate patch.

Please
Boaz

 Lets separate the two topics from now on. Send me one mail concerning
 the proper above patch, And a different mail for how to support chaining.
 
 Ok, and I'll change the topic.
 
 Paolo


--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: virtio(-scsi) vs. chained sg_lists (was Re: [PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list)

2012-07-25 Thread Boaz Harrosh

On 07/25/2012 11:06 PM, Paolo Bonzini wrote:

 Il 25/07/2012 21:16, Boaz Harrosh ha scritto:
 The picture confused me. It looked like the first element is the 
 virtio_scsi_cmd_req
 not an sgilist-element that points to the struct's buffer.

 In that case then yes your plan of making a two-elements fragment that 
 points to the
 original scsi-sglist is perfect. All you have to do is that, and all you 
 have to do
 at virtio is use the sg_for_each macro and you are done.

 You don't need any sglist allocation or reshaping. And you can easily support
 chaining. Looks like order of magnitude more simple then what you do now
 
 It is.
 
 So what is the problem?
 
 That not all architectures have ARCH_HAS_SG_CHAIN (though all those I
 care about do).  So I need to go through all architectures and make sure
 they use for_each_sg, or at least to change ARCH_HAS_SG_CHAIN to a
 Kconfig define so that dependencies can be expressed properly.
 


What is actually preventing ARCH_HAS_SG_CHAIN from all these ARCHES?
is that the DMA drivers not using for_each_sg(). Sounds like easy
to fix.

But yes a deep change would convert ARCH_HAS_SG_CHAIN to a Kconfig.

If you want to be lazy, like me, You might just put a BUILD_BUG_ON
in code, requesting the user to disable the driver for this ARCH.

I bet there is more things to do at ARCH to enable virtualization
then just support ARCH_HAS_SG_CHAIN. Be it just another requirement.

If you Document it and make sure current ARCHs are fine, it should
not ever trigger.

 And BTW you won't need that new __sg_set_page API anymore.
 
 Kind of.
 
sg_init_table(sg, 2);
sg_set_buf(sg[0], req, sizeof(req));
sg_chain(sg[1], scsi_out(sc));
 
 is still a little bit worse than
 
__sg_set_buf(sg[0], req, sizeof(req));
__sg_chain(sg[1], scsi_out(sc));
 


I believe they are the same, specially for the
on the stack 2 elements array. Actually I think
In both cases you need to at least call sg_init_table()
once after allocation, No?

Your old code with big array copy and re-shaping was
a better example of the need for your new API. Which I agree.

But please for my sake do not call it __sg_chain. Call it
something like sg_chain_not_end(). I hate those __ which
for god sack means what? 
(A good name is when I don't have to read the code, an __
 means fuck you go read the code)

 Paolo


Thanks
Boaz
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: gdth new set of patches for 2.6.24 stable

2008-02-18 Thread Boaz Harrosh

On Sun, Feb 17 2008 at 19:24 +0200, James Bottomley [EMAIL PROTECTED] wrote:
 On Sun, 2008-02-17 at 18:46 +0200, Boaz Harrosh wrote:
 On Thu, Feb 14 2008 at 20:47 +0200, Boaz Harrosh [EMAIL PROTECTED] wrote:
 Submitted are a new set of patches, that fix lots of problems
 with the gdth driver.

 It fixes the following problems:
 - scan for drives on hosts. (Already in mainline)
 - truly fixes the exit/reboot problems but does call flush() before
   reboot.
 - fix crash when accessing array with icpcon management application.
 - fix crash when doing $ cat /proc/sys/gdth/0.
   This one still has the below WARN_ON in messages (see gdth_info below)
   So there is one more thing hiding in there.
 - use pci_get_device
   One of the testers requested if we can also put the move to 
 pci_get_device 
   patch with removal of dependency on PCI_LEGACY, to the stable release.

 The patches are for and based on Linux-2.6.24. here is the list of patches:
   [PATCH 1/5] gdth: update deprecated pci_find_device
   [PATCH 2/5] gdth: scan for scsi devices
   [PATCH 3/5] gdth: bugfix for the at-exit problems
   [PATCH 4/5] gdth: fix to internal commands execution
   [PATCH 5/5] gdth: remove gdth cooked up command accessors

 Please all test and report your findings.

 Thanks in advance
 Boaz

 ---
 gdth_info
   WARNING: at arch/x86/kernel/pci-dma_32.c:66 dma_free_coherent()
   Pid: 5501, comm: cat Not tainted 2.6.24 #43
[c0107137] dma_free_coherent+0x93/0x95
[c025ef73] gdth_ioctl_free+0x4c/0x69
[c0264a36] gdth_proc_info+0x165f/0x182c
[c0111f7a] update_curr+0xeb/0xf2
[c01132aa] task_rq_lock+0x29/0x50
[c0113706] try_to_wake_up+0x42/0x342
[c0113706] try_to_wake_up+0x42/0x342
[c0111a9f] __wake_up_common+0x46/0x6d
[c0113569] __wake_up+0x32/0x42
[c022dad9] n_tty_receive_buf+0x2e8/0xe97
[c022dad9] n_tty_receive_buf+0x2e8/0xe97
[c0111f0a] update_curr+0x7b/0xf2
[c0112625] enqueue_task_fair+0x27/0x30
[c0111783] enqueue_task+0xa/0x14
[c025e351] proc_scsi_read+0x29/0x3d
[c025e328] proc_scsi_read+0x0/0x3d
[c0189704] proc_file_read+0x1c6/0x279
[c018953e] proc_file_read+0x0/0x279
[c0185eca] proc_reg_read+0x53/0x71
[c0185e77] proc_reg_read+0x0/0x71
[c0159968] vfs_read+0x85/0x11b
[c0159d9d] sys_read+0x41/0x6a
[c0102822] sysenter_past_esp+0x5f/0x85
  /gdth_info
 -
 James hi.

 All my testers have reported back that with these 5 patches applied they can
 now run with a 2.6.24 kernel the same way they ran before. However there is
 that reported issue, with the dma_free_coherent WARN_ON (above). The code 
 was 
 like that from day one and it is a very old issue, however it is a 
 regression 
 because 2.6.24 introduced that new WARN_ON.
 (infamous commit aa24886e379d2b641c5117e178b15ce1d5d366ba)
 From posts on lkml and even recent one in linux-scsi about the arcmsr driver
 it looks that all a driver can do is work around it with different kernel 
 mechanisms
 and driver rewrites. I'm afraid I need your help here. I'm not sure I 
 understand
 why does the gdth driver uses the pci_{alloc,free}_consistent() API's, and 
 what
 is needed to replace it. Could you please have a look in gdth_proc.c and 
 also in
 gdth.c for all the places that call gdth_ioctl_alloc/gdth_ioctl_free, and 
 advise
 what can I do in it's place. Please bear in mind that we need it for 2.6.24, 
 as
 a bugfix.

 Apart from the above issue, please accept patches 3,4,5 above they have now
 been tested and are reported to bring broken system back to production.
 (Given that you approve off course). And mark them for inclusion to the
 2.6.24 stable releases. (Or is there some thing that I should do)

 ---
 Meanwhile on x86 systems I understand the WARN_ON is cosmetic, and does not
 pose any harm. Some people have reported stability with temporarily disabling
 it. For testers that want to try, here it is below. At your own risk.
 
 Isn't this the correct fix?  pscratch is a permanent address (it's
 allocated at boot time and never changes).  All you need the smp lock
 for is mediating the scratch in use flag.
 
 James
 
 diff --git a/drivers/scsi/gdth_proc.c b/drivers/scsi/gdth_proc.c
 index de57734..ce0228e 100644
 --- a/drivers/scsi/gdth_proc.c
 +++ b/drivers/scsi/gdth_proc.c
 @@ -694,15 +694,13 @@ static void gdth_ioctl_free(gdth_ha_str *ha, int size, 
 char *buf, ulong64 paddr)
  {
  ulong flags;
  
 -spin_lock_irqsave(ha-smp_lock, flags);
 -
  if (buf == ha-pscratch) {
 + spin_lock_irqsave(ha-smp_lock, flags);
  ha-scratch_busy = FALSE;
 + spin_unlock_irqrestore(ha-smp_lock, flags);
  } else {
  pci_free_consistent(ha-pdev, size, buf, paddr);
  }
 -
 -spin_unlock_irqrestore(ha-smp_lock, flags);
  }
  
  #ifdef GDTH_IOCTL_PROC
 
 
 -

James
You are bung on the money. It was tested and it works. So simple, I was 
thinking it was accessed by DMA and freed at interrupt. But no, just a 
simple lock like this.

So that's it then, all reported

Re: gdth new set of patches for 2.6.24 stable

2008-02-18 Thread Boaz Harrosh

On Mon, Feb 18 2008 at 14:57 +0200, Andrew Morton [EMAIL PROTECTED] wrote:
 On Sun, 17 Feb 2008 18:46:03 +0200 Boaz Harrosh [EMAIL PROTECTED] wrote:
 
 ...

 All my testers have reported back that with these 5 patches applied they can
 now run with a 2.6.24 kernel the same way they ran before. However there is
 that reported issue, with the dma_free_coherent WARN_ON (above). The code 
 was 
 like that from day one and it is a very old issue, however it is a 
 regression 
 because 2.6.24 introduced that new WARN_ON.
 (infamous commit aa24886e379d2b641c5117e178b15ce1d5d366ba)
 From posts on lkml and even recent one in linux-scsi about the arcmsr driver
 it looks that all a driver can do is work around it with different kernel 
 mechanisms
 and driver rewrites. I'm afraid I need your help here. I'm not sure I 
 understand
 why does the gdth driver uses the pci_{alloc,free}_consistent() API's, and 
 what
 is needed to replace it. Could you please have a look in gdth_proc.c and 
 also in
 gdth.c for all the places that call gdth_ioctl_alloc/gdth_ioctl_free, and 
 advise
 what can I do in it's place. Please bear in mind that we need it for 2.6.24, 
 as
 a bugfix.

 Apart from the above issue, please accept patches 3,4,5 above they have now
 been tested and are reported to bring broken system back to production.
 (Given that you approve off course). And mark them for inclusion to the
 2.6.24 stable releases. (Or is there some thing that I should do)

 ---
 Meanwhile on x86 systems I understand the WARN_ON is cosmetic, and does not
 pose any harm. Some people have reported stability with temporarily disabling
 it. For testers that want to try, here it is below. At your own risk.

 ---
 From 50d3657bf6a138ee63ad1ce00052380edc75ace7 Mon Sep 17 00:00:00 2001
 From: Boaz Harrosh [EMAIL PROTECTED]
 Date: Sun, 17 Feb 2008 12:49:35 +0200
 Subject: [PATCH] gdth: Hack to remove WARN_ON in arch/x86/kernel/pci-dma_32.c

   gdth uses dma_free_coherent() with interrupts disabled. Which
   is not portable, but is safe on the HW that supports gdth.

 NOT Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
 ---
  arch/x86/kernel/pci-dma_32.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

 diff --git a/arch/x86/kernel/pci-dma_32.c b/arch/x86/kernel/pci-dma_32.c
 index 5133032..350dcfd 100644
 --- a/arch/x86/kernel/pci-dma_32.c
 +++ b/arch/x86/kernel/pci-dma_32.c
 @@ -63,7 +63,7 @@ void dma_free_coherent(struct device *dev, size_t size,
  struct dma_coherent_mem *mem = dev ? dev-dma_mem : NULL;
  int order = get_order(size);
  
 -WARN_ON(irqs_disabled());   /* for portability */
 +/*  WARN_ON(irqs_disabled());*/ /* for portability */
  if (mem  vaddr = mem-virt_base  vaddr  (mem-virt_base + 
 (mem-size  PAGE_SHIFT))) {
  int page = (vaddr - mem-virt_base)  PAGE_SHIFT;
  
 
 Yes.   Let's reprise aa24886e379d2b641c5117e178b15ce1d5d366ba:
 
 : commit aa24886e379d2b641c5117e178b15ce1d5d366ba
 : Author: David Brownell [EMAIL PROTECTED]
 : Date:   Fri Aug 10 13:10:27 2007 -0700
 : 
 : dma_free_coherent() needs irqs enabled (sigh)
 : 
 : On at least ARM (and I'm told MIPS too) dma_free_coherent() has a newish
 : call context requirement: unlike its dma_alloc_coherent() sibling, it 
 may
 : not be called with IRQs disabled.  (This was new behavior on ARM as of 
 late
 : 2005, caused by ARM SMP updates.) This little surprise can be annoyingly
 : driver-visible.
 : 
 : Since it looks like that restriction won't be removed, this patch 
 changes
 : the definition of the API to include that requirement.  Also, to help 
 catch
 : nonportable drivers, it updates the x86 and swiotlb versions to include 
 the
 : relevant warnings.  (I already observed that it trips on the
 : bus_reset_tasklet of the new firewire_ohci driver.)
 : 
 
 In general, all Linux memory-freeing functions can be called from all
 contexts.  (vfree is an irritating exception).  This is good, and provides
 maximum usefulness to callees, as all utility functions should seek to do. 
 It would be best to fix arm and mips.
 
 But arm and mips require enabled local irqs because their
 dma_free_coherent() needs to do a cross-cpu IPI call.  Presumably because
 of certain unusual TLB protocols.
 
 I'm not sure what we should do about this.  Presumably the gdth-on-arm
 usage base is, umm, zero, so we could lamely add
 CONFIG_DMA_FREE_COHERENT_WITH_LOCAL_IRQS_DISABLED_IS_OK and then use that
 to disable gdth (and similar) on arm amd mips.  But ugh.
 
 Russell, Ralf: is there something we can do here to relax this requirement?
 
 I'm thinking that perhaps we can do some rcu/refcounting tricks: launch the
 IPI from within dma_free_coherent(), but don't wait for it to complete. 
 When all CPUs have handled the IPI then (and only then) the virtual address
 becomes recyclable, or something like that?
 
 double-checks
 
 Actually I think David might have been wrong about mips.  afaict its

[PATCH 1/3 ver2] iscsi: extended cdb support

2008-02-18 Thread Boaz Harrosh


Support for extended CDBs in iscsi.
All we need is to check if command spills over 16 bytes then allocate
an iscsi-extended-header for the leftovers.

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
Reviewed-by: Pete Wyckoff [EMAIL PROTECTED]
Signed-off-by: Mike Christie [EMAIL PROTECTED]
---
 drivers/scsi/libiscsi.c|   55 
 include/scsi/iscsi_proto.h |6 +++-
 2 files changed, 54 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 59f8445..a43b8ee 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -137,6 +137,45 @@ static int iscsi_add_hdr(struct iscsi_cmd_task *ctask, 
unsigned len)
return 0;
 }
 
+/*
+ * make an extended cdb AHS
+ */
+static int iscsi_prep_ecdb_ahs(struct iscsi_cmd_task *ctask)
+{
+   struct scsi_cmnd *cmd = ctask-sc;
+   unsigned rlen, pad_len;
+   unsigned short ahslength;
+   struct iscsi_ecdb_ahdr *ecdb_ahdr;
+   int rc;
+
+   ecdb_ahdr = iscsi_next_hdr(ctask);
+   rlen = cmd-cmd_len - ISCSI_CDB_SIZE;
+
+   BUG_ON(rlen  sizeof(ecdb_ahdr-ecdb));
+   ahslength = rlen + sizeof(ecdb_ahdr-reserved);
+
+   pad_len = iscsi_padding(rlen);
+
+   rc = iscsi_add_hdr(ctask, sizeof(ecdb_ahdr-ahslength) +
+  sizeof(ecdb_ahdr-ahstype) + ahslength + pad_len);
+   if (rc)
+   return rc;
+
+   if (pad_len)
+   memset(ecdb_ahdr-ecdb[rlen], 0, pad_len);
+
+   ecdb_ahdr-ahslength = cpu_to_be16(ahslength);
+   ecdb_ahdr-ahstype = ISCSI_AHSTYPE_CDB;
+   ecdb_ahdr-reserved = 0;
+   memcpy(ecdb_ahdr-ecdb, cmd-cmnd + ISCSI_CDB_SIZE, rlen);
+
+   debug_scsi(iscsi_prep_ecdb_ahs: varlen_cdb_len %d 
+  rlen %d pad_len %d ahs_length %d iscsi_headers_size %u\n,
+  cmd-cmd_len, rlen, pad_len, ahslength, ctask-hdr_len);
+
+   return 0;
+}
+
 /**
  * iscsi_prep_scsi_cmd_pdu - prep iscsi scsi cmd pdu
  * @ctask: iscsi cmd task
@@ -150,7 +189,7 @@ static int iscsi_prep_scsi_cmd_pdu(struct iscsi_cmd_task 
*ctask)
struct iscsi_session *session = conn-session;
struct iscsi_cmd *hdr = ctask-hdr;
struct scsi_cmnd *sc = ctask-sc;
-   unsigned hdrlength;
+   unsigned hdrlength, cmd_len;
int rc;
 
ctask-hdr_len = 0;
@@ -165,10 +204,16 @@ static int iscsi_prep_scsi_cmd_pdu(struct iscsi_cmd_task 
*ctask)
hdr-cmdsn = cpu_to_be32(session-cmdsn);
session-cmdsn++;
hdr-exp_statsn = cpu_to_be32(conn-exp_statsn);
-   memcpy(hdr-cdb, sc-cmnd, sc-cmd_len);
-   if (sc-cmd_len  MAX_COMMAND_SIZE)
-   memset(hdr-cdb[sc-cmd_len], 0,
-   MAX_COMMAND_SIZE - sc-cmd_len);
+   cmd_len = sc-cmd_len;
+   if (cmd_len  ISCSI_CDB_SIZE)
+   memset(hdr-cdb[cmd_len], 0, ISCSI_CDB_SIZE - cmd_len);
+   else if (cmd_len  ISCSI_CDB_SIZE) {
+   rc = iscsi_prep_ecdb_ahs(ctask);
+   if (rc)
+   return rc;
+   cmd_len = ISCSI_CDB_SIZE;
+   }
+   memcpy(hdr-cdb, sc-cmnd, cmd_len);
 
ctask-imm_count = 0;
if (sc-sc_data_direction == DMA_TO_DEVICE) {
diff --git a/include/scsi/iscsi_proto.h b/include/scsi/iscsi_proto.h
index 5ffec8a..e0593bf 100644
--- a/include/scsi/iscsi_proto.h
+++ b/include/scsi/iscsi_proto.h
@@ -112,6 +112,7 @@ struct iscsi_ahs_hdr {
 
 #define ISCSI_AHSTYPE_CDB  1
 #define ISCSI_AHSTYPE_RLENGTH  2
+#define ISCSI_CDB_SIZE 16
 
 /* iSCSI PDU Header */
 struct iscsi_cmd {
@@ -125,7 +126,7 @@ struct iscsi_cmd {
__be32 data_length;
__be32 cmdsn;
__be32 exp_statsn;
-   uint8_t cdb[16];/* SCSI Command Block */
+   uint8_t cdb[ISCSI_CDB_SIZE];/* SCSI Command Block */
/* Additional Data (Command Dependent) */
 };
 
@@ -154,7 +155,8 @@ struct iscsi_ecdb_ahdr {
__be16 ahslength;   /* CDB length - 15, including reserved byte */
uint8_t ahstype;
uint8_t reserved;
-   uint8_t ecdb[260 - 16]; /* 4-byte aligned extended CDB spillover */
+   /* 4-byte aligned extended CDB spillover */
+   uint8_t ecdb[260 - ISCSI_CDB_SIZE];
 };
 
 /* SCSI Response Header */
-- 
1.5.3.3


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 0/3 ver2] iscsi bidi varlen support

2008-02-18 Thread Boaz Harrosh

On Thu, Jan 31 2008 at 20:08 +0200, Boaz Harrosh [EMAIL PROTECTED] wrote:
 Cheers after 1.3 years these can go in.
 
 [PATCH 1/3] iscsi: extended cdb support
The varlen support is not yet in mainline for
   block and scsi-ml. But the API for drivers will
   not change. All LLD need to do is max_command to
   the it's maximum and be ready for bigger commands.
   This is what's done here. Once these commands start
   coming iscsi will be ready for them.
 
 [PATCH 2/3] iscsi: bidi support - libiscsi
 [PATCH 3/3] iscsi: bidi support - iscsi_tcp
   bidirectional commands support in iscsi.
   iSER is not yet ready, but it will not break.
   There is already a mechanism in libiscsi that will
   return error if bidi commands are sent iSER way.
 
 Pete please send me the iSER bits so we can port them
 to this latest version.
 
 Mike these patches are ontop of iscs branch of the iscsi
 git tree, they will apply but for compilation you will need
 to sync with Linus mainline. The patches are for the in-tree
 iscsi code. I own you the compat patch for the out-off-tree
 code, but this I will only be Sunday.
 
 If we do it fast it might get accepted to 2.6.25 merge window
 
 Everybody is invited to a party at Shila ben-yhuda 52 Tel-Aviv
 9:45 pm. Drinks and wonderful see-food on us :)
 
 Boaz
  
 -
Everything the same as before. But working this time. Also
Pete's comment about second patch, was correct and code is now
fixed.

I have got Mike's Signed-off-by, on these they were tested and
approved by him. So they are for scsi-misc. But ... James? is
there any chance these can go into scsi-rc-fixes for the 2.6.25
kernel? The reason they are so late was mainly because of a fallout
in the merge process and a bug that was introduced because of that,
but they were intended to go together with bidi into 2.6.25. Also
as an important client code to the bidi-api that is introduced in
2.6.25 kernel.

Thanks
Boaz
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/3] iscsi bidi varlen support

2008-02-18 Thread Boaz Harrosh

On Tue, Feb 12 2008 at 22:17 +0200, Pete Wyckoff [EMAIL PROTECTED] wrote:
 [EMAIL PROTECTED] wrote on Tue, 12 Feb 2008 15:12 -0500:
 [EMAIL PROTECTED] wrote on Thu, 31 Jan 2008 20:08 +0200:
 Cheers after 1.3 years these can go in.

 [PATCH 1/3] iscsi: extended cdb support
The varlen support is not yet in mainline for
   block and scsi-ml. But the API for drivers will
   not change. All LLD need to do is max_command to
   the it's maximum and be ready for bigger commands.
   This is what's done here. Once these commands start
   coming iscsi will be ready for them.

 [PATCH 2/3] iscsi: bidi support - libiscsi
 [PATCH 3/3] iscsi: bidi support - iscsi_tcp
   bidirectional commands support in iscsi.
   iSER is not yet ready, but it will not break.
   There is already a mechanism in libiscsi that will
   return error if bidi commands are sent iSER way.

 Pete please send me the iSER bits so we can port them
 to this latest version.

 Mike these patches are ontop of iscs branch of the iscsi
 git tree, they will apply but for compilation you will need
 to sync with Linus mainline. The patches are for the in-tree
 iscsi code. I own you the compat patch for the out-off-tree
 code, but this I will only be Sunday.
 Here's the patch to add bidi support to iSER too.  It works
 with my setup, but could use more testing.  Note that this does
 rely on the 3 patches quoted above.
 
 Similar, for varlen support to iSER.  Probably apply this one before
 the bidi one I just sent.
 
   -- Pete
 
 
 From: Pete Wyckoff [EMAIL PROTECTED]
 Subject: [PATCH] iscsi iser: varlen
 
 Handle variable-length CDBs in iSER.
 
 Signed-off-by: Pete Wyckoff [EMAIL PROTECTED]
 ---
  drivers/infiniband/ulp/iser/iscsi_iser.c |5 +++--
  drivers/infiniband/ulp/iser/iscsi_iser.h |2 +-
  drivers/infiniband/ulp/iser/iser_initiator.c |   16 ++--
  3 files changed, 14 insertions(+), 9 deletions(-)
 
 diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.c 
 b/drivers/infiniband/ulp/iser/iscsi_iser.c
 index 5f2284d..9dfc310 100644
 --- a/drivers/infiniband/ulp/iser/iscsi_iser.c
 +++ b/drivers/infiniband/ulp/iser/iscsi_iser.c
 @@ -401,7 +401,8 @@ iscsi_iser_session_create(struct iscsi_transport *iscsit,
   ctask  = session-cmds[i];
   iser_ctask = ctask-dd_data;
   ctask-hdr = (struct iscsi_cmd *)iser_ctask-desc.iscsi_header;
 - ctask-hdr_max = sizeof(iser_ctask-desc.iscsi_header);
 + ctask-hdr_max = sizeof(iser_ctask-desc.iscsi_header) +
 +  sizeof(iser_ctask-desc.hdrextbuf);
   }
  
   for (i = 0; i  session-mgmtpool_max; i++) {
 @@ -604,7 +605,7 @@ static struct iscsi_transport iscsi_iser_transport = {
   .host_template  = iscsi_iser_sht,
   .conndata_size  = sizeof(struct iscsi_conn),
   .max_lun= ISCSI_ISER_MAX_LUN,
 - .max_cmd_len= ISCSI_ISER_MAX_CMD_LEN,
 + .max_cmd_len= 260,

Same bug I had. .max_cmd_len is still char, before the varlen patch to scsi-ml.
So it must be at most 252, Until that patch is introduced and it can return to
the correct 260 or better yet SCSI_MAX_VARLEN_CDB_SIZE. That also is only 
defined in the scsi-ml varlen patch.

   /* session management */
   .create_session = iscsi_iser_session_create,
   .destroy_session= iscsi_session_teardown,
 diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.h 
 b/drivers/infiniband/ulp/iser/iscsi_iser.h
 index db8f81a..66905df 100644
 --- a/drivers/infiniband/ulp/iser/iscsi_iser.h
 +++ b/drivers/infiniband/ulp/iser/iscsi_iser.h
 @@ -90,7 +90,6 @@
  /* FMR space for 1 MB of 4k-page transfers, plus 1 if not page aligned */
  #define ISCSI_ISER_SG_TABLESIZE (((120)  SHIFT_4K) + 1)
  #define ISCSI_ISER_MAX_LUN   256
 -#define ISCSI_ISER_MAX_CMD_LEN   16
  
  /* QP settings */
  /* Maximal bounds on received asynchronous PDUs */
 @@ -217,6 +216,7 @@ enum iser_desc_type {
  struct iser_desc {
   struct iser_hdr  iser_header;
   struct iscsi_hdr iscsi_header;
 + char hdrextbuf[ISCSI_MAX_AHS_SIZE];
   struct iser_regd_buf hdr_regd_buf;
   void *data; /* used by RX  TX_CONTROL 
 */
   struct iser_regd_buf data_regd_buf; /* used by RX  TX_CONTROL 
 */
 diff --git a/drivers/infiniband/ulp/iser/iser_initiator.c 
 b/drivers/infiniband/ulp/iser/iser_initiator.c
 index 83247f1..ea3f5dc 100644
 --- a/drivers/infiniband/ulp/iser/iser_initiator.c
 +++ b/drivers/infiniband/ulp/iser/iser_initiator.c
 @@ -246,9 +246,13 @@ post_rx_kmalloc_failure:
   return err;
  }
  
 -/* creates a new tx descriptor and adds header regd buffer */
 +/**
 + * iser_create_send_desc - creates a new tx descriptor and adds header regd 
 buffer
 + * @iscsi_hdr_len: length of struct iscsi_hdr and any AHSes in hdrextbuf
 + */
  static void

[PATCH 3/3 ver2] iscsi: bidi support - iscsi_tcp

2008-02-18 Thread Boaz Harrosh


  bidi support for iscsi_tcp
  - access the right scsi_in() and/or scsi_out() side of things.
also for resid

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
Reviewed-by: Pete Wyckoff [EMAIL PROTECTED]
Signed-off-by: Mike Christie [EMAIL PROTECTED]
---
 drivers/scsi/iscsi_tcp.c |   31 +--
 1 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c
index 8a17867..72b9b2a 100644
--- a/drivers/scsi/iscsi_tcp.c
+++ b/drivers/scsi/iscsi_tcp.c
@@ -528,6 +528,7 @@ iscsi_data_rsp(struct iscsi_conn *conn, struct 
iscsi_cmd_task *ctask)
struct iscsi_session *session = conn-session;
struct scsi_cmnd *sc = ctask-sc;
int datasn = be32_to_cpu(rhdr-datasn);
+   unsigned total_in_length = scsi_in(sc)-length;
 
iscsi_update_cmdsn(session, (struct iscsi_nopin*)rhdr);
if (tcp_conn-in.datalen == 0)
@@ -542,10 +543,10 @@ iscsi_data_rsp(struct iscsi_conn *conn, struct 
iscsi_cmd_task *ctask)
tcp_ctask-exp_datasn++;
 
tcp_ctask-data_offset = be32_to_cpu(rhdr-offset);
-   if (tcp_ctask-data_offset + tcp_conn-in.datalen  scsi_bufflen(sc)) {
+   if (tcp_ctask-data_offset + tcp_conn-in.datalen  total_in_length) {
debug_tcp(%s: data_offset(%d) + data_len(%d)  
total_length_in(%d)\n,
  __FUNCTION__, tcp_ctask-data_offset,
- tcp_conn-in.datalen, scsi_bufflen(sc));
+ tcp_conn-in.datalen, total_in_length);
return ISCSI_ERR_DATA_OFFSET;
}
 
@@ -558,8 +559,8 @@ iscsi_data_rsp(struct iscsi_conn *conn, struct 
iscsi_cmd_task *ctask)
 
if (res_count  0 
(rhdr-flags  ISCSI_FLAG_CMD_OVERFLOW ||
-res_count = scsi_bufflen(sc)))
-   scsi_set_resid(sc, res_count);
+res_count = total_in_length))
+   scsi_in(sc)-resid = res_count;
else
sc-result = (DID_BAD_TARGET  16) |
rhdr-cmd_status;
@@ -670,11 +671,11 @@ iscsi_r2t_rsp(struct iscsi_conn *conn, struct 
iscsi_cmd_task *ctask)
r2t-data_length, session-max_burst);
 
r2t-data_offset = be32_to_cpu(rhdr-data_offset);
-   if (r2t-data_offset + r2t-data_length  scsi_bufflen(ctask-sc)) {
+   if (r2t-data_offset + r2t-data_length  scsi_out(ctask-sc)-length) {
iscsi_conn_printk(KERN_ERR, conn,
  invalid R2T with data len %u at offset %u 
  and total length %d\n, r2t-data_length,
- r2t-data_offset, scsi_bufflen(ctask-sc));
+ r2t-data_offset, 
scsi_out(ctask-sc)-length);
__kfifo_put(tcp_ctask-r2tpool.queue, (void*)r2t,
sizeof(void*));
return ISCSI_ERR_DATALEN;
@@ -771,6 +772,7 @@ iscsi_tcp_hdr_dissect(struct iscsi_conn *conn, struct 
iscsi_hdr *hdr)
if (tcp_conn-in.datalen) {
struct iscsi_tcp_cmd_task *tcp_ctask = ctask-dd_data;
struct hash_desc *rx_hash = NULL;
+   struct scsi_data_buffer *sdb = scsi_in(ctask-sc);
 
/*
 * Setup copy of Data-In into the Scsi_Cmnd
@@ -788,8 +790,8 @@ iscsi_tcp_hdr_dissect(struct iscsi_conn *conn, struct 
iscsi_hdr *hdr)
  tcp_ctask-data_offset,
  tcp_conn-in.datalen);
return iscsi_segment_seek_sg(tcp_conn-in.segment,
-scsi_sglist(ctask-sc),
-scsi_sg_count(ctask-sc),
+sdb-table.sgl,
+sdb-table.nents,
 tcp_ctask-data_offset,
 tcp_conn-in.datalen,
 iscsi_tcp_process_data_in,
@@ -1332,7 +1334,8 @@ iscsi_tcp_ctask_init(struct iscsi_cmd_task *ctask)
return 0;
 
/* If we have immediate data, attach a payload */
-   err = iscsi_tcp_send_data_prep(conn, scsi_sglist(sc), scsi_sg_count(sc),
+   err = iscsi_tcp_send_data_prep(conn, scsi_out(sc)-table.sgl,
+  scsi_out(sc)-table.nents,
   0, ctask-imm_count);
if (err)
return err;
@@ -1386,6 +1389,7 @@ iscsi_tcp_ctask_xmit(struct iscsi_conn *conn, struct 
iscsi_cmd_task *ctask)
 {
struct iscsi_tcp_cmd_task *tcp_ctask = ctask-dd_data;
struct scsi_cmnd *sc = ctask-sc

[PATCH 2/3 ver2] iscsi: bidi support - libiscsi

2008-02-18 Thread Boaz Harrosh


  iscsi bidi support at the generic libiscsi level
  - prepare the additional bidi_read rlength header.
  - access the right scsi_in() and/or scsi_out() side of things.
also for resid.
  - Handle BIDI underflow overflow from target

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
Reviewed-by: Pete Wyckoff [EMAIL PROTECTED]
Signed-off-by: Mike Christie [EMAIL PROTECTED]
---
 drivers/scsi/libiscsi.c |   85 ++
 1 files changed, 70 insertions(+), 15 deletions(-)

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index a43b8ee..9c12915 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -176,6 +176,31 @@ static int iscsi_prep_ecdb_ahs(struct iscsi_cmd_task 
*ctask)
return 0;
 }
 
+static int iscsi_prep_bidi_ahs(struct iscsi_cmd_task *ctask)
+{
+   struct scsi_cmnd *sc = ctask-sc;
+   struct iscsi_rlength_ahdr *rlen_ahdr;
+   int rc;
+
+   rlen_ahdr = iscsi_next_hdr(ctask);
+   rc = iscsi_add_hdr(ctask, sizeof(*rlen_ahdr));
+   if (rc)
+   return rc;
+
+   rlen_ahdr-ahslength =
+   cpu_to_be16(sizeof(rlen_ahdr-read_length) +
+ sizeof(rlen_ahdr-reserved));
+   rlen_ahdr-ahstype = ISCSI_AHSTYPE_RLENGTH;
+   rlen_ahdr-reserved = 0;
+   rlen_ahdr-read_length = cpu_to_be32(scsi_in(sc)-length);
+
+   debug_scsi(bidi-in rlen_ahdr-read_length(%d) 
+  rlen_ahdr-ahslength(%d)\n,
+  be32_to_cpu(rlen_ahdr-read_length),
+  be16_to_cpu(rlen_ahdr-ahslength));
+   return 0;
+}
+
 /**
  * iscsi_prep_scsi_cmd_pdu - prep iscsi scsi cmd pdu
  * @ctask: iscsi cmd task
@@ -200,7 +225,6 @@ static int iscsi_prep_scsi_cmd_pdu(struct iscsi_cmd_task 
*ctask)
hdr-flags = ISCSI_ATTR_SIMPLE;
int_to_scsilun(sc-device-lun, (struct scsi_lun *)hdr-lun);
hdr-itt = build_itt(ctask-itt, session-age);
-   hdr-data_length = cpu_to_be32(scsi_bufflen(sc));
hdr-cmdsn = cpu_to_be32(session-cmdsn);
session-cmdsn++;
hdr-exp_statsn = cpu_to_be32(conn-exp_statsn);
@@ -216,7 +240,15 @@ static int iscsi_prep_scsi_cmd_pdu(struct iscsi_cmd_task 
*ctask)
memcpy(hdr-cdb, sc-cmnd, cmd_len);
 
ctask-imm_count = 0;
+   if (scsi_bidi_cmnd(sc)) {
+   hdr-flags |= ISCSI_FLAG_CMD_READ;
+   rc = iscsi_prep_bidi_ahs(ctask);
+   if (rc)
+   return rc;
+   }
if (sc-sc_data_direction == DMA_TO_DEVICE) {
+   unsigned out_len = scsi_out(sc)-length;
+   hdr-data_length = cpu_to_be32(out_len);
hdr-flags |= ISCSI_FLAG_CMD_WRITE;
/*
 * Write counters:
@@ -237,19 +269,19 @@ static int iscsi_prep_scsi_cmd_pdu(struct iscsi_cmd_task 
*ctask)
ctask-unsol_datasn = 0;
 
if (session-imm_data_en) {
-   if (scsi_bufflen(sc) = session-first_burst)
+   if (out_len = session-first_burst)
ctask-imm_count = min(session-first_burst,
conn-max_xmit_dlength);
else
-   ctask-imm_count = min(scsi_bufflen(sc),
+   ctask-imm_count = min(out_len,
conn-max_xmit_dlength);
hton24(hdr-dlength, ctask-imm_count);
} else
zero_data(hdr-dlength);
 
if (!session-initial_r2t_en) {
-   ctask-unsol_count = min((session-first_burst),
-   (scsi_bufflen(sc))) - ctask-imm_count;
+   ctask-unsol_count = min(session-first_burst, out_len)
+- ctask-imm_count;
ctask-unsol_offset = ctask-imm_count;
}
 
@@ -259,6 +291,7 @@ static int iscsi_prep_scsi_cmd_pdu(struct iscsi_cmd_task 
*ctask)
} else {
hdr-flags |= ISCSI_FLAG_CMD_FINAL;
zero_data(hdr-dlength);
+   hdr-data_length = cpu_to_be32(scsi_in(sc)-length);
 
if (sc-sc_data_direction == DMA_FROM_DEVICE)
hdr-flags |= ISCSI_FLAG_CMD_READ;
@@ -277,10 +310,12 @@ static int iscsi_prep_scsi_cmd_pdu(struct iscsi_cmd_task 
*ctask)
return EIO;
 
conn-scsicmd_pdus_cnt++;
-   debug_scsi(iscsi prep [%s cid %d sc %p cdb 0x%x itt 0x%x len %d 
-   cmdsn %d win %d]\n,
-   sc-sc_data_direction == DMA_TO_DEVICE ? write : read,
-   conn-id, sc, sc-cmnd[0], ctask-itt, scsi_bufflen(sc),
+   debug_scsi(iscsi prep [%s cid %d sc %p cdb 0x%x itt 0x%x 
+   len %d bidi_len %d cmdsn %d win %d]\n,
+   scsi_bidi_cmnd(sc

Re: [PATCH] 2.6.25-rc1-git2: GDT SCSI: change drivers/scsi/gdth.c into using pci_get device

2008-02-17 Thread Boaz Harrosh

On Sat, Feb 16 2008 at 18:37 +0200, Matthew Wilcox [EMAIL PROTECTED] wrote:
 On Wed, Feb 13, 2008 at 10:57:37AM +0200, Boaz Harrosh wrote:
 I still don't have a card for testing myself. Again anyone
 wants to send me a card. Intel people anybody home?
 
 Apparently Intel sold this line of cards to Adaptec.  The copyright
 notice in the file backs this up:
 
  * Copyright (C) 1995-06 ICP vortex GmbH, Achim Leubner *
  * Copyright (C) 2002-04 Intel Corporation  *
  * Copyright (C) 2003-06 Adaptec Inc.   *
  * [EMAIL PROTECTED]  *
 

OK I never got this guy to ping back. CCing [EMAIL PROTECTED]

Who is the right contact person, regarding the HW that is supported
by the gdth driver? Form what we see in driver commit logs (above), 
it was transfered to  Adaptec from Intel in 2003. Is that still so?

Boaz
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ips.c broken since 2.6.23 on x86_64?

2008-02-17 Thread Boaz Harrosh

On Sat, Feb 16 2008 at 2:41 +0200, FUJITA Tomonori [EMAIL PROTECTED] wrote:
 On Fri, 15 Feb 2008 14:50:57 -0800
 Tim Pepper [EMAIL PROTECTED] wrote:
 
 On Sat 16 Feb at 01:09:43 +0900 [EMAIL PROTECTED] said:
 The first one is just reverting the data buffer accessors
 conversion. It would be nice if we could just revert it but we
 can't. These changes are necessary to compile the driver against post
 2.6.24.
 Fujita-san,

 Unfortunately (and not too surprisingly given what we've tried so far) with
 only the first of your series reverted the driver is working fine for me
 again.
 
 Do you mean that you applied only the following two patches against
 2.6.24, and then it doesn't work?
 
 0001-ips-revert-the-changes-for-the-data-buffer-accessor.patch
 0002-ips-kill-the-map_single-path-in-ips_scmd_buf_write.patch
 
 If so, the second patch is broken. Did you saw BUG_ON message (I added
 some BUG_ON to the patch)?
 
 
 I saw (eg: replies to http://lkml.org/lkml/2007/5/11/132) some possibly
 similar sounding issues with other drivers.  Could there be some memory
 uninitialised?  I did try changing all the ips.c kmalloc's to kzalloc's,
 but that didn't help.  Also that thread ties into pci gart.  The machines
 we've been using are liable to getting pci calgary although given my
 .config has:
 CONFIG_GART_IOMMU=y
 CONFIG_CALGARY_IOMMU=y
 # CONFIG_CALGARY_IOMMU_ENABLED_BY_DEFAULT is not set
 and that when booting this mainline I don't see any Calgary related
 messages like I get from eg: Ubuntu's 2.6.22-14-server...I'm probably not
 actually running the calgary iommu code in these repros.
 
 Yes, probabaly, your machine doesn't use any IOMMU hardware
 (nommu_map_sg function was in your crash log).
 
 
 Anyway, I greatly appreciate your efforts so far in trying to find what
 could be wrong here!
 
 Really sorry about the troubles and thanks for testing.
 -
Tomo hi
It looks like the same bug we had with USB's isd200 and protocol.c. An overflow
of a data buffer bigger then then the sglist. There 2 it was in the INQUIRY 
command.

You just need to also check for sg != NULL in the for() loop.

Tim Please test below patch. It's ontop of 2.6.24 but should also apply to
2.6.25-rcx

Boaz
-- 
From ec20bea25c9fe2400378b19c128b15fef3c7cbb6 Mon Sep 17 00:00:00 2001
From: Boaz Harrosh [EMAIL PROTECTED]
Date: Sun, 17 Feb 2008 14:50:25 +0200
Subject: [PATCH] ips: Avoid overflow in writing scsi command data

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/scsi/ips.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/ips.c b/drivers/scsi/ips.c
index 5c5a9b2..1d12253 100644
--- a/drivers/scsi/ips.c
+++ b/drivers/scsi/ips.c
@@ -3517,7 +3517,7 @@ ips_scmd_buf_write(struct scsi_cmnd *scmd, void *data, 
unsigned int count)
 struct scatterlist *sg = scsi_sglist(scmd);
 
 for (i = 0, xfer_cnt = 0;
- (i  scsi_sg_count(scmd))  (xfer_cnt  count); i++) {
+ (i  scsi_sg_count(scmd))  (xfer_cnt  count)  sg; i++) {
 min_cnt = min(count - xfer_cnt, sg[i].length);
 
 /* kmap_atomic() ensures addressability of the data buffer.*/
-- 
1.5.3.3

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ips.c broken since 2.6.23 on x86_64?

2008-02-17 Thread Boaz Harrosh

On Sun, Feb 17 2008 at 14:52 +0200, Boaz Harrosh [EMAIL PROTECTED] wrote:
 On Sat, Feb 16 2008 at 2:41 +0200, FUJITA Tomonori [EMAIL PROTECTED] wrote:
 On Fri, 15 Feb 2008 14:50:57 -0800
 Tim Pepper [EMAIL PROTECTED] wrote:

 On Sat 16 Feb at 01:09:43 +0900 [EMAIL PROTECTED] said:
 The first one is just reverting the data buffer accessors
 conversion. It would be nice if we could just revert it but we
 can't. These changes are necessary to compile the driver against post
 2.6.24.
 Fujita-san,

 Unfortunately (and not too surprisingly given what we've tried so far) with
 only the first of your series reverted the driver is working fine for me
 again.
 Do you mean that you applied only the following two patches against
 2.6.24, and then it doesn't work?

 0001-ips-revert-the-changes-for-the-data-buffer-accessor.patch
 0002-ips-kill-the-map_single-path-in-ips_scmd_buf_write.patch

 If so, the second patch is broken. Did you saw BUG_ON message (I added
 some BUG_ON to the patch)?


 I saw (eg: replies to http://lkml.org/lkml/2007/5/11/132) some possibly
 similar sounding issues with other drivers.  Could there be some memory
 uninitialised?  I did try changing all the ips.c kmalloc's to kzalloc's,
 but that didn't help.  Also that thread ties into pci gart.  The machines
 we've been using are liable to getting pci calgary although given my
 .config has:
 CONFIG_GART_IOMMU=y
 CONFIG_CALGARY_IOMMU=y
 # CONFIG_CALGARY_IOMMU_ENABLED_BY_DEFAULT is not set
 and that when booting this mainline I don't see any Calgary related
 messages like I get from eg: Ubuntu's 2.6.22-14-server...I'm probably not
 actually running the calgary iommu code in these repros.
 Yes, probabaly, your machine doesn't use any IOMMU hardware
 (nommu_map_sg function was in your crash log).


 Anyway, I greatly appreciate your efforts so far in trying to find what
 could be wrong here!
 Really sorry about the troubles and thanks for testing.
 -
 Tomo hi
 It looks like the same bug we had with USB's isd200 and protocol.c. An 
 overflow
 of a data buffer bigger then then the sglist. There 2 it was in the INQUIRY 
 command.
 
 You just need to also check for sg != NULL in the for() loop.
 
 Tim Please test below patch. It's ontop of 2.6.24 but should also apply to
 2.6.25-rcx
 
 Boaz
 -- 
 From ec20bea25c9fe2400378b19c128b15fef3c7cbb6 Mon Sep 17 00:00:00 2001
 From: Boaz Harrosh [EMAIL PROTECTED]
 Date: Sun, 17 Feb 2008 14:50:25 +0200
 Subject: [PATCH] ips: Avoid overflow in writing scsi command data
 
 Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
 ---
  drivers/scsi/ips.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)
 
 diff --git a/drivers/scsi/ips.c b/drivers/scsi/ips.c
 index 5c5a9b2..1d12253 100644
 --- a/drivers/scsi/ips.c
 +++ b/drivers/scsi/ips.c
 @@ -3517,7 +3517,7 @@ ips_scmd_buf_write(struct scsi_cmnd *scmd, void *data, 
 unsigned int count)

Disregard that patch it is totally wrong. That's not it. Sorry

Boaz

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: gdth new set of patches for 2.6.24 stable

2008-02-17 Thread Boaz Harrosh

On Thu, Feb 14 2008 at 20:47 +0200, Boaz Harrosh [EMAIL PROTECTED] wrote:
 Submitted are a new set of patches, that fix lots of problems
 with the gdth driver.
 
 It fixes the following problems:
 - scan for drives on hosts. (Already in mainline)
 - truly fixes the exit/reboot problems but does call flush() before
   reboot.
 - fix crash when accessing array with icpcon management application.
 - fix crash when doing $ cat /proc/sys/gdth/0.
   This one still has the below WARN_ON in messages (see gdth_info below)
   So there is one more thing hiding in there.
 - use pci_get_device
   One of the testers requested if we can also put the move to pci_get_device 
   patch with removal of dependency on PCI_LEGACY, to the stable release.
 
 The patches are for and based on Linux-2.6.24. here is the list of patches:
   [PATCH 1/5] gdth: update deprecated pci_find_device
   [PATCH 2/5] gdth: scan for scsi devices
   [PATCH 3/5] gdth: bugfix for the at-exit problems
   [PATCH 4/5] gdth: fix to internal commands execution
   [PATCH 5/5] gdth: remove gdth cooked up command accessors
 
 Please all test and report your findings.
 
 Thanks in advance
 Boaz
 
 ---
 gdth_info
   WARNING: at arch/x86/kernel/pci-dma_32.c:66 dma_free_coherent()
   Pid: 5501, comm: cat Not tainted 2.6.24 #43
[c0107137] dma_free_coherent+0x93/0x95
[c025ef73] gdth_ioctl_free+0x4c/0x69
[c0264a36] gdth_proc_info+0x165f/0x182c
[c0111f7a] update_curr+0xeb/0xf2
[c01132aa] task_rq_lock+0x29/0x50
[c0113706] try_to_wake_up+0x42/0x342
[c0113706] try_to_wake_up+0x42/0x342
[c0111a9f] __wake_up_common+0x46/0x6d
[c0113569] __wake_up+0x32/0x42
[c022dad9] n_tty_receive_buf+0x2e8/0xe97
[c022dad9] n_tty_receive_buf+0x2e8/0xe97
[c0111f0a] update_curr+0x7b/0xf2
[c0112625] enqueue_task_fair+0x27/0x30
[c0111783] enqueue_task+0xa/0x14
[c025e351] proc_scsi_read+0x29/0x3d
[c025e328] proc_scsi_read+0x0/0x3d
[c0189704] proc_file_read+0x1c6/0x279
[c018953e] proc_file_read+0x0/0x279
[c0185eca] proc_reg_read+0x53/0x71
[c0185e77] proc_reg_read+0x0/0x71
[c0159968] vfs_read+0x85/0x11b
[c0159d9d] sys_read+0x41/0x6a
[c0102822] sysenter_past_esp+0x5f/0x85
  /gdth_info
 -
James hi.

All my testers have reported back that with these 5 patches applied they can
now run with a 2.6.24 kernel the same way they ran before. However there is
that reported issue, with the dma_free_coherent WARN_ON (above). The code was 
like that from day one and it is a very old issue, however it is a regression 
because 2.6.24 introduced that new WARN_ON.
(infamous commit aa24886e379d2b641c5117e178b15ce1d5d366ba)
From posts on lkml and even recent one in linux-scsi about the arcmsr driver
it looks that all a driver can do is work around it with different kernel 
mechanisms
and driver rewrites. I'm afraid I need your help here. I'm not sure I understand
why does the gdth driver uses the pci_{alloc,free}_consistent() API's, and what
is needed to replace it. Could you please have a look in gdth_proc.c and also in
gdth.c for all the places that call gdth_ioctl_alloc/gdth_ioctl_free, and advise
what can I do in it's place. Please bear in mind that we need it for 2.6.24, as
a bugfix.

Apart from the above issue, please accept patches 3,4,5 above they have now
been tested and are reported to bring broken system back to production.
(Given that you approve off course). And mark them for inclusion to the
2.6.24 stable releases. (Or is there some thing that I should do)

---
Meanwhile on x86 systems I understand the WARN_ON is cosmetic, and does not
pose any harm. Some people have reported stability with temporarily disabling
it. For testers that want to try, here it is below. At your own risk.

---
From 50d3657bf6a138ee63ad1ce00052380edc75ace7 Mon Sep 17 00:00:00 2001
From: Boaz Harrosh [EMAIL PROTECTED]
Date: Sun, 17 Feb 2008 12:49:35 +0200
Subject: [PATCH] gdth: Hack to remove WARN_ON in arch/x86/kernel/pci-dma_32.c

  gdth uses dma_free_coherent() with interrupts disabled. Which
  is not portable, but is safe on the HW that supports gdth.

NOT Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 arch/x86/kernel/pci-dma_32.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/pci-dma_32.c b/arch/x86/kernel/pci-dma_32.c
index 5133032..350dcfd 100644
--- a/arch/x86/kernel/pci-dma_32.c
+++ b/arch/x86/kernel/pci-dma_32.c
@@ -63,7 +63,7 @@ void dma_free_coherent(struct device *dev, size_t size,
struct dma_coherent_mem *mem = dev ? dev-dma_mem : NULL;
int order = get_order(size);
 
-   WARN_ON(irqs_disabled());   /* for portability */
+/* WARN_ON(irqs_disabled());*/ /* for portability */
if (mem  vaddr = mem-virt_base  vaddr  (mem-virt_base + 
(mem-size  PAGE_SHIFT))) {
int page = (vaddr - mem-virt_base)  PAGE_SHIFT;
 
-- 
1.5.3.3
 
 




-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message

Re: [BUGFIX 2/2] gdth: bugfix for the Timer at exit crash

2008-02-14 Thread Boaz Harrosh

On Wed, Feb 13 2008 at 19:36 +0200, James Bottomley [EMAIL PROTECTED] wrote:
snip
 ---
 From: Boaz Harrosh [EMAIL PROTECTED]
 Subject: [PATCH] gdth: bugfix for the at-exit problems

 gdth_exit would first remove all cards then stop the timer
 and would not sync with the timer function. This caused a crash
 in gdth_timer() when module was unloaded.
 So del_timer_sync the timer before we delete the cards.

 also the reboot notifier function would crash. So unify
 the exit and halt functions with a gdth_shutdown() that's
 called by both.

 Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
 ---
snip
 +static struct notifier_block gdth_notifier = {
 +gdth_halt, NULL, 0
 +};
 +
 +bool gdth_shutdown_done;

right forgot the static. But I use it in gdth_init(), so it
must be external. Unless you promise me that gdth_init() will
never ever be called after a call to shutdown.
Any way the hot-plug patch changes all that. This is only
for 2.6.24 bugfixs.

 Static police alert!  Just make it static and move it into
 gdth_shutdown()

 +static void gdth_shutdown()
 +{
 +gdth_ha_str *ha;
 +if (gdth_shutdown_done)
 +return;
 +
 +gdth_shutdown_done = true;
 +unregister_chrdev(major,gdth);
 +unregister_reboot_notifier(gdth_notifier);

 I'm not sure you can do this, aren't reboot notifiers called with the
 rwsem held?  In which case the unregister which also takes the rwsem
 will hang the system.

humm, can't remove a notifier from within the notifier. Thanks James for 
the catch, it's what happens when you don't test your own patches.

I have moved unregister_reboot_notifier to gdth_exit.
 James

Will send a new version for review. Please note that this is a bugfix patch
on top of 2.6.24. It is not needed for Jeff's hot-plug path.

There will be one more bugfix patch for a crash at the user-mode ioctl code.

Boaz

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] gdth: bugfix for the at-exit problems

2008-02-14 Thread Boaz Harrosh


This is a bugfix for the 2.6.24.x stable releases.

gdth_exit would first remove all cards then stop the timer
and would not sync with the timer function. This caused a crash
in gdth_timer() when module was unloaded.
So del_timer_sync the timer before we delete the cards.

also the reboot notifier function would crash. So unify
the exit and halt functions with a gdth_shutdown() that's
called by both.

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/scsi/gdth.c |   90 --
 1 files changed, 36 insertions(+), 54 deletions(-)

diff --git a/drivers/scsi/gdth.c b/drivers/scsi/gdth.c
index 8eb78be..3828b23 100644
--- a/drivers/scsi/gdth.c
+++ b/drivers/scsi/gdth.c
@@ -183,7 +183,6 @@ static int gdth_ioctl(struct inode *inode, struct file 
*filep,
   unsigned int cmd, unsigned long arg);
 
 static void gdth_flush(gdth_ha_str *ha);
-static int gdth_halt(struct notifier_block *nb, ulong event, void *buf);
 static int gdth_queuecommand(Scsi_Cmnd *scp,void (*done)(Scsi_Cmnd *));
 static int __gdth_queuecommand(gdth_ha_str *ha, struct scsi_cmnd *scp,
struct gdth_cmndinfo *cmndinfo);
@@ -418,12 +417,6 @@ static inline void gdth_set_sglist(struct scsi_cmnd *cmd,
 #include gdth_proc.h
 #include gdth_proc.c
 
-/* notifier block to get a notify on system shutdown/halt/reboot */
-static struct notifier_block gdth_notifier = {
-gdth_halt, NULL, 0
-};
-static int notifier_disabled = 0;
-
 static gdth_ha_str *gdth_find_ha(int hanum)
 {
gdth_ha_str *ha;
@@ -3793,6 +3786,8 @@ static void gdth_timeout(ulong data)
 gdth_ha_str *ha;
 ulong flags;
 
+BUG_ON(list_empty(gdth_instances));
+
 ha = list_first_entry(gdth_instances, gdth_ha_str, list);
 spin_lock_irqsave(ha-smp_lock, flags);
 
@@ -4668,45 +4663,6 @@ static void gdth_flush(gdth_ha_str *ha)
 }
 }
 
-/* shutdown routine */
-static int gdth_halt(struct notifier_block *nb, ulong event, void *buf)
-{
-gdth_ha_str *ha;
-#ifndef __alpha__
-gdth_cmd_strgdtcmd;
-charcmnd[MAX_COMMAND_SIZE];   
-#endif
-
-if (notifier_disabled)
-return NOTIFY_OK;
-
-TRACE2((gdth_halt() event %d\n,(int)event));
-if (event != SYS_RESTART  event != SYS_HALT  event != SYS_POWER_OFF)
-return NOTIFY_DONE;
-
-notifier_disabled = 1;
-printk(GDT-HA: Flushing all host drives .. );
-list_for_each_entry(ha, gdth_instances, list) {
-gdth_flush(ha);
-
-#ifndef __alpha__
-/* controller reset */
-memset(cmnd, 0xff, MAX_COMMAND_SIZE);
-gdtcmd.BoardNode = LOCALBOARD;
-gdtcmd.Service = CACHESERVICE;
-gdtcmd.OpCode = GDT_RESET;
-TRACE2((gdth_halt(): reset controller %d\n, ha-hanum));
-gdth_execute(ha-shost, gdtcmd, cmnd, 10, NULL);
-#endif
-}
-printk(Done.\n);
-
-#ifdef GDTH_STATISTICS
-del_timer(gdth_timer);
-#endif
-return NOTIFY_OK;
-}
-
 /* configure lun */
 static int gdth_slave_configure(struct scsi_device *sdev)
 {
@@ -5141,13 +5097,13 @@ static void gdth_remove_one(gdth_ha_str *ha)
 
scsi_remove_host(shp);
 
+   gdth_flush(ha);
+
if (ha-sdev) {
scsi_free_host_dev(ha-sdev);
ha-sdev = NULL;
}
 
-   gdth_flush(ha);
-
if (shp-irq)
free_irq(shp-irq,ha);
 
@@ -5173,6 +5129,22 @@ static void gdth_remove_one(gdth_ha_str *ha)
scsi_host_put(shp);
 }
 
+static void gdth_shutdown(void);
+static int gdth_halt(struct notifier_block *nb, ulong event, void *buf)
+{
+   TRACE2((gdth_halt() event %d\n, (int)event));
+   if (event != SYS_RESTART  event != SYS_HALT  event != SYS_POWER_OFF)
+   return NOTIFY_DONE;
+
+   gdth_shutdown();
+   return NOTIFY_OK;
+}
+
+static struct notifier_block gdth_notifier = {
+gdth_halt, NULL, 0
+};
+static bool gdth_shutdown_done;
+
 static int __init gdth_init(void)
 {
if (disable) {
@@ -5185,6 +5157,7 @@ static int __init gdth_init(void)
   GDTH_VERSION_STR);
 
/* initializations */
+   gdth_shutdown_done = false;
gdth_polling = TRUE;
gdth_clear_events();
 
@@ -5235,23 +5208,32 @@ static int __init gdth_init(void)
add_timer(gdth_timer);
 #endif
major = register_chrdev(0,gdth, gdth_fops);
-   notifier_disabled = 0;
register_reboot_notifier(gdth_notifier);
gdth_polling = FALSE;
return 0;
 }
 
-static void __exit gdth_exit(void)
+static void gdth_shutdown()
 {
gdth_ha_str *ha;
 
-   list_for_each_entry(ha, gdth_instances, list)
-   gdth_remove_one(ha);
+   if (gdth_shutdown_done)
+   return;
+
+   gdth_shutdown_done = true;
+   unregister_chrdev(major, gdth);
 
 #ifdef GDTH_STATISTICS
-   del_timer(gdth_timer);
+   del_timer_sync(gdth_timer);
 #endif
-   unregister_chrdev(major,gdth);
+
+   list_for_each_entry(ha, gdth_instances

Re: [BUGFIX 2/2] gdth: bugfix for the Timer at exit crash

2008-02-14 Thread Boaz Harrosh

On Wed, Feb 13 2008 at 21:38 +0200, Jan Engelhardt [EMAIL PROTECTED] wrote:
 On Feb 13 2008 11:03, Boaz Harrosh wrote:
 I've tested this patch now - and it works fine. Now rmmod, halt and 
 reboot also works.

 Stefan Priebe

 This is grate news Stefan. Thank you very much for all your time
 and effort, with out we could not have fixed all this.
 
 Do you have a git tree with the latest pieces?
No, scsi-misc I guess ;)

I could put it here:
git://git.bhalevy.com/open-osd gdth

branch give me an hours

Boaz
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] gdth: bugfix for the at-exit problems

2008-02-14 Thread Boaz Harrosh

On Thu, Feb 14 2008 at 18:10 +0200, James Bottomley [EMAIL PROTECTED] wrote:
 On Thu, 2008-02-14 at 13:58 +0200, Boaz Harrosh wrote:
 This is a bugfix for the 2.6.24.x stable releases.

 gdth_exit would first remove all cards then stop the timer
 and would not sync with the timer function. This caused a crash
 in gdth_timer() when module was unloaded.
 So del_timer_sync the timer before we delete the cards.

 also the reboot notifier function would crash. So unify
 the exit and halt functions with a gdth_shutdown() that's
 called by both.
 
 The patch looks fine now, thanks.  Can we actually get a tester just to
 make sure there's nothing I missed.
 
 James
 
 

Yes, and the tester reported, a breakage. We are on it.
Apparently, you cannot do a full deallocation of resources
at reboot notifier, nor would you want to I guess.

But you can do the flush. The exit call is never called
on a reboot and the card access is valid to the end.

Please comment?

So I pretty much reverted that patch, but did leave some
cleanups.

Also we found the other problems reported with user-mode tools
and cat /proc/sys/gdth/0

so 2 patches on the way above reverted. Give us a few ours to test
every thing.

Thanks
Boaz

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

gdth new set of patches for 2.6.24 stable

2008-02-14 Thread Boaz Harrosh

Submitted are a new set of patches, that fix lots of problems
with the gdth driver.

It fixes the following problems:
- scan for drives on hosts. (Already in mainline)
- truly fixes the exit/reboot problems but does call flush() before
  reboot.
- fix crash when accessing array with icpcon management application.
- fix crash when doing $ cat /proc/sys/gdth/0.
  This one still has the below WARN_ON in messages (see gdth_info below)
  So there is one more thing hiding in there.
- use pci_get_device
  One of the testers requested if we can also put the move to pci_get_device 
  patch with removal of dependency on PCI_LEGACY, to the stable release.

The patches are for and based on Linux-2.6.24. here is the list of patches:
  [PATCH 1/5] gdth: update deprecated pci_find_device
  [PATCH 2/5] gdth: scan for scsi devices
  [PATCH 3/5] gdth: bugfix for the at-exit problems
  [PATCH 4/5] gdth: fix to internal commands execution
  [PATCH 5/5] gdth: remove gdth cooked up command accessors

Please all test and report your findings.

Thanks in advance
Boaz

---
gdth_info
  WARNING: at arch/x86/kernel/pci-dma_32.c:66 dma_free_coherent()
  Pid: 5501, comm: cat Not tainted 2.6.24 #43
   [c0107137] dma_free_coherent+0x93/0x95
   [c025ef73] gdth_ioctl_free+0x4c/0x69
   [c0264a36] gdth_proc_info+0x165f/0x182c
   [c0111f7a] update_curr+0xeb/0xf2
   [c01132aa] task_rq_lock+0x29/0x50
   [c0113706] try_to_wake_up+0x42/0x342
   [c0113706] try_to_wake_up+0x42/0x342
   [c0111a9f] __wake_up_common+0x46/0x6d
   [c0113569] __wake_up+0x32/0x42
   [c022dad9] n_tty_receive_buf+0x2e8/0xe97
   [c022dad9] n_tty_receive_buf+0x2e8/0xe97
   [c0111f0a] update_curr+0x7b/0xf2
   [c0112625] enqueue_task_fair+0x27/0x30
   [c0111783] enqueue_task+0xa/0x14
   [c025e351] proc_scsi_read+0x29/0x3d
   [c025e328] proc_scsi_read+0x0/0x3d
   [c0189704] proc_file_read+0x1c6/0x279
   [c018953e] proc_file_read+0x0/0x279
   [c0185eca] proc_reg_read+0x53/0x71
   [c0185e77] proc_reg_read+0x0/0x71
   [c0159968] vfs_read+0x85/0x11b
   [c0159d9d] sys_read+0x41/0x6a
   [c0102822] sysenter_past_esp+0x5f/0x85
 /gdth_info
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 4/5] gdth: fix to internal commands execution

2008-02-14 Thread Boaz Harrosh


The recent patch named:
  [SCSI] gdth: !use_sg cleanup and use of scsi accessors

has done a bad job in handling internal commands issued by gdth_execute().

Internal commands are issued with device gdth_cmd_str ready made directly
to the card, without any mapping or translations of scsi commands. So here
I added a gdth_cmd_str pointer to the gdth_cmndinfo private structure which
is then copied directly to host.

following this patch is a cleanup that removes the home cooked accessors
and reverts them to regular scsi_cmnd accessors. Since they are not used
anymore. After review maybe the 2 patches should be squashed together.

FIXME: There is still a problem with gdth_get_info(). as reported there
   is a WARN_ON trigerd in dma_free_coherent() when doing:
   $ cat /proc/sys/gdth/0

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/scsi/gdth.c |   30 --
 drivers/scsi/gdth.h |1 +
 2 files changed, 13 insertions(+), 18 deletions(-)

diff --git a/drivers/scsi/gdth.c b/drivers/scsi/gdth.c
index 0979553..077f972 100644
--- a/drivers/scsi/gdth.c
+++ b/drivers/scsi/gdth.c
@@ -160,7 +160,7 @@ static void gdth_readapp_event(gdth_ha_str *ha, unchar 
application,
 static void gdth_clear_events(void);
 
 static void gdth_copy_internal_data(gdth_ha_str *ha, Scsi_Cmnd *scp,
-char *buffer, ushort count, int to_buffer);
+char *buffer, ushort count);
 static int gdth_internal_cache_cmd(gdth_ha_str *ha, Scsi_Cmnd *scp);
 static int gdth_fill_cache_cmd(gdth_ha_str *ha, Scsi_Cmnd *scp, ushort hdrive);
 
@@ -439,8 +439,8 @@ static struct gdth_cmndinfo *gdth_get_cmndinfo(gdth_ha_str 
*ha)
for (i=0; iGDTH_MAXCMDS; ++i) {
if (ha-cmndinfo[i].index == 0) {
priv = ha-cmndinfo[i];
-   priv-index = i+1;
memset(priv, 0, sizeof(*priv));
+   priv-index = i+1;
break;
}
}
@@ -487,7 +487,6 @@ int __gdth_execute(struct scsi_device *sdev, gdth_cmd_str 
*gdtcmd, char *cmnd,
 gdth_ha_str *ha = shost_priv(sdev-host);
 Scsi_Cmnd *scp;
 struct gdth_cmndinfo cmndinfo;
-struct scatterlist one_sg;
 DECLARE_COMPLETION_ONSTACK(wait);
 int rval;
 
@@ -501,13 +500,10 @@ int __gdth_execute(struct scsi_device *sdev, gdth_cmd_str 
*gdtcmd, char *cmnd,
 /* use request field to save the ptr. to completion struct. */
 scp-request = (struct request *)wait;
 scp-timeout_per_command = timeout*HZ;
-sg_init_one(one_sg, gdtcmd, sizeof(*gdtcmd));
-gdth_set_sglist(scp, one_sg);
-gdth_set_sg_count(scp, 1);
-gdth_set_bufflen(scp, sizeof(*gdtcmd));
 scp-cmd_len = 12;
 memcpy(scp-cmnd, cmnd, 12);
 cmndinfo.priority = IOCTL_PRI;
+cmndinfo.internal_cmd_str = gdtcmd;
 cmndinfo.internal_command = 1;
 
 TRACE((__gdth_execute() cmd 0x%x\n, scp-cmnd[0]));
@@ -2351,7 +2347,7 @@ static void gdth_next(gdth_ha_str *ha)
  * buffers, kmap_atomic() as needed.
  */
 static void gdth_copy_internal_data(gdth_ha_str *ha, Scsi_Cmnd *scp,
-char *buffer, ushort count, int to_buffer)
+char *buffer, ushort count)
 {
 ushort cpcount,i, max_sg = gdth_sg_count(scp);
 ushort cpsum,cpnow;
@@ -2377,10 +2373,7 @@ static void gdth_copy_internal_data(gdth_ha_str *ha, 
Scsi_Cmnd *scp,
 }
 local_irq_save(flags);
 address = kmap_atomic(sg_page(sl), KM_BIO_SRC_IRQ) + sl-offset;
-if (to_buffer)
-memcpy(buffer, address, cpnow);
-else
-memcpy(address, buffer, cpnow);
+memcpy(address, buffer, cpnow);
 flush_dcache_page(sg_page(sl));
 kunmap_atomic(address, KM_BIO_SRC_IRQ);
 local_irq_restore(flags);
@@ -2434,7 +2427,7 @@ static int gdth_internal_cache_cmd(gdth_ha_str *ha, 
Scsi_Cmnd *scp)
 strcpy(inq.vendor,ha-oem_name);
 sprintf(inq.product,Host Drive  #%02d,t);
 strcpy(inq.revision,   );
-gdth_copy_internal_data(ha, scp, (char*)inq, sizeof(gdth_inq_data), 
0);
+gdth_copy_internal_data(ha, scp, (char*)inq, sizeof(gdth_inq_data));
 break;
 
   case REQUEST_SENSE:
@@ -2444,7 +2437,7 @@ static int gdth_internal_cache_cmd(gdth_ha_str *ha, 
Scsi_Cmnd *scp)
 sd.key   = NO_SENSE;
 sd.info  = 0;
 sd.add_length= 0;
-gdth_copy_internal_data(ha, scp, (char*)sd, sizeof(gdth_sense_data), 
0);
+gdth_copy_internal_data(ha, scp, (char*)sd, sizeof(gdth_sense_data));
 break;
 
   case MODE_SENSE:
@@ -2456,7 +2449,7 @@ static int gdth_internal_cache_cmd(gdth_ha_str *ha, 
Scsi_Cmnd *scp)
 mpd.bd.block_length[0] = (SECTOR_SIZE  0x00ff)  16;
 mpd.bd.block_length[1] = (SECTOR_SIZE  0xff00)  8;
 mpd.bd.block_length[2] = (SECTOR_SIZE

[PATCH 3/5] gdth: bugfix for the at-exit problems

2008-02-14 Thread Boaz Harrosh


This is a bugfix for the 2.6.24.x stable releases.

gdth_exit would first remove all cards then stop the timer
and would not sync with the timer function. This caused a crash
in gdth_timer() when module was unloaded.
So del_timer_sync the timer before we delete the cards.

also the reboot notifier function would crash. So clean
that up and fix the crashes.

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/scsi/gdth.c |   82 +-
 1 files changed, 28 insertions(+), 54 deletions(-)

diff --git a/drivers/scsi/gdth.c b/drivers/scsi/gdth.c
index 4959037..0979553 100644
--- a/drivers/scsi/gdth.c
+++ b/drivers/scsi/gdth.c
@@ -183,7 +183,6 @@ static int gdth_ioctl(struct inode *inode, struct file 
*filep,
   unsigned int cmd, unsigned long arg);
 
 static void gdth_flush(gdth_ha_str *ha);
-static int gdth_halt(struct notifier_block *nb, ulong event, void *buf);
 static int gdth_queuecommand(Scsi_Cmnd *scp,void (*done)(Scsi_Cmnd *));
 static int __gdth_queuecommand(gdth_ha_str *ha, struct scsi_cmnd *scp,
struct gdth_cmndinfo *cmndinfo);
@@ -418,12 +417,6 @@ static inline void gdth_set_sglist(struct scsi_cmnd *cmd,
 #include gdth_proc.h
 #include gdth_proc.c
 
-/* notifier block to get a notify on system shutdown/halt/reboot */
-static struct notifier_block gdth_notifier = {
-gdth_halt, NULL, 0
-};
-static int notifier_disabled = 0;
-
 static gdth_ha_str *gdth_find_ha(int hanum)
 {
gdth_ha_str *ha;
@@ -3796,6 +3789,8 @@ static void gdth_timeout(ulong data)
 gdth_ha_str *ha;
 ulong flags;
 
+BUG_ON(list_empty(gdth_instances));
+
 ha = list_first_entry(gdth_instances, gdth_ha_str, list);
 spin_lock_irqsave(ha-smp_lock, flags);
 
@@ -4671,45 +4666,6 @@ static void gdth_flush(gdth_ha_str *ha)
 }
 }
 
-/* shutdown routine */
-static int gdth_halt(struct notifier_block *nb, ulong event, void *buf)
-{
-gdth_ha_str *ha;
-#ifndef __alpha__
-gdth_cmd_strgdtcmd;
-charcmnd[MAX_COMMAND_SIZE];   
-#endif
-
-if (notifier_disabled)
-return NOTIFY_OK;
-
-TRACE2((gdth_halt() event %d\n,(int)event));
-if (event != SYS_RESTART  event != SYS_HALT  event != SYS_POWER_OFF)
-return NOTIFY_DONE;
-
-notifier_disabled = 1;
-printk(GDT-HA: Flushing all host drives .. );
-list_for_each_entry(ha, gdth_instances, list) {
-gdth_flush(ha);
-
-#ifndef __alpha__
-/* controller reset */
-memset(cmnd, 0xff, MAX_COMMAND_SIZE);
-gdtcmd.BoardNode = LOCALBOARD;
-gdtcmd.Service = CACHESERVICE;
-gdtcmd.OpCode = GDT_RESET;
-TRACE2((gdth_halt(): reset controller %d\n, ha-hanum));
-gdth_execute(ha-shost, gdtcmd, cmnd, 10, NULL);
-#endif
-}
-printk(Done.\n);
-
-#ifdef GDTH_STATISTICS
-del_timer(gdth_timer);
-#endif
-return NOTIFY_OK;
-}
-
 /* configure lun */
 static int gdth_slave_configure(struct scsi_device *sdev)
 {
@@ -5144,13 +5100,13 @@ static void gdth_remove_one(gdth_ha_str *ha)
 
scsi_remove_host(shp);
 
+   gdth_flush(ha);
+
if (ha-sdev) {
scsi_free_host_dev(ha-sdev);
ha-sdev = NULL;
}
 
-   gdth_flush(ha);
-
if (shp-irq)
free_irq(shp-irq,ha);
 
@@ -5176,6 +5132,24 @@ static void gdth_remove_one(gdth_ha_str *ha)
scsi_host_put(shp);
 }
 
+static int gdth_halt(struct notifier_block *nb, ulong event, void *buf)
+{
+   gdth_ha_str *ha;
+
+   TRACE2((gdth_halt() event %d\n, (int)event));
+   if (event != SYS_RESTART  event != SYS_HALT  event != SYS_POWER_OFF)
+   return NOTIFY_DONE;
+
+   list_for_each_entry(ha, gdth_instances, list)
+   gdth_flush(ha);
+
+   return NOTIFY_OK;
+}
+
+static struct notifier_block gdth_notifier = {
+gdth_halt, NULL, 0
+};
+
 static int __init gdth_init(void)
 {
if (disable) {
@@ -5238,7 +5212,6 @@ static int __init gdth_init(void)
add_timer(gdth_timer);
 #endif
major = register_chrdev(0,gdth, gdth_fops);
-   notifier_disabled = 0;
register_reboot_notifier(gdth_notifier);
gdth_polling = FALSE;
return 0;
@@ -5248,14 +5221,15 @@ static void __exit gdth_exit(void)
 {
gdth_ha_str *ha;
 
-   list_for_each_entry(ha, gdth_instances, list)
-   gdth_remove_one(ha);
+   unregister_chrdev(major, gdth);
+   unregister_reboot_notifier(gdth_notifier);
 
 #ifdef GDTH_STATISTICS
-   del_timer(gdth_timer);
+   del_timer_sync(gdth_timer);
 #endif
-   unregister_chrdev(major,gdth);
-   unregister_reboot_notifier(gdth_notifier);
+
+   list_for_each_entry(ha, gdth_instances, list)
+   gdth_remove_one(ha);
 }
 
 module_init(gdth_init);
-- 
1.5.3.3


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http

Re: [PATCH] 2.6.25-rc1-git2: GDT SCSI: change drivers/scsi/gdth.c into using pci_get device

2008-02-13 Thread Boaz Harrosh

On Wed, Feb 13 2008 at 2:17 +0200, James Bottomley [EMAIL PROTECTED] wrote:
 On Tue, 2008-02-12 at 20:48 -0300, Sergio Luis wrote:
 reposting an updated version of it. Please check if it's ok.
 
 Looks fine, thanks!  You added an extra space at the end of 
 
 while ((pdev = pci_get_device(vendor, device, pdev)) 
 
 Which I fixed.  Unfortunately checkpatch isn't very helpful for this
 driver since it uses spaces not tabs everywhere, which checkpatch really
 hates.
 
 James
 
 
James hi
This patch is now obsolete. After Jeff's patch to convert it to 
hot plug API. Jeffs patch looks very good to me. I will try
to push it through the testers.

I still don't have a card for testing myself. Again anyone
wants to send me a card. Intel people anybody home?

I will add the missing Kconfig hunk to jeff's patch and will
push it after it is tested.

Thanks
Boaz
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [BUGFIX 2/2] gdth: bugfix for the Timer at exit crash

2008-02-13 Thread Boaz Harrosh

On Wed, Feb 13 2008 at 9:06 +0200, Stefan Priebe - allied internet ag [EMAIL 
PROTECTED] wrote:
 Hello!
 
 I've tested this patch now - and it works fine. Now rmmod, halt and 
 reboot also works.
 
 Stefan Priebe
 
This is grate news Stefan. Thank you very much for all your time
and effort, with out we could not have fixed all this.

Boaz
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Fwd: tested your patch to gdth]

2008-02-13 Thread Boaz Harrosh

I have received below message from a tester. Can any one please help me
with the /proc/scsi/gdth/0 problem. Where should I look?

Boaz 

On Wed, Feb 13 2008 at 8:39 +0200, Tobias Oetiker [EMAIL PROTECTED] wrote:
 Hi Boaz,
 
 I just tested the patch to GDTH you published yesterday on LKML.
 With it I can boot of GDTH and things seem to work. Except when I
 do a cat to /proc/scsi/gdth/0 then the kernel panics and the
 machine reboots.
 
 cheers
 tobi
 

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Subject: [PATCH 1/3] Let scsi_cmnd-cmnd use request-cmd buffer

2008-02-13 Thread Boaz Harrosh

On Tue, Feb 12 2008 at 21:41 +0200, James Bottomley [EMAIL PROTECTED] wrote:
 On Sun, 2008-02-10 at 21:05 +0200, Boaz Harrosh wrote:
 - struct scsi_cmnd had a 16 bytes command buffer of its own.
   This is an unnecessary duplication and copy of request's
   cmd. It is probably left overs from the time that scsi_cmnd
   could function without a request attached. So clean that up.

 - Once above is done, few places, apart from scsi-ml, needed
   adjustments due to changing the data type of scsi_cmnd-cmnd.

 - Lots of drivers still use MAX_COMMAND_SIZE. So I have left
   that #define but equate it to BLK_MAX_CDB. The way I see it
   and is reflected in the patch below is.
   MAX_COMMAND_SIZE - means: The longest fixed-length (*) SCSI CDB
  as per the SCSI standard and is not related
  to the implementation.
   BLK_MAX_CDB. - The allocated space at the request level

 (*)fixed-length here means commands that their size can be determined
by their opcode and the CDB does not carry a length specifier, like
the VARIABLE_LENGTH_CMD(0x7f) command. This is actually not exactly
true and the SCSI standard also defines extended commands and
vendor specific commands that can be bigger than 16 bytes. The kernel
will support these using the same infrastructure used for VARLEN CDB's.
So in effect MAX_COMMAND_SIZE means the maximum size command
scsi-ml supports without specifying a cmd_len by ULD's
 
 When we do this, what happens to the minority of drivers that need the
 command in DMAable memory ... or have you audited them all and we can
 now dump the DMA pool allocation for SCSI commands?
 
 James
 
 

Am I right in assuming that I only need to audited the drivers that
have .unchecked_isa_dma set? I will redo this audit again, and report
back.

Boaz
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [BUGFIX 2/2] gdth: bugfix for the Timer at exit crash

2008-02-13 Thread Boaz Harrosh

On Tue, Feb 12 2008 at 19:40 +0200, Boaz Harrosh [EMAIL PROTECTED] wrote:
 gdth _exit would first remove all cards then stop the timer
 and would not sync with the timer function. This caused a crash
 in gdth_timer() when module was unloaded.
 
 del_timer_sync the timer before we delete the cards.
 
 NOT YET TESTED
 
 Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]

Tested-by: Stefan Priebe [EMAIL PROTECTED]

 ---
  drivers/scsi/gdth.c |   15 ---
  1 files changed, 8 insertions(+), 7 deletions(-)
 
 diff --git a/drivers/scsi/gdth.c b/drivers/scsi/gdth.c
 index 8eb78be..103280e 100644
 --- a/drivers/scsi/gdth.c
 +++ b/drivers/scsi/gdth.c
 @@ -3793,6 +3793,8 @@ static void gdth_timeout(ulong data)
  gdth_ha_str *ha;
  ulong flags;
  
 +BUG_ON(list_empty(gdth_instances));
 +
  ha = list_first_entry(gdth_instances, gdth_ha_str, list);
  spin_lock_irqsave(ha-smp_lock, flags);
  
 @@ -5146,8 +5148,6 @@ static void gdth_remove_one(gdth_ha_str *ha)
   ha-sdev = NULL;
   }
  
 - gdth_flush(ha);
 -
   if (shp-irq)
   free_irq(shp-irq,ha);
  
 @@ -5245,14 +5245,15 @@ static void __exit gdth_exit(void)
  {
   gdth_ha_str *ha;
  
 - list_for_each_entry(ha, gdth_instances, list)
 - gdth_remove_one(ha);
 + unregister_chrdev(major,gdth);
 + unregister_reboot_notifier(gdth_notifier);
  
  #ifdef GDTH_STATISTICS
 - del_timer(gdth_timer);
 + del_timer_sync(gdth_timer);
  #endif
 - unregister_chrdev(major,gdth);
 - unregister_reboot_notifier(gdth_notifier);
 +
 + list_for_each_entry(ha, gdth_instances, list)
 + gdth_remove_one(ha);
  }
  
  module_init(gdth_init);

James please put this patch in rc-fixes also. It has now been tested
by few people, and it solves a reproducible problem in the unloading
of the driver.

It was not yet confirmed by Andrew's reporter with the:
+if (list_empty(gdth_instances))
+   return;

at gdth_timer() In -mm tree. In my patch I have converted the if() to a 
BUG_ON because now it should not happen. But I figure it is not worse then 
what there is now, which is nothing.

With your recommendation I will push both patches to the stable branches
People have emailed me requesting it.

Thanks
Boaz
 
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] gdth: convert to PCI hotplug API

2008-02-13 Thread Boaz Harrosh

On Wed, Feb 13 2008 at 1:49 +0200, Jeff Garzik [EMAIL PROTECTED] wrote:
 Signed-off-by: Jeff Garzik [EMAIL PROTECTED]
 ---
  drivers/scsi/gdth.c |  143 
 +++-
  1 file changed, 86 insertions(+), 57 deletions(-)
 
snip
below is the same exact patch rebased on my last two bugfixes.
(Already in scsi-rc-fixes)

do you intend this to be pushed into 2.6.25-rcx or this is already
for 2.6.26? Should we put this in -mm tree for testing?

Boaz

THIS IS NOT YET TESTED

---
From: Jeff Garzik [EMAIL PROTECTED]
Date: Wed, 13 Feb 2008 13:01:16 +0200
Subject: [PATCH] gdth: convert to PCI hotplug API

Signed-off-by: Jeff Garzik [EMAIL PROTECTED]
Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/scsi/gdth.c |  143 ++
 1 files changed, 86 insertions(+), 57 deletions(-)

diff --git a/drivers/scsi/gdth.c b/drivers/scsi/gdth.c
index 103280e..107d2c7 100644
--- a/drivers/scsi/gdth.c
+++ b/drivers/scsi/gdth.c
@@ -596,85 +596,107 @@ static int __init gdth_search_isa(ulong32 bios_adr)
 #endif /* CONFIG_ISA */
 
 #ifdef CONFIG_PCI
-static void gdth_search_dev(gdth_pci_str *pcistr, ushort *cnt,
-ushort vendor, ushort dev);
+static gdth_pci_str gdth_pcistr[MAXHA];
+static int gdth_pci_cnt;
+static bool gdth_pci_registered;
 
-static int __init gdth_search_pci(gdth_pci_str *pcistr)
+static bool __init gdth_search_vortex(ushort device)
 {
-ushort device, cnt;
-
-TRACE((gdth_search_pci()\n));
-
-cnt = 0;
-for (device = 0; device = PCI_DEVICE_ID_VORTEX_GDT6555; ++device)
-gdth_search_dev(pcistr, cnt, PCI_VENDOR_ID_VORTEX, device);
-for (device = PCI_DEVICE_ID_VORTEX_GDT6x17RP; 
- device = PCI_DEVICE_ID_VORTEX_GDTMAXRP; ++device)
-gdth_search_dev(pcistr, cnt, PCI_VENDOR_ID_VORTEX, device);
-gdth_search_dev(pcistr, cnt, PCI_VENDOR_ID_VORTEX, 
-PCI_DEVICE_ID_VORTEX_GDTNEWRX);
-gdth_search_dev(pcistr, cnt, PCI_VENDOR_ID_VORTEX, 
-PCI_DEVICE_ID_VORTEX_GDTNEWRX2);
-gdth_search_dev(pcistr, cnt, PCI_VENDOR_ID_INTEL,
-PCI_DEVICE_ID_INTEL_SRC);
-gdth_search_dev(pcistr, cnt, PCI_VENDOR_ID_INTEL,
-PCI_DEVICE_ID_INTEL_SRC_XSCALE);
-return cnt;
+   if (device = PCI_DEVICE_ID_VORTEX_GDT6555)
+   return true;
+   if (device = PCI_DEVICE_ID_VORTEX_GDT6x17RP 
+   device = PCI_DEVICE_ID_VORTEX_GDTMAXRP)
+   return true;
+   if (device == PCI_DEVICE_ID_VORTEX_GDTNEWRX ||
+   device == PCI_DEVICE_ID_VORTEX_GDTNEWRX2)
+   return true;
+   return false;
 }
 
+static int gdth_pci_init_one(struct pci_dev *pdev,
+const struct pci_device_id *ent);
+static void gdth_pci_remove_one(struct pci_dev *pdev);
+static void gdth_remove_one(gdth_ha_str *ha);
+
 /* Vortex only makes RAID controllers.
  * We do not really want to specify all 550 ids here, so wildcard match.
  */
-static struct pci_device_id gdthtable[] __maybe_unused = {
-{PCI_VENDOR_ID_VORTEX,PCI_ANY_ID,PCI_ANY_ID, PCI_ANY_ID},
-{PCI_VENDOR_ID_INTEL,PCI_DEVICE_ID_INTEL_SRC,PCI_ANY_ID,PCI_ANY_ID}, 
-
{PCI_VENDOR_ID_INTEL,PCI_DEVICE_ID_INTEL_SRC_XSCALE,PCI_ANY_ID,PCI_ANY_ID}, 
-{0}
+static struct pci_device_id gdthtable[] __devinitdata = {
+   { PCI_VDEVICE(VORTEX, PCI_ANY_ID) },
+   { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_SRC) },
+   { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_SRC_XSCALE) },
+   { } /* terminate list */
+};
+MODULE_DEVICE_TABLE(pci, gdthtable);
+
+static struct pci_driver gdth_pci_driver = {
+   .name   = gdth,
+   .id_table   = gdthtable,
+   .probe  = gdth_pci_init_one,
+   .remove = gdth_pci_remove_one,
 };
-MODULE_DEVICE_TABLE(pci,gdthtable);
 
-static void __init gdth_search_dev(gdth_pci_str *pcistr, ushort *cnt,
-   ushort vendor, ushort device)
+static void gdth_pci_remove_one(struct pci_dev *pdev)
+{
+   gdth_ha_str *ha = pci_get_drvdata(pdev);
+
+   pci_set_drvdata(pdev, NULL);
+
+   list_del(ha-list);
+   gdth_remove_one(ha);
+
+   pci_disable_device(pdev);
+}
+
+static int __devinit gdth_pci_init_one(struct pci_dev *pdev,
+  const struct pci_device_id *ent)
 {
-ulong base0, base1, base2;
-struct pci_dev *pdev;
+   ushort vendor = pdev-vendor;
+   ushort device = pdev-device;
+   ulong base0, base1, base2;
+   int rc;
 
-TRACE((gdth_search_dev() cnt %d vendor %x device %x\n,
-  *cnt, vendor, device));
+   TRACE((gdth_search_dev() cnt %d vendor %x device %x\n,
+  gdth_pci_cnt, vendor, device));
+
+   if (vendor == PCI_VENDOR_ID_VORTEX  !gdth_search_vortex(device))
+   return -ENODEV;
+
+   rc = pci_enable_device(pdev);
+   if (rc)
+   return rc

Re: [BUGFIX 2/2] gdth: bugfix for the Timer at exit crash

2008-02-13 Thread Boaz Harrosh

On Wed, Feb 13 2008 at 17:44 +0200, James Bottomley [EMAIL PROTECTED] wrote:
 On Tue, 2008-02-12 at 19:40 +0200, Boaz Harrosh wrote:
 -gdth_flush(ha);
 -
 
 This piece doesn't look right.  gdth_flush() forces the internal cache
 to disk backing.  If you remove it, you're taking the chance that the
 machine will be powered off without a writeback which can cause data
 corruption.
 
 James
 
Yes. 
I have more problems reported, with exit, and am just sending one more patch 
that puts
this back in. Which was tested.

So I will resend this one plus one new one.

Boaz
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [BUGFIX 2/2] gdth: bugfix for the Timer at exit crash

2008-02-13 Thread Boaz Harrosh

On Wed, Feb 13 2008 at 17:54 +0200, Boaz Harrosh [EMAIL PROTECTED] wrote:
 On Wed, Feb 13 2008 at 17:44 +0200, James Bottomley [EMAIL PROTECTED] wrote:
 On Tue, 2008-02-12 at 19:40 +0200, Boaz Harrosh wrote:
 -   gdth_flush(ha);
 -
 This piece doesn't look right.  gdth_flush() forces the internal cache
 to disk backing.  If you remove it, you're taking the chance that the
 machine will be powered off without a writeback which can cause data
 corruption.

 James

 Yes. 
 I have more problems reported, with exit, and am just sending one more patch 
 that puts
 this back in. Which was tested.
 
 So I will resend this one plus one new one.
 
 Boaz
 

The gdth driver would do a register_reboot_notifier(gdth_notifier);
to a gdth_halt() function, which would then redo half of what gdth_exit
does, and wrongly so, and crash.  

Are we guaranteed in todays kernel that modules .exit function be called
on an halt or reboot? If so then there is no need for duplications and
the gdth_halt() should go.

Submitted a patch that replaces the previous one I submitted with a deeper
fix. 
[PATCH] gdth: bugfix for the at-exit problems

If you ask me this all gdth_flush() is a crackup. sd and scsi-ml are doing 
scsi FLUSH commands when ever is needed. The controller as no business caching
data in memory longer then what is stated in standard. Raid controller or no 
raid
controller. Virtual or not virtual device. Data on Plate means data on plate.
What if there is a power outage? what the driver can do then?

Boaz
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

1 2 3 4 5 6 >

1 - 100 of 561 matches

Mail list logo