Re: iomapping a big endian area

2005-04-04 Thread Benjamin Herrenschmidt
On Sat, 2005-04-02 at 22:27 -0600, James Bottomley wrote:
 On Sat, 2005-04-02 at 20:08 -0800, David S. Miller wrote:
   Did anyone have a preference for the API?  I was thinking
   ioread32_native, but ioread32be is fine too.
  
  I think doing foo{be,le}{8,16,32}() would be consistent with
  our byteorder.h interface names.
 
 Thinking about this some more, I know of no case of a BE bus connected
 to a LE system, nor do I think anyone would ever create such a beast, so
 our only missing interface is for a BE bus on a BE system.

It's more a matter of the device than the bus imho... 

 Thus, I think io{read,write}{16,32}_native are better interfaces ...

I disagree. The driver will never know ...

 they basically mean pass memory operations without byte swaps, so
 they're well defined on both BE and LE systems and correspond exactly to
 our existing _raw_{read,write}{w,l} calls (principle of least surprise).

I don't think it's sane. You know that your device is BE or LE and use
the appropriate interface. native doesn't make sense to me in this
context.

Ben.


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: iomapping a big endian area

2005-04-04 Thread James Bottomley
On Mon, 2005-04-04 at 17:50 +1000, Benjamin Herrenschmidt wrote:
 I disagree. The driver will never know ...

? the driver has to know.  Look at the 53c700 to see exactly how awful
it is.  This beast has byte and word registers.  When used BE, all the
byte registers alter their position (to both inb and readb).

 I don't think it's sane. You know that your device is BE or LE and use
 the appropriate interface. native doesn't make sense to me in this
 context.

Well ... it's like this. Native means pass through without swapping
and has an easy implementation on both BE and LE platforms.  Logically
io{read,write}{16,32}be would have to do byte swaps on LE platforms.
Being lazy, I'm opposed to doing the work if there's no actual use for
it, so can you provide an example of a BE bus (or device) used on a LE
platform that would actually benefit from this abstraction?

James


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: iomapping a big endian area

2005-04-04 Thread Christoph Hellwig
On Mon, Apr 04, 2005 at 08:59:03AM -0500, James Bottomley wrote:
 Well ... it's like this. Native means pass through without swapping
 and has an easy implementation on both BE and LE platforms.  Logically
 io{read,write}{16,32}be would have to do byte swaps on LE platforms.
 Being lazy, I'm opposed to doing the work if there's no actual use for
 it, so can you provide an example of a BE bus (or device) used on a LE
 platform that would actually benefit from this abstraction?

The IOC4 device that provides IDE, serial ports and external interrupts
on Altix systems has a big endian register layour, and the PCI-X bridge
in those Altix systems can do the swapping if a special bit is set.

In older kernels that bit was set from the driver through a special API,
but it seems the firmware does that automatically now.

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: iomapping a big endian area

2005-04-04 Thread Randy.Dunlap
James Bottomley wrote:
On Mon, 2005-04-04 at 17:50 +1000, Benjamin Herrenschmidt wrote:
I disagree. The driver will never know ...

? the driver has to know.  Look at the 53c700 to see exactly how awful
it is.  This beast has byte and word registers.  When used BE, all the
byte registers alter their position (to both inb and readb).

I don't think it's sane. You know that your device is BE or LE and use
the appropriate interface. native doesn't make sense to me in this
context.

Well ... it's like this. Native means pass through without swapping
and has an easy implementation on both BE and LE platforms.  Logically
io{read,write}{16,32}be would have to do byte swaps on LE platforms.
Being lazy, I'm opposed to doing the work if there's no actual use for
it, so can you provide an example of a BE bus (or device) used on a LE
platform that would actually benefit from this abstraction?
I would probably spell native as noswap.
native just doesn't convey enough specific meaning...
--
~Randy
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] compile out scsi_ioctl when no SCSI/IDE/etc. (take 2)

2005-04-04 Thread Matthew Wilcox
On Mon, Apr 04, 2005 at 03:47:01PM +0200, Fillod Stephane wrote:
 --- linux/drivers/ide/ide.c   26 Mar 2005 03:28:17 -  1.1.1.2
 +++ linux/drivers/ide/ide.c   4 Apr 2005 13:24:29 -
 @@ -1562,9 +1562,11 @@
   return 0;
   }
  
 +#ifdef CONFIG_SCSI_IOCTL
   case CDROMEJECT:
   case CDROMCLOSETRAY:
   return scsi_cmd_ioctl(file, bdev-bd_disk, cmd,
 p);
 +#endif
  
   case HDIO_GET_BUSSTATE:
   if (!capable(CAP_SYS_ADMIN))

Rather than putting ifdefs in the middle of functions, the normal way is
to define a dummy function:

+++ include/linux/blkdev.h  4 Apr 2005 14:41:23 -
@@ -548,7 +548,14 @@ extern int blk_remove_plug(request_queue
 extern void blk_recount_segments(request_queue_t *, struct bio *);
 extern int blk_phys_contig_segment(request_queue_t *q, struct bio *, struct 
bio *);
 extern int blk_hw_contig_segment(request_queue_t *q, struct bio *, struct bio 
*);
+#ifdef CONFIG_SCSI_IOCTL
 extern int scsi_cmd_ioctl(struct file *, struct gendisk *, unsigned int, void 
__user *);
+#else
+static inline int scsi_cmd_ioctl(struct file *f, struct gendisk *g, unsigned 
int cmd, void __user *ptr)
+{
+   return -ENOTTY;
+}
+#endif
 extern void blk_start_queue(request_queue_t *q);
 extern void blk_stop_queue(request_queue_t *q);
 extern void blk_sync_queue(struct request_queue *q);


(I'm not sure about the error number; ide.c returns -EINVAL if it doesn't
recognise the ioctl, which I think is wrong, it should return -ENOTTY.
OTOH, aren't we supposed to return -ENOSYS for I recognise this ioctl,
it's supported on this device, but it's not implemented?  Is there a
good reference for how errnos are supposed to be implemented in kernel?)

-- 
Next the statesmen will invent cheap lies, putting the blame upon 
the nation that is attacked, and every man will be glad of those
conscience-soothing falsities, and will diligently study them, and refuse
to examine any refutations of them; and thus he will by and by convince 
himself that the war is just, and will thank God for the better sleep 
he enjoys after this process of grotesque self-deception. -- Mark Twain
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH scsi-misc-2.6 08/13] scsi: move request preps in other places into prep_fn()

2005-04-04 Thread James Bottomley
On Fri, 2005-04-01 at 14:25 +0900, Tejun Heo wrote:
  Ah.. with later requeue path consolidation patches, all requests get
 their sense buffer cleared during requeueing, which, IMHO, is more
 logical.  Moving scsi_init_cmd_errh() should come after the patch.
 Sorry. :-)
 
  I'll make another take of this patchset (maybe subset) after issues
 are resolved.  I'll split and reorder relocation of scsi_init_cmd_errh
 then.

Thanks.  It would help me enormously if you explained what bugs you were
fixing at the top of each patch, and also only do patchsets that are
dependent on each other (I already have your serial_numer_at_timeout and
internal_timeout removal patches in the scsi-misc-2.6 tree).

James


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


dm-multipath+iscsi-initiator-core performance results

2005-04-04 Thread Nicholas A. Bellinger
Greetings all,

The following presents benchmarks for dm-multipath in combination with
iscsi-initiator-core over 10Gb/sec Ethernet.  This test's primary
purpose is to show how outer-nexus multipathing using dm-multipath in
conjunction with inter-nexus multiplexing using iscsi-initiator-core can
perform on Linux v2.6 with commodity 64-bit hardware

For this test iSCSI/TCP over IPv4 was used with a single iSCSI Initiator
Node defining two iSCSI Channels to the same physical iSCSI Target Node
via different iSCSI Target Portal Groups. (see diagram below).  A table
is then loaded into dm-multipath containing references to the same LUN
over multiple iSCSI Channels. Tests where run in both single connection
(SC/S) and multiple connection (MC/S) scenarios.  There is a concrete
performance improvement on READs with MC/S (almost 100 MB/sec on 128k).

For more information on iiC, visit http://www.iscsi-initiator-core.org

--

A) iSCSI SAN Topography:

Initiator Node: iqn.2005-02.org.iscsi-initiator-core.org:linux-x86_64

iscsi-initiator-core v1.6.2.0-rc1 on Linux/x86_64 2.6.12-rc1
iscsi-initiator-core-tools v2.3
device-mapper: 4.4.0-ioctl (2005-01-12)
device-mapper: dm-multipath version 1.0.4
device-mapper: dm-round-robin version 1.0.0
multipath-tools v0.4.3

2x 1.8ghz 242 AMD Opterons w/ 2 GB Memory
Neterion Xframe 10Gb Ethernet

Target Node: iqn.2002-07.com.pyxtechnologies.target:lad220.linux.x86_64

PyX Technologies Storage Engine 2.x on Linux/x86_64 2.6.12-rc1
 
2x 1.8ghz 242 AMD Opterons w/ 2 GB Memory
Neterion Xframe 10Gb Ethernet

--

B) Diagram:

2 iSCSI Channels - One (1) Connection per Session (SC/S):

  Initiator  Target
 ___   _
|   | | |
| Channel 1 |---| TPG 1 192.168.3.220 |\
|   | | |  LUN 0
| Channel 2 |---| TPG 2 192.168.3.221 |/
|___| |_|


2 iSCSI Channels - Two (2) iSCSI Connections per Session (MC/S):

  Initiator  Target
 ___   _
|   |/-\| |
| Channel 1 |\-/| TPG 1 192.168.3.220 |\
|   | | |  LUN 0
| Channel 2 |/-\| TPG 2 192.168.3.221 |/
|___|\-/|_|

-

C) Performance Results:

Using LTP disktest (see attached files for full output, see below for
configuration):

disktest -K8 -ID -PT -h3 -T30 -r /dev/mapper/dm0 -B$BLOCK_SIZE

SC/S via 2 iSCSI Channels:

WRITE:

32k: Total write throughput: 227711385.6B/s (217.16MB/s), IOPS 6949.2/s.
64k: Total write throughput: 342947703.5B/s (327.06MB/s), IOPS 5233.0/s.
128k: Total write throughput: 412186487.5B/s (393.09MB/s), IOPS 3144.7/s

MC/S via 2 iSCSI Channels:

WRITE:

32k: Total write throughput: 232743458.1B/s (221.96MB/s), IOPS 7102.8/s.
64k: Total write throughput: 367285589.3B/s (350.27MB/s), IOPS 5604.3/s.
128k: Total write throughput: 451494980.3B/s (430.58MB/s), IOPS 3444.6/s

SC/S via 2 iSCSI Channels:

READ:

32k: Total read throughput: 330224981.3B/s (314.93MB/s), IOPS 10077.7/s.
64k: Total read throughput: 420939912.5B/s (401.44MB/s), IOPS 6423.0/s.
128k: Total read throughput: 435495458.1B/s (415.32MB/s), IOPS 3322.6/s.

2 iSCSI Channels w/ Multiple Connections per Session (MC/S):

READ:

32k: Total read throughput: 364500309.3B/s (347.61MB/s), IOPS 11123.7/s.
64k: Total read throughput: 471708467.2B/s (449.86MB/s), IOPS 7197.7/s.
128k: Total read throughput: 528338124.8B/s (503.86MB/s), IOPS 4030.9/s.

-
 
D) Detailed iSCSI Initiator Node Setup Information:

Single Connection Setup from /etc/sysconfig/initiator

CHANNEL=1 1 eth2 192.168.3.220 3260 0
AuthMethod=CHAP,None;DataDigest=None;MaxRecvDataSegmentLength=262144;MaxBurstLength=262144;FirstBurstLength=262144
 tpgfailover=0
CHANNEL=2 1 eth2 192.168.3.221 3260 0
AuthMethod=CHAP,None;DataDigest=None;MaxRecvDataSegmentLength=262144;MaxBurstLength=262144;FirstBurstLength=262144
 tpgfailover=0

Multiple Connection Setup from /etc/sysconfig/initiator

CHANNEL=1 2 eth2 192.168.3.220 3260 0
AuthMethod=CHAP,None;DataDigest=None;MaxRecvDataSegmentLength=262144;MaxBurstLength=262144;FirstBurstLength=262144
 tpgfailover=0
CHANNEL=2 2 eth2 192.168.3.221 3260 0
AuthMethod=CHAP,None;DataDigest=None;MaxRecvDataSegmentLength=262144;MaxBurstLength=262144;FirstBurstLength=262144
 tpgfailover=0

iSCSI Channel 1 Login and LUN Scan:

CHANNEL[1] - Setting max_sectors: 512
scsi2 : PyX Technologies iSCSI Initiator Core Stack v1.6.2.0-rc1-OSS on

Re: iomapping a big endian area

2005-04-04 Thread James Bottomley
OK, I sent the patch off to Andrew.  To complete the original problem,
the attached is the patch that uses it in the parisc lasi driver
(although, actually, it sets up 53c700 to work everywhere including BE
on a LE system).

I changed some of the flags around to reflect the fact that we now have
generic BE support in the driver (rather than the more limited
force_le_on_be flag).

James

= drivers/scsi/53c700.h 1.25 vs edited =
--- 1.25/drivers/scsi/53c700.h  2005-04-04 09:55:44 -05:00
+++ edited/drivers/scsi/53c700.h2005-04-04 15:39:01 -05:00
@@ -177,10 +177,10 @@
struct device   *dev;
__u32   dmode_extra;/* adjustable bus settings */
__u32   differential:1; /* if we are differential */
-#ifdef CONFIG_53C700_LE_ON_BE
+#ifdef CONFIG_53C700_BE
/* This option is for HP only.  Set it if your chip is wired for
 * little endian on this platform (which is big endian) */
-   __u32   force_le_on_be:1;
+   __u32   chip_is_be:1;
 #endif
__u32   chip710:1;  /* set if really a 710 not 700 */
__u32   burst_disable:1;/* set to 1 to disable 710 bursting */
@@ -229,24 +229,29 @@
 /*
  * 53C700 Register Interface - the offset from the Selected base
  * I/O address */
-#ifdef CONFIG_53C700_LE_ON_BE
-#define bE (hostdata-force_le_on_be ? 0 : 3)
-#definebSWAP   (hostdata-force_le_on_be)
-/* This is terrible, but there's no raw version of ioread32.  That means
- * that on a be board we swap twice (once in ioread32 and once again to 
- * get the value correct) */
-#define bS_to_io(x)((hostdata-force_le_on_be) ? (x) : cpu_to_le32(x))
-#elif defined(__BIG_ENDIAN)
+#ifdef CONFIG_53C700_BE
+#define bE (hostdata-chip_is_be ? 3: 0)
+#ifdef __BIG_ENDIAN
+#definebSWAP   (!hostdata-chip_is_be)
+#else
+#define bSWAP  (hostdata-chip_is_be);
+#endif
+#define NCR_ioread32(x)((hostdata-chip_is_be) ? ioread32be(x) : 
ioread32(x))
+#define NCR_iowrite32(v, x) \
+   ((hostdata-chip_is_be) ? iowrite32be((v), (x)) : iowrite32((v), (x)))
+#else
+#define NCR_ioread32(x)ioread32(x)
+#define NCR_iowrite32(v, x)iowrite32((v), (x))
+#if defined(__BIG_ENDIAN)
 #define bE 3
 #define bSWAP  0
-#define bS_to_io(x)(x)
 #elif defined(__LITTLE_ENDIAN)
 #define bE 0
 #define bSWAP  0
-#define bS_to_io(x)(x)
 #else
 #error __BIG_ENDIAN or __LITTLE_ENDIAN must be defined, did you include 
byteorder.h?
 #endif
+#endif
 #define bS_to_cpu(x)   (bSWAP ? le32_to_cpu(x) : (x))
 #define bS_to_host(x)  (bSWAP ? cpu_to_le32(x) : (x))
 
@@ -460,14 +465,13 @@
 {
const struct NCR_700_Host_Parameters *hostdata
= (struct NCR_700_Host_Parameters *)host-hostdata[0];
-   __u32 value = ioread32(hostdata-base + reg);
+   __u32 value = NCR_ioread32(hostdata-base + reg);
 #if 1
/* sanity check the register */
-   if((reg  0x3) != 0)
-   BUG();
+   BUG_ON((reg  0x3) != 0);
 #endif
 
-   return bS_to_io(value);
+   return value;
 }
 
 static inline void
@@ -487,11 +491,10 @@
 
 #if 1
/* sanity check the register */
-   if((reg  0x3) != 0)
-   BUG();
+   BUG_ON((reg  0x3) != 0);
 #endif
 
-   iowrite32(bS_to_io(value), hostdata-base + reg);
+   NCR_iowrite32(value, hostdata-base + reg);
 }
 
 #endif
= drivers/scsi/Kconfig 1.88 vs edited =
--- 1.88/drivers/scsi/Kconfig   2005-04-04 09:55:45 -05:00
+++ edited/drivers/scsi/Kconfig 2005-04-04 15:34:40 -05:00
@@ -951,7 +951,7 @@
  many PA-RISC workstations  servers.  If you do not know whether you
  have a Lasi chip, it is safe to say Y here.
 
-config 53C700_LE_ON_BE
+config 53C700_BE
bool
depends on SCSI_LASI700
default y
= drivers/scsi/lasi700.c 1.27 vs edited =
--- 1.27/drivers/scsi/lasi700.c 2005-04-04 09:55:45 -05:00
+++ edited/drivers/scsi/lasi700.c   2005-04-04 15:31:19 -05:00
@@ -117,15 +117,13 @@
 
if (dev-id.sversion == LASI_700_SVERSION) {
hostdata-clock = LASI700_CLOCK;
-   hostdata-force_le_on_be = 1;
+   hostdata-chip_is_be = 0;
} else {
hostdata-clock = LASI710_CLOCK;
-   hostdata-force_le_on_be = 0;
+   hostdata-chip_is_be = 1;
hostdata-chip710 = 1;
hostdata-dmode_extra = DMODE_FC2;
}
-
-   NCR_700_set_mem_mapped(hostdata);
 
host = NCR_700_detect(lasi700_template, hostdata, dev-dev);
if (!host)


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html