Re: DMA mapping on SCSI device?

2008-01-30 Thread Mark Lord

Robert Hancock wrote:

Luben Tuikov wrote:

--- On Mon, 1/28/08, Robert Hancock [EMAIL PROTECTED] wrote:

The trick is that if an ATAPI device is connected, we (as
far as I'm aware) can't use ADMA mode, so we have to switch that
port into legacy mode.


Can you double check this with the HW architect of the
HW DMA engine of the ASIC?


Will do so. However, previous statements from NVIDIA fairly clearly 
indicate that this is the case.





This means it's only capable of 32-bit DMA.
However the other port on the controller may be connected to a hard 
drive and therefore still capable of 64-bit DMA.


If this is indeed the case as you've presented it here,
it sounds like a HW shortcoming.  I cannot see how the device
type (or protocol) dictate how the DMA engine operates.
They live in two different domains.


Well, there is an indirect link. The ADMA interface (which supports 
64-bit DMA) cannot be used to issue ATAPI commands, so if an ATAPI 
device is connected we have to go to legacy mode, which supports only 
32-bit DMA.


I'm not sure why ADMA mode doesn't support ATAPI. The only reason I can 
think of is that there's issues since ATAPI commands can potentially be 
of unpredictable transfer size. The real ADMA spec that the NVIDIA 
implementation is loosely based on does have some special ignore 
excess controls that don't seem to be in the NVIDIA version (or at 
least not to the knowledge I have on this hardware).

..

The original Pacific Digital ADMA cores *do* support most ATAPI commands
in ADMA mode, including READ_CD, READ_10, etc..  With the caveat that if
DSC completion state is required, the driver has to drop out of ADMA
and poll for it after the ADMA command completes.

Commands which were not ADMA compatible (eg. MODE_SENSE, TEST_UNIT_READY, ..)
were simply handled with PIO (in the driver) rather than any form of DMA,
which is okay because those commands are relatively infrequent.

Note that Pacific Digital officially said no ATAPI for the ADMA design,
but I implemented it regardless (for Linux) and it worked rather well.
We could burn DVDs and back-up to tape simultaneously, with the burner
and the tape unit sharing a single IDE cable/channel.

Cheers
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: DMA mapping on SCSI device?

2008-01-30 Thread Mark Lord

Mark Lord wrote:

..
Commands which were not ADMA compatible (eg. MODE_SENSE, 
TEST_UNIT_READY, ..)

were simply handled with PIO (in the driver) rather than any form of DMA,
which is okay because those commands are relatively infrequent.

..

A slight correction there:  TEST_UNIT_READY was fine in ADMA mode as well.

Cheers
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: DMA mapping on SCSI device?

2008-01-30 Thread Robert Hancock

Mark Lord wrote:

Robert Hancock wrote:

Luben Tuikov wrote:

--- On Mon, 1/28/08, Robert Hancock [EMAIL PROTECTED] wrote:

The trick is that if an ATAPI device is connected, we (as
far as I'm aware) can't use ADMA mode, so we have to switch that
port into legacy mode.


Can you double check this with the HW architect of the
HW DMA engine of the ASIC?


Will do so. However, previous statements from NVIDIA fairly clearly 
indicate that this is the case.





This means it's only capable of 32-bit DMA.
However the other port on the controller may be connected to a hard 
drive and therefore still capable of 64-bit DMA.


If this is indeed the case as you've presented it here,
it sounds like a HW shortcoming.  I cannot see how the device
type (or protocol) dictate how the DMA engine operates.
They live in two different domains.


Well, there is an indirect link. The ADMA interface (which supports 
64-bit DMA) cannot be used to issue ATAPI commands, so if an ATAPI 
device is connected we have to go to legacy mode, which supports only 
32-bit DMA.


I'm not sure why ADMA mode doesn't support ATAPI. The only reason I 
can think of is that there's issues since ATAPI commands can 
potentially be of unpredictable transfer size. The real ADMA spec 
that the NVIDIA implementation is loosely based on does have some 
special ignore excess controls that don't seem to be in the NVIDIA 
version (or at least not to the knowledge I have on this hardware).

..

The original Pacific Digital ADMA cores *do* support most ATAPI commands
in ADMA mode, including READ_CD, READ_10, etc..  With the caveat that if
DSC completion state is required, the driver has to drop out of ADMA
and poll for it after the ADMA command completes.

Commands which were not ADMA compatible (eg. MODE_SENSE, 
TEST_UNIT_READY, ..)

were simply handled with PIO (in the driver) rather than any form of DMA,
which is okay because those commands are relatively infrequent.

Note that Pacific Digital officially said no ATAPI for the ADMA design,
but I implemented it regardless (for Linux) and it worked rather well.
We could burn DVDs and back-up to tape simultaneously, with the burner
and the tape unit sharing a single IDE cable/channel.


I'm told that the ADMA hardware does have some support for issuing ATAPI 
commands, however according to Allen Martin of NVIDIA, The ATAPI 
support in the ADMA hardware had some serious problem that forced us to 
turn it off in the Windows driver. So it looks like ATAPI in ADMA mode 
is likely a non-starter.

-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: DMA mapping on SCSI device?

2008-01-30 Thread Matthew Wilcox
On Tue, Jan 29, 2008 at 02:09:52PM -0800, Luben Tuikov wrote:
  The ideal solution would be to do mapping against a
  different struct 
  device for each port, so that we could maintain the proper
  DMA mask for 
  each of them at all times. However I'm not sure if
  that's possible. The 
  thought of using the SCSI struct device for DMA mapping was
  brought up 
  at one point.. any thoughts on that?
 
 The reason for this is that the object that a struct scsi_dev
 represents has nothing to do with HW DMA engines.

It really would work, once the few remaining architectures move away
from asserting that the 'struct device' passed in is a pci device.
It seems like the best way forward to me.

-- 
Intel are signing my paycheques ... these opinions are still mine
Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step.
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: DMA mapping on SCSI device?

2008-01-29 Thread James Bottomley

On Tue, 2008-01-29 at 05:28 +0100, Andi Kleen wrote:
  The ideal solution would be to do mapping against a different struct
  device for each port, so that we could maintain the proper DMA mask for
  each of them at all times. However I'm not sure if that's possible.
 
 I cannot imagine why it should be that difficult. The PCI subsystem
 could over a pci_clone_device() or similar function.   For all complicated
 purposes (sysfs etc)  the original device could be used, so it would
 be hopefully not that difficult.

I know it works for parisc ... all we care about for DMA mapping is the
mask in the actual device and the location of the iommu.  For the
latter, we just go up device-parent until we find it, so as long as
manufactured devices are properly parented we have no problems with
mapping them.

The concern matthew has is this code in asm-generic/dma-mapping.h:

static inline void *
dma_alloc_coherent(struct device *dev, size_t size, dma_addr_t
*dma_handle,
   gfp_t flag)
{
BUG_ON(dev-bus != pci_bus_type);

return pci_alloc_consistent(to_pci_dev(dev), size, dma_handle);
}

The manufactured devices wouldn't be PCI devices (otherwise they'd show
up in PCI and cause all sorts of confusion), so any architectures which
haven't converted to using the dma_ functions internally will BUG here.

However, a quick audit shows that to be just m68k, v850 and sparc (not
sparc64), so they're probably none the driver cares about.

 The alternative would be to add a new family of PCI mapping
 functions that take an explicit mask. Disadvantage would be changing 
 all architectures, but on the other hand the interface could be phase
 in one by one (and nF4 primarily only works on x86 anyways) 

I suppose it would allow us to clean dma_mask and dma_coherent_mask out
of the device structures ... on the other hand, the mask isn't simply
what the device wants, it's also what the platform allows you to set, so
it would have to be stored somewhere anyway.

 I suspect the later would be a little cleaner, although they don't
 make much difference.

James


-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: DMA mapping on SCSI device?

2008-01-29 Thread Luben Tuikov
--- On Mon, 1/28/08, Andi Kleen [EMAIL PROTECTED] wrote:
  The ideal solution would be to do mapping against a
 different struct
  device for each port, so that we could maintain the
 proper DMA mask for
  each of them at all times. However I'm not sure if
 that's possible.
 
 I cannot imagine why it should be that difficult. The PCI
 subsystem
 could over a pci_clone_device() or similar function.   For
 all complicated
 purposes (sysfs etc)  the original device could be used, so
 it would
 be hopefully not that difficult.
 
 The alternative would be to add a new family of PCI mapping
 functions that take an explicit mask. Disadvantage would be
 changing 
 all architectures, but on the other hand the interface
 could be phase
 in one by one (and nF4 primarily only works on x86 anyways)
 
 I suspect the later would be a little cleaner, although
 they don't
 make much difference.

Yes, I guess, that's certainly doable.

The current PCI abstraction is clean: HW DMA engine(s) implementation
is a property of the PCI function.

Marrying different behaviour of the HW DMA engine of the ASIC
depending on the SCSI end device at the PCI device abstraction doesn't
sound good. (An extreme design is a single DMA engine servicing
the ASIC.)

Although, the effect that Rob wants could be cleanly implemented
at a higher level, pci_map_sg() and such, or fixing 
blk_queue_bounce_limit() in x86_64 to that effect.

Luben

-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: DMA mapping on SCSI device?

2008-01-29 Thread Luben Tuikov
--- On Mon, 1/28/08, Robert Hancock [EMAIL PROTECTED] wrote:
 The trick is that if an ATAPI device is connected, we (as
 far as I'm 
 aware) can't use ADMA mode, so we have to switch that
 port into legacy 
 mode.

Can you double check this with the HW architect of the
HW DMA engine of the ASIC?

 This means it's only capable of 32-bit DMA.
 However the other port 
 on the controller may be connected to a hard drive and
 therefore still 
 capable of 64-bit DMA.

If this is indeed the case as you've presented it here,
it sounds like a HW shortcoming.  I cannot see how the device
type (or protocol) dictate how the DMA engine operates.
They live in two different domains.

 The ideal solution would be to do mapping against a
 different struct 
 device for each port, so that we could maintain the proper
 DMA mask for 
 each of them at all times. However I'm not sure if
 that's possible. The 
 thought of using the SCSI struct device for DMA mapping was
 brought up 
 at one point.. any thoughts on that?

The reason for this is that the object that a struct scsi_dev
represents has nothing to do with HW DMA engines.

It looks like your current solution is correct and
x86_64's blk_queue_bounce_limit needs work.

Luben

-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: DMA mapping on SCSI device?

2008-01-29 Thread Robert Hancock

Luben Tuikov wrote:

--- On Mon, 1/28/08, Robert Hancock [EMAIL PROTECTED] wrote:

The trick is that if an ATAPI device is connected, we (as
far as I'm 
aware) can't use ADMA mode, so we have to switch that
port into legacy 
mode.


Can you double check this with the HW architect of the
HW DMA engine of the ASIC?


Will do so. However, previous statements from NVIDIA fairly clearly 
indicate that this is the case.





This means it's only capable of 32-bit DMA.
However the other port 
on the controller may be connected to a hard drive and
therefore still 
capable of 64-bit DMA.


If this is indeed the case as you've presented it here,
it sounds like a HW shortcoming.  I cannot see how the device
type (or protocol) dictate how the DMA engine operates.
They live in two different domains.


Well, there is an indirect link. The ADMA interface (which supports 
64-bit DMA) cannot be used to issue ATAPI commands, so if an ATAPI 
device is connected we have to go to legacy mode, which supports only 
32-bit DMA.


I'm not sure why ADMA mode doesn't support ATAPI. The only reason I can 
think of is that there's issues since ATAPI commands can potentially be 
of unpredictable transfer size. The real ADMA spec that the NVIDIA 
implementation is loosely based on does have some special ignore 
excess controls that don't seem to be in the NVIDIA version (or at 
least not to the knowledge I have on this hardware).


And yes, it is a rather unfortunate hardware shortcoming (presuming that 
it is entirely true).





The ideal solution would be to do mapping against a
different struct 
device for each port, so that we could maintain the proper
DMA mask for 
each of them at all times. However I'm not sure if
that's possible. The 
thought of using the SCSI struct device for DMA mapping was
brought up 
at one point.. any thoughts on that?


The reason for this is that the object that a struct scsi_dev
represents has nothing to do with HW DMA engines.

It looks like your current solution is correct and
x86_64's blk_queue_bounce_limit needs work.

Luben



-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


DMA mapping on SCSI device?

2008-01-28 Thread Robert Hancock
We've got a bit of a problem with the sata_nv driver that I'm trying to 
figure out a decent solution to (hence all the lists CCed). This is the 
situation:


The nForce4 ADMA hardware has 2 modes: legacy mode, where it acts like a 
normal ATA controller with 32-bit DMA limits, and ADMA mode where it can 
access all of 64-bit memory. Each PCI device has 2 SATA ports, and the 
legacy/ADMA mode can be controlled independently on both of them.


The trick is that if an ATAPI device is connected, we (as far as I'm 
aware) can't use ADMA mode, so we have to switch that port into legacy 
mode. This means it's only capable of 32-bit DMA. However the other port 
on the controller may be connected to a hard drive and therefore still 
capable of 64-bit DMA. (To make things more complicated, devices can be 
hotplugged and so this can change dynamically.) Since the device that 
libata is doing DMA mapping against is attached to the PCI device and 
not the port, it creates a problem here. If we change the mask on one it 
affects the other one as well.


The original solution used by the driver was to leave the DMA mask at 
64-bit and use blk_queue_bounce_limit to try to force the block layer 
not to send any requests with DMA addresses over 4GB into the driver. 
However it seems on x86_64 this doesn't work, since it pushes high 
addresses through anyway and expects the IOMMU to take care of it (which 
it doesn't because of the 64-bit mask).


The last solution I tried was to set the DMA mask on both ports to 
32-bit on slave_configure when an ATAPI device is connected. However, 
this runs into complications as well. This is run on initialization and 
when trying to set the other port into 32-bit DMA, it may not be 
initialized yet. Plus, it forces the port with a hard drive on it into 
32-bit DMA needlessly.


The ideal solution would be to do mapping against a different struct 
device for each port, so that we could maintain the proper DMA mask for 
each of them at all times. However I'm not sure if that's possible. The 
thought of using the SCSI struct device for DMA mapping was brought up 
at one point.. any thoughts on that?

-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: DMA mapping on SCSI device?

2008-01-28 Thread Grant Grundler
On Jan 29, 2008 11:08 AM, Robert Hancock [EMAIL PROTECTED] wrote:
...
 The last solution I tried was to set the DMA mask on both ports to
 32-bit on slave_configure when an ATAPI device is connected. However,
 this runs into complications as well. This is run on initialization and
 when trying to set the other port into 32-bit DMA, it may not be
 initialized yet. Plus, it forces the port with a hard drive on it into
 32-bit DMA needlessly.

Have you measured the impact of setting the PCI dma mask to 32-bit?

Last time Alex Williamson (HP) measured this on IA64, we deliberately
forced pci_map_sg() to use the IOMMU even for devices that were 64-bit
capable. We got 3-5% better throughput since the device had fewer
entries to retrieve and the devices (at the time) weren't that good at
processing SG lists.


 The ideal solution would be to do mapping against a different struct
 device for each port, so that we could maintain the proper DMA mask for
 each of them at all times. However I'm not sure if that's possible. The
 thought of using the SCSI struct device for DMA mapping was brought up
 at one point.. any thoughts on that?

I'm pretty sure that's not possible (using two PCI dev structs). I'm
skeptical it's worth converting DMA services to use SCSI devs since
that's an extremely invasive change for a marginal benefit.

hth,
grant

 -
 To unsubscribe from this list: send the line unsubscribe linux-scsi in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: DMA mapping on SCSI device?

2008-01-28 Thread Matthew Wilcox
On Mon, Jan 28, 2008 at 06:08:44PM -0600, Robert Hancock wrote:
 The 
 thought of using the SCSI struct device for DMA mapping was brought up 
 at one point.. any thoughts on that?

I believe this will work on some architectures and not others.
Anything that uses include/asm-generic/dma-mapping.h will break, for
example.  It would be nice for those architectures to get fixed ...

-- 
Intel are signing my paycheques ... these opinions are still mine
Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step.
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: DMA mapping on SCSI device?

2008-01-28 Thread Andi Kleen

 The ideal solution would be to do mapping against a different struct
 device for each port, so that we could maintain the proper DMA mask for
 each of them at all times. However I'm not sure if that's possible.

I cannot imagine why it should be that difficult. The PCI subsystem
could over a pci_clone_device() or similar function.   For all complicated
purposes (sysfs etc)  the original device could be used, so it would
be hopefully not that difficult.

The alternative would be to add a new family of PCI mapping
functions that take an explicit mask. Disadvantage would be changing 
all architectures, but on the other hand the interface could be phase
in one by one (and nF4 primarily only works on x86 anyways) 

I suspect the later would be a little cleaner, although they don't
make much difference.

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html