Re: [PATCH] fsldma: use PCI Read Multiple command
On Thu, Jun 11, 2009 at 11:17 PM, Ira Snyderi...@ovro.caltech.edu wrote: On Wed, Jun 10, 2009 at 09:45:26PM -0500, Kumar Gala wrote: On Apr 27, 2009, at 3:49 PM, Dan Williams wrote: On Mon, Apr 27, 2009 at 1:47 PM, Timur Tabi ti...@freescale.com wrote: Adding Kumar to the CC: list, since he might pick up the patch. Acked-by: Dan Williams dan.j.willi...@intel.com I agree with taking this through Kumar's tree. I'm going through patches for .31.. Should I still pick this up? Going forward should I pick up fsldma patches? I'm fine with that, but you should probably talk to Li Yang (added to CC). He's gotten in contact with me a few times recently. I am fine with both ways for this patch as it is only related to Freescale register details. But in general I think patches should go through functional subsystem, as they usually would need insight of the subsystem architecture. I prefer the way that the patch acked or signed-off by Freescale guys and push upstream through Dan's tree as most other subsystems did. Unless Dan prefers to ack the subsystem architectural part of each patch and have them pushed other way. - Leo ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] fsldma: use PCI Read Multiple command
On Jun 12, 2009, at 4:23 AM, Li Yang wrote: On Thu, Jun 11, 2009 at 11:17 PM, Ira Snyderi...@ovro.caltech.edu wrote: On Wed, Jun 10, 2009 at 09:45:26PM -0500, Kumar Gala wrote: On Apr 27, 2009, at 3:49 PM, Dan Williams wrote: On Mon, Apr 27, 2009 at 1:47 PM, Timur Tabi ti...@freescale.com wrote: Adding Kumar to the CC: list, since he might pick up the patch. Acked-by: Dan Williams dan.j.willi...@intel.com I agree with taking this through Kumar's tree. I'm going through patches for .31.. Should I still pick this up? Going forward should I pick up fsldma patches? I'm fine with that, but you should probably talk to Li Yang (added to CC). He's gotten in contact with me a few times recently. I am fine with both ways for this patch as it is only related to Freescale register details. But in general I think patches should go through functional subsystem, as they usually would need insight of the subsystem architecture. I prefer the way that the patch acked or signed-off by Freescale guys and push upstream through Dan's tree as most other subsystems did. Unless Dan prefers to ack the subsystem architectural part of each patch and have them pushed other way. I agree w/this and just wanting to see what Dan's preference is. - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] fsldma: use PCI Read Multiple command
Kumar Gala wrote: On Jun 12, 2009, at 4:23 AM, Li Yang wrote: On Thu, Jun 11, 2009 at 11:17 PM, Ira Snyderi...@ovro.caltech.edu wrote: On Wed, Jun 10, 2009 at 09:45:26PM -0500, Kumar Gala wrote: On Apr 27, 2009, at 3:49 PM, Dan Williams wrote: On Mon, Apr 27, 2009 at 1:47 PM, Timur Tabi ti...@freescale.com wrote: Adding Kumar to the CC: list, since he might pick up the patch. Acked-by: Dan Williams dan.j.willi...@intel.com I agree with taking this through Kumar's tree. I'm going through patches for .31.. Should I still pick this up? Going forward should I pick up fsldma patches? I'm fine with that, but you should probably talk to Li Yang (added to CC). He's gotten in contact with me a few times recently. I am fine with both ways for this patch as it is only related to Freescale register details. But in general I think patches should go through functional subsystem, as they usually would need insight of the subsystem architecture. I prefer the way that the patch acked or signed-off by Freescale guys and push upstream through Dan's tree as most other subsystems did. Unless Dan prefers to ack the subsystem architectural part of each patch and have them pushed other way. I agree w/this and just wanting to see what Dan's preference is. I'll take fsldma patches through the dmaengine tree with Leo's ack/sign-off. That last request was a one-off because I had nothing else to push and the discussion was very architecture specific. Thanks, Dan ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] fsldma: use PCI Read Multiple command
On Jun 12, 2009, at 12:38 PM, Dan Williams wrote: Kumar Gala wrote: On Jun 12, 2009, at 4:23 AM, Li Yang wrote: On Thu, Jun 11, 2009 at 11:17 PM, Ira Snyderi...@ovro.caltech.edu wrote: On Wed, Jun 10, 2009 at 09:45:26PM -0500, Kumar Gala wrote: On Apr 27, 2009, at 3:49 PM, Dan Williams wrote: On Mon, Apr 27, 2009 at 1:47 PM, Timur Tabi ti...@freescale.com wrote: Adding Kumar to the CC: list, since he might pick up the patch. Acked-by: Dan Williams dan.j.willi...@intel.com I agree with taking this through Kumar's tree. I'm going through patches for .31.. Should I still pick this up? Going forward should I pick up fsldma patches? I'm fine with that, but you should probably talk to Li Yang (added to CC). He's gotten in contact with me a few times recently. I am fine with both ways for this patch as it is only related to Freescale register details. But in general I think patches should go through functional subsystem, as they usually would need insight of the subsystem architecture. I prefer the way that the patch acked or signed-off by Freescale guys and push upstream through Dan's tree as most other subsystems did. Unless Dan prefers to ack the subsystem architectural part of each patch and have them pushed other way. I agree w/this and just wanting to see what Dan's preference is. I'll take fsldma patches through the dmaengine tree with Leo's ack/ sign-off. That last request was a one-off because I had nothing else to push and the discussion was very architecture specific. Sounds good to me. I expect you to pick up this patch for .31 - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] fsldma: use PCI Read Multiple command
On Wed, Jun 10, 2009 at 09:45:26PM -0500, Kumar Gala wrote: On Apr 27, 2009, at 3:49 PM, Dan Williams wrote: On Mon, Apr 27, 2009 at 1:47 PM, Timur Tabi ti...@freescale.com wrote: Adding Kumar to the CC: list, since he might pick up the patch. Acked-by: Dan Williams dan.j.willi...@intel.com I agree with taking this through Kumar's tree. I'm going through patches for .31.. Should I still pick this up? Going forward should I pick up fsldma patches? I'm fine with that, but you should probably talk to Li Yang (added to CC). He's gotten in contact with me a few times recently. In addition to the PCI Read Multiple patch, I've posted a few more fsldma patches to ppcdev recently: fsldma: enable external start for the 83xx controller fsldma: do not clear bandwidth control bits on the 83xx controller fsldma: Add DMA_SLAVE support The first two are very straightforward. The third probably needs some review before it can be picked up, though. It is well tested, I've been using it for a few weeks without issues. Thanks, Ira ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] fsldma: use PCI Read Multiple command
On Apr 27, 2009, at 3:49 PM, Dan Williams wrote: On Mon, Apr 27, 2009 at 1:47 PM, Timur Tabi ti...@freescale.com wrote: Adding Kumar to the CC: list, since he might pick up the patch. Acked-by: Dan Williams dan.j.willi...@intel.com I agree with taking this through Kumar's tree. I'm going through patches for .31.. Should I still pick this up? Going forward should I pick up fsldma patches? - k ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] fsldma: use PCI Read Multiple command
You are assuming the PCI memory space is prefetchable( no side effect) for DMA. Is it possible that DMA is from non-prefetchable memory space? This should be a safe assumption for this driver. Remember, this driver just does offload memcpy, from one region to another. So the PCI memory that you are reading from should be just a buffer of data, and there should be side-effect of reading it. How about one FIFO buffer? ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] fsldma: use PCI Read Multiple command
David Hawkins wrote: How about FIFO RAM case? If the FIFO has a fixed address, then according to the user guide, the DMA controller won't generate a burst transaction to it. We can try confirming this if you'd like. Like I said earlier, this driver does not support copying data to/from a single register address. It's just a off-loaded memcpy. So we don't need to worry about what happens to a fixed address. -- Timur Tabi Linux kernel developer at Freescale ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] fsldma: use PCI Read Multiple command
On Sat, Apr 25, 2009 at 2:35 AM, Ira Snyder i...@ovro.caltech.edu wrote: By default, the Freescale 83xx DMA controller uses the PCI Read Line command when reading data over the PCI bus. Setting the controller to use the PCI Read Multiple command instead allows the controller to read much larger bursts of data, which provides a drastic speed increase. The slowdown due to using PCI Read Line was only observed when a PCI-to-PCI bridge was between the devices trying to communicate. A simple test driver showed an increase from 4MB/sec to 116MB/sec when performing DMA over the PCI bus. Using DMA to transfer between blocks of local SDRAM showed no change in performance with this patch. The dmatest driver was also used to verify the correctness of the transfers, and showed no errors. Signed-off-by: Ira W. Snyder i...@ovro.caltech.edu Acked-by: Li Yang le...@freescale.com Thanks, Leo ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
RE: [PATCH] fsldma: use PCI Read Multiple command
By default, the Freescale 83xx DMA controller uses the PCI Read Line command when reading data over the PCI bus. Setting the controller to use the PCI Read Multiple command instead allows the controller to read much larger bursts of data, which provides a drastic speed increase. IIRC, the default for 83xx DMA controller uses the PCI mem read command, not mem read line. You are assuming the PCI memory space is prefetchable( no side effect) for DMA. Is it possible that DMA is from non-prefetchable memory space? The slowdown due to using PCI Read Line was only observed when a PCI-to-PCI bridge was between the devices trying to communicate. A simple test driver showed an increase from 4MB/sec to 116MB/sec when performing DMA over the PCI bus. Using DMA to transfer between blocks of local SDRAM showed no change in performance with this patch. The dmatest driver was also used to verify the correctness of the transfers, and showed no errors. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] fsldma: use PCI Read Multiple command
On Mon, Apr 27, 2009 at 5:09 PM, Liu Dave-R63238 dave...@freescale.com wrote: By default, the Freescale 83xx DMA controller uses the PCI Read Line command when reading data over the PCI bus. Setting the controller to use the PCI Read Multiple command instead allows the controller to read much larger bursts of data, which provides a drastic speed increase. IIRC, the default for 83xx DMA controller uses the PCI mem read command, not mem read line. You are assuming the PCI memory space is prefetchable( no side effect) for DMA. Is it possible that DMA is from non-prefetchable memory space? I guess it's not reasonable to use DMA from non-prefetchable memory. So it's up to the driver which uses the DMA engine to prevent from using DMA API on non-prefetchable memory. - Leo ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] fsldma: use PCI Read Multiple command
Hi all, You are assuming the PCI memory space is prefetchable( no side effect) for DMA. Is it possible that DMA is from non-prefetchable memory space? This should be a safe assumption for this driver. Remember, this driver just does offload memcpy, from one region to another. So the PCI memory that you are reading from should be just a buffer of data, and there should be side-effect of reading it. However, I would like to see a comment at the top of the file warning people that copying from PCI memory will result in prefetched reads. Here's a few results from DMA tests between two MPC8349EA boards housed in a CPCI chassis. 1. DMA mode register (DMAMRn) PCI read command (PRC, bits 11:10) a) DMAMRn[PRC] = 00 = PCI Read The PCI read command is 6h on the PCI bus. For DMA lengths of less than 1 cache line (32-bytes) the DMA controller will generate a PCI 6h command. However, for lengths of 32-bytes and higher, the DMA controller actually generates a PCI Read Line (Eh) command. Freescale indicated that this 'change of PCI command code' functionality is an undocumented 'feature', there to improve performance for longer read lengths. b) DMAMRn[PRC] = 01 = PCI Read Line Generated the PCI command code for PCI read line (Eh), regardless of DMA length. c) DMAMRn[PRC] = 10 = PCI Read Multiple Generated the PCI command code for PCI Read Multiple (Ch), regardless of DMA length. 2. DMA from areas marked as non-prefetchable. We setup two test cases: a) Two boards in the same PCI segment with no intervening bridges. b) Two boards in separate PCI segments with intervening bridges (in this case, there was two bridges on SBS CPCI-to-CPI backplane bridging cards). The PCI window to the IMMR registers on the boards is marked as non-prefetchable. DMA from that area (reads) appeared on the PCI bus as single 32-bit read transactions, i.e., the target device effectively ignores the PCI read command, and the target returns read data as single reads, i.e., the target acts as as non-prefetchable memory. We tested with the PCI Read Multiple command, and no data was ever prefetched from the target. The bridges did not prefetch and discard data, eg. we tried 36-bytes, and there were 9 separate 4-byte transactions (we were using a 32-bit PCI host for this test). Hence, at least for this example target device, there are no side effects to using PCI read multiple on the bus master, to a target PCI memory region marked as non-prefetchable. So even though the bridges were seeing transactions that indicated read multiple, only single read cycles were seen to be used. Would you like some sort of summary of this info for a commit message? Would you like us to check any other transaction/register combos? Cheers, Dave ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] fsldma: use PCI Read Multiple command
David Hawkins wrote: Would you like some sort of summary of this info for a commit message? That's probably overkill. I just want a sentence or two that tells someone looking at the code casually that the behavior of reading PCI memory might be different than what they expect. Would you like us to check any other transaction/register combos? Yes, could you try this on non-PCI memory? -- Timur Tabi Linux kernel developer at Freescale ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] fsldma: use PCI Read Multiple command
Would you like some sort of summary of this info for a commit message? That's probably overkill. I just want a sentence or two that tells someone looking at the code casually that the behavior of reading PCI memory might be different than what they expect. Ok, will-do. Would you like us to check any other transaction/register combos? Yes, could you try this on non-PCI memory? We've been using it to DMA between the x86 host main memory and the MPC8349EA boards (PCI targets). The reason we changed to Read Multiple was that it had a dramatic improvement in efficiency through bridges. However, the x86 host memory is prefetchable, so is consistent with the use of Read Multiple. Can you give me an example of non-PCI memory that would be non-prefetchable that you'd like us to try? We can see if our host CPUs have an area like that ... we just need to know what device to look for first :) Cheers, Dave ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] fsldma: use PCI Read Multiple command
On Apr 27, 2009, at 2:48 PM, David Hawkins wrote: Can you give me an example of non-PCI memory that would be non-prefetchable that you'd like us to try? We can see if our host CPUs have an area like that ... we just need to know what device to look for first :) You can mark the pci inbound window on the 83xx as non-prefetchable (assuming 83xx is host). On a x86 host I doubt there is any easy way to get non-prefetchable memory. - k ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] fsldma: use PCI Read Multiple command
Can you give me an example of non-PCI memory that would be non-prefetchable that you'd like us to try? We can see if our host CPUs have an area like that ... we just need to know what device to look for first :) You can mark the pci inbound window on the 83xx as non-prefetchable (assuming 83xx is host). On a x86 host I doubt there is any easy way to get non-prefetchable memory. Yep, we were going to do that, but chose to use the 1MB region already setup for the IMMRs since its already marked as non-prefetchable. We were only doing reads, so it wasn't going to hurt anything. I doubt that marking one of the other BAR regions as non-prefetchable will give a different result. However, we're more than happy to double-check if you'd like. Cheers, Dave ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] fsldma: use PCI Read Multiple command
You can mark the pci inbound window on the 83xx as non-prefetchable (assuming 83xx is host). On a x86 host I doubt there is any easy way to get non-prefetchable memory. One more note; we don't have access to a host-mode MPC8349EA, our boards are all targets. Cheers, Dave ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] fsldma: use PCI Read Multiple command
On Apr 27, 2009, at 3:00 PM, David Hawkins wrote: Can you give me an example of non-PCI memory that would be non-prefetchable that you'd like us to try? We can see if our host CPUs have an area like that ... we just need to know what device to look for first :) You can mark the pci inbound window on the 83xx as non-prefetchable (assuming 83xx is host). On a x86 host I doubt there is any easy way to get non-prefetchable memory. Yep, we were going to do that, but chose to use the 1MB region already setup for the IMMRs since its already marked as non-prefetchable. We were only doing reads, so it wasn't going to hurt anything. I doubt that marking one of the other BAR regions as non-prefetchable will give a different result. However, we're more than happy to double-check if you'd like. Its possible you'll get a different result since IMMR is a register space internal and thats normally a completely different bus than memory would be (internal to the 83xx). I'd suggest double-checking w/ a BAR marked non-prefetch pointing to real memory. - k ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] fsldma: use PCI Read Multiple command
David Hawkins wrote: Can you give me an example of non-PCI memory that would be non-prefetchable that you'd like us to try? We can see if our host CPUs have an area like that ... we just need to know what device to look for first :) H I was going to say any SOC device in the IMMR, but I don't see anything there that would constitute a memory buffer. I test this change on an 8610 and DMA to a register I/O, where this bit isn't even defined, and it made no difference. So I guess this change is okay. Acked-by: Timur Tabi ti...@freescale.com -- Timur Tabi Linux kernel developer at Freescale ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] fsldma: use PCI Read Multiple command
Can you give me an example of non-PCI memory that would be non-prefetchable that you'd like us to try? We can see if our host CPUs have an area like that ... we just need to know what device to look for first :) You can mark the pci inbound window on the 83xx as non-prefetchable (assuming 83xx is host). On a x86 host I doubt there is any easy way to get non-prefetchable memory. Yep, we were going to do that, but chose to use the 1MB region already setup for the IMMRs since its already marked as non-prefetchable. We were only doing reads, so it wasn't going to hurt anything. I doubt that marking one of the other BAR regions as non-prefetchable will give a different result. However, we're more than happy to double-check if you'd like. Its possible you'll get a different result since IMMR is a register space internal and thats normally a completely different bus than memory would be (internal to the 83xx). I'd suggest double-checking w/a BAR marked non-prefetch pointing to real memory. We had a 4k BAR1 setup to point to DDR memory. With prefetchable set, a 36-byte transfer generated a burst-of-8 32-bit words followed by a single transaction. With non-prefetchable set, the transfers were all single. So it works like we'd expect. Cheers, Dave ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] fsldma: use PCI Read Multiple command
Hi Timur, Would you like some sort of summary of this info for a commit message? That's probably overkill. I just want a sentence or two that tells someone looking at the code casually that the behavior of reading PCI memory might be different than what they expect. Could you help us with the wording you'd like to see in the code. Did you want to see something in the header comments, or something near the register settings? How about something like this in place of the existing PCI_RM comment: PRC_RM - PCI read multiple The default PCI read command used by the DMA controller is PCI Read (PCI command 6h). When the burst length is 32-bytes or longer, PCI Read Line (PCI command Eh) is used (undocumented feature of the controller). Using PCI read multiple (PCI command Ch) results in high-performance across PCI bridges. DMA transfers to non-prefetchable PCI registers should not result in prefetched reads, even when using the PCI read multiple command. Cheers, Dave ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] fsldma: use PCI Read Multiple command
David Hawkins wrote: PRC_RM - PCI read multiple The default PCI read command used by the DMA controller is PCI Read (PCI command 6h). When the burst length is 32-bytes or longer, PCI Read Line (PCI command Eh) is used (undocumented feature of the controller). Using PCI read multiple (PCI command Ch) results in high-performance across PCI bridges. DMA transfers to non-prefetchable PCI registers should not result in prefetched reads, even when using the PCI read multiple command. I was thinking more along the lines of: This driver tells the DMA controller to use the PCI Read Multiple command, instead of the PCI Read Line command, for PCI read operations. Please be aware that this setting may result in read pre-fetching on some platforms. -- Timur Tabi Linux kernel developer at Freescale ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] fsldma: use PCI Read Multiple command
Hi Timur, PRC_RM - PCI read multiple The default PCI read command used by the DMA controller is PCI Read (PCI command 6h). When the burst length is 32-bytes or longer, PCI Read Line (PCI command Eh) is used (undocumented feature of the controller). Using PCI read multiple (PCI command Ch) results in high-performance across PCI bridges. DMA transfers to non-prefetchable PCI registers should not result in prefetched reads, even when using the PCI read multiple command. I was thinking more along the lines of: This driver tells the DMA controller to use the PCI Read Multiple command, instead of the PCI Read Line command, for PCI read operations. Please be aware that this setting may result in read pre-fetching on some platforms. Ok, thanks. Ira will add your comment to the body of the code near the PRC_RM command and submit a new patch. Cheers, Dave ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] fsldma: use PCI Read Multiple command
On Mon, Apr 27, 2009 at 03:26:36PM -0500, Timur Tabi wrote: David Hawkins wrote: PRC_RM - PCI read multiple The default PCI read command used by the DMA controller is PCI Read (PCI command 6h). When the burst length is 32-bytes or longer, PCI Read Line (PCI command Eh) is used (undocumented feature of the controller). Using PCI read multiple (PCI command Ch) results in high-performance across PCI bridges. DMA transfers to non-prefetchable PCI registers should not result in prefetched reads, even when using the PCI read multiple command. I was thinking more along the lines of: This driver tells the DMA controller to use the PCI Read Multiple command, instead of the PCI Read Line command, for PCI read operations. Please be aware that this setting may result in read pre-fetching on some platforms. How about the following revised patch? Is it ok to inline it this way, or should I send another email to the list containing just this patch? From 73e42fa58c93de8d4d429ba8e069b60c42037b58 Mon Sep 17 00:00:00 2001 From: Ira W. Snyder i...@ovro.caltech.edu Date: Thu, 23 Apr 2009 16:17:54 -0700 Subject: [PATCH] fsldma: use PCI Read Multiple command By default, the Freescale 83xx DMA controller uses the PCI Read Line command when reading data over the PCI bus. Setting the controller to use the PCI Read Multiple command instead allows the controller to read much larger bursts of data, which provides a drastic speed increase. The slowdown due to using PCI Read Line was only observed when a PCI-to-PCI bridge was between the devices trying to communicate. A simple test driver showed an increase from 4MB/sec to 116MB/sec when performing DMA over the PCI bus. Using DMA to transfer between blocks of local SDRAM showed no change in performance with this patch. The dmatest driver was also used to verify the correctness of the transfers, and showed no errors. Signed-off-by: Ira W. Snyder i...@ovro.caltech.edu --- v1 - v2: * Added comment warning about possible prefetching drivers/dma/fsldma.c | 10 -- drivers/dma/fsldma.h |1 + 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c index da8a8ed..5943ca8 100644 --- a/drivers/dma/fsldma.c +++ b/drivers/dma/fsldma.c @@ -12,6 +12,11 @@ * also fit for MPC8560, MPC8555, MPC8548, MPC8641, and etc. * The support for MPC8349 DMA contorller is also added. * + * This driver instructs the DMA controller to issue the PCI Read Multiple + * command for PCI read operations, instead of using the default PCI Read Line + * command. Please be aware that this setting may result in read pre-fetching + * on some platforms. + * * This is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or @@ -49,9 +54,10 @@ static void dma_init(struct fsl_dma_chan *fsl_chan) case FSL_DMA_IP_83XX: /* Set the channel to below modes: * EOTIE - End-of-transfer interrupt enable +* PRC_RM - PCI read multiple */ - DMA_OUT(fsl_chan, fsl_chan-reg_base-mr, FSL_DMA_MR_EOTIE, - 32); + DMA_OUT(fsl_chan, fsl_chan-reg_base-mr, FSL_DMA_MR_EOTIE + | FSL_DMA_MR_PRC_RM, 32); break; } diff --git a/drivers/dma/fsldma.h b/drivers/dma/fsldma.h index 4f21a51..dc7f268 100644 --- a/drivers/dma/fsldma.h +++ b/drivers/dma/fsldma.h @@ -38,6 +38,7 @@ /* Special MR definition for MPC8349 */ #define FSL_DMA_MR_EOTIE 0x0080 +#define FSL_DMA_MR_PRC_RM 0x0800 #define FSL_DMA_SR_CH 0x0020 #define FSL_DMA_SR_PE 0x0010 -- 1.5.4.3 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] fsldma: use PCI Read Multiple command
David Hawkins wrote: Ira will add your comment to the body of the code near the PRC_RM command and submit a new patch. I'd rather have it near the top where people can see it. -- Timur Tabi Linux kernel developer at Freescale ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] fsldma: use PCI Read Multiple command
Timur Tabi wrote: David Hawkins wrote: Ira will add your comment to the body of the code near the PRC_RM command and submit a new patch. I'd rather have it near the top where people can see it. Looks like Ira had the same thought :) Dave ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] fsldma: use PCI Read Multiple command
Adding Kumar to the CC: list, since he might pick up the patch. Ira Snyder wrote: From 73e42fa58c93de8d4d429ba8e069b60c42037b58 Mon Sep 17 00:00:00 2001 From: Ira W. Snyder i...@ovro.caltech.edu Date: Thu, 23 Apr 2009 16:17:54 -0700 Subject: [PATCH] fsldma: use PCI Read Multiple command By default, the Freescale 83xx DMA controller uses the PCI Read Line command when reading data over the PCI bus. Setting the controller to use the PCI Read Multiple command instead allows the controller to read much larger bursts of data, which provides a drastic speed increase. The slowdown due to using PCI Read Line was only observed when a PCI-to-PCI bridge was between the devices trying to communicate. A simple test driver showed an increase from 4MB/sec to 116MB/sec when performing DMA over the PCI bus. Using DMA to transfer between blocks of local SDRAM showed no change in performance with this patch. The dmatest driver was also used to verify the correctness of the transfers, and showed no errors. Signed-off-by: Ira W. Snyder i...@ovro.caltech.edu Acked-by: Timur Tabi ti...@freescale.com -- Timur Tabi Linux kernel developer at Freescale ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] fsldma: use PCI Read Multiple command
On Mon, Apr 27, 2009 at 03:04:49PM -0500, Timur Tabi wrote: David Hawkins wrote: Can you give me an example of non-PCI memory that would be non-prefetchable that you'd like us to try? We can see if our host CPUs have an area like that ... we just need to know what device to look for first :) H I was going to say any SOC device in the IMMR, but I don't see anything there that would constitute a memory buffer. I test this change on an 8610 and DMA to a register I/O, where this bit isn't even defined, and it made no difference. So I guess this change is okay. I thought the driver only used the bit if the device tree indicated it was an 83xx-era DMA controller? That said, the bits are documented as specifically for PCI, so it would be surprising if it had any effect elsewhere. -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] fsldma: use PCI Read Multiple command
Scott Wood wrote: I thought the driver only used the bit if the device tree indicated it was an 83xx-era DMA controller? I just wanted to make sure it didn't do anything weird. It was the only test I could think of that didn't involve PCI. That said, the bits are documented as specifically for PCI, so it would be surprising if it had any effect elsewhere. Surely you wouldn't really be surprised by incorrect documentation of our parts. :-) -- Timur Tabi Linux kernel developer at Freescale ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] fsldma: use PCI Read Multiple command
On Mon, Apr 27, 2009 at 1:47 PM, Timur Tabi ti...@freescale.com wrote: Adding Kumar to the CC: list, since he might pick up the patch. Acked-by: Dan Williams dan.j.willi...@intel.com I agree with taking this through Kumar's tree. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
RE: [PATCH] fsldma: use PCI Read Multiple command
You are assuming the PCI memory space is prefetchable( no side effect) for DMA. Is it possible that DMA is from non-prefetchable memory space? This should be a safe assumption for this driver. Remember, this driver just does offload memcpy, from one region to another. So the PCI memory that you are reading from should be just a buffer of data, and there should be side-effect of reading it. However, I would like to see a comment at the top of the file warning people that copying from PCI memory will result in prefetched reads. How about FIFO RAM case? ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] fsldma: use PCI Read Multiple command
How about FIFO RAM case? If the FIFO has a fixed address, then according to the user guide, the DMA controller won't generate a burst transaction to it. We can try confirming this if you'd like. Cheers, Dave ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
RE: [PATCH] fsldma: use PCI Read Multiple command
Here's a few results from DMA tests between two MPC8349EA boards housed in a CPCI chassis. 1. DMA mode register (DMAMRn) PCI read command (PRC, bits 11:10) a) DMAMRn[PRC] = 00 = PCI Read The PCI read command is 6h on the PCI bus. For DMA lengths of less than 1 cache line (32-bytes) the DMA controller will generate a PCI 6h command. However, for lengths of 32-bytes and higher, the DMA controller actually generates a PCI Read Line (Eh) command. Freescale indicated that this 'change of PCI command code' functionality is an undocumented 'feature', there to improve performance for longer read lengths. b) DMAMRn[PRC] = 01 = PCI Read Line Generated the PCI command code for PCI read line (Eh), regardless of DMA length. c) DMAMRn[PRC] = 10 = PCI Read Multiple Generated the PCI command code for PCI Read Multiple (Ch), regardless of DMA length. Good summary! For the DMA PCI read/line/multi-line is outbound transaction. So according to your experiment, the 8349 PCI controller(as master) attemp to streaming/combining the outbound transaction(treated as prefetchable space). IIRC, the early 8349EUM has the bit[2]-SE in the POCMRn register, and is removed now. Not sure if it does function. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
RE: [PATCH] fsldma: use PCI Read Multiple command
You can mark the pci inbound window on the 83xx as non-prefetchable(assuming 83xx is host). On a x86 host I doubt there is any easy way to get non-prefetchable memory. One more note; we don't have access to a host-mode MPC8349EA, our boards are all targets. Actually we also can use the inbound window(w/wo prefetchable) in the agent mode. It doesn't care if it is host or agent. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: [PATCH] fsldma: use PCI Read Multiple command
Hi Dave, For the DMA PCI read/line/multi-line is outbound transaction. So according to your experiment, the 8349 PCI controller(as master) attemp to streaming/combining the outbound transaction(treated as prefetchable space). Yep, with the MPC8349EA configured as a PCI Target, and operating as a Bus Master, DMA transfers between two MPC8349EA targets to prefetchable memory on the slave will burst at pretty much full-speed over the PCI bus. The same goes for DMA to the host memory. However, reading from the host is slower, as the bursts get chopped up more than they do between two target boards. At some point I'll get a bunch of screen captures and put them in a document. IIRC, the early 8349EUM has the bit[2]-SE in the POCMRn register, and is removed now. Not sure if it does function. Hey yeah, I looked in the 8349E manual, and it is defined. I'm not sure why it would be defined there though. I can't think of why the master would want to disable streaming based on a bit setting; it should burst until the IOS says its full. Anyway, the bit is gone now, so we'll just ignore the fact it existed :) Cheers, Dave ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev