On Thu, Feb 6, 2014 at 12:40 AM, Martyn Welch <martyn.we...@ge.com> wrote:
>
>
> On 05/02/14 23:21, Michael Kenney wrote:
>>
>> On Wed, Feb 5, 2014 at 1:38 PM, Michael Kenney <mfken...@gmail.com> wrote:
>>>
>>> On Wed, Feb 5, 2014 at 1:22 PM, Martyn Welch <mar...@welchs.me.uk> wrote:
>>>>
>>>>
>>>>
>>>>
>>>> On 5 February 2014 17:41, Greg KH <gre...@linuxfoundation.org> wrote:
>>>>>
>>>>>
>>>>> On Tue, Feb 04, 2014 at 06:34:13PM +0000, Martyn Welch wrote:
>>>>>>
>>>>>> On 04/02/14 16:34, Michael Kenney wrote:
>>>>>>>
>>>>>>> Hi Martyn,
>>>>>>>
>>>>>>> On Tue, Feb 4, 2014 at 7:28 AM, Martyn Welch <martyn.we...@ge.com
>>>>>>> <mailto:martyn.we...@ge.com>> wrote:
>>>>>>>
>>>>>>>     On 28/12/13 00:34, Michael Kenney wrote:
>>>>>>>
>>>>>>>         Hi Martyn,
>>>>>>>
>>>>>>>         On Fri, Dec 27, 2013 at 4:23 PM, Martyn Welch
>>>>>>>         <mar...@welchs.me.uk <mailto:mar...@welchs.me.uk>> wrote:
>>>>>>>
>>>>>>>             On 27/12/13 20:15, Michael Kenney wrote:
>>>>>>>
>>>>>>>
>>>>>>>                 We are using the vme_tsi148 bridge driver along with
>>>>>>> the
>>>>>>>                 vme_user
>>>>>>>                 driver to access the VME boards. The A/D board
>>>>>>> requires
>>>>>>>                 D32 bus cycles
>>>>>>>                 and the VME master window is configured accordingly,
>>>>>>>                 however, when
>>>>>>>                 monitoring the bus cycles with a logic analyzer, we
>>>>>>>                 noticed that the
>>>>>>>                 CPU is transferring one byte at a time (i.e. four D8
>>>>>>>                 transfers rather
>>>>>>>                 than one D32).
>>>>>>>
>>>>>>>                 Is this the expected behavior of the tsi148 driver?
>>>>>>>
>>>>>>>
>>>>>>>             Hi Mike,
>>>>>>>
>>>>>>>             This is certainly not the expected behaviour - if the
>>>>>>> window
>>>>>>>             is configured
>>>>>>>             for D32 then it should do 32 bit transfers where
>>>>>>> possible.
>>>>>>>
>>>>>>>             I've heard of this happening recently, but haven't yet
>>>>>>> been
>>>>>>>             able to
>>>>>>>             replicate it. Which VME board are you running Linux on
>>>>>>> and
>>>>>>>             which flavour of
>>>>>>>             Linux?
>>>>>>>
>>>>>>>
>>>>>>>         I'm running Debian 7.2 with kernel 3.2 on a Fastwel CPC600
>>>>>>>         (Pentium M
>>>>>>>         based CPU board).
>>>>>>>
>>>>>>>
>>>>>>>     I haven't forgotten about this, still not sure exactly what is
>>>>>>>     happening.
>>>>>>>
>>>>>>>     Is your install/kernel 32 or 64 bit?
>>>>>>>
>>>>>>>     Are you doing single 32-bit transfers, or are you seeing this on
>>>>>>>     longer transfers (i.e. copying a buffer full of data)?
>>>>>>>
>>>>>>>
>>>>>>> Thanks for getting back to me.
>>>>>>>
>>>>>>> I'm running a 32-bit kernel and I see this behavior on all transfers
>>>>>>> regardless of buffer size.
>>>>>>>
>>>>>>
>>>>>> Gah! Thought I could see what may be causing it in 64-bit kernels, but
>>>>>> not
>>>>>> 32-bit (my x86 asm is not particularly hot).
>>>>>>
>>>>>> I think we /may/ be hitting issues with how the memcpy function gets
>>>>>> implemented on specific architectures. The tsi148 is a PCI/X to VME
>>>>>> bridge,
>>>>>> if it receives a series of 8-bit reads or writes, it translates these
>>>>>> to
>>>>>> 8-bit reads or writes on the VME bus, which is not necessarily what we
>>>>>> want.
>>>>>>
>>>>>> According to the data sheet, the card you are using has an Intel
>>>>>> 6300ESB
>>>>>> ICH, which seems to be PCI/X (so we can rule out PCIe to PCI/X bridges
>>>>>> or
>>>>>> something like that doing nasty things).
>>>>>>
>>>>>> I think (if I follow everything correctly) then the memcpy for 32-bit
>>>>>> is
>>>>>> handled by __memcpy in arch/x86/include/asm/string_32.h:
>>>>>>
>>>>>> static __always_inline void *__memcpy(void *to, const void *from,
>>>>>> size_t
>>>>>> n)
>>>>>> {
>>>>>>          int d0, d1, d2;
>>>>>>          asm volatile("rep ; movsl\n\t"
>>>>>>                       "movl %4,%%ecx\n\t"
>>>>>>                       "andl $3,%%ecx\n\t"
>>>>>>                       "jz 1f\n\t"
>>>>>>                       "rep ; movsb\n\t"
>>>>>>                       "1:"
>>>>>>                       : "=&c" (d0), "=&D" (d1), "=&S" (d2)
>>>>>>                       : "0" (n / 4), "g" (n), "1" ((long)to), "2"
>>>>>> ((long)from)
>>>>>>                       : "memory");
>>>>>>          return to;
>>>>>> }
>>>>>>
>>>>>> I'd expected this function to use movl (32-bit moves) where possible,
>>>>>> but
>>>>>> movsb to get to naturally aligned moves (which is something that we
>>>>>> deal
>>>>>> with in the VME code already to use 16-bit reads where we can).
>>>>>>
>>>>>> Greg (as co-maintainer of the VME subsystem :-) ), Am I reading this
>>>>>> right?
>>>>>
>>>>>
>>>>> Yes, but why would it matter if we copy by bytes, bits, or dwords to a
>>>>> memory-backed bus?  By definition, that shouldn't matter.  If it does
>>>>> matter, then we shouldn't be using memcpy, or relying on the compiler
>>>>> to
>>>>> do the copying, but rather we need to use a write() function instead as
>>>>> this would be IO memory that does care about this type of thing.
>>>>>
>>>>
>>>> The drivers are using the functions memcpy_toio and memcpy_fromio for
>>>> the
>>>> bulk of the transfer, these seem to equate to memcpy on x86.
>>>>
>>>> In this instance it does matter - some devices can only accept certain
>>>> width
>>>> accesses and the VME bridge will translate a 8-bit PCI transfer to a
>>>> 8-bit
>>>> transfer on the VME bus.
>>>>
>>>> So (and I haven't even compile tested this yet), this?:
>>>>
>>>> diff --git a/drivers/vme/bridges/vme_ca91cx42.c
>>>> b/drivers/vme/bridges/vme_ca91cx42.c
>>>> index 1425d22c..0d87ebd 100644
>>>> --- a/drivers/vme/bridges/vme_ca91cx42.c
>>>> +++ b/drivers/vme/bridges/vme_ca91cx42.c
>>>> @@ -894,9 +894,9 @@ static ssize_t ca91cx42_master_read(struct
>>>> vme_master_resource *image,
>>>>          }
>>>>
>>>>          count32 = (count - done) & ~0x3;
>>>> -       if (count32 > 0) {
>>>> -               memcpy_fromio(buf + done, addr + done, (unsigned
>>>> int)count);
>>>> -               done += count32;
>>>> +       while (done < count32) {
>>>> +               *(u32 *)(buf + done) = ioread32(addr + done);
>>>> +               done += 4;
>>>>          }
>>>>
>>>>          if ((count - done) & 0x2) {
>>>> @@ -948,9 +948,9 @@ static ssize_t ca91cx42_master_write(struct
>>>> vme_master_resource *image,
>>>>          }
>>>>
>>>>          count32 = (count - done) & ~0x3;
>>>> -       if (count32 > 0) {
>>>> -               memcpy_toio(addr + done, buf + done, count32);
>>>> -               done += count32;
>>>> +       while (done < count32) {
>>>> +               iowrite32(*(u32 *)(buf + done), addr + done);
>>>> +               done += 4;
>>>>          }
>>>>
>>>>          if ((count - done) & 0x2) {
>>>> diff --git a/drivers/vme/bridges/vme_tsi148.c
>>>> b/drivers/vme/bridges/vme_tsi148.c
>>>> index 5fbd08f..bc82991 100644
>>>> --- a/drivers/vme/bridges/vme_tsi148.c
>>>> +++ b/drivers/vme/bridges/vme_tsi148.c
>>>> @@ -1297,9 +1297,9 @@ static ssize_t tsi148_master_read(struct
>>>> vme_master_resource *image, void *buf,
>>>>          }
>>>>
>>>>          count32 = (count - done) & ~0x3;
>>>> -       if (count32 > 0) {
>>>> -               memcpy_fromio(buf + done, addr + done, count32);
>>>> -               done += count32;
>>>> +       if (done < count32) {
>>>> +               *(u32 *)(buf + done) = ioread32(addr + done);
>>>> +               done += 4;
>>>>          }
>>>>
>>>>          if ((count - done) & 0x2) {
>>>> @@ -1379,9 +1379,9 @@ static ssize_t tsi148_master_write(struct
>>>> vme_master_resource *image, void *buf,
>>>>          }
>>>>
>>>>          count32 = (count - done) & ~0x3;
>>>> -       if (count32 > 0) {
>>>> -               memcpy_toio(addr + done, buf + done, count32);
>>>> -               done += count32;
>>>> +       if (done < count32) {
>>>> +               iowrite32(*(u32 *)(buf + done), addr + done);
>>>> +               done += 4;
>>>>          }
>>>>
>>>>          if ((count - done) & 0x2) {
>>>>
>>>>
>>>>>
>>>>> Does that help?
>>>>>
>>>>
>>>> Yep :-)
>>>>
>>>> Thanks Greg.
>>>
>>>
>>> That certainly looks promising. I can give that patch a try later this
>>> week.
>>>
>>> Thanks.
>>
>>
>> Good news Martyn.
>>
>> I had a bit of time today to test your patch and am happy to say it
>> worked! No more bus errors.
>>
>> Now I can work on getting the actual data acquisition working :-)
>>
>
> Thanks Mike.
>
> Guess you didn't spot the little bug then :-)

Oops, I sure didn't.  I was so happy to not see a bus error on my
simple one-word read/write test that I hadn't gone any further yet
:-).

>
> For the tsi148 I forgot to change the conditional from "if" to "while".
>
> I'll submit a proper patch in a minute,

I'll give that a try today. Thanks.

--Mike
_______________________________________________
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Reply via email to