Re: [OMPI devel] New ARM patch

2013-01-28 Thread Leif Lindholm

On 26/01/13 00:05, Jeff Squyres (jsquyres) wrote:

Here's what I have done:

1. Committed your patch to v1.6.  George's patch was not committed to
v1.6.


Many thanks.


2. I opened https://svn.open-mpi.org/trac/ompi/ticket/3481 to track
your proposal of re-implementing/revamping the ARM ASM code.

Do you have a timeline for when that can be done?


As I have mentioned to you off list, I have (very) recently been 
seconded into the Linaro Enterprise Group, focusing on improving the

ARM server software ecosystem.
As such, I am potentially in a bit of legal limbo, until Linaro signs
a contribution agreement. This is however in the works.
But giving some flexibility for roadblocks, can we say "this quarter"?


3. Since no one is currently MTT testing Open MPI on ARM, I added the
following statement in the v1.6 README file under "Other systems have
been lightly (but not fully tested):"

- ARM4, ARM5, ARM6, ARM7 (when using non-inline assembly; only ARM7
is fully supported when -DOMPI_DISABLE_INLINE_ASM is used).

--> Is this correct?


Apart from our *cough* convoluted architecture vs. processor naming 
scheme... It should be ARMv4, ARMv5, ARMv6 and ARMv7.
(since ARM4 and ARM5 were skipped and ARM6 and ARM7 were processors 
implementing the ARMv3 and ARMv4 architectures :)



--> Do you think you'll be able to setup some MTT on ARM platforms?


I hope so.


4. I also added the following to v1.6 NEWS:

- Automatically provide compiler flags that compile properly on some
types of ARM systems.


Sounds good.

/
Leif




Re: [OMPI devel] New ARM patch

2013-01-25 Thread Leif Lindholm

On 24/01/13 22:12, Jeff Squyres (jsquyres) wrote:

On Jan 24, 2013, at 8:18 AM, Leif Lindholm <leif.lindh...@arm.com> wrote:

I tested this patch in v1.6 and v1.7 on my Pi, and it seems to work
just fine.  "make check" passes all the ASM tests.


Just to be perfectly clear: it wouldn't on ARMv5 though, and the ARMv6
ASM test executed with NOPs for barriers, although it would correctly
pass all other tests.


Mmm.  Ok.  So is this a correct list of what is supported right now (i.e., in 
v1.6 with your patch)

>

ARM4: no
ARM5: no
ARM6: sorta (not multi-core, or anywhere we would need barriers)
ARM7: yes

?


Correct, that is what is supported with out-of-line assembler functions
- i.e. when explicitly building with -DOMPI_DISABLE_INLINE_ASM.
They are all supported (and correctly using barriers) otherwise.


How would George's patch have changed that list?


ARM4: no
ARM5: maybe, unvalidated
ARM6: yes
ARM7: yes

/
Leif

-- IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium.  Thank you.




Re: [OMPI devel] New ARM patch

2013-01-24 Thread Leif Lindholm

On 24/01/13 02:54, Jeff Squyres (jsquyres) wrote:

[snip] Basic point is - this is an insufficiently validated patch
referred to as "an ugly kludge" by the original author (Jon
Masters@Red Hat), who created it to be able to include it in the
Fedora ARMv5 port. I has previously provided suggestions for
improvements, but it has still been submitted to the Open MPI
users list without any of those suggestions being acted on.

I admit to being slightly miffed with it being accepted and
applied without ever being mentioned on the Open MPI developers
list


It was done by one of the core committers (George); it's in our
community's culture to go commit without discussion on the devel
list for many kinds of things.


OK. In which case I probably _should_ be on that list.
*cough* might I however suggest that a statement to that effect is added
to http://www.open-mpi.org/community/lists/ompi.php ?


FWIW: Since we all know each other pretty well, we do a lot of
communication via IM and telephone in addition to the public mailing
list discussions.  This is not because we're discussing secret
things -- it's just that you can get a lot more accomplished in a 10
minute phone call than 15 back-n-forth, 10-page, highly detailed
emails.


Sure.


A list to which I now find myself subscribed to without having
asked for or being told about - miffed again.


Sorry about that; this was my fault.  I interpreted your off-post
mails to me about not being able to post to the users list as an ask
to be subscribed (since we don't allow posts from unsubscribed
users).


Understandable - apologies for overreacting.


Rather than unsubscribe you, though, I just marked you as "nomail"
on the users' list.  So you won't receive any further mail from that
list, but you're still subscribed, so you can post.


Thanks.


I tested this patch in v1.6 and v1.7 on my Pi, and it seems to work
just fine.  "make check" passes all the ASM tests.


Just to be perfectly clear: it wouldn't on ARMv5 though, and the ARMv6
ASM test executed with NOPs for barriers, although it would correctly
pass all other tests.


To be clear: I consider you to be the primary author and maintainer
of this code, and you're certainly more of an ARM expert than any of
us.  George may not have realized that someone from ARM was still an
active part of the community; I'm not sure.


I'm certainly not very visible :)
But I do try to pay attention.


But I, too, vote that we should back out his changes from the trunk
and put your suggested patch (his patch did not make it over to v1.6
or v1.7, because I was waiting for your response).

We actually do try to get consensus for these kinds of things, so
let's give George a little time to respond before backing it out.


Sure.

Regards,

Leif

-- IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium.  Thank you.




Re: [OMPI devel] New ARM patch

2013-01-23 Thread Leif Lindholm
Hi Jeff,

To summarize the out-of-line assembler changes of this patch:
- The patch is functionally correct for ARMv7 (which we know, because the
code
  doesn't change from the existing sources, it simply renames the file).
- It also appears to be functionally correct for ARMv6, given reports of
  people testing it. The fact that the source is a direct copy of the ARMv7
  file with the barrier operations substituted also suggests that it would
  work. However, this duplication of functionally identical code seems
  suboptimal to me.
- It *might* be functionally correct for ARMv5, although I have seen no
  reports of it actually being tested on ARMv5 - only of being tested on
  ARMv6/ARMv7 when successfully built for ARMv5.
- It is not functionally correct on ARMv4.

In addition to this, it effectively adds an additional build system layer,
by copying source files around at configuration time, on top of an already
very modular build system.

Now, the ARMv4/ARMv5 out-of-line code didn't exist at all before this, and
was only supported through the inline assembly, so this code would be useful
to keep, fix, and incorporate properly.

Basic point is - this is an insufficiently validated patch referred to as
"an ugly kludge" by the original author (Jon Masters@Red Hat), who created
it to be able to include it in the Fedora ARMv5 port. I has previously
provided suggestions for improvements, but it has still been submitted to
the Open MPI users list without any of those suggestions being acted on.

I admit to being slightly miffed with it being accepted and applied without
ever being mentioned on the Open MPI developers list - only on the users
list. A list to which I now find myself subscribed to without having asked
for or being told about - miffed again.

If the main purpose of accepting this patch is to provide a stopgap measure
for something better, I would much prefer simply incorporating your
CCASFLAGS
workaround into the configure script - removing the out-of-line asm
implementations of the atomics, but still providing a functional library for
the most common use-cases.

Something like:

Index: config/opal_config_asm.m4
===
--- config/opal_config_asm.m4   (revision 27881)
+++ config/opal_config_asm.m4   (working copy)
@@ -820,6 +820,7 @@
 ompi_cv_asm_arch="ARM"
 OPAL_ASM_SUPPORT_64BIT=0
 OPAL_ASM_ARM_VERSION=6
+CCASFLAGS+=" -march=armv7-a"
 AC_DEFINE_UNQUOTED([OPAL_ASM_ARM_VERSION],
[$OPAL_ASM_ARM_VERSION],
[What ARM assembly version to use])
 OMPI_GCC_INLINE_ASSIGN='"mov %0, #0" : "="(ret)'
@@ -830,6 +831,7 @@
 ompi_cv_asm_arch="ARM"
 OPAL_ASM_SUPPORT_64BIT=0
 OPAL_ASM_ARM_VERSION=5
+CCASFLAGS+=" -march=armv7-a"
 AC_DEFINE_UNQUOTED([OPAL_ASM_ARM_VERSION],
[$OPAL_ASM_ARM_VERSION],
[What ARM assembly version to use])
 OMPI_GCC_INLINE_ASSIGN='"mov %0, #0" : "="(ret)'

Then we can get on with rewriting this code properly with less urgency.

Regards,

Leif

> -Original Message-
> From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org] On
> Behalf Of Jeff Squyres (jsquyres)
> Sent: 22 January 2013 16:41
> To: Open MPI Developers
> Subject: Re: [OMPI devel] New ARM patch
> 
> Leif --
> 
> We talked about this a bit on our weekly call today.
> 
> Just to be sure: are you saying that George's patches are *functionally
> correct* for ARM5/6/7 (and broken for ARM 4), but it would be better to
> organize the code a bit better?
> 
> If that is correct, was ARM4 working before?
> 
> If ARM4 was working before, how important is it?  I.e., would it be ok
> to accept George's stuff for 1.7.0, and then accept any
> improvements/reshuffle/etc. from you for 1.7.1?
> 
> 
> 
> On Jan 21, 2013, at 12:15 PM, Leif Lindholm <leif.lindh...@arm.com>
> wrote:
> 
> > Hi George,
> >
> > Any chance of r27882 being reverted?
> >
> > As I told the Fedora guys when that patch originally surfaced[1],
> > I'm not overly fond of
> > - copying source files around as part of the configure step
> > - having separate source files for ARMv6 and ARMv7, when those
> differences
> >  should be easily separated through macros (and would be reusable for
> 32-bit
> >  ARMv8).
> >
> > Also, I might have mentioned that bit only on a separate thread on
> the Fedora list, but the ARMv4 support isn't actually correct (the ASM
> uses ARMv5-only operations).
> >
> > My alternate solution, the basic idea of which I posted over there
> [2] was to separate ARMv5 and earlier from ARM. Effectively separatin

[OMPI devel] New ARM patch

2013-01-21 Thread Leif Lindholm
Hi George,

Any chance of r27882 being reverted?

As I told the Fedora guys when that patch originally surfaced[1],
I'm not overly fond of
- copying source files around as part of the configure step
- having separate source files for ARMv6 and ARMv7, when those differences
  should be easily separated through macros (and would be reusable for 32-bit
  ARMv8).

Also, I might have mentioned that bit only on a separate thread on the Fedora 
list, but the ARMv4 support isn't actually correct (the ASM uses ARMv5-only 
operations).

My alternate solution, the basic idea of which I posted over there [2] was to 
separate ARMv5 and earlier from ARM. Effectively separating the atomics 
implementation at the boundary where The ARM architecture got 
load-linked/store-conditional, rather than having a separate source file for 
every architecture version.

[1] https://lists.fedoraproject.org/pipermail/arm/2012-November/004434.html
[2] https://lists.fedoraproject.org/pipermail/arm/2012-November/004460.html

Best Regards,

Leif

-- IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium.  Thank you.




Re: [OMPI devel] [PATCH] Open MPI on ARMv5

2012-05-04 Thread Leif Lindholm
> > It causes no problems on/for the systems supported by trunk before
> the patch went in - it just means that in some situations it doesn't
> work on/for the systems enabled by that patch.
>
> I'm having trouble parsing that.  :-)

Sorry :)

> Does this mean that we put in the patch, but it will cause no
> appreciable difference because, for target machine type X, it still
> won't work?

That's the worst case scenario, yes.

/
Leif

-- IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium.  Thank you.




Re: [OMPI devel] [PATCH] Open MPI on ARMv5

2012-05-04 Thread Leif Lindholm
It causes no problems on/for the systems supported by trunk before the patch 
went in - it just means that in some situations it doesn't work on/for the 
systems enabled by that patch.

> -Original Message-
> From: Jeff Squyres [mailto:jsquy...@cisco.com]
> Sent: 04 May 2012 11:51
> To: Evan Clinton
> Cc: Peter Robinson; Leif Lindholm; Open MPI Developers
> Subject: Re: [OMPI devel] [PATCH] Open MPI on ARMv5
>
> What I need to know in the immediate future is: does this affect the
> new patch that we just put in the 1.6rc2 tarball?
>
> http://www.open-mpi.org/community/lists/devel/2012/05/10968.php
>
> Meaning: I want to release 1.6 in the immediate future.  Is this a
> blocker?  If so, how fast can we get a fix?
>
>
> On May 2, 2012, at 6:07 PM, Evan Clinton wrote:
>
> > What I mean to say is that, as far as I can tell, in Open MPI's
> > configure stuff there's a switch based on what it detects the host
> > processor as (and this detection could be overridden by specifying
> the
> > --host= thing); this is probably not the best way to do it.
> >
> > (sorry for the double-post again, dangit)
> >
> > On Wed, May 2, 2012 at 5:52 PM, Peter Robinson <pbrobin...@gmail.com>
> wrote:
> >> On Wed, May 2, 2012 at 10:38 PM, Evan Clinton
> <evan.m.clin...@gmail.com> wrote:
> >>>> The Fedora guys are having trouble building the armv5tel variant
> (well, they did before this patch too, but... :)
> >>>>
> http://arm.koji.fedoraproject.org/koji/getfile?taskID=790343=build
> .log
> >>>
> >>> Ah, I think the problem is that the build system is not playing
> nicely
> >>> with cross-compiles (which it looks like that's doing
> (specifically,
> >>> in that case, compiling for armv5 on an armv7 box)).  I think an
> >>> immediate workaround would be to do ./configure
> >>> --host=armv5tel-unknown-linux-gnueabi or similar (in addition to
> >>> specifying the target -march).  I think you'd need to specify the
> >>> --host in a similar manner for any cross-compile of Open MPI?
> >>
> >> It's not cross compiling, it's native compile although it might be
> >> underlying armv7 device but it's running a armv5tel userspace.
> >> Ultimately for distribution compile platforms it should be paying
> >> attention to what the build system is telling it to compile for not
> >> the underlying device because it's built once and could be run on
> any
> >> number of devices.
> >>
> >> Peter
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>


-- IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium.  Thank you.




Re: [OMPI devel] [PATCH] Open MPI on ARMv5

2012-05-02 Thread Leif Lindholm
Hi Evan, Jeff,

The Fedora guys are having trouble building the armv5tel variant (well, they 
did before this patch too, but... :)

It appears the lack of out-of-line assembly causes issues in that it still 
attempts to assemble the armv7 file:
http://arm.koji.fedoraproject.org/koji/getfile?taskID=790343=build.log

Evan - how are you building this?

/
Leif

> -Original Message-
> From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org] On
> Behalf Of Leif Lindholm
> Sent: 30 April 2012 22:05
> To: Jeff Squyres
> Cc: Open MPI Developers; Evan Clinton
> Subject: Re: [OMPI devel] [PATCH] Open MPI on ARMv5
>
> My only question mark was with regards to the lack of out-of-line
> assembly implementations for the older architecture versions (as in "I
> don't know whether people would care about that or not").
>
> It does apply cleanly-ish (non-interactively) on top of 1.5.5, but I
> don't know if any further drops off 1.5 are planned?
>
> /
> Leif
> 
> From: Jeff Squyres [jsquy...@cisco.com]
> Sent: 30 April 2012 21:19
> To: Leif Lindholm
> Cc: Evan Clinton; Open MPI Developers
> Subject: Re: [OMPI devel] [PATCH] Open MPI on ARMv5
>
> I'm good with it as long as you guys are.
>
> Re the "...for now" comment; does this imply that you're going to do
> more later?
>
> BTW, I assume this is for trunk and the v1.6 branch, right?
>
>
> On Apr 30, 2012, at 9:47 AM, Leif Lindholm wrote:
>
> > Thanks,
> >
> > I'm happy for this to go in - Jeff?
> >
> > /
> >Leif
> >
> >> -Original Message-
> >> From: nave.notn...@gmail.com [mailto:nave.notn...@gmail.com] On
> Behalf
> >> Of Evan Clinton
> >> Sent: 30 April 2012 05:12
> >> To: Leif Lindholm
> >> Cc: Open MPI Developers; Jeffrey Squyres
> >> Subject: Re: [OMPI devel] [PATCH] Open MPI on ARMv5
> >>
> >> Thanks again for the comments.
> >>
> >>>> Quote the documentation, __kuser_cmpxchg "already includes memory
> >>>> barriers as needed," so the code using it shouldn't need anything
> >>>> extra.
> >>>
> >>> Fair enough - but could you put a comment to this effect into the
> >> patch?
> >> Comment added.
> >>
> >>> But I'm still not too happy about even the very unlikely risk of
> >>> something executing "random stuff" depending on kernel version.
> >>> For now, could you update the comments to that effect that:
> >>>
> >>> "These kernel functions are available on kernel versions 2.6.15 and
> >>> greater" + ", and running this software on earlier versions will
> >> result
> >>> in undefined behaviour."
> >> Comment added.
> >>
> >>> What I'm suggesting is not to parse information out of the build
> >> host,
> >>> but rather using the configured toolchain and options and try to
> >>> assemble the 64-bit atomic instructions. This would make it easy to
> >>> rebuild a generic ARMv6 package to support 64-bit atomics by just
> >> adding
> >>> -march=armv6zk to CFLAGS.
> >> Ah, I get it.  I may see if I can come up with a nice test in the
> near
> >> future.
> >>
> >> For now, revised patch attached.
> >
> > -- IMPORTANT NOTICE: The contents of this email and any attachments
> are confidential and may also be privileged. If you are not the
> intended recipient, please notify the sender immediately and do not
> disclose the contents to any other person, use it for any purpose, or
> store or copy the information in any medium.  Thank you.
> >
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
>
> -- IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended
> recipient, please notify the sender immediately and do not disclose the
> contents to any other person, use it for any purpose, or store or copy
> the information in any medium.  Thank you.
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


-- IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium.  Thank you.




Re: [OMPI devel] [PATCH] Open MPI on ARMv5

2012-04-30 Thread Leif Lindholm
My only question mark was with regards to the lack of out-of-line assembly 
implementations for the older architecture versions (as in "I don't know 
whether people would care about that or not").

It does apply cleanly-ish (non-interactively) on top of 1.5.5, but I don't know 
if any further drops off 1.5 are planned?

/
Leif

From: Jeff Squyres [jsquy...@cisco.com]
Sent: 30 April 2012 21:19
To: Leif Lindholm
Cc: Evan Clinton; Open MPI Developers
Subject: Re: [OMPI devel] [PATCH] Open MPI on ARMv5

I'm good with it as long as you guys are.

Re the "...for now" comment; does this imply that you're going to do more later?

BTW, I assume this is for trunk and the v1.6 branch, right?


On Apr 30, 2012, at 9:47 AM, Leif Lindholm wrote:

> Thanks,
>
> I'm happy for this to go in - Jeff?
>
> /
>Leif
>
>> -Original Message-
>> From: nave.notn...@gmail.com [mailto:nave.notn...@gmail.com] On Behalf
>> Of Evan Clinton
>> Sent: 30 April 2012 05:12
>> To: Leif Lindholm
>> Cc: Open MPI Developers; Jeffrey Squyres
>> Subject: Re: [OMPI devel] [PATCH] Open MPI on ARMv5
>>
>> Thanks again for the comments.
>>
>>>> Quote the documentation, __kuser_cmpxchg "already includes memory
>>>> barriers as needed," so the code using it shouldn't need anything
>>>> extra.
>>>
>>> Fair enough - but could you put a comment to this effect into the
>> patch?
>> Comment added.
>>
>>> But I'm still not too happy about even the very unlikely risk of
>>> something executing "random stuff" depending on kernel version.
>>> For now, could you update the comments to that effect that:
>>>
>>> "These kernel functions are available on kernel versions 2.6.15 and
>>> greater" + ", and running this software on earlier versions will
>> result
>>> in undefined behaviour."
>> Comment added.
>>
>>> What I'm suggesting is not to parse information out of the build
>> host,
>>> but rather using the configured toolchain and options and try to
>>> assemble the 64-bit atomic instructions. This would make it easy to
>>> rebuild a generic ARMv6 package to support 64-bit atomics by just
>> adding
>>> -march=armv6zk to CFLAGS.
>> Ah, I get it.  I may see if I can come up with a nice test in the near
>> future.
>>
>> For now, revised patch attached.
>
> -- IMPORTANT NOTICE: The contents of this email and any attachments are 
> confidential and may also be privileged. If you are not the intended 
> recipient, please notify the sender immediately and do not disclose the 
> contents to any other person, use it for any purpose, or store or copy the 
> information in any medium.  Thank you.
>


--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



-- IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium.  Thank you.




Re: [OMPI devel] [PATCH] Open MPI on ARMv5

2012-04-24 Thread Leif Lindholm
Hi Evan,

And just to add to the delay, I've been off sick...

First of all - thanks for the patch!

The patch adds support for currently unsupported platforms, without actually
changing any code paths for currently supported platforms. So from that
perspective, I would not object strongly to it being applied as is.

However, I do have a few minor comments on the code:
- The v5 code doesn't actually make use of the kuser helper barriers
  in its versions of opal_atomic_cmpset_acq/rel.
- The line
  +#if (OMPI_GCC_INLINE_ASSEMBLY || (OPAL_ASM_ARM_VERSION < 6))
  is correct, but does my head in. Could another define like
  OPAL_ARM_KUSER_BARRIERS or similar be added by the barrier definition
  and used here instead, to improve readability?
- Could you change the line
  +#if (OPAL_ASM_ARM_VERSION >= 6 && OMPI_GCC_INLINE_ASSEMBLY)
  to
  +#if (OMPI_GCC_INLINE_ASSEMBLY && (OPAL_ASM_ARM_VERSION >= 6))
  again for readability?

And a few higher-level comments/suggestions:
- The patch does not do a runtime test for kernel helper version. While
  normally not a problem, this could cause tricky to debug issues if 
  running on top of old kernels (as in "executing uninitialized
  memory" tricky).
  Not sure what the best way to hook such a test in would be though.
- This patch does not include out-of-line assembly versions
  (in opal/asm/base) for the atomic operations. However I am not sure
  if this causes any practical problems.
- If 64-bit atomics are desirable, these are actually possible on most
  ARMv6 platforms (including the Raspberry PI), so a configure test on
  whether LDREXD assembles without errors could be used to detect this.

Longer-term, I'd like to migrate to using the new GCC __atomic* intrinsics
(in 4.7 - http://gcc.gnu.org/wiki/Atomic/GCCMM). However, the old __sync*
intrinsic were a bit heavyweight so until 4.7 is ubiquitous I prefer to keep
the inline asm, and now the kuser calls.

/
Leif

> -Original Message-
> From: Jeffrey Squyres [mailto:jsquy...@cisco.com]
> Sent: 19 April 2012 16:21
> To: Open MPI Developers; Leif Lindholm
> Subject: Re: [OMPI devel] [PATCH] Open MPI on ARMv5
> 
> Thanks Evan!
> 
> (sorry for the delay in replying -- I was on vacation all last week and
> I'm *still* catching up...)
> 
> Lief -- does this look good to you?
> 
> 
> On Apr 13, 2012, at 11:13 PM, Evan Clinton wrote:
> 
> > At present Open MPI only supports ARMv7 processors.  Attached is a
> > patch against current trunk (r26270) that extends the atomic
> > operations and memory barriers code to work with ARMv5 and ARMv6
> ones,
> > too.
> >
> > For v6, the only changes were to use "mcr p15, 0, r0, c7, c10, 5"
> > instead of the unavailable DMB instruction, and to disable the 64 bit
> > compare-exchange function (which I understand is not vital for Open
> > MPI on 32 bit platforms?).  For v5, it was a bit trickier; the
> > processor lacks nice memory barrier instructions or proper atomic
> > operations.  Fortunately, the Linux kernel offers several helper
> > functions on ARM, and I've used those here.
> >
> > The changes build and pass all of the assembly-related tests in the
> > test folder and the hello world examples run on my "armv5tel" box
> > running Debian with Linux 2.6.32-5.  It should also run fine on ARMv6
> > boxes, and presumably v4, but I don't have either to test on.
> >
> > Documentation for the Linux kernel helper functions:
> >
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=blob;f=D
> ocumentation/arm/kernel_user_helpers.txt
> >
> > I've sent in a contributor agreement so there should be no IP
> problems.
> >
> > Hopefully this is useful,
> > Evan Clinton
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 






Re: [OMPI devel] Compiling OpenMPI 1.5.4 on Debian 6 qemu arm6l

2012-02-28 Thread Leif Lindholm
We'd need a few ifdefs, effectively.

One on the dmb/mcr and one on the 64-bit, depending on v6k or higher.

This would provide ARMv6 support only though - ARMv5 or earlier (like debian 
"armel") will still miss out.

> -Original Message-
> From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org] On
> Behalf Of Jeffrey Squyres
> Sent: 28 February 2012 14:10
> To: Leif Lindholm
> Cc: Open MPI Developers; Ron Broberg
> Subject: Re: [OMPI devel] Compiling OpenMPI 1.5.4 on Debian 6 qemu
> arm6l
>
> Are there any changes we need to make to OMPI?
>
>
> On Feb 28, 2012, at 7:50 AM, Leif Lindholm wrote:
>
> > Hi Ron,
> >
> > Excellent work! Indeed - simply dropping the DMBs can lead to memory
> consistency issues even on ARMv6.
> >
> > The architectural semantics for memory barriers exist in ARMv6 though
> - they just weren't given dedicated mnemonics.
> > What you could do is to simply replace the inline "dmb" sequences
> with inline cp15 operations:
> > - "MCR p15, 0, r0, c7, c10, 5"
> >  (the 'r0' is an encoding artefact and doesn't affect the register
> >  contents)
> >
> > LDREXD/STREXD weren't part of the ARMv6 base architecture, although
> they are supported by the 1176 which is used in the Raspberry PI. If
> your tools support detecting/building for extension subarchitecture
> ARMv6k (supported by 1176), you can actually keep the 64-bit atomics
> in.
> >
> > Best Regards,
> >
> > Leif
> >
> > References:
> >
> http://infocenter.arm.com/help/topic/com.arm.doc.ddi0301h/Babfdddg.html
> >
> http://infocenter.arm.com/help/topic/com.arm.doc.ddi0301h/Babhejba.html
> >
> >> -Original Message-
> >> From: Jeffrey Squyres [mailto:jsquy...@cisco.com]
> >> Sent: 28 February 2012 12:30
> >> To: Ron Broberg; Open MPI Developers
> >> Cc: Leif Lindholm
> >> Subject: Re: [OMPI devel] Compiling OpenMPI 1.5.4 on Debian 6 qemu
> >> arm6l
> >>
> >> Ron -- Many thanks!
> >>
> >> Leif -- can you comment on this?  (yes, I'm passing the buck to our
> ARM
> >> Open MPI representative :-) )
> >>
> >>
> >> On Feb 26, 2012, at 1:22 PM, Ron Broberg wrote:
> >>
> >>> I would like to report the following information regarding
> compiling
> >> OpenMPI on Debian ARMv6. I won't submit this as a patch because I
> don't
> >> believe that "delete all 'dmb' instructions" can be considered a
> well
> >> developed patch. But this information may be of use to someone down
> the
> >> line.
> >>>
> >>> I was able to compile the upstream openmpi-1.5.4 distribution on a
> >> Debian6 armv6l qemu emulation.
> >>> http://www.open-mpi.org/software/ompi/v1.4/downloads/openmpi-
> >> 1.4.5.tar.bz2
> >>>
> >>>
> >>> You have to make 3 changes to the package
> >>> 1) Delete all references to the RISC instruction 'dmb'
> >>> 2) Modify the 'configure' file to include an 'armv6' option
> >>> 3) Compile with CFLAGS=-march=armv6
> >>>
> >>> ## 1) make the following edits to these three files
> >>> ./opal/asm/generated/atomic-local.s
> >>>delete all dmb instructions
> >>> ./opal/asm/base/ARM.asm
> >>>delete all dmb instructions
> >>> ./opal/include/opal/sys/arm/atomic.h
> >>>change the lines:
> >>> #if OPAL_WANT_SMP_LOCKS
> >>> #define MB()  __asm__ __volatile__ ("dmb" : : : "memory")
> >>> #define RMB() __asm__ __volatile__ ("dmb" : : : "memory")
> >>> #define WMB() __asm__ __volatile__ ("dmb" : : : "memory")
> >>> #else
> >>> #define MB()
> >>> #define RMB()
> >>> #define WMB()
> >>> #endif
> >>>
> >>> to read:
> >>> #define MB()
> >>> #define RMB()
> >>> #define WMB()
> >>>
> >>> ## 2) add the following to the 'configure' file at line 26946 of
> >> 171183
> >>> goto line 26946, there should be an 'alpha-' section above and an
> >> 'armv7' below
> >>> insert the following
> >>>armv6*)
> >>>ompi_cv_asm_arch="ARM"
> >>>OPAL_ASM_SUPPORT_64BIT=0
> >>>OMPI_GCC_INLINE_ASSIGN='"mov %0, #0" : "="(ret)'
> >>>;;
> >>>
> >>> ## 3) co

Re: [OMPI devel] Compiling OpenMPI 1.5.4 on Debian 6 qemu arm6l

2012-02-28 Thread Leif Lindholm
Hi Ron,

Excellent work! Indeed - simply dropping the DMBs can lead to memory 
consistency issues even on ARMv6.

The architectural semantics for memory barriers exist in ARMv6 though - they 
just weren't given dedicated mnemonics.
What you could do is to simply replace the inline "dmb" sequences with inline 
cp15 operations:
- "MCR p15, 0, r0, c7, c10, 5"
  (the 'r0' is an encoding artefact and doesn't affect the register
  contents)

LDREXD/STREXD weren't part of the ARMv6 base architecture, although they are 
supported by the 1176 which is used in the Raspberry PI. If your tools support 
detecting/building for extension subarchitecture ARMv6k (supported by 1176), 
you can actually keep the 64-bit atomics in.

Best Regards,

Leif

References:
http://infocenter.arm.com/help/topic/com.arm.doc.ddi0301h/Babfdddg.html
http://infocenter.arm.com/help/topic/com.arm.doc.ddi0301h/Babhejba.html

> -Original Message-
> From: Jeffrey Squyres [mailto:jsquy...@cisco.com]
> Sent: 28 February 2012 12:30
> To: Ron Broberg; Open MPI Developers
> Cc: Leif Lindholm
> Subject: Re: [OMPI devel] Compiling OpenMPI 1.5.4 on Debian 6 qemu
> arm6l
>
> Ron -- Many thanks!
>
> Leif -- can you comment on this?  (yes, I'm passing the buck to our ARM
> Open MPI representative :-) )
>
>
> On Feb 26, 2012, at 1:22 PM, Ron Broberg wrote:
>
> > I would like to report the following information regarding compiling
> OpenMPI on Debian ARMv6. I won't submit this as a patch because I don't
> believe that "delete all 'dmb' instructions" can be considered a well
> developed patch. But this information may be of use to someone down the
> line.
> >
> > I was able to compile the upstream openmpi-1.5.4 distribution on a
> Debian6 armv6l qemu emulation.
> > http://www.open-mpi.org/software/ompi/v1.4/downloads/openmpi-
> 1.4.5.tar.bz2
> >
> >
> > You have to make 3 changes to the package
> >  1) Delete all references to the RISC instruction 'dmb'
> >  2) Modify the 'configure' file to include an 'armv6' option
> >  3) Compile with CFLAGS=-march=armv6
> >
> > ## 1) make the following edits to these three files
> > ./opal/asm/generated/atomic-local.s
> > delete all dmb instructions
> > ./opal/asm/base/ARM.asm
> > delete all dmb instructions
> > ./opal/include/opal/sys/arm/atomic.h
> > change the lines:
> > #if OPAL_WANT_SMP_LOCKS
> > #define MB()  __asm__ __volatile__ ("dmb" : : : "memory")
> > #define RMB() __asm__ __volatile__ ("dmb" : : : "memory")
> > #define WMB() __asm__ __volatile__ ("dmb" : : : "memory")
> > #else
> > #define MB()
> > #define RMB()
> > #define WMB()
> > #endif
> >
> > to read:
> > #define MB()
> > #define RMB()
> > #define WMB()
> >
> > ## 2) add the following to the 'configure' file at line 26946 of
> 171183
> > goto line 26946, there should be an 'alpha-' section above and an
> 'armv7' below
> > insert the following
> > armv6*)
> > ompi_cv_asm_arch="ARM"
> > OPAL_ASM_SUPPORT_64BIT=0
> > OMPI_GCC_INLINE_ASSIGN='"mov %0, #0" : "="(ret)'
> > ;;
> >
> > ## 3) compile and install with the following CFLAGS
> > CFLAGS=-march=armv6
> > ./configure CFLAGS=-march=armv6
> > make
> > sudo make install
> >
> > more information about my build at
> > http://rhinohide.wordpress.com/2012/02/26/openmpi-on-raspberry-pi/
> >
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>


-- IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium.  Thank you.




[OMPI devel] [patch] add explicit IT instructions in ARM assembly

2011-08-02 Thread Leif Lindholm
In the Linaro 11.05 Linux metadistribution, and hence most likely in 
Ubuntu 11.04, the GNU assembler no longer defaults to 
-mimplicit-it=thumb, causing the build to fail when compiling on this 
platform.


The attached trivial patch adds explicit IT instructions where required, 
permitting a successful build. Patch created against r24936, but 
validated also against subsequent ones up to and including r24961.


/
Leif

p.s.
In case someone is curious about what I'm talking about above, there is 
a pretty decent description of the IT instruction here:

http://infocenter.arm.com/help/topic/com.arm.doc.dui0489d/Cjabicci.htmlIndex: opal/asm/base/ARM.asm
===
--- opal/asm/base/ARM.asm	(revision 24936)
+++ opal/asm/base/ARM.asm	(working copy)
@@ -73,6 +73,7 @@
LSYM(7)
ldrexd  r4, r5, [r0]
cmp r4, r2
+   it  eq
cmpeq   r5, r3
bne REFLSYM(8)
strexd  r1, r6, r7, [r0]
@@ -91,6 +92,7 @@
LSYM(9)
ldrexd  r4, r5, [r0]
cmp r4, r2
+   it  eq
cmpeq   r5, r3
bne REFLSYM(10)
strexd  r1, r6, r7, [r0]
@@ -111,6 +113,7 @@
LSYM(11)
ldrexd  r4, r5, [r0]
cmp r4, r2
+   it  eq
cmpeq   r5, r3
bne REFLSYM(12)
dmb
Index: opal/include/opal/sys/arm/atomic.h
===
--- opal/include/opal/sys/arm/atomic.h	(revision 24936)
+++ opal/include/opal/sys/arm/atomic.h	(working copy)
@@ -142,6 +142,7 @@
__asm__ __volatile__ (
  "1:  ldrexd  %0, %H0, [%2]   \n"
  "cmp %0, %3  \n"
+ "it  eq  \n"
  "cmpeq   %H0, %H3\n"
  "bne 2f  \n"
  "strexd  %1, %4, %H4, [%2]   \n"

Re: [OMPI devel] [Patch] Add support for ARMv7-A architecture

2011-01-07 Thread Leif Lindholm
> From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org] On
> Behalf Of Jeff Squyres
> 
> On Jan 7, 2011, at 12:39 PM, Leif Lindholm wrote:
 
> > Could I submit it as an attachment instead?
> 
> Please do.  We usually use attachments here for exactly this reason.

Please find it attached.

Regards,

Leif

armv7-a.patch
Description: Binary data


Re: [OMPI devel] [Patch] Add support for ARMv7-A architecture

2011-01-07 Thread Leif Lindholm
> -Original Message-
> From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org] On
> Behalf Of Jeff Squyres
>
> Many thanks for this contribution!

You're welcome :)

> A few points:
>
> 1. This is a lengthy contribution; it's a bit more than a "trivial" patch
> that we could include in the mainline without worrying about intellectual
> property.  :-(  Can you officially release this code under the BSD license,
> and/or sign the Open MPI 3rd party contribution agreement?  (I know that
> this is a major hassle, but we have to do it :-( )

This has been resolved off-list, agreement is in place.

> 2. Where did you generate this patch from?  Based on your patch filenames,
> I tried to apply it to the SVN trunk@24210 but failed:

It was meant to be against whatever was the trunk on 24 Dec.
I tried applying it to a clean checkout today (svnversion on the result says 
24210M), and I had no failures here.
Is it possible my email client is somehow corrupting the patch file, and 
de-corrupting it on input (I copy-pasted my patch from the original email when 
trying today)?

Could I submit it as an attachment instead?

Regards,

Leif

-- IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium.  Thank you.




[OMPI devel] [Patch] Add support for ARMv7-A architecture

2010-12-24 Thread Leif Lindholm
Hi,

The following patch adds support for the ARMv7-A architecture to opal.
This includes current processors such as Cortex-A8 and Cortex-A9, as
well as upcoming Cortex-A5 and Cortex-A15.

It has been validated on Ubuntu Lucid (10.04) and Maverick (10.10),
although the former might require some package updates to build from
checkout.

The opal/include/opal/sys/arm directory was cloned from powerpc.

I apologise for what I had to do to generate-asm.pl to get it to build.

Signed-off-by: leif.lindh...@arm.com

Index: ompi-trunk/opal/asm/generate-asm.pl
===
--- ompi-trunk/opal/asm/generate-asm.pl (revision 24191)
+++ ompi-trunk/opal/asm/generate-asm.pl (working copy)
@@ -103,7 +103,11 @@
 }

 if ($GNU_STACK == 1) {
-print OUTPUT "\n\t.section\t.note.GNU-stack,\"\",\@progbits\n";
+if ($asmarch eq "ARM") {
+print OUTPUT "\n\t.section\t.note.GNU-stack,\"\",\%progbits\n";
+} else {
+print OUTPUT "\n\t.section\t.note.GNU-stack,\"\",\@progbits\n";
+}
 }

 close(INPUT);
Index: ompi-trunk/opal/asm/asm-data.txt
===
--- ompi-trunk/opal/asm/asm-data.txt(revision 24191)
+++ ompi-trunk/opal/asm/asm-data.txt(working copy)
@@ -48,6 +48,15 @@

 ##
 #
+# ARM (ARMv7 and later)
+#
+##
+
+ARMdefault-.text-.globl-:--.L-#-1-1-1-1-1  arm-linux
+
+
+##
+#
 # Intel Pentium Class
 #
 ##
Index: ompi-trunk/opal/asm/base/ARM.asm
===
--- ompi-trunk/opal/asm/base/ARM.asm(revision 0)
+++ ompi-trunk/opal/asm/base/ARM.asm(revision 0)
@@ -0,0 +1,150 @@
+START_FILE
+   TEXT
+
+   ALIGN(4)
+START_FUNC(opal_atomic_mb)
+   dmb
+   bx  lr
+END_FUNC(opal_atomic_mb)
+
+
+START_FUNC(opal_atomic_rmb)
+   dmb
+   bx  lr
+END_FUNC(opal_atomic_rmb)
+
+
+START_FUNC(opal_atomic_wmb)
+   dmb
+   bx  lr
+END_FUNC(opal_atomic_wmb)
+
+
+START_FUNC(opal_atomic_cmpset_32)
+   LSYM(1)
+   ldrex   r3, [r0]
+   cmp r1, r3
+   bne REFLSYM(2)
+   strex   r12, r2, [r0]
+   cmp r12, #0
+   bne REFLSYM(1)
+   mov r0, #1
+   LSYM(2)
+   movne   r0, #0
+   bx  lr
+END_FUNC(opal_atomic_cmpset_32)
+
+
+START_FUNC(opal_atomic_cmpset_acq_32)
+   LSYM(3)
+   ldrex   r3, [r0]
+   cmp r1, r3
+   bne REFLSYM(4)
+   strex   r12, r2, [r0]
+   cmp r12, #0
+   bne REFLSYM(3)
+   dmb
+   mov r0, #1
+   LSYM(4)
+   movne   r0, #0
+   bx  lr
+END_FUNC(opal_atomic_cmpset_acq_32)
+
+
+START_FUNC(opal_atomic_cmpset_rel_32)
+   LSYM(5)
+   ldrex   r3, [r0]
+   cmp r1, r3
+   bne REFLSYM(6)
+   dmb
+   strex   r12, r2, [r0]
+   cmp r12, #0
+   bne REFLSYM(4)
+   mov r0, #1
+   LSYM(6)
+   movne   r0, #0
+   bx  lr
+END_FUNC(opal_atomic_cmpset_rel_32)
+
+#START_64BIT
+START_FUNC(opal_atomic_cmpset_64)
+   push{r4-r7}
+   ldrdr6, r7, [sp, #16]
+   LSYM(7)
+   ldrexd  r4, r5, [r0]
+   cmp r4, r2
+   cmpeq   r5, r3
+   bne REFLSYM(8)
+   strexd  r1, r6, r7, [r0]
+   cmp r1, #0
+   bne REFLSYM(7)
+   mov r0, #1
+   LSYM(8)
+   movne   r0, #0
+   pop {r4-r7}
+   bx  lr
+END_FUNC(opal_atomic_cmpset_64)
+
+START_FUNC(opal_atomic_cmpset_acq_64)
+   push{r4-r7}
+   ldrdr6, r7, [sp, #16]
+   LSYM(9)
+   ldrexd  r4, r5, [r0]
+   cmp r4, r2
+   cmpeq   r5, r3
+   bne REFLSYM(10)
+   strexd  r1, r6, r7, [r0]
+   cmp r1, #0
+   bne REFLSYM(9)
+   dmb
+   mov r0, #1
+   LSYM(10)
+   movne   r0, #0
+   pop {r4-r7}
+   bx  lr
+END_FUNC(opal_atomic_cmpset_acq_64)
+
+
+START_FUNC(opal_atomic_cmpset_rel_64)
+   push{r4-r7}
+   ldrdr6, r7, [sp, #16]
+   LSYM(11)
+   ldrexd  r4, r5, [r0]
+   cmp r4, r2
+   cmpeq   r5, r3
+   bne REFLSYM(12)
+   dmb
+   strexd  r1, r6, r7, [r0]
+   cmp r1, #0
+   bne REFLSYM(11)
+   mov r0, #1
+   LSYM(12)
+   movne   r0, #0
+   pop {r4-r7}
+   bx  lr
+END_FUNC(opal_atomic_cmpset_rel_64)
+#END_64BIT
+
+
+START_FUNC(opal_atomic_add_32)
+   LSYM(13)
+   ldrex   r2, [r0]
+   add r2, r2, r1
+   strex   r3, r2, [r0]
+   cmp r3, #0
+   bne REFLSYM(13)
+   mov r0, r2
+   bx  lr
+END_FUNC(opal_atomic_add_32)
+
+
+START_FUNC(opal_atomic_sub_32)
+   LSYM(14)
+   ldrex   r2, [r0]
+