RE: [PATCH] Re: STDCXX-1072 SPARC V8 mutex alignment requirements

2012-10-13 Thread Travis Vitek

Liviu,

I built the library and tests before and after your change with the 64-bit 
flag, and I saw no differences in the number of failed tests between the 
builds. I've attached the output of 'gmake -k runall' before and after your 
change to STDCXX-1072 in case you want to look over them.

I wrote up the following test case to prove that I could reproduce the issue 
using the compile/link options used to build the stdcxx tests. The code is 
below..

  [vitek@andromeda] 138 % cat t.cpp
  #include thread.h
  #include synch.h
  #include malloc.h

  int main ()
  {
int *ip;
mutex_t *mp;

ip = (int*)malloc(sizeof (int) + sizeof (mutex_t));
mp = (mutex_t*)(ip + 1);

mutex_init(mp, USYNC_THREAD | LOCK_ROBUST, 0);
mutex_lock(mp);
mutex_unlock(mp);
mutex_destroy(mp);

free(ip);
  }
  [vitek@andromeda] 139 % gmake t
  CC -c -D_RWSTDDEBUG -mt -I/amd/homes/vitek/tmp/stdcxx-4.2.x/include \
-I/build/vitek/stdcxx-4.2.x_sunpro-5.8_sunos-5.10-patched/include \
-I/amd/homes/vitek/tmp/stdcxx-4.2.x/tests/include \
-library=%none -g  -m64 +w -errtags -erroff=hidef   t.cpp
  CC t.o -o t -L/build/vitek/stdcxx-4.2.x_sunpro-5.8_sunos-5.10-patched/rwtest \
-lrwtest15S -library=%none -mt -m64 \
-L/build/vitek/stdcxx-4.2.x_sunpro-5.8_sunos-5.10-patched/lib \
-lstd15S  -lm
  Bus Error (core dumped)
  [vitek@andromeda] 140 %

I also tested with POSIX mutexes and saw the same behavior.

Travis

 From: Liviu Nicoara 
 Sent: Thursday, October 11, 2012 5:28 AM
 To: dev@stdcxx.apache.org
 Subject: Re: [PATCH] Re: STDCXX-1072 SPARC V8 mutex alignment requirements
 
 I applied the patch on 4.2.x. If someone with access to a SPARC machine 
 could give it a runall and post the results here it would be awesome. I 
 will postpone closing the incident until then.
 
 Thanks!
 
 Liviu




Re: [PATCH] Re: STDCXX-1072 SPARC V8 mutex alignment requirements

2012-10-13 Thread Liviu Nicoara

Thanks, Travis!

On 10/13/12 15:36, Travis Vitek wrote:


Liviu,

I built the library and tests before and after your change with the 64-bit 
flag, and I saw no differences in the number of failed tests between the 
builds. I've attached the output of 'gmake -k runall' before and after your 
change to STDCXX-1072 in case you want to look over them.

I wrote up the following test case to prove that I could reproduce the issue 
using the compile/link options used to build the stdcxx tests. The code is 
below..

   [vitek@andromeda] 138 % cat t.cpp
   #include thread.h
   #include synch.h
   #include malloc.h

   int main ()
   {
 int *ip;
 mutex_t *mp;

 ip = (int*)malloc(sizeof (int) + sizeof (mutex_t));
 mp = (mutex_t*)(ip + 1);

 mutex_init(mp, USYNC_THREAD | LOCK_ROBUST, 0);
 mutex_lock(mp);
 mutex_unlock(mp);
 mutex_destroy(mp);

 free(ip);
   }
   [vitek@andromeda] 139 % gmake t
   CC -c -D_RWSTDDEBUG -mt -I/amd/homes/vitek/tmp/stdcxx-4.2.x/include \
 -I/build/vitek/stdcxx-4.2.x_sunpro-5.8_sunos-5.10-patched/include \
 -I/amd/homes/vitek/tmp/stdcxx-4.2.x/tests/include \
 -library=%none -g  -m64 +w -errtags -erroff=hidef   t.cpp
   CC t.o -o t -L/build/vitek/stdcxx-4.2.x_sunpro-5.8_sunos-5.10-patched/rwtest 
\
 -lrwtest15S -library=%none -mt -m64 \
 -L/build/vitek/stdcxx-4.2.x_sunpro-5.8_sunos-5.10-patched/lib \
 -lstd15S  -lm
   Bus Error (core dumped)
   [vitek@andromeda] 140 %

I also tested with POSIX mutexes and saw the same behavior.

Travis


From: Liviu Nicoara
Sent: Thursday, October 11, 2012 5:28 AM
To: dev@stdcxx.apache.org
Subject: Re: [PATCH] Re: STDCXX-1072 SPARC V8 mutex alignment requirements

I applied the patch on 4.2.x. If someone with access to a SPARC machine
could give it a runall and post the results here it would be awesome. I
will postpone closing the incident until then.

Thanks!

Liviu







Re: [PATCH] Re: STDCXX-1072 SPARC V8 mutex alignment requirements

2012-10-11 Thread Liviu Nicoara
I applied the patch on 4.2.x. If someone with access to a SPARC machine 
could give it a runall and post the results here it would be awesome. I 
will postpone closing the incident until then.


Thanks!

Liviu

On 10/06/12 16:56, Liviu Nicoara wrote:

On 09/29/12 15:33, Liviu Nicoara wrote:

On 9/28/12 11:32 AM, Travis Vitek wrote:


[...] I think an even cleaner solution is to switch
to using __rw_aligned_buffer instead. It gives us a single point of
failure for alignment issues like this, and it makes the code
self-documenting and easier to read.


I am attaching another patch here, which makes use of the
__rw_aligned_buffer, slightly more verbose but the code is slightly
cleaner.


If there are no objections, I would check it in sometime after this
week-end.

Liviu




[PATCH] Re: STDCXX-1072 SPARC V8 mutex alignment requirements

2012-09-29 Thread Liviu Nicoara

On 9/28/12 11:32 AM, Travis Vitek wrote:



-Original Message-
From: Liviu Nicoara
Sent: Friday, September 28, 2012 5:29 AM
[...]

The patch assumes the type `long double' exists on every platform. While I do 
believe that it is available everywhere, we have lots of conditional code 
guarding its use in the library now. If we are going to use `long double' in 
this context, we should guard it with _RWSTD_NO_LONG_DOUBLE. I think an even 
cleaner solution is to switch to using __rw_aligned_buffer instead. It gives us 
a single point of failure for alignment issues like this, and it makes the code 
self-documenting and easier to read.


I am attaching another patch here, which makes use of the __rw_aligned_buffer, 
slightly more verbose but the code is slightly cleaner.


Thanks!

Liviu


Index: src/messages.cpp
===
--- src/messages.cpp(revision 1390953)
+++ src/messages.cpp(working copy)
@@ -38,6 +38,7 @@
 
 #include rw/_mutex.h
 
+#include podarray.h
 
 #if !defined (_RWSTD_NO_NL_TYPES_H)  !defined (_RWSTD_NO_CATOPEN_CATGETS)
 #  include nl_types.h
@@ -64,10 +65,7 @@ _RWSTD_NAMESPACE (__rw) {
 struct __rw_open_cat_data
 {
 nl_catd catd;
-union {
-void *_C_align;
-char _C_data [sizeof (_STD::locale)];
-} loc;
+__rw_aligned_buffer_STD::locale loc;
 };
 
 struct __rw_open_cat_page
@@ -159,7 +157,8 @@ __rw_manage_cat_data (int cat,  __rw_op
 cat = int (n_catalogs);
 
 catalogs [cat]-catd = pcat_data-catd;
-memcpy (catalogs [cat]-loc, pcat_data-loc,
+memcpy (catalogs [cat]-loc._C_store (), 
+pcat_data-loc._C_store (),
 sizeof (_STD::locale));
 
 if (size_t (cat)  largest_cat)
@@ -175,7 +174,8 @@ __rw_manage_cat_data (int cat,  __rw_op
 }
 
 catalogs [cat]-catd = pcat_data-catd;
-memcpy (catalogs [cat]-loc, pcat_data-loc,
+memcpy (catalogs [cat]-loc._C_store (),
+pcat_data-loc._C_store (),
 sizeof (_STD::locale));
 
 if (size_t (cat)  largest_cat)
@@ -258,8 +258,9 @@ int __rw_cat_open (const _STD::string c
 return -1;
 
 __rw_open_cat_data cat_data;
+
 cat_data.catd = catd;
-new (cat_data.loc) _STD::locale (loc);
+new (cat_data.loc._C_store ()) _STD::locale (loc);
 
 int cat = -1;
 __rw_manage_cat_data (cat, cat_data);
@@ -307,7 +308,7 @@ const _STD::locale __rw_get_locale (int
 
 _RWSTD_ASSERT (0 != pcat_data);
 
-return *(_RWSTD_REINTERPRET_CAST (_STD::locale*, (pcat_data-loc)));
+return *pcat_data-loc._C_data ();
 }
 
 
@@ -329,8 +330,7 @@ void __rw_cat_close (int cat)
 
 catclose (pcat_data-catd);
 
-_STD::locale* const ploc =
-_RWSTD_REINTERPRET_CAST (_STD::locale*, pcat_data-loc);
+_STD::locale* const ploc = pcat_data-loc._C_data ();
 
 ploc-~locale ();
 
Index: src/locale_body.cpp
===
--- src/locale_body.cpp (revision 1390953)
+++ src/locale_body.cpp (working copy)
@@ -1024,14 +1024,12 @@ _C_manage (__rw_locale *plocale, const c
 
 // the classic C locale is statically allocated
 // and not destroyed during the lifetime of the process
-static union {
-void* _C_align;
-char  _C_buf [sizeof (__rw_locale)];
-} classic_body;
+static __rw_aligned_buffer__rw_locale classic_body;
 
 // construct a locale body in place
 // with the initial reference count of 1
-classic = new (classic_body) __rw_locale (locname);
+classic = new (classic_body._C_store ()) 
+__rw_locale (locname);
 
 _RWSTD_ASSERT (1 == classic-_C_ref);
 }
Index: src/use_facet.h
===
--- src/use_facet.h (revision 1390953)
+++ src/use_facet.h (working copy)
@@ -37,6 +37,7 @@
 #include rw/_defs.h
 
 #include access.h
+#include podarray.h
 
 
 // helper macro _RWSTD_DEFINE_FACET_FACTORY() defines a facet factory
@@ -63,12 +64,9 @@
}   \
else {  \
/* construct an ordinary facet in static memory */  \
-   static union {  \
-   void *align_;   \
-   char  data_ [sizeof (__rw_ ## fid ## _facet)];  \
-   } f;\
+   

Re: STDCXX-1072 SPARC V8 mutex alignment requirements

2012-09-28 Thread Liviu Nicoara

On 09/28/12 08:29, Liviu Nicoara wrote:

I have created the above and linked it to the closed STDCXX-1066.
[...]
IMO, the patch I attached does not break binary compatibility.


Scratch this, I haven't thought it through.

Thanks,
Liviu


Re: STDCXX-1072 SPARC V8 mutex alignment requirements

2012-09-28 Thread Liviu Nicoara

On 09/28/12 08:45, Liviu Nicoara wrote:

On 09/28/12 08:29, Liviu Nicoara wrote:

I have created the above and linked it to the closed STDCXX-1066.
[...]
IMO, the patch I attached does not break binary compatibility.


Scratch this, I haven't thought it through.


Actually, after more thought, I believe that the patch itself is not causing 
binary incompatibility. The library built on a kernel-patched system will be 
binary incompatible with a library built on a previous system, regardless of 
the patch I presented earlier, because new mutex alignment requirements will 
have changed the size and layout of library objects. Comments are welcome.

Thanks,
Liviu



RE: STDCXX-1072 SPARC V8 mutex alignment requirements

2012-09-28 Thread Travis Vitek

 -Original Message-
 From: Liviu Nicoara
 Sent: Friday, September 28, 2012 5:29 AM
 

 
 In short, my reading about this issue is that the kernel patch changed
 the alignment of the userland mutex objects from a machine word to a
 double-word boundary. No changes are required of the users who use such
 objects in their programs unless users create mutex objects in buffers
 which may not be aligned on a proper boundary.

Your reading of this is consistent with my previous understanding of the 
problem, so that is good.

 
 I still don't have access to a SPARC machine. Any feed-back and/or
 SPARC build results are more than welcome!
 

I can provide build results for SPARCV9 if we want them, but I thought that the 
problem only came up on 32-bit SPARCV8 builds.

The patch assumes the type `long double' exists on every platform. While I do 
believe that it is available everywhere, we have lots of conditional code 
guarding its use in the library now. If we are going to use `long double' in 
this context, we should guard it with _RWSTD_NO_LONG_DOUBLE. I think an even 
cleaner solution is to switch to using __rw_aligned_buffer instead. It gives us 
a single point of failure for alignment issues like this, and it makes the code 
self-documenting and easier to read.

As for your concerns about binary compatibility, I think that everything should 
be okay. We aren't changing the size of anything that is being passed around, 
we're just changing its alignment. I could be wrong, but I'm pretty sure that 
we're safe.



Re: STDCXX-1072 SPARC V8 mutex alignment requirements

2012-09-28 Thread Martin Sebor

The patch looks reasonable to me, except for the missing guard
for _RWSTD_NO_LONG_DOUBLE. For C++ 11 compilers, we might want
to replace the union with the alignas features. Of course, that
will require another configuration test and macro, and most
likely won't help the current Sun Studio compiler (unless it
already implements alignas). Although it's possible that the
compiler supports the GCC aligned attribute (we might want to
use it with Linux compilers versions that don't yet implement
C++ 11 alignas).

Martin

On 09/28/2012 06:29 AM, Liviu Nicoara wrote:

I have created the above and linked it to the closed STDCXX-1066.

In short, my reading about this issue is that the kernel patch changed
the alignment of the userland mutex objects from a machine word to a
double-word boundary. No changes are required of the users who use such
objects in their programs unless users create mutex objects in buffers
which may not be aligned on a proper boundary. E.g., the following are
safe:

mutex_t lock;

struct S {
char misalign;
mutex_t lock;
};

whereas the following is not:

union {
void* align;
char buf [sizeof mutex_t];
} u;

new (u) mutex_t;

because the alignment requirements for void pointer are less strict than
for mutex_t. A few places in the library use the latter for all sorts of
static objects (mostly local statics). I looked esp. for places where we
build objects that contain mutex sub-objects inside a union-aligned buffer:

struct S {
char c;
mutex_t m;
};

...

union {
void* align; // - incorrect
char buf [sizeof (S)];
} u;

new (u) S ();

The alignment must be changed to a value equal or greater than the mutex
alignment requirements.

IMO, the patch I attached does not break binary compatibility. It uses a
one size fits all long double for alignment -- like we use in
rw/_mutex.h -- but in doing so dispenses with all sorts of preprocessor
conditionals.

I still don't have access to a SPARC machine. Any feed-back and/or SPARC
build results are more than welcome!

Thanks!

Liviu





RE: STDCXX-1072 SPARC V8 mutex alignment requirements

2012-09-28 Thread Travis Vitek
Liviu,

Sorry I didn't look until just now, but it appears that I could have re-opened 
STDCXX-1066. I only see the 'Reopen Issue' button for STDCXX issues, but it is 
most definitely there. Perhaps there is some sort of permission issue for you?

Also, STDCXX-1066 appears to have been a duplicate of STDCXX-1040.

Travis

 -Original Message-
 From: Liviu Nicoara
 Sent: Friday, September 28, 2012 5:29 AM
 
 I have created the above and linked it to the closed STDCXX-1066.


Re: STDCXX-1072 SPARC V8 mutex alignment requirements

2012-09-28 Thread Martin Sebor

On 09/28/2012 09:32 AM, Travis Vitek wrote:



-Original Message-
From: Liviu Nicoara
Sent: Friday, September 28, 2012 5:29 AM





In short, my reading about this issue is that the kernel patch changed
the alignment of the userland mutex objects from a machine word to a
double-word boundary. No changes are required of the users who use such
objects in their programs unless users create mutex objects in buffers
which may not be aligned on a proper boundary.


Your reading of this is consistent with my previous understanding of the 
problem, so that is good.



I still don't have access to a SPARC machine. Any feed-back and/or
SPARC build results are more than welcome!



I can provide build results for SPARCV9 if we want them, but I thought that the 
problem only came up on 32-bit SPARCV8 builds.

The patch assumes the type `long double' exists on every platform. While I do 
believe that it is available everywhere, we have lots of conditional code 
guarding its use in the library now. If we are going to use `long double' in 
this context, we should guard it with _RWSTD_NO_LONG_DOUBLE. I think an even 
cleaner solution is to switch to using __rw_aligned_buffer instead. It gives us 
a single point of failure for alignment issues like this, and it makes the code 
self-documenting and easier to read.

As for your concerns about binary compatibility, I think that everything should 
be okay. We aren't changing the size of anything that is being passed around, 
we're just changing its alignment. I could be wrong, but I'm pretty sure that 
we're safe.


Alignment does matter for binary compatibility but since the patch
increases it, and since the actual alignment for an object is
guaranteed to be at least as strict as the requirement for its
type, I think the change is both backward and forward compatible.
The latter only for correctly written programs that aren't relying
on the actual alignment being greater than required. Since this is
an internal helper used only by stdcxx classes, I don't think we
need to worry about such programs.

Martin


Re: STDCXX-1072 SPARC V8 mutex alignment requirements

2012-09-28 Thread Liviu Nicoara

On 09/28/12 11:45, Travis Vitek wrote:

Liviu,

Sorry I didn't look until just now, but it appears that I could have re-opened 
STDCXX-1066. I only see the 'Reopen Issue' button for STDCXX issues, but it is 
most definitely there. Perhaps there is some sort of permission issue for you?


It's ok, I think it's somewhat cleaner to create a new one and link it to the 
old one. Even if clean was not a concern, it was within Stefan's options to 
close the incident. I don't know.



Also, STDCXX-1066 appears to have been a duplicate of STDCXX-1040.


Yep, forgot about it, I am thinking about linking that one, too.

Thanks.


Re: STDCXX-1072 SPARC V8 mutex alignment requirements

2012-09-28 Thread Liviu Nicoara

On 09/28/12 11:32, Travis Vitek wrote:



-Original Message-
From: Liviu Nicoara
Sent: Friday, September 28, 2012 5:29 AM





In short, my reading about this issue is that the kernel patch changed
the alignment of the userland mutex objects from a machine word to a
double-word boundary. No changes are required of the users who use such
objects in their programs unless users create mutex objects in buffers
which may not be aligned on a proper boundary.


Your reading of this is consistent with my previous understanding of the 
problem, so that is good.



I still don't have access to a SPARC machine. Any feed-back and/or
SPARC build results are more than welcome!



I can provide build results for SPARCV9 if we want them, but I thought that the 
problem only came up on 32-bit SPARCV8 builds.


I guess a -xarch=sparcv8 build will do. It is quite unlikely anybody has a real 
32-bit SPARC hardware.



The patch assumes the type `long double' exists on every platform. While I do 
believe that it is available everywhere, we have lots of conditional code 
guarding its use in the library now. If we are going to use `long double' in 
this context, we should guard it with _RWSTD_NO_LONG_DOUBLE. I think an even 
cleaner solution is to switch to using __rw_aligned_buffer instead. It gives us 
a single point of failure for alignment issues like this, and it makes the code 
self-documenting and easier to read.


I took a cue from an alignment in the rw/_mutex.h code there. I'll do better.



As for your concerns about binary compatibility, I think that everything should 
be okay. We aren't changing the size of anything that is being passed around, 
we're just changing its alignment. I could be wrong, but I'm pretty sure that 
we're safe.



The library object sizes and layouts will change with or without our patch. 
Before the kernel patch our objects will have one layout and size and different 
ones afterwards. E.g.:

struct locale {
int c;
pthread_mutex_t lock;
};

before kernel patching you'd have no padding between the members. That's what I 
thought when firing that second post about compatibility.

Liviu


Re: STDCXX-1072 SPARC V8 mutex alignment requirements

2012-09-28 Thread Martin Sebor

On 09/28/2012 11:27 AM, Liviu Nicoara wrote:

On 09/28/12 11:45, Travis Vitek wrote:

Liviu,

Sorry I didn't look until just now, but it appears that I could have
re-opened STDCXX-1066. I only see the 'Reopen Issue' button for STDCXX
issues, but it is most definitely there. Perhaps there is some sort of
permission issue for you?


It's ok, I think it's somewhat cleaner to create a new one and link it
to the old one. Even if clean was not a concern, it was within Stefan's
options to close the incident. I don't know.


Jira issues are ours to decide what to do with, just like stdcxx
code. If there's consensus that the bug should stay open we keep
it open (or reopen it). Likewise, if we all agree to close it we
close it.

FWIW, if we're going to fix the problem noted in STDCXX-1066
(regardless of how) it would probably make sense to change the
resolution of the issue from Won't Fix. Otherwise it could be
confusing.

One other comment: I would suggest choosing subjects for bug
reports that reflect the problem rather than a fix for it or
a rationale for it. For  STDCXX-1066 I think something like
Library mutex objects misaligned on SPARCV8 would better
capture the problem than the current title. (It's also up
to us to rename an issue if we find it more descriptive
than the original.)

Martin





Also, STDCXX-1066 appears to have been a duplicate of STDCXX-1040.


Yep, forgot about it, I am thinking about linking that one, too.

Thanks.




Re: STDCXX-1072 SPARC V8 mutex alignment requirements

2012-09-28 Thread Liviu Nicoara

On 09/28/12 13:51, Martin Sebor wrote:

[...]
One other comment: I would suggest choosing subjects for bug
reports that reflect the problem rather than a fix for it or
a rationale for it. For  STDCXX-1066 I think something like
Library mutex objects misaligned on SPARCV8 would better
capture the problem than the current title. (It's also up
to us to rename an issue if we find it more descriptive
than the original.)


I can't do that myself. I looked at that and 1056 and there is no button for me 
to reopen, or to edit stuff which is not mine.

Needless to say, I like my issues better. But I have no problems with the 
re-opening of the closed ones.

Thanks,
Liviu



Re: STDCXX-1072 SPARC V8 mutex alignment requirements

2012-09-28 Thread Martin Sebor

On 09/28/2012 11:55 AM, Liviu Nicoara wrote:

On 09/28/12 13:51, Martin Sebor wrote:

[...]
One other comment: I would suggest choosing subjects for bug
reports that reflect the problem rather than a fix for it or
a rationale for it. For STDCXX-1066 I think something like
Library mutex objects misaligned on SPARCV8 would better
capture the problem than the current title. (It's also up
to us to rename an issue if we find it more descriptive
than the original.)


I can't do that myself. I looked at that and 1056 and there is no button
for me to reopen, or to edit stuff which is not mine.


What you can do with an issue is determined by your Jira role
and the Jira permission scheme for the project. Your roles are
Committer and PMC member.

Both Committers and PMC members (and pretty much all other
roles) have the Resolve Issues Permission:

  Ability to resolve and reopen issues. This includes the ability
  to set a fix version.

https://issues.apache.org/jira/plugins/servlet/project-config/STDCXX/permissions

So you should be able to reopen it. The top of the STDCXX-1066
page should look similar to what's in the attachment screenshot.
I.e., you should see a Reopen Issue button near the top, just
to the right of the |Comment|Voters|Watch Issue|More Actions|
tabs. If that's not what you see, we should figure out why.
One reason would be if you were logged in with a different user
id, e.g., if you had more than one account. I checked and you
do seem to have two accounts. One with your old Rogue Wave
address, and another @hates.ms. I couldn't tell which address
was used on the People admin page (Jira just shows the user
name), so I removed and re-added you with the current email
address. Let me know if that didn't clear things up.

Martin



Needless to say, I like my issues better. But I have no problems with
the re-opening of the closed ones.

Thanks,
Liviu