RE: [PATCH] Re: STDCXX-1072 SPARC V8 mutex alignment requirements
Liviu, I built the library and tests before and after your change with the 64-bit flag, and I saw no differences in the number of failed tests between the builds. I've attached the output of 'gmake -k runall' before and after your change to STDCXX-1072 in case you want to look over them. I wrote up the following test case to prove that I could reproduce the issue using the compile/link options used to build the stdcxx tests. The code is below.. [vitek@andromeda] 138 % cat t.cpp #include thread.h #include synch.h #include malloc.h int main () { int *ip; mutex_t *mp; ip = (int*)malloc(sizeof (int) + sizeof (mutex_t)); mp = (mutex_t*)(ip + 1); mutex_init(mp, USYNC_THREAD | LOCK_ROBUST, 0); mutex_lock(mp); mutex_unlock(mp); mutex_destroy(mp); free(ip); } [vitek@andromeda] 139 % gmake t CC -c -D_RWSTDDEBUG -mt -I/amd/homes/vitek/tmp/stdcxx-4.2.x/include \ -I/build/vitek/stdcxx-4.2.x_sunpro-5.8_sunos-5.10-patched/include \ -I/amd/homes/vitek/tmp/stdcxx-4.2.x/tests/include \ -library=%none -g -m64 +w -errtags -erroff=hidef t.cpp CC t.o -o t -L/build/vitek/stdcxx-4.2.x_sunpro-5.8_sunos-5.10-patched/rwtest \ -lrwtest15S -library=%none -mt -m64 \ -L/build/vitek/stdcxx-4.2.x_sunpro-5.8_sunos-5.10-patched/lib \ -lstd15S -lm Bus Error (core dumped) [vitek@andromeda] 140 % I also tested with POSIX mutexes and saw the same behavior. Travis From: Liviu Nicoara Sent: Thursday, October 11, 2012 5:28 AM To: dev@stdcxx.apache.org Subject: Re: [PATCH] Re: STDCXX-1072 SPARC V8 mutex alignment requirements I applied the patch on 4.2.x. If someone with access to a SPARC machine could give it a runall and post the results here it would be awesome. I will postpone closing the incident until then. Thanks! Liviu
Re: [PATCH] Re: STDCXX-1072 SPARC V8 mutex alignment requirements
Thanks, Travis! On 10/13/12 15:36, Travis Vitek wrote: Liviu, I built the library and tests before and after your change with the 64-bit flag, and I saw no differences in the number of failed tests between the builds. I've attached the output of 'gmake -k runall' before and after your change to STDCXX-1072 in case you want to look over them. I wrote up the following test case to prove that I could reproduce the issue using the compile/link options used to build the stdcxx tests. The code is below.. [vitek@andromeda] 138 % cat t.cpp #include thread.h #include synch.h #include malloc.h int main () { int *ip; mutex_t *mp; ip = (int*)malloc(sizeof (int) + sizeof (mutex_t)); mp = (mutex_t*)(ip + 1); mutex_init(mp, USYNC_THREAD | LOCK_ROBUST, 0); mutex_lock(mp); mutex_unlock(mp); mutex_destroy(mp); free(ip); } [vitek@andromeda] 139 % gmake t CC -c -D_RWSTDDEBUG -mt -I/amd/homes/vitek/tmp/stdcxx-4.2.x/include \ -I/build/vitek/stdcxx-4.2.x_sunpro-5.8_sunos-5.10-patched/include \ -I/amd/homes/vitek/tmp/stdcxx-4.2.x/tests/include \ -library=%none -g -m64 +w -errtags -erroff=hidef t.cpp CC t.o -o t -L/build/vitek/stdcxx-4.2.x_sunpro-5.8_sunos-5.10-patched/rwtest \ -lrwtest15S -library=%none -mt -m64 \ -L/build/vitek/stdcxx-4.2.x_sunpro-5.8_sunos-5.10-patched/lib \ -lstd15S -lm Bus Error (core dumped) [vitek@andromeda] 140 % I also tested with POSIX mutexes and saw the same behavior. Travis From: Liviu Nicoara Sent: Thursday, October 11, 2012 5:28 AM To: dev@stdcxx.apache.org Subject: Re: [PATCH] Re: STDCXX-1072 SPARC V8 mutex alignment requirements I applied the patch on 4.2.x. If someone with access to a SPARC machine could give it a runall and post the results here it would be awesome. I will postpone closing the incident until then. Thanks! Liviu
Re: [PATCH] Re: STDCXX-1072 SPARC V8 mutex alignment requirements
I applied the patch on 4.2.x. If someone with access to a SPARC machine could give it a runall and post the results here it would be awesome. I will postpone closing the incident until then. Thanks! Liviu On 10/06/12 16:56, Liviu Nicoara wrote: On 09/29/12 15:33, Liviu Nicoara wrote: On 9/28/12 11:32 AM, Travis Vitek wrote: [...] I think an even cleaner solution is to switch to using __rw_aligned_buffer instead. It gives us a single point of failure for alignment issues like this, and it makes the code self-documenting and easier to read. I am attaching another patch here, which makes use of the __rw_aligned_buffer, slightly more verbose but the code is slightly cleaner. If there are no objections, I would check it in sometime after this week-end. Liviu
[PATCH] Re: STDCXX-1072 SPARC V8 mutex alignment requirements
On 9/28/12 11:32 AM, Travis Vitek wrote: -Original Message- From: Liviu Nicoara Sent: Friday, September 28, 2012 5:29 AM [...] The patch assumes the type `long double' exists on every platform. While I do believe that it is available everywhere, we have lots of conditional code guarding its use in the library now. If we are going to use `long double' in this context, we should guard it with _RWSTD_NO_LONG_DOUBLE. I think an even cleaner solution is to switch to using __rw_aligned_buffer instead. It gives us a single point of failure for alignment issues like this, and it makes the code self-documenting and easier to read. I am attaching another patch here, which makes use of the __rw_aligned_buffer, slightly more verbose but the code is slightly cleaner. Thanks! Liviu Index: src/messages.cpp === --- src/messages.cpp(revision 1390953) +++ src/messages.cpp(working copy) @@ -38,6 +38,7 @@ #include rw/_mutex.h +#include podarray.h #if !defined (_RWSTD_NO_NL_TYPES_H) !defined (_RWSTD_NO_CATOPEN_CATGETS) # include nl_types.h @@ -64,10 +65,7 @@ _RWSTD_NAMESPACE (__rw) { struct __rw_open_cat_data { nl_catd catd; -union { -void *_C_align; -char _C_data [sizeof (_STD::locale)]; -} loc; +__rw_aligned_buffer_STD::locale loc; }; struct __rw_open_cat_page @@ -159,7 +157,8 @@ __rw_manage_cat_data (int cat, __rw_op cat = int (n_catalogs); catalogs [cat]-catd = pcat_data-catd; -memcpy (catalogs [cat]-loc, pcat_data-loc, +memcpy (catalogs [cat]-loc._C_store (), +pcat_data-loc._C_store (), sizeof (_STD::locale)); if (size_t (cat) largest_cat) @@ -175,7 +174,8 @@ __rw_manage_cat_data (int cat, __rw_op } catalogs [cat]-catd = pcat_data-catd; -memcpy (catalogs [cat]-loc, pcat_data-loc, +memcpy (catalogs [cat]-loc._C_store (), +pcat_data-loc._C_store (), sizeof (_STD::locale)); if (size_t (cat) largest_cat) @@ -258,8 +258,9 @@ int __rw_cat_open (const _STD::string c return -1; __rw_open_cat_data cat_data; + cat_data.catd = catd; -new (cat_data.loc) _STD::locale (loc); +new (cat_data.loc._C_store ()) _STD::locale (loc); int cat = -1; __rw_manage_cat_data (cat, cat_data); @@ -307,7 +308,7 @@ const _STD::locale __rw_get_locale (int _RWSTD_ASSERT (0 != pcat_data); -return *(_RWSTD_REINTERPRET_CAST (_STD::locale*, (pcat_data-loc))); +return *pcat_data-loc._C_data (); } @@ -329,8 +330,7 @@ void __rw_cat_close (int cat) catclose (pcat_data-catd); -_STD::locale* const ploc = -_RWSTD_REINTERPRET_CAST (_STD::locale*, pcat_data-loc); +_STD::locale* const ploc = pcat_data-loc._C_data (); ploc-~locale (); Index: src/locale_body.cpp === --- src/locale_body.cpp (revision 1390953) +++ src/locale_body.cpp (working copy) @@ -1024,14 +1024,12 @@ _C_manage (__rw_locale *plocale, const c // the classic C locale is statically allocated // and not destroyed during the lifetime of the process -static union { -void* _C_align; -char _C_buf [sizeof (__rw_locale)]; -} classic_body; +static __rw_aligned_buffer__rw_locale classic_body; // construct a locale body in place // with the initial reference count of 1 -classic = new (classic_body) __rw_locale (locname); +classic = new (classic_body._C_store ()) +__rw_locale (locname); _RWSTD_ASSERT (1 == classic-_C_ref); } Index: src/use_facet.h === --- src/use_facet.h (revision 1390953) +++ src/use_facet.h (working copy) @@ -37,6 +37,7 @@ #include rw/_defs.h #include access.h +#include podarray.h // helper macro _RWSTD_DEFINE_FACET_FACTORY() defines a facet factory @@ -63,12 +64,9 @@ } \ else { \ /* construct an ordinary facet in static memory */ \ - static union { \ - void *align_; \ - char data_ [sizeof (__rw_ ## fid ## _facet)]; \ - } f;\ +
Re: STDCXX-1072 SPARC V8 mutex alignment requirements
On 09/28/12 08:29, Liviu Nicoara wrote: I have created the above and linked it to the closed STDCXX-1066. [...] IMO, the patch I attached does not break binary compatibility. Scratch this, I haven't thought it through. Thanks, Liviu
Re: STDCXX-1072 SPARC V8 mutex alignment requirements
On 09/28/12 08:45, Liviu Nicoara wrote: On 09/28/12 08:29, Liviu Nicoara wrote: I have created the above and linked it to the closed STDCXX-1066. [...] IMO, the patch I attached does not break binary compatibility. Scratch this, I haven't thought it through. Actually, after more thought, I believe that the patch itself is not causing binary incompatibility. The library built on a kernel-patched system will be binary incompatible with a library built on a previous system, regardless of the patch I presented earlier, because new mutex alignment requirements will have changed the size and layout of library objects. Comments are welcome. Thanks, Liviu
RE: STDCXX-1072 SPARC V8 mutex alignment requirements
-Original Message- From: Liviu Nicoara Sent: Friday, September 28, 2012 5:29 AM In short, my reading about this issue is that the kernel patch changed the alignment of the userland mutex objects from a machine word to a double-word boundary. No changes are required of the users who use such objects in their programs unless users create mutex objects in buffers which may not be aligned on a proper boundary. Your reading of this is consistent with my previous understanding of the problem, so that is good. I still don't have access to a SPARC machine. Any feed-back and/or SPARC build results are more than welcome! I can provide build results for SPARCV9 if we want them, but I thought that the problem only came up on 32-bit SPARCV8 builds. The patch assumes the type `long double' exists on every platform. While I do believe that it is available everywhere, we have lots of conditional code guarding its use in the library now. If we are going to use `long double' in this context, we should guard it with _RWSTD_NO_LONG_DOUBLE. I think an even cleaner solution is to switch to using __rw_aligned_buffer instead. It gives us a single point of failure for alignment issues like this, and it makes the code self-documenting and easier to read. As for your concerns about binary compatibility, I think that everything should be okay. We aren't changing the size of anything that is being passed around, we're just changing its alignment. I could be wrong, but I'm pretty sure that we're safe.
Re: STDCXX-1072 SPARC V8 mutex alignment requirements
The patch looks reasonable to me, except for the missing guard for _RWSTD_NO_LONG_DOUBLE. For C++ 11 compilers, we might want to replace the union with the alignas features. Of course, that will require another configuration test and macro, and most likely won't help the current Sun Studio compiler (unless it already implements alignas). Although it's possible that the compiler supports the GCC aligned attribute (we might want to use it with Linux compilers versions that don't yet implement C++ 11 alignas). Martin On 09/28/2012 06:29 AM, Liviu Nicoara wrote: I have created the above and linked it to the closed STDCXX-1066. In short, my reading about this issue is that the kernel patch changed the alignment of the userland mutex objects from a machine word to a double-word boundary. No changes are required of the users who use such objects in their programs unless users create mutex objects in buffers which may not be aligned on a proper boundary. E.g., the following are safe: mutex_t lock; struct S { char misalign; mutex_t lock; }; whereas the following is not: union { void* align; char buf [sizeof mutex_t]; } u; new (u) mutex_t; because the alignment requirements for void pointer are less strict than for mutex_t. A few places in the library use the latter for all sorts of static objects (mostly local statics). I looked esp. for places where we build objects that contain mutex sub-objects inside a union-aligned buffer: struct S { char c; mutex_t m; }; ... union { void* align; // - incorrect char buf [sizeof (S)]; } u; new (u) S (); The alignment must be changed to a value equal or greater than the mutex alignment requirements. IMO, the patch I attached does not break binary compatibility. It uses a one size fits all long double for alignment -- like we use in rw/_mutex.h -- but in doing so dispenses with all sorts of preprocessor conditionals. I still don't have access to a SPARC machine. Any feed-back and/or SPARC build results are more than welcome! Thanks! Liviu
RE: STDCXX-1072 SPARC V8 mutex alignment requirements
Liviu, Sorry I didn't look until just now, but it appears that I could have re-opened STDCXX-1066. I only see the 'Reopen Issue' button for STDCXX issues, but it is most definitely there. Perhaps there is some sort of permission issue for you? Also, STDCXX-1066 appears to have been a duplicate of STDCXX-1040. Travis -Original Message- From: Liviu Nicoara Sent: Friday, September 28, 2012 5:29 AM I have created the above and linked it to the closed STDCXX-1066.
Re: STDCXX-1072 SPARC V8 mutex alignment requirements
On 09/28/2012 09:32 AM, Travis Vitek wrote: -Original Message- From: Liviu Nicoara Sent: Friday, September 28, 2012 5:29 AM In short, my reading about this issue is that the kernel patch changed the alignment of the userland mutex objects from a machine word to a double-word boundary. No changes are required of the users who use such objects in their programs unless users create mutex objects in buffers which may not be aligned on a proper boundary. Your reading of this is consistent with my previous understanding of the problem, so that is good. I still don't have access to a SPARC machine. Any feed-back and/or SPARC build results are more than welcome! I can provide build results for SPARCV9 if we want them, but I thought that the problem only came up on 32-bit SPARCV8 builds. The patch assumes the type `long double' exists on every platform. While I do believe that it is available everywhere, we have lots of conditional code guarding its use in the library now. If we are going to use `long double' in this context, we should guard it with _RWSTD_NO_LONG_DOUBLE. I think an even cleaner solution is to switch to using __rw_aligned_buffer instead. It gives us a single point of failure for alignment issues like this, and it makes the code self-documenting and easier to read. As for your concerns about binary compatibility, I think that everything should be okay. We aren't changing the size of anything that is being passed around, we're just changing its alignment. I could be wrong, but I'm pretty sure that we're safe. Alignment does matter for binary compatibility but since the patch increases it, and since the actual alignment for an object is guaranteed to be at least as strict as the requirement for its type, I think the change is both backward and forward compatible. The latter only for correctly written programs that aren't relying on the actual alignment being greater than required. Since this is an internal helper used only by stdcxx classes, I don't think we need to worry about such programs. Martin
Re: STDCXX-1072 SPARC V8 mutex alignment requirements
On 09/28/12 11:45, Travis Vitek wrote: Liviu, Sorry I didn't look until just now, but it appears that I could have re-opened STDCXX-1066. I only see the 'Reopen Issue' button for STDCXX issues, but it is most definitely there. Perhaps there is some sort of permission issue for you? It's ok, I think it's somewhat cleaner to create a new one and link it to the old one. Even if clean was not a concern, it was within Stefan's options to close the incident. I don't know. Also, STDCXX-1066 appears to have been a duplicate of STDCXX-1040. Yep, forgot about it, I am thinking about linking that one, too. Thanks.
Re: STDCXX-1072 SPARC V8 mutex alignment requirements
On 09/28/12 11:32, Travis Vitek wrote: -Original Message- From: Liviu Nicoara Sent: Friday, September 28, 2012 5:29 AM In short, my reading about this issue is that the kernel patch changed the alignment of the userland mutex objects from a machine word to a double-word boundary. No changes are required of the users who use such objects in their programs unless users create mutex objects in buffers which may not be aligned on a proper boundary. Your reading of this is consistent with my previous understanding of the problem, so that is good. I still don't have access to a SPARC machine. Any feed-back and/or SPARC build results are more than welcome! I can provide build results for SPARCV9 if we want them, but I thought that the problem only came up on 32-bit SPARCV8 builds. I guess a -xarch=sparcv8 build will do. It is quite unlikely anybody has a real 32-bit SPARC hardware. The patch assumes the type `long double' exists on every platform. While I do believe that it is available everywhere, we have lots of conditional code guarding its use in the library now. If we are going to use `long double' in this context, we should guard it with _RWSTD_NO_LONG_DOUBLE. I think an even cleaner solution is to switch to using __rw_aligned_buffer instead. It gives us a single point of failure for alignment issues like this, and it makes the code self-documenting and easier to read. I took a cue from an alignment in the rw/_mutex.h code there. I'll do better. As for your concerns about binary compatibility, I think that everything should be okay. We aren't changing the size of anything that is being passed around, we're just changing its alignment. I could be wrong, but I'm pretty sure that we're safe. The library object sizes and layouts will change with or without our patch. Before the kernel patch our objects will have one layout and size and different ones afterwards. E.g.: struct locale { int c; pthread_mutex_t lock; }; before kernel patching you'd have no padding between the members. That's what I thought when firing that second post about compatibility. Liviu
Re: STDCXX-1072 SPARC V8 mutex alignment requirements
On 09/28/2012 11:27 AM, Liviu Nicoara wrote: On 09/28/12 11:45, Travis Vitek wrote: Liviu, Sorry I didn't look until just now, but it appears that I could have re-opened STDCXX-1066. I only see the 'Reopen Issue' button for STDCXX issues, but it is most definitely there. Perhaps there is some sort of permission issue for you? It's ok, I think it's somewhat cleaner to create a new one and link it to the old one. Even if clean was not a concern, it was within Stefan's options to close the incident. I don't know. Jira issues are ours to decide what to do with, just like stdcxx code. If there's consensus that the bug should stay open we keep it open (or reopen it). Likewise, if we all agree to close it we close it. FWIW, if we're going to fix the problem noted in STDCXX-1066 (regardless of how) it would probably make sense to change the resolution of the issue from Won't Fix. Otherwise it could be confusing. One other comment: I would suggest choosing subjects for bug reports that reflect the problem rather than a fix for it or a rationale for it. For STDCXX-1066 I think something like Library mutex objects misaligned on SPARCV8 would better capture the problem than the current title. (It's also up to us to rename an issue if we find it more descriptive than the original.) Martin Also, STDCXX-1066 appears to have been a duplicate of STDCXX-1040. Yep, forgot about it, I am thinking about linking that one, too. Thanks.
Re: STDCXX-1072 SPARC V8 mutex alignment requirements
On 09/28/12 13:51, Martin Sebor wrote: [...] One other comment: I would suggest choosing subjects for bug reports that reflect the problem rather than a fix for it or a rationale for it. For STDCXX-1066 I think something like Library mutex objects misaligned on SPARCV8 would better capture the problem than the current title. (It's also up to us to rename an issue if we find it more descriptive than the original.) I can't do that myself. I looked at that and 1056 and there is no button for me to reopen, or to edit stuff which is not mine. Needless to say, I like my issues better. But I have no problems with the re-opening of the closed ones. Thanks, Liviu
Re: STDCXX-1072 SPARC V8 mutex alignment requirements
On 09/28/2012 11:55 AM, Liviu Nicoara wrote: On 09/28/12 13:51, Martin Sebor wrote: [...] One other comment: I would suggest choosing subjects for bug reports that reflect the problem rather than a fix for it or a rationale for it. For STDCXX-1066 I think something like Library mutex objects misaligned on SPARCV8 would better capture the problem than the current title. (It's also up to us to rename an issue if we find it more descriptive than the original.) I can't do that myself. I looked at that and 1056 and there is no button for me to reopen, or to edit stuff which is not mine. What you can do with an issue is determined by your Jira role and the Jira permission scheme for the project. Your roles are Committer and PMC member. Both Committers and PMC members (and pretty much all other roles) have the Resolve Issues Permission: Ability to resolve and reopen issues. This includes the ability to set a fix version. https://issues.apache.org/jira/plugins/servlet/project-config/STDCXX/permissions So you should be able to reopen it. The top of the STDCXX-1066 page should look similar to what's in the attachment screenshot. I.e., you should see a Reopen Issue button near the top, just to the right of the |Comment|Voters|Watch Issue|More Actions| tabs. If that's not what you see, we should figure out why. One reason would be if you were logged in with a different user id, e.g., if you had more than one account. I checked and you do seem to have two accounts. One with your old Rogue Wave address, and another @hates.ms. I couldn't tell which address was used on the People admin page (Jira just shows the user name), so I removed and re-added you with the current email address. Let me know if that didn't clear things up. Martin Needless to say, I like my issues better. But I have no problems with the re-opening of the closed ones. Thanks, Liviu