On 13/01/2014 15:02, Kostya Serebryany wrote:
Alp, others,

Now that there is no urgent issue I'd like to return to the discussion about names with "__" prefix in sanitizer run-times.

We had __lsan_is_turned_off: a user of the tool has to define such symbol to communicate some information to the tool. Now there is an alternative name for this thing, but we still have a few similar names with similar usage model elsewhere.

Examples:
compiler-rt/include/sanitizer/asan_interface.h
  const char* __asan_default_options();
  void __asan_malloc_hook(void *ptr, size_t size);
  void __asan_free_hook(void *ptr);

compiler-rt/include/sanitizer/msan_interface.h
  const char* __msan_default_options();
  void __msan_malloc_hook(const volatile void *ptr, size_t size);
  void __msan_free_hook(const volatile void *ptr);

compiler-rt/lib/lsan/lsan_common.h (should be moved to compiler-rt/include/sanitizer)
  const char *__lsan_default_suppressions();

I may have missed a couple more.
These all have consistent naming with the rest of sanitizer interfaces.
This naming scheme is the only scheme known to me to that allows to avoid clashes with user names. And I still don't buy the arguments that defining functions with such names in user code is illegal.

It isn't subjective. It's a fact that the names have undefined behaviour in non-sanitizer builds.

I appreciate it's important for your internal schedule to have this up and running ASAP but that shouldn't come at the cost of pushing it onto others who aren't part of your internal roadmap -- especially when two alternative technical solutions have been proposed that would let you achieve your goals without delay, and without introducing invalid names.

If you want to define these functions in non-sanitizer build configurations, they shouldn't use a reserved name. It's that simple -- this has been discussed and other contributors, maintainers share my sentiment.

If you're OK to conditionalize the code so it's only defined in sanitizer builds, that's fine and as Sean pointed out there's already plenty of precedent -- it's what every software project does to define "__" prefixed weak link functions like libc malloc hooks.

There are varying degrees of undefined behaviour, but extern "C" names in the global namespace are the most problematic kinds that are known to cause trouble and conflicts.

Given that your function definitions have C linkage in user code, the underscore naming eats into the reserved namespace not only of the compiler but the whole implementation stack, from the standard C/C++ library down to the linker, resolver and runtime environment. That is a remarkably bad idea.

My understanding, is that the standard calls this "undefined behavior" and we explicitly define this behavior for these particular function names.

Who is "we"? MSVC, gcc and clang in non-sanitizer mode do not explicitly define behaviour for these function names.


It is very unlikely we will want to rename these existing functions to something else, and it is actually quite likely that we will want __lsan_is_turned_off (or some such) back for consistency.

The consistency argument is flawed. "__" is reserved for the implementation and you're trying to introduce the names in user code in non-sanitizer builds.

So we need to find a way to silence your tool for these names. Please advise.

This isn't about a tool or coding style. It's about making the effort to follow the C++ standard and avoiding reserved namespace conflicts which we know are problematic.

BTW, does the tool have a name and when will it become public?

it's a clang diagnostic implementing C and C++ rules for reserved names under the flags: -Wreserved, -Wreserved-macros, -Wreserved-identifiers. It'll be ready when it's ready, and you don't need it because the problem is right there in plain sight.

This isn't about silencing a tool but about doing the obviously correct thing, which is to either (1) conditionalize your function definitions or (2) rename them so they can be legally defined outside of sanitizer builds.

Alp.





--kcc




On Fri, Jan 10, 2014 at 12:11 PM, Kostya Serebryany <[email protected] <mailto:[email protected]>> wrote:

    Done at r198922.


    On Fri, Jan 10, 2014 at 8:47 AM, Kostya Serebryany <[email protected]
    <mailto:[email protected]>> wrote:

        Let me rename __lsan_is_turned_off to
        LeakSanitizerIsTurnedOffForTheCurrentProcess and recommit the
        patch with the new naming.
        __lsan_is_turned_off will remain there as deprecated, so that
        we don't break the existing uses. Will do this later today.
        I don't buy Alp's reasons for revert, but the problem is just
        not worth such a lengthy discussion.
        Let's do something useful instead :)

        --kcc


        On Fri, Jan 10, 2014 at 1:24 AM, Alp Toker <[email protected]
        <mailto:[email protected]>> wrote:


            On 09/01/2014 21:12, Sean Silva wrote:




                On Thu, Jan 9, 2014 at 12:49 PM, Alp Toker
                <[email protected] <mailto:[email protected]>
                <mailto:[email protected] <mailto:[email protected]>>> wrote:

                    OK, sorry I only just got round to this. Backed
                out in r198884 and
                    r198885.

                    In principle it's OK to re-land this with the
                ifdef Jordan
                    suggested, but I think it'd be more sustainable to
                try using
                    non-reserved identifiers for the library part of
                the sanitizer
                    interface if you have time to look into that.


                I'm not sure if you're aware of this Alp, but using a
                reserved "weak" symbol is a classic and widely used
                way to expose "hooks" into runtime
                libraries/instrumentation.

                
http://www.gnu.org/software/libc/manual/html_node/Hooks-for-Malloc.html
                http://msdn.microsoft.com/en-us/library/c63a9b7h.aspx
                __libc_malloc_dispatch in bionic libc

                The alternative is usually a function called by the
                user code in main() or whatever, which takes a
                function pointer.


            Yes, we discussed this earlier.

            That use case falls under permissible undefined behaviour,
            if and only if the implementation (of which the C library
            is a part) permits it.

            Not just bionic libc but glibc and others system
            components including the sanitizer control function in
            discussion have weak symbols, often prefixed "__" like
            this exposed as hooks which user code is free to define
            and override.

            What's not OK is to define the reserved names when a
            _different_ C library or implementation is in use, and the
            wording in the standard makes it clear the intention is to
            prevent implementations stepping on one another.

            Alp.




                -- Sean Silva


                    Cheers
                    Alp.



                    On 09/01/2014 17:52, Jordan Rose wrote:


                        On Jan 9, 2014, at 9:42 , Alp Toker
                <[email protected] <mailto:[email protected]>
                        <mailto:[email protected]
                <mailto:[email protected]>> <mailto:[email protected]
                <mailto:[email protected]>

                        <mailto:[email protected]
                <mailto:[email protected]>>>> wrote:


                            On 09/01/2014 17:30, Jordan Rose wrote:


                                On Jan 9, 2014, at 6:57 , Alp Toker
                <[email protected] <mailto:[email protected]>
                                <mailto:[email protected]
                <mailto:[email protected]>> <mailto:[email protected]
                <mailto:[email protected]>
                                <mailto:[email protected]
                <mailto:[email protected]>>><mailto:[email protected]
                <mailto:[email protected]>

                                <mailto:[email protected]
                <mailto:[email protected]>>>> wrote:

                                    I'm not making an assertion. The
                standard is very
                                    clear on this:


                                       *17.6.4.3 Reserved names
                [reserved.names]*

                                       If a program declares or
                defines a name in a
                                    context where it is
                                       reserved, other than as
                explicitly allowed by
                                    this Clause, its
                                       *behavior is undefined*.

                                       *17.6.4.3.2 Global names
                [global.names]*

                                       Certain sets of names and
                function signatures
                                    are always reserved
                                       to the implementation:

                                         * *Each name that contains a
                double
                                    underscore __*or begins
                                           with an underscore followed
                by an uppercase
                                    letter (2.12) *is
                 reserved to the implementation for any use*.
                                         * Each name that begins with
                an underscore is
                                    reserved to the
                 implementation for use as a name in the
                                    global namespace.



                                I know I shouldn't be getting into
                this, but...

                                   *1.3.24 undefined behavior
                [defns.undefined]*
                                   behavior for which this
                International Standard
                                imposes no requirements
                                   /[ Note: Undefined behavior may be
                expected when this
                                   International Standard omits any
                explicit
                                definition of behavior
                                   or when a program uses an erroneous
                construct or
                                erroneous data.
                                   *Permissible undefined behavior*
                ranges from
                                ignoring the
                                   situation completely with
                unpredictable results,
                                *to behaving
                                   during translation or program
                execution in a
                                documented manner
                                   characteristic of the environment*
                (with or without
                                the issuance
                                   of a diagnostic message), to
                terminating a
                                translation or
                                   execution (with the issuance of a
                diagnostic
                                message). Many
                                   erroneous program constructs do not
                engender undefined
                                   behavior; they are required to be
                diagnosed. — end
                                note ]/


                                (emphasis mine)

                                As I read this, a valid interpretation
                of this program
                                is that when LEAK_SANITIZER is
                defined, the program
                                contains undefined behavior, and
                therefore it should
                                only be set in a context when the
                particular
                                implementation is known to do
                something sensible for
                                this particular undefined behavior
                (that is, use the
                                function at runtime to disable leak
                checking).

                                I don't see this as abstractly
                different from the
                                standard-specified practice of
                replacing the global
                                operator new, so I don't think it's
                inherently an
                                anti-pattern. I think everyone agrees
                on this since
                                you've said already you'd have no
                objections if the
                                name weren't one of the restricted
                [global.names] names.

                                Would it help if the function were
                pre-declared in a
                                system header, and then just
                implemented or not
                                implemented in user code?


                            Hi Jordan,

                            That's the current situation -- as long as
                sanitizer
                            headers are available and in use the part
                of the spec you
                            highlighted means it's acceptable.

                            The problem arises when sanitizer headers
                aren't available.


                        I just don't think the program is illegal when
                LEAK_SANITIZER
                        is not defined. The tokens within the #ifdef
                are skipped
                        completely, so they don't refer to names and
                certainly don't
                        declare anything.

                        I'm not sure we should care about the case where
                        LEAK_SANITIZER is defined in an environment
                that doesn't
                        specify what defining this particular name
                will do. The user
                        has full control over this. (And in fact, IIRC
                being able to
                        define macros on the command line isn't at all
                specified by
                        the standard, so the program by itself will
                /always/ skip the
                        LEAK_SANITIZER block.)

                        Jordan


                    -- http://www.nuanti.com
                    the browser experts

                    _______________________________________________
                    cfe-commits mailing list
                [email protected]
                <mailto:[email protected]>
                <mailto:[email protected]
                <mailto:[email protected]>>
                http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits



-- http://www.nuanti.com
            the browser experts





--
http://www.nuanti.com
the browser experts

_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Reply via email to