Re: PATCH: fix clang to emit correct addrspacecast for CUDA

Justin Holewinski Mon, 24 Mar 2014 11:50:23 -0700

I don't have anything against making this a target-independent IR-level,as long as no-one complains about it being a core pass. Perhaps the passcould only execute if a target explicitly enables a flag. Somethinglike "preferNonGenericPointers". The default could be 'false', and thepass would only modify the IR if the target sets it to 'true'. Ofcourse, this also assumes address space 0 is generic. This is currentlytrue for the in-tree targets and CUDA/OpenCL support in Clang, but Idon't believe its a set rule anywhere.


On 03/24/2014 02:28 PM, Jingyue Wu wrote:

I agree with your concern. However, both CUDA and OpenCL (two mostpopular users of addrspacecast I believe) support generic addressspace, and could benefit from this optimization Would we end up withduplicated code (at least one for CUDA one for opencl) if we put it inthe back-end?


Jingyue

On Mon, Mar 24, 2014 at 11:22 AM, Justin Holewinski<[email protected] <mailto:[email protected]>> wrote:


    The hard part would be making this optimization general enough to
    be target-independent.  Optimizing to non-zero address spaces may
    not make sense for all targets (or even all future versions of
    PTX).  I agree that there should be an IR-level optimization for
    this, but perhaps its too target-specific and should actually live
    in the back-end.


    On 03/24/2014 01:05 PM, Jingyue Wu wrote:

    Right. We are aware of this issue, and think it should be
    addressed in the IR optimizer (similar to InstCombineLoadCast and
    InstCombineStoreToCast) instead of clang. Do you think this is an
    appropriate approach? Is this optimization general enough to stay
    in the IR optimizer or target-dependent?

    Jingyue


    On Mon, Mar 24, 2014 at 4:54 AM, Justin Holewinski
    <[email protected]
    <mailto:[email protected]>> wrote:

        Hi Jingyue,

        I committed the addrspacecast isel patterns to NVPTX.  Also,
        I wanted to point out that your changes in the last test case
        in this patch (address-spaces.cu <http://address-spaces.cu>)
        represent changes that may lead to performance degradation.
         Specific address spaces should be used whenever possible for
        loads/stores.  Casting everything to a generic address is
        still correct, but may lead to additional indirections for
        the hardware.


        On Fri, Mar 21, 2014 at 2:25 PM, Justin Holewinski
        <[email protected] <mailto:[email protected]>> wrote:

            addrspacecast support in NVPTX is on my todo list.  I'll
            try to put something together in the next few days.


            On 3/21/14, 2:20 PM, Jingyue Wu wrote:

            Hi,

            Static local variables in CUDA can be declared with
            address space qualifiers, such as __shared__. Therefore,
            the codegen needs to potentially addrspacecast a static
            local variable to the type expected by its declaration.
            Peter did something similar for global variables in
            r157167.

            All clang tests passed.

            Justin: The NVPTX backend support for addrspacecast
            seems not complete. We can send you follow-up patches
            once this one gets in.

            Jingyue

--Thanks,


            Justin Holewinski

            
------------------------------------------------------------------------
            This email message is for the sole use of the intended
            recipient(s) and may contain confidential information.
            Any unauthorized review, use, disclosure or distribution
            is prohibited. If you are not the intended recipient,
            please contact the sender by reply email and destroy all
            copies of the original message.
            
------------------------------------------------------------------------

--

        Thanks,

        Justin Holewinski

_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Re: PATCH: fix clang to emit correct addrspacecast for CUDA

Reply via email to