JonChesterfield added inline comments.
================ Comment at: clang/test/OpenMP/barrier_codegen.cpp:22 +// CLANGCG-NOT: readonly +// IRBUILDER: ; Function Attrs: nofree nosync nounwind readonly +// IRBUILDER-NEXT: declare i32 @__kmpc_global_thread_num(%struct.ident_t*) ---------------- ABataev wrote: > jdoerfert wrote: > > ABataev wrote: > > > jdoerfert wrote: > > > > ABataev wrote: > > > > > jdoerfert wrote: > > > > > > ABataev wrote: > > > > > > > jdoerfert wrote: > > > > > > > > ABataev wrote: > > > > > > > > > Not sure about correct use of `nosync` and `readonly` > > > > > > > > > attributes. OpenMP runtime uses lazy initialization of the > > > > > > > > > runtime library and when any runtime function is called, the > > > > > > > > > inner parts of the OpenMP runtime are initialized > > > > > > > > > automatically. It may use some sync primitives and may modify > > > > > > > > > memory, I assume. Same about `nofree`. > > > > > > > > There are two versions of these functions, host and device. I > > > > > > > > assume host functions are not inlined but device functions > > > > > > > > might be. This is basically all the modes we support right now. > > > > > > > > > > > > > > > > If we do not inline the function (host) we don't necessarily > > > > > > > > care what they do but what effect the user can expect. > > > > > > > > The user can not expect to synchronize through > > > > > > > > `__kmpc_global_thread_num` calls in a defined way, thus > > > > > > > > `nosync`. > > > > > > > > Similarly, from the users perspective there is no way to > > > > > > > > determine if something was written or freed, no matter how many > > > > > > > > of these calls I issue and under which control conditions, all > > > > > > > > I see is the number as a result. Thus, `readonly` and `nofree`. > > > > > > > > I believe `readnone` is even fine here but it might not work > > > > > > > > for the device version (see below) so I removed it. > > > > > > > > > > > > > > > > If we do inline the function (device) we need to make sure the > > > > > > > > attributes are compatible with the inlined code to not cause > > > > > > > > UB. The code of `__kmpc_global_thread_num` at least does not > > > > > > > > write anything and only reads stuff (as far as I can tell). > > > > > > > > > > > > > > > > Please correct me if I overlooked something. > > > > > > > But someone may try to inline the host-based runtime or try to > > > > > > > use LTO with it. > > > > > > > The question is not about the user expectations but about the > > > > > > > optimizations which can be triggered with these attributes. > > > > > > This is our runtime and we have supported and unsupported usage > > > > > > models. > > > > > Hmm, I don't think this the right approach. Plus, you still did not > > > > > answer about optimizations. Maybe, currently, these attributes won't > > > > > trigger dangerous optimizations but they can do this in the future > > > > > and it may lead to unpredictable results. I would use the pessimistic > > > > > model here rather than over-optimistic. > > > > I did (try to) describe why there cannot be any problems wrt. > > > > optimizations: > > > > The specified behavior of the runtime call is _as if_ it is `readonly`, > > > > `nofree`, and `nosync`. > > > > That is, from the perspective of the compiler this is true and > > > > optimizations are allowed to use that fact. > > > > > > > > The fact that the first ever runtime call initializes the runtime is > > > > neither part of the specification nor of the observable behavior. If we > > > > change the order between two call to `__kmpc_global_thread_num`, or > > > > similar calls, we cannot observe if/which one initialized the runtime > > > > and which read only stuff. > > > Here is the code of this function from the libomp: > > > ``` > > > int gtid; > > > > > > if (!__kmp_init_serial) { > > > gtid = KMP_GTID_DNE; > > > } else > > > #ifdef KMP_TDATA_GTID > > > if (TCR_4(__kmp_gtid_mode) >= 3) { > > > KA_TRACE(1000, ("*** __kmp_get_global_thread_id_reg: using TDATA\n")); > > > gtid = __kmp_gtid; > > > } else > > > #endif > > > if (TCR_4(__kmp_gtid_mode) >= 2) { > > > KA_TRACE(1000, ("*** __kmp_get_global_thread_id_reg: using keyed > > > TLS\n")); > > > gtid = __kmp_gtid_get_specific(); > > > } else { > > > KA_TRACE(1000, > > > ("*** __kmp_get_global_thread_id_reg: using internal > > > alg.\n")); > > > gtid = __kmp_get_global_thread_id(); > > > } > > > > > > /* we must be a new uber master sibling thread */ > > > if (gtid == KMP_GTID_DNE) { > > > KA_TRACE(10, > > > ("__kmp_get_global_thread_id_reg: Encountered new root > > > thread. " > > > "Registering a new gtid.\n")); > > > __kmp_acquire_bootstrap_lock(&__kmp_initz_lock); > > > if (!__kmp_init_serial) { > > > __kmp_do_serial_initialize(); > > > gtid = __kmp_gtid_get_specific(); > > > } else { > > > gtid = __kmp_register_root(FALSE); > > > } > > > __kmp_release_bootstrap_lock(&__kmp_initz_lock); > > > /*__kmp_printf( "+++ %d\n", gtid ); */ /* GROO */ > > > } > > > > > > KMP_DEBUG_ASSERT(gtid >= 0); > > > > > > return gtid; > > > > > > ``` > > > > > > As you can see, it is very complex. It has locks, in the debug mode it > > > writes something, etc. It is not only about the initialization, there are > > > some other calls (like `__kmp_get_global_thread_id()`, which really > > > writes something). Do you really think it is safe to mark this as > > > readonly and nosync? > > > Do you really think it is safe to mark this as readonly and nosync? > > > > Yes. I am arguing why in the last few comments. > > > > The central question is not what the function does but how it behaves (as > > long as we do not inline it, then what it does leaks). > I still don't think the right way to do it. Instead I would suggest to think > about platform dependent attributes. Say, in general we use some pessimistic > set of attributes but for some platforms (like NVPTX or AMDGCN) we can define > more optimistic set of attributes. Properties that are true globally, in the as-if sense, but violated locally, if inlined, are a seriously complicated proposal. I don't want to block this patch on a satisfactory resolution to that so suggest we go with conservative. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D69922/new/ https://reviews.llvm.org/D69922 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits