jfb added a comment. In D79279#2235085 <https://reviews.llvm.org/D79279#2235085>, @rsmith wrote:
> Thanks, I'm happy with this approach. > > If I understand correctly, the primary (perhaps sole) purpose of > `__builtin_memcpy_sized` is to provide a primitive from which an atomic > operation can be produced. That being the case, I wonder if the name is > emphasizing the wrong thing, and a name that contains `atomic` would be more > obvious. As a concrete alternative, `__atomic_unordered_memcpy` is not much > longer and seems to have the right kinds of implications. WDYT? Kinda, that's the motivating from Hans' paper which I'm following. One other use case (and the reason I assume Azul folks want it too) is when there's a GC that looks at objects. With this it knows it won't see torn objects when the GC is concurrent. It's similar, but generally `atomic` also implies an ordering, and here it's explicitly unordered (i.e. Bring Your Own Fences). So I don't have a strong opinion on the name, but `atomic` seem slightly wrong. Follow-ups I suggest in comments: - Make "must be trivially copyable" a warning throughout (not just here, but for atomics too). - Implement that diagnostic for `mem*` functions. ================ Comment at: clang/docs/LanguageExtensions.rst:2403-2406 +Constant evaluation support is only provided when the source and destination are +pointers to arrays with the same trivially copyable element type, and the given +size is an exact multiple of the element size that is no greater than the number +of elements accessible through the source and destination operands. ---------------- rsmith wrote: > Is this true in general, or only for `[w]mem{cpy,move}`? I thought for the > other cases, we required an array of `char` / `wchar_t`? This is just moving documentation that was below. Phab is confused with the diff. ================ Comment at: clang/docs/LanguageExtensions.rst:2409 +Support for constant expression evaluation for the above builtins can be detected +with ``__has_feature(cxx_constexpr_string_builtins)``. + ---------------- rsmith wrote: > I think this only applies to the above list minus the five functions you > added to it. Given this and the previous comment, I'm not sure that merging > the documentation on string builtins and memory builtins is working out well > -- they seem to have more differences than similarities. > > (`memset` is an outlier here that should be called out -- we don't seem to > provide any constant evaluation support for it whatsoever.) [w]memset are indeed the odd ones, update says so. ================ Comment at: clang/lib/AST/ExprConstant.cpp:8870 + return false; + if (N.urem(ElSz.getLimitedValue()) != 0) { + Info.FFDiag(E, diag::note_constexpr_mem_sized_bad_size) ---------------- rsmith wrote: > `getLimitedValue()` here seems unnecessary; `urem` can take an `APInt`. Their bitwidth doesn't always match, and that asserts out. ================ Comment at: clang/lib/AST/ExprConstant.cpp:8872 + Info.FFDiag(E, diag::note_constexpr_mem_sized_bad_size) + << (int)N.getLimitedValue() << (int)ElSz.getLimitedValue(); + return false; ---------------- rsmith wrote: > Consider using `toString` instead of truncating each `APSInt` to `uint64_t` > then to `int`. The size might reliably fit into `uint64_t`, but I don't think > we can assume that `int` is large enough. OK I updated 2 other places as well. ================ Comment at: clang/lib/CodeGen/CGBuiltin.cpp:635 + return ArgTy->castAs<clang::ObjCObjectPointerType>()->getPointeeType(); + return ArgTy->castAs<clang::PointerType>()->getPointeeType(); +} ---------------- rsmith wrote: > I'm not sure this `castAs` is necessarily correct. If the operand is C++11 > `nullptr`, we could perform a null-to-pointer implicit conversion, and > `ArgTy` could be `NullPtrTy` after stripping that back off here. > > It seems like maybe what we want to do is strip off implicit conversions > until we hit a non-pointer type, and take the pointee type we found > immediately before that? Ah good catch! The new functions I'm adding just disallow nullptr, but the older ones allow it. I've modified the code accordingly and added a test in CodeGen for nullptr. ================ Comment at: clang/lib/CodeGen/CGBuiltin.cpp:2639-2640 case Builtin::BI__builtin_mempcpy: { + QualType DestTy = getPtrArgType(CGM, E, 0); + QualType SrcTy = getPtrArgType(CGM, E, 1); Address Dest = EmitPointerWithAlignment(E->getArg(0)); ---------------- rsmith wrote: > jfb wrote: > > rsmith wrote: > > > jfb wrote: > > > > rsmith wrote: > > > > > Looking through implicit conversions in `getPtrArgType` here will > > > > > change the code we generate for cases like: > > > > > > > > > > ``` > > > > > void f(volatile void *p, volatile void *q) { > > > > > memcpy(p, q, 4); > > > > > } > > > > > ``` > > > > > > > > > > ... (in C, where we permit such implicit conversions) to use a > > > > > volatile memcpy intrinsic. Is that an intentional change? > > > > I'm confused... what's the difference that this makes for the > > > > pre-existing builtins? My intent was to get the `QualType` > > > > unconditionally, but I can conditionalize it if needed... However this > > > > ought to make no difference: > > > > ``` > > > > static QualType getPtrArgType(CodeGenModule &CGM, const CallExpr *E, > > > > unsigned ArgNo) { > > > > QualType ArgTy = E->getArg(ArgNo)->IgnoreImpCasts()->getType(); > > > > if (ArgTy->isArrayType()) > > > > return CGM.getContext().getAsArrayType(ArgTy)->getElementType(); > > > > if (ArgTy->isObjCObjectPointerType()) > > > > return > > > > ArgTy->castAs<clang::ObjCObjectPointerType>()->getPointeeType(); > > > > return ArgTy->castAs<clang::PointerType>()->getPointeeType(); > > > > } > > > > ``` > > > > and indeed I can't see the example you provided change in IR from one > > > > to the other. The issue I'm working around is that getting it > > > > unconditionally would make ObjC code sad when `id` is passed in as I > > > > outlined above. > > > The example I gave should produce a non-volatile memcpy, and used to do > > > so (we passed `false` as the fourth parameter to `CreateMemCpy`). With > > > this patch, `getPtrArgType`will strip off the implicit conversion from > > > `volatile void*` to `void*` in the argument type, so `isVolatile` below > > > will be `true`, so I think it will now create a volatile memcpy for the > > > same testcase. If that's not what's happening, then I'd like to > > > understand why not :) > > > > > > I'm not saying that's necessarily a bad change, but it is a change, and > > > it's one we should make intentionally if we make it at all. > > Oh yes, sorry I thought you were talking about something that > > `getPtrArgType` did implicitly! Indeed the C code behaves differently in > > that it doesn't just strip `volatile` anymore. > > > > I'm not super thrilled by the default C behavior, and I think this new > > behavior removes a gotcha, and is in fact what I was going for in the first > > iteration of the patch. Now that I've separated the builtin I agree that > > it's a bit odd... but it seems like the right thing to do anyways? But it > > no longer matches the C library function to do so. > > > > FWIW, this currently "works as you'd expect": > > ``` > > void f(__attribute__((address_space(32))) void *dst, const void *src, int > > sz) { > > __builtin_memcpy(dst, src, sz); > > } > > ``` > > https://godbolt.org/z/dcWxcK > > > > and I think that's completely accidental because the C library function > > doesn't (and, as John pointed out earlier, the builtin is meant to act like > > the C function). > > FWIW, this currently "works as you'd expect": > > > > ``` > > void f(__attribute__((address_space(32))) void *dst, const void *src, int > > sz) { > > __builtin_memcpy(dst, src, sz); > > } > > ``` > > The same is true even if you remove the `__builtin_` (and add a suitable > include), and that seems like a bug to me. It looks like we have special > logic that treats *all* builtins taking pointers as being overloaded on > address space, which seems wrong at least for lib builtins. C TR 18037:2008 > is quite explicit about this. Section 5.1.4 says: > > """ > The standard C library (ISO/IEC 9899:1999 clause 7 - Libraries) is unchanged; > the library's > functions and objects continue to be declared only with regard to the generic > address space. One > consequence is that pointers into named address spaces cannot be passed as > arguments to library > functions except in the special case that the named address spaces are > subsets of the generic > address space. > """ > > We could retain that special rule for `__builtin_`-spelled variants of lib > builtins. If we do, then maybe we shouldn't be adding > `__builtin_memcpy_overloaded` at all and should only extend the behavior of > `__builtin_memcpy` to also propagate volatility (and add a new builtin for > the atomic case). > > Regarding volatile, consider: > > ``` > void maybe_volatile_memcpy(volatile void *dst, const volatile void *src, int > sz, _Bool is_volatile) { > if (is_volatile) { > #ifdef __clang__ > __builtin_memcpy_overloaded(dst, src, sz); > #elif __GNUC__ > // ... > #else > // volatile char copy loop > #endif > } > memcpy(dst, src, sz); > } > ``` > > With this patch, the above code will always perform a volatile `memcpy`. I > think that's a surprise. A call to `memcpy` should follow the C semantics, > even if we choose to change the semantics of `__builtin_memcpy`. Yes, I believe that this is a pre-existing inconsistency with the non-`__builtin_` variants. ================ Comment at: clang/lib/Sema/SemaChecking.cpp:5609 + return ExprError(Diag(TheCall->getBeginLoc(), + PDiag(diag::err_atomic_op_needs_trivial_copy)) + << DstValTy << DstOp->getSourceRange()); ---------------- rsmith wrote: > Generally, I'm a little uncomfortable about producing an error if a type is > complete but allowing the construct if the type is incomplete -- that seems > like a situation where a warning would be more appropriate to me. It's > surprising and largely unprecedented that providing more information about a > type would change the program from valid to invalid. > > Do we really need the protection of an error here rather than an > enabled-by-default warning? Moreover, don't we already have a warning for > `memcpy` of a non-trivially-copyable object somewhere? If not, then I think > we should add such a thing that also applies to the real `memcpy`, rather > than only warning on the builtin. That rationale makes sense, but it's pre-existing behavior for atomic. I can change all of them in a follow-up if that's OK? We don't have such a check for other builtins. I can do a second follow-up to then adopt these warnings for them too? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D79279/new/ https://reviews.llvm.org/D79279 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits