Thanks. That’s true, but there are cases (like making BLAS calls) where I have to nest more than 4 `withUnsafeMutable…` closures. It’s safe by really clumsy. I just wish there were a cleaner way that looks like the do-notation in Haskell or for-notation in Scala.
-Richard > On Dec 16, 2016, at 14:16, Joe Groff <jgr...@apple.com> wrote: > >> >> On Dec 16, 2016, at 12:10 PM, Richard Wei <rxr...@gmail.com> wrote: >> >> `sync` is not escaping. Shadow copy causes the device memory to make a copy, >> which can’t be a solution either. I’ll file a radar. >> >>> Note that, independently, this part looks fishy: >>> >>>> try! fill<<<(blockSize, blockCount)>>>[ >>>> .pointer(to: &self) >>> >>> UnsafePointers formed by passing an argument inout are not valid after the >>> called function returns, so if this function is forming and returning a >>> pointer, it will likely result in undefined behavior. You should use >>> `withUnsafePointer(&self) { p in ... }` to get a usable pointer. >> >> This part is a semi-"type safe” wrapper to CUDA kernel launcher. The purpose >> is to make explicit which argument gets passed to the kernel as a pointer; >> in this case, self is a `DeviceArray`. `.pointer` is a factory method under >> `KernelArgument`. Since most arguments to the kernel are passed in by >> pointers, IMO using a bunch of `withUnsafe...` clauses would only make it >> look unnecessarily clumsy. > > Unfortunately, that's the only defined way to write this code. The pointer > you get as an argument from `&self` is only good until pointer(to:) returns, > so it won't be guaranteed to be valid by the time you use it, and the > compiler will assume that `self` doesn't change afterward, so any mutation > done to `self` through that pointer will lead to undefined behavior. > Rewriting this code to use withUnsafePointer should also work around the > capture bug without requiring a shadow copy: > > let blockSize = min(512, count) > let blockCount = (count+blockSize-1)/blockSize > withUnsafePointer(to: &self) { selfPtr in > device.sync { // Launch CUDA kernel > try! fill<<<(blockSize, blockCount)>>>[ > .pointer(to: selfPtr), .value(value), .value(Int64(count)) > ] > } > } > > -Joe
_______________________________________________ swift-users mailing list swift-users@swift.org https://lists.swift.org/mailman/listinfo/swift-users