Since we cannot post issues (reported here https://forum.gitlab.com/t/creating-new-issue-gives-cannot-create-issue-getting-whoops-something-went-wrong-on-our-end/41966?u=bsmith <https://forum.gitlab.com/t/creating-new-issue-gives-cannot-create-issue-getting-whoops-something-went-wrong-on-our-end/41966?u=bsmith>) here is my issue so I don't forget it.
I think err = WaitForCUDA();CHKERRCUDA(err); ierr = PetscLogGpuTimeEnd();CHKERRQ(ierr); should be changed to include WaitForCUDA() actually WaitForDevice() inside the PetscLogGpuTimeEnd(). Currently sometimes the WaitForCUDA() is missing in a few places resulting in bad timing. Also some _SeqCUDA() don't have the PetscLogGpuTimeEnd() and need to be fixed. The current model is a maintenance nightmare. Does anyone see a problem with making this change? Thanks Barry
