Matt, So what's an example of "doing a bunch of iterations to make sending the initial datadown worth it"? Is there a correlation between that and arithmetic intensity, where an application is likely to be more compute-bound and memory-bandwidth bound?
Thanks, Justin On Thu, Mar 10, 2016 at 2:50 PM, Matthew Knepley <[email protected]> wrote: > On Thu, Mar 10, 2016 at 12:29 PM, Justin Chang <[email protected]> > wrote: > >> Hi all, >> >> When would I ever use GPU computing for a finite element simulation where >> the limiting factor of performance is the memory bandwidth bound? Say I >> want to run problems similar to SNES ex12 and 62. I understand that there >> is an additional bandwidth associated with offloading data from the CPU to >> GPU but is there more to it? I recall reading through some email threads >> about GPU's potentially giving you a speed up of 3x that on a CPU but the >> gain in performance may not be worth the increase in time moving data >> around. > > > The main use case is if you are being forced to use a machine which has > GPUs. Then you can indeed get some benefit > from the larger bandwidth. You need a problem where you are doing a bunch > of iterations to make sending the initial data > down worth it. > > It would certainly be better if you are computing the action of your > operator directly on the GPU, but that is much more > disruptive to the code right now. > > Matt > > >> Thanks, >> Justin >> > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener >
