Hey Matt, Do you have any guidance or ideas regarding how large the subdomains should be to offset the cost of this copy?
Cheers, Dave On 13 March 2012 15:03, Matthew Knepley <knepley at gmail.com> wrote: > On Tue, Mar 13, 2012 at 8:59 AM, Xiangze Zeng <zengshixiangze at 163.com> > wrote: >> >> Hi, Jed. >> At the beginning and end of ?the codes for setting the matrices values, I >> add "printf", and compute the time of this period. It is much longer than >> that when I don't use the GPU. I just guess the time is used for copping >> data.?My PCTYPE is sor. And 2000 iterations. ?Do you have any suggestion >> about this? > > > 1) You do not have to guess. Use -log_summary, and there are explicit events > for copying to the GPU > > 2) GPUs only really become effective for large systems due to this overhead. > I suggest looking at the > ? ? performance and overhead as a function of system size. > > ? ?Matt > >> >> Zeng >> >> ? 2012-03-13 20:12:09?"Jed?Brown"?<jedbrown at mcs.anl.gov> ??? >> >> 2012/3/13 Xiangze Zeng <zengshixiangze at 163.com> >>> >>> After I??configure PETSc using --with-precision=single, I can run both >>> ex19 and my own code. Good news! But it seems lots of time is using for >>> copping the data from CPU to GPU. >> >> >> How are you measuring? What preconditioner are you using and how many >> iterations are typically required? >> >> >> > > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener
