On Oct 6, 2010, at 8:00 PM, Jed Brown wrote: > Alright, same model as MATCRL and CSRPERM. Would be nice to eventually avoid > the copy, but there's no wasted code to do that latter if the performance is > clearly worth it.
If you are going to force each new row to be 16 byte aligned you are going to need a copy don't you anyway? Barry > > Jed > > >> On Oct 7, 2010 2:50 AM, "Barry Smith" <bsmith at mcs.anl.gov> wrote: >> >> >> Make a whole new subclass of SeqAIJ (parallel to the Inode) that does all >> this cool stuff and copies into the new aligned data structures (rather than >> keeping the data in the same data structure (as the current inode does). >> We'll just have to get the factorization stuff to work eventually once you >> show good performance gain for MatMult_SeqAIJ_AlignedInode(). >> >> >> Barry >> >> On Oct 6, 2010, at 7:04 PM, Jed Brown wrote: >> >> > Looking at assembly generated from the Inode kernel... >> >
