Hi Barry,
Currently the ell pack format only does matrix vector products which limits
its usefulness with real preconditioning, or requires two copies of the matrix.
yeah, that's the problem with those 'optimized' data formats for
MatMult: They do one thing very well, but fail on many other fronts.
Since it's a data format issue and not an implementation issue, we can't
fix it by merely devoting more time to it.
Best regards,
Karli
On Jul 24, 2017, at 2:42 PM, Richard Tran Mills <[email protected]> wrote:
Barry,
I agree that perhaps we don't need to try to get this in the release, but I think that what this is
targeting (processors with wide enough SIMD lanes for this to matter) are becoming more common, and
that therefore this is becoming less special purpose. The recently released ("Skylake")
Intel Xeon server chips have introduced 512 bit vector instructions ("AVX512"), so I
consider this to have gone mainstream at this point -- it's no longer restricted to the relatively
small market that Intel's Knights Landing chips were targeting. (Also, it will likely be supported
by Intel basically forever -- they *never* remove support for instructions that have been
introduced for their mainstream CPUs.) I also think it likely that other CPU manufacturers will
follow suit.
I don't think we need to delay a release to get the ELL format in, but I do think that we should
get this merged into master sooner rather than later. I think having things like this in the main
line of PETSc development is 1) actually useful to some people trying to do computing on
cutting-edge CPUs and 2) helps dispel the myth that "PETSc doesn't care about novel
architectures" that I sometimes see propagated (by people pointing to things like our
rejection of threading within PETSc as "evidence").
--Richard
On Mon, Jul 24, 2017 at 11:09 AM, Smith, Barry F. <[email protected]> wrote:
The ell pack stuff is kind of special purpose I don't think it needs be in
the release. Anyone who needs it for experimentation can just get the branch.
Barry
On Jul 24, 2017, at 12:07 PM, Zhang, Hong <[email protected]> wrote:
I will create a pull request today. Presumably the review process would take
long for this monster, so it is fine if it cannot go in the release.
Thanks,
Hong (Mr.)
On Jul 24, 2017, at 11:23 AM, Richard Tran Mills <[email protected]> wrote:
On Sun, Jul 23, 2017 at 3:08 PM, Smith, Barry F. <[email protected]> wrote:
Anything stopping us from making a PETSc release?
Barry
I'd really like to get my rmills/add-aijmkl branch in a form that can go in.
Should be able to have this ready for everyone's review in a day or two (have a
deadline for camera-ready copy for a conference paper today, so maybe not
today).
Not my branch, but I am wondering what Hong thinks his hongzh/add-ell-format
branch needs to be ready to go in, and if he could use some help from me or
anyone else on it. One thing I notice is that it looks like some of the Intel
intrinsics stuff in there needs some additional configure tests to establish
that that the correct AVX512 instructions are supported. I'd like to add an
option to use the same sort or row permutation used in AIJPERM to eliminate the
need for most explicit storage or zero elements.
--Richard