Some discussion came up today about the use of "factories" in C++ (and other
OO languages) code and I thought it would be useful to related it to how PETSc
handles the same issue since some numerical libraries (the next generation of
Trilinos's ML and Chombo) are using factories extensively and recklessly (and
will, I am confident, alienate a lot of users. I love it when our competitors
make the mistake of chasing every new idea down the wrong road).
"In object-oriented computer programming, a factory is an object for
creating other objects ", "it deals with the problem of creating objects
(products) without specifying the exact class of object that will be created."
http://en.wikipedia.org/wiki/Factory_method_pattern
http://en.wikipedia.org/wiki/Factory_(software_concept)
So for example if in my code I need a KSP solver object I could do
something like
void myroutine( ?..) {
KSPObject *kspobject = new KSPGMRESObject
where KSPObject is my abstract class and KSPGMRESObject is a specific
implementation some one has written. Now in the code I can go off and use the
KSPObject to solve something and the rest of my code does not need to "know"
that the KSPObject is actually a KSPGMRESObject. But since when I create the
object I cannot create an abstract class and can only create a specific
implementation my code has now hardwired the KSPGMRESObject. Factories are a
way of "avoiding" this hardwiring. I can introduce a new class
KSPObjectFactory that has a method newKSPObject() and then reorganize my code as
void myroutine(KSPObjectFactory *factory, ?.) {
KSPObject *kspobject = factor->newKSPObject();
and I've removed the explicit use of a particular implementation constructor
from my routine. The actual decision of what type of KSPObject to create is
"pushed up higher in the code" and involves the factory. So for example I
could have a KSPObjectFactory() that produces a particular implementation of a
KSPObject by setting a string name into the factory. So if
BarrysKSPObjectFactory is a particular implementation of KSPObjectFactory then
I could write "higher up in my code"
BarrysKSPObjectFactory *kspobjectfactory = new BarrysKSPObjectFactory;
kspobjectfactory->setimplementationbyname("gmres");
myroutine(kspobjectfactory);
one could also consider a method on BarrysKSPObjectFactory
setimplementationbyCommandLineArgs(argsc,args); now I can "push up" the
decision of what KSPObject to actually use to runtime as a command option.
A drawback to factories is that it can easily double the number of
different classes that users have to deal with and many people (at least Mark
and I) find it cumbersome.
PETSc factories 101
----------------------------
In PETSc because all objects are essentially delegator objects ("the
delegation pattern is a design pattern in object-oriented programming where an
object, instead of performing one of its stated tasks, delegates that task to
an associated helper object" http://en.wikipedia.org/wiki/Delegation_pattern)
when we "create" a PETSc object we have not yet actually created the delegated
object and thus do not need traditional factories for the purpose listed above.
For example
KSP ksp;
KSPCreate(comm,&ksp);
gives me a KSP solver object that I can pass around to other code, keep
references to, and even set options on but it does not have a specific
implementation of a solver wired to it yet. When I call
KSPSetType(ksp,"gmres"); or KSPSetFromOptions(ksp);
what happens is the KSP object looks for the "gmres factory" that has been
registered with KSPRegister("gmres",KSPCreate_GMRES,..) and calls
KSPCreate_GMRES() to generate the delegate that will actually do the solving.
Since the delegate is completely encapsulated inside the ksp object I can
change the delegate at a later time in the code to have a different
implementation by just calling
KSPSetType(ksp,"cg")
The old delegate is freed and the new solver implementation is put in
place. And all references to the ksp object continue to work (just using the
new solver).
So you see the design of PETSc allows "pushing up" the specific choice of
implementations of classes in essentially the same way as factory objects do
but without the need for users to explicitly create and manipulate the
factories.
PETSc factories 102
---------------------------
The other place PETSc uses factories is to allow "mesh information" to
determine algebraic objects that are created within algebraic solvers. This is
done with the DM abstract class which you can think of as a factory for Vecs
and Mats (though it does other things as well).
Consider the nonlinear solver SNES in PETSc. I can use
SNESSetJacobian(snes,J,J,func,ctx)
to provide the matrix that Newton's method is going to use. But these means
that my code that creates the SNES object and sets is various parameters has to
explicit know how to create the J Mat object. If I am using a complicated
meshing package that generated some grid and is going to use finite elements to
compute the Jacobian J I'd like the figuring out of the size and sparsity
pattern of the J to be handled by the meshing package. Thus I would make an
implementation of the DM abstract class (say MattsDM) that does all this yucky
figuring out. Then I could call
DMCreateMatrix(dm,&J) /* now my solver code sees nothing of the yuckyness of
particular mesh details */
and then pass the J to
SNESSetJacobian(snes,J,J,func,ctx)
Similarly I can do the same thing to create vectors.
We can take this one step further. So far I've been assuming that the
application programmer is explicitly creating the algebraic objects (Mats and
Vecs) and passing them to the solver object.
Once solver objects start getting complicated; for example with multigrid
methods it is painful for the user to create ALL the vectors and matrices
needed for the multiple levels and provide them all to PCMG (though it is
possible and we provide interfaces for doing that).
Instead we can create DMs that can generate appropriate sized vectors and
matrices and give those DMs to the solver object and the solver then calls the
methods to get new vectors and matrices wherever it needs them inside the
solver. PCMG would still need several DMs, one for each level. But rather
than requiring the user to create these several DMs the user can create a
single DM and the DM objects have methods in them that generate coarser or
finer DMs that can be used to generate the vectors and matrices on the other
levels. This is why a simple call of SNESSetDM(snes,dm) allows the nonlinear
and linear solver objects (including all the levels of multigrid) to create all
their various needed vectors and matrices.
When composing complicated solvers this approach can be extremely powerful, one
can envision DMs being able to generate sub DMs that represent just pieces of
the physics and using those DMs to generate the vectors and matrices for the
solver associated with the sub physics (in PCFieldSplit). Thus we can generate
all the algebraic pieces for very complicated nested solvers with multigrid
inside fieldsplit and fieldsplit into multigrid etc for any number of levels
with one simple paradigm.
In conclusion, you can think of PETSc as having one important visible factory
class the DM and then one factory completely transparent to the user for each
abstract PETSc object: IS, Vec,Mat,KSP,SNES, and yes even DM. IMHO factories
are a powerful and useful tool for large libraries but they should be used
sparingly and most of them though thoughtful design need never be seen the
users (because if seen by the users that steepens the learning curve a great
deal)
Questions, clarifications?
Barry
We actually have another factory in PETSc, MatGetVecs() that returns for a
given matrix appropriately sized vectors. I don't have some grand philosophical
reason for it to be around, but it is a very useful utility since often when
you have a matrix you need vectors to perform operations with it and it is
cumbersome to have to pass some vector through several layers of routines to
get it to where it is needed to interact with the matrix.