Re: [Math] Synchronization

Gilles Sadowski Sun, 22 Jul 2012 04:05:04 -0700

Hello.

> On 22/07/2012 02:43, Gilles Sadowski wrote:
> > On Sat, Jul 21, 2012 at 11:44:43AM -0700, Ted Dunning wrote:
> >> Synchronization in low level code is generally really bad.  This is the
> >> whole point of why Vector was essentially thrown away in Java collections
> >> in favor of ArrayList.
> > 
> > I might be wrong, but I don't think that's the reason.
> > At the beginning there was the "Vector"... and nothing else. In singly
> > threaded code or as a thread local variable, synchronization is an
> > unnecessary waste of time. Since most code falls into this category,
> > "ArrayList" is indeed much more useful. That doesn't imply that there is
> > never a need for synchronization in low-level code.
> > 
> > In the present case, AFAICT the synchronization is a performance gain during
> > the creation of the data (to be cached). Once that part is done, user code
> > do not call any "synchronized" methods, so no loss there.
> > 
> >>
> >> Better to synchronize in the caller who knows about the multi-threading.
> >>  Or simply build a synchronized wrapper in an inner class.
> 
> Note that there are also other mechanisms apart from synchronization
> that could be used. The lazy holder idiom can be used to ensure single
> execution and both locks and atomic variables from the concurrency
> package can be used for thread safety (I strongly prefer them to
> synchronization).

AIUI, this amounts to "manually" replicating what "synchronized" probides.
[Well, not in all cases, I know; but here, using "synchronized" seems right
to the point, in that it solves the problem (as demonstrated by the unit
test).]

As I pointed out previously, this use of "synchronized" is an
implementation detail which API users don't have to worry about.[1]
This "improvement" of the initial code makes it possible for a class like
the new "IterativeLegendreGaussIntegrator" to use a singleton factory (also
an implementation detail), which is a huge advantage when a application will
instantitate many such integrators:
 - The singleton is nice because rules are stored once for the whole
   application instead of in each "IterativeLegendreGaussIntegrator"
   instance.
 - The singleton is problematic in a multi-threaded environment since the
   rules will be computed several times; "synchronized" solves that issue.

> One important point too is that regardless of the fact we use
> multi-threading in [math] or not by ourselves, we should take care to
> allow upper level application to use our classes in multi-threaded
> environments. This is simple for small container classes (3D vectors,
> complex numbers ...) that can be made immutable (at least the original
> class we provide, not the derived class users may create). This is much
> more complex or can even be impossible for long algorithms with an API
> providing lots of different functions (core engine run, setters,
> getters, retrieval of partial results ...). Documenting everything about
> thread safety would be a daunting task, but it would be a nice thing to
> have, at least for some classes or methods.

I agree and didn't mean that CM should be entirely thread-safe.
[However I think that the "IterativeLegendreGaussIntegrator" case shows that
there is a potential gain when certain part are coded with this in mind.]

Also, I'm not in favour of introducing complex code for the sake of a
thread safety, especially when that could be implemented at a higher
level.[2]
But there are cases where there is an inherently "simple" parallelization,
and where users would probably expect that a math library would take
advantage of it. That was mainly the reason for the post about
multi-threading.

I think that whenever there is an algorithm where parallelization provides
a huge gain (in these times of mutiple cores), we should include it in the
implementation of that algorithm, or refrain from providing a crippled
feature (as maintaining it will drain our scarce human resources for
something that will be superseded by a more performant implementation).

In that respect, I agree with Ted (in the other ML thread): Currently CM is
probably not as good as Atlas wrappers for large matrices. But, since we
rule out depending on other projects, especially if they wrap JNI calls),
the conclusion I draw is that we should not provide a feature set related to
large matrices. [This of course points towards the bugs related to sparse
matrices, recently uncovered by Sébastien: Better (IMHO) to provide an
enhanced feature set for dense matrices rather than spend time to fix
features that are rarely used, if ever, by CM users.]

Best regards,
Gilles

[1] Unless I got it wrong of course! :-}
[2] In the case of the "IterativeLegendreGaussIntegrator" I think that it is
    not possible to do it at a higher level: the concerned field is
    "private". And the fix is not complex, but just a single keyword.

> 
> > 
> > Please show how this fits with the current case, and improves the
> > implementation.
> > 
> > 
> > Thanks,
> > Gilles
> > 
> >>
> >> On Sat, Jul 21, 2012 at 6:08 AM, Gilles Sadowski <
> >> gil...@harfang.homelinux.org> wrote:
> >>
> >>>>
> >>>> Note that there is a performance price for the synchronization,
> >>>> which may outweigh the benefit of adding it.
> >>>
> >>> I thought of adding that because one might want to parallelize the
> >>> integration computations while the quadrature rules factory is shared.
> >>> [See e.g. MATH-827, line 41 in the posted code.]
> >>> Without "synchronized", all threads will compute all the rules.
> >>>
> >>> On my machine (4 cores), in the absence of synchronization, the above test
> >>> indicated that the "computeRule" method was called 18 times!
> >>>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [Math] Synchronization

Reply via email to