[jira] [Commented] (FELIX-4656) Improve memory usage and speed of the resolver

Thomas Watson (JIRA) Wed, 15 Apr 2015 13:34:15 -0700

    [ 
https://issues.apache.org/jira/browse/FELIX-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496885#comment-14496885
 ]


Thomas Watson commented on FELIX-4656:
--------------------------------------

I also appreciate Guillaume's work on this defect.  I need to understand the 
fix a bit more though.  I am the owner of Equinox and before I can assess the 
risk of re-consuming the latest felix resolver I would like to understand the 
fix.  I have spend some more time looking at the fix and have some 
questions/suggestions just to make sure I understand the approach.  Keep in 
mind my suggestions are mostly superficial changes that I think would make the 
code more understandable.

You say in an earlier comment "Basically, the path structure is the difference 
between the original candidate map and the current one, i.e. it contains 
capabilities that have been excluded because they could not be resolved.  It is 
supposed to identify a Candidates object."

You then do a check in ResolverImpl to ensure multiple Candidates (or 
permutations) with the same 'identity' are not processed more than once.  This 
seems like a very good thing to do.  I do not see any reason to process the 
'same' Candidate permutation more than once.  What threw me off the first time 
I glanced at the changes was the term 'path'.  You describe it as the 
difference from the original Candidates object which is a more accurate 
description to me than path.  I think we should change the usage of term path 
here to be 'delta' (change getPaths -> getDelta()).  And the variable 
'donePaths' should be changed to something like 'processedDeltas'.

Since you have now optimized the loop in ResolverImpl to avoid processing the 
same candidates multiple times I am wondering if the check in 
Candidates.permutateIfNeeded(Requirement, List<Candidates>) is still valid or 
needed.  That check in there always seemed a little fishy to me because it 
checks to see if an existing permutation has a different candidate (zero) for 
the req as the one currently being processed.  If no existing permutation 
candidates are found with a different candidate (zero) then the current 
candidates is permuted.  It seems this potentially will throw some valid 
permutations out, but maybe that is not an issue we should address now.  I 
would probably wait until we get a problem set that proves a valid solution set 
was thrown out that would have allowed resolution.

Currently ResolveSession.m_usesPermutations and 
ResolveSession.m_importPermutations are both simple List<Candidates> which 
seems to imply that the lists could get the 'same' Candidates object added 
multiple times to the two lists.  Your changes address this concern by keeping 
a record of the processed 'deltas' so the duplicates would be ignored.  Is this 
something that we should also optimize, or would it add more complication that 
it is worth?

> Improve memory usage and speed of the resolver
> ----------------------------------------------
>
>                 Key: FELIX-4656
>                 URL: https://issues.apache.org/jira/browse/FELIX-4656
>             Project: Felix
>          Issue Type: Improvement
>          Components: Resolver
>            Reporter: Guillaume Nodet
>            Assignee: Guillaume Nodet
>             Fix For: resolver-1.2.0
>
>
> During big resolutions (> 100 bundles), the memory consumption can become 
> very huge, mostly by keeping a lot of copies of the Candidates object.
> I want to lower the memory requirements of the resolver without touching the 
> algorithm at all (which would be a different improvement).
> This can be done by using :
>   * lower memory intensive collections
>   * do smart copies of those collections (where they would only actually copy 
> the data when modify)
> The second item is slightly more difficult to achieve, as the maps in the 
> Candidate objects contains Set and List, which would mean that those must be 
> copied too.  So it could actually be complementary, if achievable.
> For the first one, the HashMap and HashSet are very memory intensive.  I'll 
> introduce two new collections which will lower the requirements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FELIX-4656) Improve memory usage and speed of the resolver

Reply via email to