GitHub user tstupka opened a pull request:

    https://github.com/apache/maven-indexer/pull/12

    resolve performance loss due to lucene 4.8.1 - upgrade to lucene 553 and 
additional fixes

    since lucene was upgraded to 4.8.1 the indexer takes 2.5x longer than with 
lucene 3.6. This seems be a cumulative effect of partial reductions in 
performance introduced in particular lucene releases after 3.6 - see also 
https://issues.apache.org/jira/browse/MINDEXER-99
    
    see the particular commits in this request. Each addresses a suggestion for 
a specific improvement. When all applied, the resulting performance is 
comparable with the performance before the above mentioned upgrade.
    
    #0bb9484 - upgrading lucene from 4.8.1 to 5.5.3
    performance was improved in lucene 5.x.
    with 5.5.3 the indexer works significantly faster than with 4.8.1
    
    #3cfa430 - avoid rebuilding groups after reading index
    after generating the index it has to be re-read one more time to extract a 
distinct list of allGroups and rootGroups, even though that info was already 
available, but thrown away.
    
    #8b98a49 - improve reading from zip file
    
    #4062146 - do not unnecessarily force merge on index writer
    merge is very expensive - lets trust lucene to merge when it seems fit. 
    the final index size without force merges was 910mb compared to 900mb with 
fm.
    the time improvement is aprox 30%

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tstupka/maven-indexer lucene_553

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/maven-indexer/pull/12.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #12
    
----
commit 0bb9484e6eaea3e7974d7e5c9b5ab3d6802780e9
Author: Tomas Stupka <[email protected]>
Date:   2016-10-24T15:25:46Z

    upgrading lucene from 4.8.1 to 5.5.3

commit 3cfa430d71a8d58a0454966c0dd183e37f5fb067
Author: Tomas Stupka <[email protected]>
Date:   2016-10-25T08:47:19Z

    avoid rebuilding groups after reading index

commit 8b98a495186cafe20ee6494719185e74813ea15e
Author: Tomas Stupka <[email protected]>
Date:   2016-10-25T09:01:18Z

    improve reading from zip file

commit 40621465f3ebf14a89961d07ded0d17a4d2d61bc
Author: Tomas Stupka <[email protected]>
Date:   2016-10-25T09:50:29Z

    do not unnecessarily force merge on index writer

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to