Re: maven-indexer / Lucene

Olivier Lamy Tue, 04 Jul 2017 16:20:07 -0700

maybe an option is to use some shading?
I'm thinking of shading lucene packages used by maven indexer. I can easily
provide a build for that.
WDYT?



On 26 June 2017 at 11:49, Olivier Lamy <[email protected]> wrote:

> Hi
> graph/document storage could be convenient (but not possible with neo4j as
> it's GPL license [1])
> well we can add solr as an additional webapp with our jetty distribution
> but this will be a pain for users who want to use tomcat or any other
> servlet container...
> we still need to investigate a new storage model :-)
>
> Olivier
> [1] https://neo4j.com/licensing/
>
> On 25 June 2017 at 06:26, Martin <[email protected]> wrote:
>
>> Yes, you are right. The lucene dependency causes a lot of trouble and will
>> cause headaches with each version change of one of the dependencies.
>> What are the requirements for a replacement?
>> - We want to store hierarchical data?
>> - We want to store metadata for nodes ?
>> - Fulltext search (only metadata or for artifacts too?)
>> - Blob / Artifact storage (I don't think so, but not so familiar with the
>> archiva artifact model)?
>>
>> Maybe some graph database may be an alternative. Don't know if the
>> license of
>> neo4j is compatible to the apache license, and I think it brings lucene as
>> dependency too. I will have a look.
>> Problem is, if there is fulltext search needed, I think, for most of the
>> frameworks we get a lucene dependency, if it's embedded.
>>
>> Other alternatives:
>> - Implement fulltext search by our own (index of the metadata stored via
>> the
>> archiva api) and use the lucene dependency that comes from the
>> maven-indexer
>> - Jcr Oak with Solr. Solr is not embedded, must run as its own application
>> (war).
>>
>> Greetings
>>
>> Martin
>>
>>
>>
>> Am Samstag, 24. Juni 2017, 14:05:26 CEST schrieb Olivier Lamy:
>> > well this gonna be a pain.
>> > IMHO we need to find a new alternative to jcr oak.
>> > And something not using Lucene as it's a real pain to have different
>> > librairies using lucene as they do not update in the same time (and
>> Lucene
>> > break backward compat so quickly...)
>> > Any ideas? I'd like to have something embedded (but with a possible
>> > external server configuration).
>> > There is currently a Cassandra implementation. I was not satisfied about
>> > performance but I guess I did that 4yo ago so can be improved for sure
>> :-)
>> > Maybe orientdb?
>> > What else?
>> >
>> > On 24 June 2017 at 09:50, Olivier Lamy <[email protected]> wrote:
>> > > well the issue is non compatible version of Lucene for Maven Indexer
>> and
>> > > Oak (well I can try push a patch to Oak for upgrading...)
>> > >
>> > > On 24 June 2017 at 08:41, Olivier Lamy <[email protected]> wrote:
>> > >> Hi
>> > >> Maven Indexer 6.0-SNAPSHOT doesn't need anymore plexus bridge.
>> > >> I'm working on it in the branch ( feature/jcr_oak )
>> > >> Not sure why but I have intermittent failure with store-jcr module.
>> > >> I definitely agree on the upgrade.
>> > >> Well we can simply detect it's not oak compatible and schedule a full
>> > >> reindex (maybe with a message in logs and ui?)
>> > >> But we need to be sure we can still read central index and not sure
>> about
>> > >> possible lucene conflict with oak and maven indexer.
>> > >> We can work on this branch? (I created a Jenkins job for it
>> > >> https://builds.apache.org/view/A-D/view/Archiva/job/archi
>> > >> va-jcr-oak-branch/)
>> > >> If you prefer master I would say no worries neither.
>> > >> Something else to look at is upgrading maven-core etc...
>> > >> Anyway
>> > >> Cheers
>> > >> Olivier
>> > >>
>> > >> On 22 June 2017 at 19:16, Martin <[email protected]> wrote:
>> > >>> Hi,
>> > >>>
>> > >>> upgrading the maven indexer leads to some major changes.
>> > >>> Lucene is used by maven-indexer and also by jackrabbit. Jackrabbit
>> > >>> sticks to
>> > >>> the old 3.x version and, as I see it, they will not move to a newer
>> > >>> version.
>> > >>> There is Jackrabbit Oak as alternative.
>> > >>> I tried a proof of concept and could replace the jackrabbit
>> > >>> implementation of
>> > >>> metadata-store-jcr with a oak implementation. At least I got the
>> unit
>> > >>> tests of
>> > >>> this module all to pass.
>> > >>> But switching to Oak has some drawbacks:
>> > >>> - The repository format changed and we must provide a way to migrate
>> > >>> (either
>> > >>> migrate the existing repository or create a new one by reindexing)
>> > >>> - The lucene version used is newer but does not match to the version
>> > >>> from the
>> > >>> maven-indexer dependencies. There may come up some incompatibilities
>> > >>> that are
>> > >>> not solvable without using a modified version of one of the both. Or
>> > >>> there may
>> > >>> be the possibility to switch to solr (as separate component) and
>> get rid
>> > >>> of
>> > >>> the lucene dependencies for jcr inside the archiva project.
>> > >>>
>> > >>> Switching to maven-indexer 6.0-SNAPSHOT means some changes too:
>> > >>> - The Plexus-Sisu-Bridge does not work as before.
>> > >>> - We must migrate from the NexusIndexer to the indexer API.
>> > >>>
>> > >>> So switching to the new indexer and oak means more work as expected
>> and
>> > >>> some
>> > >>> risks regarding new incompatibility problems. And I think this
>> cannot be
>> > >>> done
>> > >>> without broken master builds for some time period.
>> > >>>
>> > >>> So, what should we do? I think maven indexer is one of the core
>> > >>> components of
>> > >>> archiva, and we should utilize the 3.x-version to  migrate to the
>> new
>> > >>> indexer
>> > >>> version, even if this means switching to jcr oak. Otherwise it would
>> > >>> mean to
>> > >>> stick to the old version for the next years.
>> > >>> @Olivier, regarding the maven-indexer / sisu-Bridge API changes, I
>> hope
>> > >>> you
>> > >>> can provide  useful help.
>> > >>>
>> > >>> I committed the PoC to the branch feature/jcr_oak. There are some
>> > >>> modules
>> > >>> where the tests do not pass (mainly because of the indexer API
>> changes).
>> > >>>
>> > >>> Any comments?
>> > >>>
>> > >>> Cheers
>> > >>>
>> > >>> Martin
>> > >>>
>> > >>> Am Dienstag, 13. Juni 2017, 09:07:35 CEST schrieb Olivier Lamy:
>> > >>> > forget it but we need to ensure we can read maven index files....
>> > >>> >
>> > >>> > On 13 June 2017 at 17:06, Olivier Lamy <[email protected]> wrote:
>> > >>> > > Hi,
>> > >>> > > Remember jackrabbit depends on Lucene as well so upgrading
>> Lucene
>> > >>>
>> > >>> can be a
>> > >>>
>> > >>> > > problem here.
>> > >>> > > Regarding maven-indexer yes we can depend on a snapshot until
>> the
>> > >>>
>> > >>> release.
>> > >>>
>> > >>> > > I can release it ;-)
>> > >>> > >
>> > >>> > > On 13 June 2017 at 06:06, Martin <[email protected]> wrote:
>> > >>> > >> Hi,
>> > >>> > >>
>> > >>> > >> the lucene version depends on the maven indexer. But I'm not
>> sure
>> > >>>
>> > >>> about
>> > >>>
>> > >>> > >> the
>> > >>> > >> current state of maven-indexer. The version has not changed
>> since
>> > >>>
>> > >>> some
>> > >>>
>> > >>> > >> 2013.
>> > >>> > >>
>> > >>> > >> There are commits on the master branch since then, and the
>> lucene
>> > >>>
>> > >>> version
>> > >>>
>> > >>> > >> has
>> > >>> > >> been changed too, but no releases were tagged.
>> > >>> > >> Does it make sense to switch to the maven-indexer 6.0-SNAPSHOT?
>> > >>> > >>
>> > >>> > >> As I know there are new compact index formats with new lucene
>> > >>>
>> > >>> versions
>> > >>>
>> > >>> > >> but I'm
>> > >>> > >> not sure if this is relevant for the maven indexes.
>> > >>> > >>
>> > >>> > >> Cheers
>> > >>> > >>
>> > >>> > >> Martin
>> > >>> > >
>> > >>> > > --
>> > >>> > > Olivier Lamy
>> > >>> > > http://twitter.com/olamy | http://linkedin.com/in/olamy
>> > >>
>> > >> --
>> > >> Olivier Lamy
>> > >> http://twitter.com/olamy | http://linkedin.com/in/olamy
>> > >
>> > > --
>> > > Olivier Lamy
>> > > http://twitter.com/olamy | http://linkedin.com/in/olamy
>>
>>
>>
>
>
> --
> Olivier Lamy
> http://twitter.com/olamy | http://linkedin.com/in/olamy
>



-- 
Olivier Lamy
http://twitter.com/olamy | http://linkedin.com/in/olamy

Re: maven-indexer / Lucene

Reply via email to