I will try to share the work I did tomorrow in a branch On Thu, 6 Jul 2017 at 7:48 pm, Martin Stockhammer <marti...@apache.org> wrote:
> We have different lucene (incompatible) dependencies that prevents us to > update the maven indexer and/or jackrabbit. And this will happen again with > each upgrade from one of these two packages in the future. > So would be really good if we can find a solution that removes one of the > lucene dependencies. > > Greetings > > Martin > > > Am 6. Juli 2017 09:36:06 MESZ schrieb Chris Graham <chrisgw...@gmail.com>: > >Can I please an obvious/stupid question? > > > >What is driving this need for change? > > > >From a quick read of the thread above, all of the options appear to > >introduce a lot of breaking changes, and a whole lot more uncertainty. > > > >So, what is so broken that it is driving these changes? > > > >Sent from my iPhone > > > >> On 6 Jul 2017, at 12:39 pm, Olivier Lamy <ol...@apache.org> wrote: > >> > >> Yup. > >> The idea is to have an extra jar produced by the maven-indexer with > >shaded > >> lucene version. > >> So the lucene classes (version used by Maven indexer) will be > >relocated in > >> a package called org.apache.maven.index.shaded.lucene (such > >> org.apache.maven.index.shaded.lucene.search.BooleanClause ) > >> Then you exclude lucene dependencies used by maven indexer and voila. > >> The voila is a bit optimistic and not so ezy but anyway working on it > >ATM. > >> > >> > >>> On 6 July 2017 at 07:08, Martin <marti...@apache.org> wrote: > >>> > >>> What do you mean exactly by shading? Moving to another package name? > >>> > >>> Am Mittwoch, 5. Juli 2017, 01:19:17 CEST schrieb Olivier Lamy: > >>>> maybe an option is to use some shading? > >>>> I'm thinking of shading lucene packages used by maven indexer. I > >can > >>> easily > >>>> provide a build for that. > >>>> WDYT? > >>>> > >>>>> On 26 June 2017 at 11:49, Olivier Lamy <ol...@apache.org> wrote: > >>>>> Hi > >>>>> graph/document storage could be convenient (but not possible with > >>> neo4j as > >>>>> it's GPL license [1]) > >>>>> well we can add solr as an additional webapp with our jetty > >>> distribution > >>>>> but this will be a pain for users who want to use tomcat or any > >other > >>>>> servlet container... > >>>>> we still need to investigate a new storage model :-) > >>>>> > >>>>> Olivier > >>>>> [1] https://neo4j.com/licensing/ > >>>>> > >>>>>> On 25 June 2017 at 06:26, Martin <marti...@apache.org> wrote: > >>>>>> Yes, you are right. The lucene dependency causes a lot of trouble > >and > >>>>>> will > >>>>>> cause headaches with each version change of one of the > >dependencies. > >>>>>> What are the requirements for a replacement? > >>>>>> - We want to store hierarchical data? > >>>>>> - We want to store metadata for nodes ? > >>>>>> - Fulltext search (only metadata or for artifacts too?) > >>>>>> - Blob / Artifact storage (I don't think so, but not so familiar > >with > >>> the > >>>>>> archiva artifact model)? > >>>>>> > >>>>>> Maybe some graph database may be an alternative. Don't know if > >the > >>>>>> license of > >>>>>> neo4j is compatible to the apache license, and I think it brings > >>> lucene > >>>>>> as > >>>>>> dependency too. I will have a look. > >>>>>> Problem is, if there is fulltext search needed, I think, for most > >of > >>> the > >>>>>> frameworks we get a lucene dependency, if it's embedded. > >>>>>> > >>>>>> Other alternatives: > >>>>>> - Implement fulltext search by our own (index of the metadata > >stored > >>> via > >>>>>> the > >>>>>> archiva api) and use the lucene dependency that comes from the > >>>>>> maven-indexer > >>>>>> - Jcr Oak with Solr. Solr is not embedded, must run as its own > >>>>>> application > >>>>>> (war). > >>>>>> > >>>>>> Greetings > >>>>>> > >>>>>> Martin > >>>>>> > >>>>>> Am Samstag, 24. Juni 2017, 14:05:26 CEST schrieb Olivier Lamy: > >>>>>>> well this gonna be a pain. > >>>>>>> IMHO we need to find a new alternative to jcr oak. > >>>>>>> And something not using Lucene as it's a real pain to have > >different > >>>>>>> librairies using lucene as they do not update in the same time > >(and > >>>>>> > >>>>>> Lucene > >>>>>> > >>>>>>> break backward compat so quickly...) > >>>>>>> Any ideas? I'd like to have something embedded (but with a > >possible > >>>>>>> external server configuration). > >>>>>>> There is currently a Cassandra implementation. I was not > >satisfied > >>>>>>> about > >>>>>>> performance but I guess I did that 4yo ago so can be improved > >for > >>> sure > >>>>>> : > >>>>>> :-) > >>>>>> : > >>>>>>> Maybe orientdb? > >>>>>>> What else? > >>>>>>> > >>>>>>>> On 24 June 2017 at 09:50, Olivier Lamy <ol...@apache.org> > >wrote: > >>>>>>>> well the issue is non compatible version of Lucene for Maven > >>> Indexer > >>>>>> > >>>>>> and > >>>>>> > >>>>>>>> Oak (well I can try push a patch to Oak for upgrading...) > >>>>>>>> > >>>>>>>>> On 24 June 2017 at 08:41, Olivier Lamy <ol...@apache.org> > >wrote: > >>>>>>>>> Hi > >>>>>>>>> Maven Indexer 6.0-SNAPSHOT doesn't need anymore plexus bridge. > >>>>>>>>> I'm working on it in the branch ( feature/jcr_oak ) > >>>>>>>>> Not sure why but I have intermittent failure with store-jcr > >>> module. > >>>>>>>>> I definitely agree on the upgrade. > >>>>>>>>> Well we can simply detect it's not oak compatible and schedule > >a > >>>>>>>>> full > >>>>>>>>> reindex (maybe with a message in logs and ui?) > >>>>>>>>> But we need to be sure we can still read central index and not > >>> sure > >>>>>> > >>>>>> about > >>>>>> > >>>>>>>>> possible lucene conflict with oak and maven indexer. > >>>>>>>>> We can work on this branch? (I created a Jenkins job for it > >>>>>>>>> https://builds.apache.org/view/A-D/view/Archiva/job/archi > >>>>>>>>> va-jcr-oak-branch/) > >>>>>>>>> If you prefer master I would say no worries neither. > >>>>>>>>> Something else to look at is upgrading maven-core etc... > >>>>>>>>> Anyway > >>>>>>>>> Cheers > >>>>>>>>> Olivier > >>>>>>>>> > >>>>>>>>>> On 22 June 2017 at 19:16, Martin <marti...@apache.org> wrote: > >>>>>>>>>> Hi, > >>>>>>>>>> > >>>>>>>>>> upgrading the maven indexer leads to some major changes. > >>>>>>>>>> Lucene is used by maven-indexer and also by jackrabbit. > >>> Jackrabbit > >>>>>>>>>> sticks to > >>>>>>>>>> the old 3.x version and, as I see it, they will not move to a > >>> newer > >>>>>>>>>> version. > >>>>>>>>>> There is Jackrabbit Oak as alternative. > >>>>>>>>>> I tried a proof of concept and could replace the jackrabbit > >>>>>>>>>> implementation of > >>>>>>>>>> metadata-store-jcr with a oak implementation. At least I got > >the > >>>>>> > >>>>>> unit > >>>>>> > >>>>>>>>>> tests of > >>>>>>>>>> this module all to pass. > >>>>>>>>>> But switching to Oak has some drawbacks: > >>>>>>>>>> - The repository format changed and we must provide a way to > >>>>>>>>>> migrate > >>>>>>>>>> (either > >>>>>>>>>> migrate the existing repository or create a new one by > >>> reindexing) > >>>>>>>>>> - The lucene version used is newer but does not match to the > >>>>>>>>>> version > >>>>>>>>>> from the > >>>>>>>>>> maven-indexer dependencies. There may come up some > >>>>>>>>>> incompatibilities > >>>>>>>>>> that are > >>>>>>>>>> not solvable without using a modified version of one of the > >>> both. > >>>>>>>>>> Or > >>>>>>>>>> there may > >>>>>>>>>> be the possibility to switch to solr (as separate component) > >and > >>>>>> > >>>>>> get rid > >>>>>> > >>>>>>>>>> of > >>>>>>>>>> the lucene dependencies for jcr inside the archiva project. > >>>>>>>>>> > >>>>>>>>>> Switching to maven-indexer 6.0-SNAPSHOT means some changes > >too: > >>>>>>>>>> - The Plexus-Sisu-Bridge does not work as before. > >>>>>>>>>> - We must migrate from the NexusIndexer to the indexer API. > >>>>>>>>>> > >>>>>>>>>> So switching to the new indexer and oak means more work as > >>> expected > >>>>>> > >>>>>> and > >>>>>> > >>>>>>>>>> some > >>>>>>>>>> risks regarding new incompatibility problems. And I think > >this > >>>>>> > >>>>>> cannot be > >>>>>> > >>>>>>>>>> done > >>>>>>>>>> without broken master builds for some time period. > >>>>>>>>>> > >>>>>>>>>> So, what should we do? I think maven indexer is one of the > >core > >>>>>>>>>> components of > >>>>>>>>>> archiva, and we should utilize the 3.x-version to migrate to > >>> the > >>>>>> > >>>>>> new > >>>>>> > >>>>>>>>>> indexer > >>>>>>>>>> version, even if this means switching to jcr oak. Otherwise > >it > >>>>>>>>>> would > >>>>>>>>>> mean to > >>>>>>>>>> stick to the old version for the next years. > >>>>>>>>>> @Olivier, regarding the maven-indexer / sisu-Bridge API > >>> changes, I > >>>>>> > >>>>>> hope > >>>>>> > >>>>>>>>>> you > >>>>>>>>>> can provide useful help. > >>>>>>>>>> > >>>>>>>>>> I committed the PoC to the branch feature/jcr_oak. There are > >>> some > >>>>>>>>>> modules > >>>>>>>>>> where the tests do not pass (mainly because of the indexer > >API > >>>>>> > >>>>>> changes). > >>>>>> > >>>>>>>>>> Any comments? > >>>>>>>>>> > >>>>>>>>>> Cheers > >>>>>>>>>> > >>>>>>>>>> Martin > >>>>>>>>>> > >>>>>>>>>> Am Dienstag, 13. Juni 2017, 09:07:35 CEST schrieb Olivier > >Lamy: > >>>>>>>>>>> forget it but we need to ensure we can read maven index > >>> files.... > >>>>>>>>>>> > >>>>>>>>>>> On 13 June 2017 at 17:06, Olivier Lamy <ol...@apache.org> > >>> wrote: > >>>>>>>>>>>> Hi, > >>>>>>>>>>>> Remember jackrabbit depends on Lucene as well so upgrading > >>>>>> > >>>>>> Lucene > >>>>>> > >>>>>>>>>> can be a > >>>>>>>>>> > >>>>>>>>>>>> problem here. > >>>>>>>>>>>> Regarding maven-indexer yes we can depend on a snapshot > >>> until > >>>>>> > >>>>>> the > >>>>>> > >>>>>>>>>> release. > >>>>>>>>>> > >>>>>>>>>>>> I can release it ;-) > >>>>>>>>>>>> > >>>>>>>>>>>> On 13 June 2017 at 06:06, Martin <marti...@apache.org> > >>> wrote: > >>>>>>>>>>>>> Hi, > >>>>>>>>>>>>> > >>>>>>>>>>>>> the lucene version depends on the maven indexer. But I'm > >>> not > >>>>>> > >>>>>> sure > >>>>>> > >>>>>>>>>> about > >>>>>>>>>> > >>>>>>>>>>>>> the > >>>>>>>>>>>>> current state of maven-indexer. The version has not > >changed > >>>>>> > >>>>>> since > >>>>>> > >>>>>>>>>> some > >>>>>>>>>> > >>>>>>>>>>>>> 2013. > >>>>>>>>>>>>> > >>>>>>>>>>>>> There are commits on the master branch since then, and the > >>>>>> > >>>>>> lucene > >>>>>> > >>>>>>>>>> version > >>>>>>>>>> > >>>>>>>>>>>>> has > >>>>>>>>>>>>> been changed too, but no releases were tagged. > >>>>>>>>>>>>> Does it make sense to switch to the maven-indexer > >>>>>>>>>>>>> 6.0-SNAPSHOT? > >>>>>>>>>>>>> > >>>>>>>>>>>>> As I know there are new compact index formats with new > >>> lucene > >>>>>>>>>> > >>>>>>>>>> versions > >>>>>>>>>> > >>>>>>>>>>>>> but I'm > >>>>>>>>>>>>> not sure if this is relevant for the maven indexes. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Cheers > >>>>>>>>>>>>> > >>>>>>>>>>>>> Martin > >>>>>>>>>>>> > >>>>>>>>>>>> -- > >>>>>>>>>>>> Olivier Lamy > >>>>>>>>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy > >>>>>>>>> > >>>>>>>>> -- > >>>>>>>>> Olivier Lamy > >>>>>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy > >>>>>>>> > >>>>>>>> -- > >>>>>>>> Olivier Lamy > >>>>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy > >>>>> > >>>>> -- > >>>>> Olivier Lamy > >>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy > >>> > >>> > >>> > >> > >> > >> -- > >> Olivier Lamy > >> http://twitter.com/olamy | http://linkedin.com/in/olamy > > -- > Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet. -- Olivier Lamy http://twitter.com/olamy | http://linkedin.com/in/olamy