Hi Took a bit of time but I finally get the branch working :-) branch: feature/jcr_oak Let me know what do you think of? Well I guess there are still some optimisations to do for jcr oak I can see some logs: 21:02:39.559 [1071] [main] WARN oak.query.QueryImpl - Traversal query (query without index): SELECT * FROM [nt:base] WHERE [jcr:uuid] = $id /* oak-internal */; consider creating an index 21:02:39.563 [328] [main] WARN plugins.index.Cursors$TraversingCursor - Traversed 1000 nodes with filter Filter(query=SELECT * FROM [nt:base] WHERE [jcr:uuid] = $id /* oak-internal */, path=*, property=[jcr:uuid=[21232f29-7a57-35a7-8389-4a0e4a801fc3]]); consider creating an index or changing the query
On 8 July 2017 at 06:22, Martin <[email protected]> wrote: > Hi Olivier, > > great! > For my understanding: The dependency to lucene in the pom of indexer-core > is > still there, but the lucene packages are moved to the > ...maven.index.shaded... > package? You develop indexer-core with the standard lucene packages and the > shading is executed during the build of the indexer package? > > I think that may solve our dependency problem. > > I still got errors in the maven-indexer module, but I think the status is > still "work in progress". I don't want to interfere too much with your > changes. > > I'm not sure, if we should keep the JCR Oak as metadata implementation. I > think OrientDB may be a feasible alternative: Embeddable, Graph database, > Lucene index optional and may be omitted, Apache License. And with JCR Oak > we > also have to convert the existing metadata index. > > But one step after the other. If we agree that the shaded indexer works, we > should merge only the maven indexer changes to the master branch without > the > JCR/lucene update and change the JCR and or lucene afterwards. > > Greetings > > Martin > > Am Freitag, 7. Juli 2017, 09:23:24 CEST schrieb Olivier Lamy: > > So the repo contains a branch feature/jar_shaded_lucene here > > https://git1-us-west.apache.org/repos/asf?p=maven-indexer.git;a=summary > > and I pushed what I started for Archiva in the branch called > feature/jcr_oak > > So in order to test it you need to build first maven-indexer from the > > branch feature/jar_shaded_lucene > > > > On 6 July 2017 at 22:31, Olivier Lamy <[email protected]> wrote: > > > I will try to share the work I did tomorrow in a branch > > > > > > On Thu, 6 Jul 2017 at 7:48 pm, Martin Stockhammer <[email protected] > > > > > > > > wrote: > > >> We have different lucene (incompatible) dependencies that prevents us > to > > >> update the maven indexer and/or jackrabbit. And this will happen again > > >> with > > >> each upgrade from one of these two packages in the future. > > >> So would be really good if we can find a solution that removes one of > the > > >> lucene dependencies. > > >> > > >> Greetings > > >> > > >> Martin > > >> > > >> > > >> Am 6. Juli 2017 09:36:06 MESZ schrieb Chris Graham < > [email protected] > > >> > > >> >Can I please an obvious/stupid question? > > >> > > > >> >What is driving this need for change? > > >> > > > >> >From a quick read of the thread above, all of the options appear to > > >> >introduce a lot of breaking changes, and a whole lot more > uncertainty. > > >> > > > >> >So, what is so broken that it is driving these changes? > > >> > > > >> >Sent from my iPhone > > >> > > > >> >> On 6 Jul 2017, at 12:39 pm, Olivier Lamy <[email protected]> wrote: > > >> >> > > >> >> Yup. > > >> >> The idea is to have an extra jar produced by the maven-indexer with > > >> > > > >> >shaded > > >> > > > >> >> lucene version. > > >> >> So the lucene classes (version used by Maven indexer) will be > > >> > > > >> >relocated in > > >> > > > >> >> a package called org.apache.maven.index.shaded.lucene (such > > >> >> org.apache.maven.index.shaded.lucene.search.BooleanClause ) > > >> >> Then you exclude lucene dependencies used by maven indexer and > voila. > > >> >> The voila is a bit optimistic and not so ezy but anyway working on > it > > >> > > > >> >ATM. > > >> > > > >> >>> On 6 July 2017 at 07:08, Martin <[email protected]> wrote: > > >> >>> > > >> >>> What do you mean exactly by shading? Moving to another package > name? > > >> >>> > > >> >>> Am Mittwoch, 5. Juli 2017, 01:19:17 CEST schrieb Olivier Lamy: > > >> >>>> maybe an option is to use some shading? > > >> >>>> I'm thinking of shading lucene packages used by maven indexer. I > > >> > > > >> >can > > >> > > > >> >>> easily > > >> >>> > > >> >>>> provide a build for that. > > >> >>>> WDYT? > > >> >>>> > > >> >>>>> On 26 June 2017 at 11:49, Olivier Lamy <[email protected]> > wrote: > > >> >>>>> Hi > > >> >>>>> graph/document storage could be convenient (but not possible > with > > >> >>> > > >> >>> neo4j as > > >> >>> > > >> >>>>> it's GPL license [1]) > > >> >>>>> well we can add solr as an additional webapp with our jetty > > >> >>> > > >> >>> distribution > > >> >>> > > >> >>>>> but this will be a pain for users who want to use tomcat or any > > >> > > > >> >other > > >> > > > >> >>>>> servlet container... > > >> >>>>> we still need to investigate a new storage model :-) > > >> >>>>> > > >> >>>>> Olivier > > >> >>>>> [1] https://neo4j.com/licensing/ > > >> >>>>> > > >> >>>>>> On 25 June 2017 at 06:26, Martin <[email protected]> wrote: > > >> >>>>>> Yes, you are right. The lucene dependency causes a lot of > trouble > > >> > > > >> >and > > >> > > > >> >>>>>> will > > >> >>>>>> cause headaches with each version change of one of the > > >> > > > >> >dependencies. > > >> > > > >> >>>>>> What are the requirements for a replacement? > > >> >>>>>> - We want to store hierarchical data? > > >> >>>>>> - We want to store metadata for nodes ? > > >> >>>>>> - Fulltext search (only metadata or for artifacts too?) > > >> >>>>>> - Blob / Artifact storage (I don't think so, but not so > familiar > > >> > > > >> >with > > >> > > > >> >>> the > > >> >>> > > >> >>>>>> archiva artifact model)? > > >> >>>>>> > > >> >>>>>> Maybe some graph database may be an alternative. Don't know if > > >> > > > >> >the > > >> > > > >> >>>>>> license of > > >> >>>>>> neo4j is compatible to the apache license, and I think it > brings > > >> >>> > > >> >>> lucene > > >> >>> > > >> >>>>>> as > > >> >>>>>> dependency too. I will have a look. > > >> >>>>>> Problem is, if there is fulltext search needed, I think, for > most > > >> > > > >> >of > > >> > > > >> >>> the > > >> >>> > > >> >>>>>> frameworks we get a lucene dependency, if it's embedded. > > >> >>>>>> > > >> >>>>>> Other alternatives: > > >> >>>>>> - Implement fulltext search by our own (index of the metadata > > >> > > > >> >stored > > >> > > > >> >>> via > > >> >>> > > >> >>>>>> the > > >> >>>>>> archiva api) and use the lucene dependency that comes from the > > >> >>>>>> maven-indexer > > >> >>>>>> - Jcr Oak with Solr. Solr is not embedded, must run as its own > > >> >>>>>> application > > >> >>>>>> (war). > > >> >>>>>> > > >> >>>>>> Greetings > > >> >>>>>> > > >> >>>>>> Martin > > >> >>>>>> > > >> >>>>>> Am Samstag, 24. Juni 2017, 14:05:26 CEST schrieb Olivier Lamy: > > >> >>>>>>> well this gonna be a pain. > > >> >>>>>>> IMHO we need to find a new alternative to jcr oak. > > >> >>>>>>> And something not using Lucene as it's a real pain to have > > >> > > > >> >different > > >> > > > >> >>>>>>> librairies using lucene as they do not update in the same time > > >> > > > >> >(and > > >> > > > >> >>>>>> Lucene > > >> >>>>>> > > >> >>>>>>> break backward compat so quickly...) > > >> >>>>>>> Any ideas? I'd like to have something embedded (but with a > > >> > > > >> >possible > > >> > > > >> >>>>>>> external server configuration). > > >> >>>>>>> There is currently a Cassandra implementation. I was not > > >> > > > >> >satisfied > > >> > > > >> >>>>>>> about > > >> >>>>>>> performance but I guess I did that 4yo ago so can be improved > > >> > > > >> >for > > >> > > > >> >>> sure > > >> >>> > > >> >>>>>> :-) > > >> >>>>>> : > > >> >>>>>>> Maybe orientdb? > > >> >>>>>>> What else? > > >> >>>>>>> > > >> >>>>>>>> On 24 June 2017 at 09:50, Olivier Lamy <[email protected]> > > >> > > > >> >wrote: > > >> >>>>>>>> well the issue is non compatible version of Lucene for Maven > > >> >>> > > >> >>> Indexer > > >> >>> > > >> >>>>>> and > > >> >>>>>> > > >> >>>>>>>> Oak (well I can try push a patch to Oak for upgrading...) > > >> >>>>>>>> > > >> >>>>>>>>> On 24 June 2017 at 08:41, Olivier Lamy <[email protected]> > > >> > > > >> >wrote: > > >> >>>>>>>>> Hi > > >> >>>>>>>>> Maven Indexer 6.0-SNAPSHOT doesn't need anymore plexus > bridge. > > >> >>>>>>>>> I'm working on it in the branch ( feature/jcr_oak ) > > >> >>>>>>>>> Not sure why but I have intermittent failure with store-jcr > > >> >>> > > >> >>> module. > > >> >>> > > >> >>>>>>>>> I definitely agree on the upgrade. > > >> >>>>>>>>> Well we can simply detect it's not oak compatible and > schedule > > >> > > > >> >a > > >> > > > >> >>>>>>>>> full > > >> >>>>>>>>> reindex (maybe with a message in logs and ui?) > > >> >>>>>>>>> But we need to be sure we can still read central index and > not > > >> >>> > > >> >>> sure > > >> >>> > > >> >>>>>> about > > >> >>>>>> > > >> >>>>>>>>> possible lucene conflict with oak and maven indexer. > > >> >>>>>>>>> We can work on this branch? (I created a Jenkins job for it > > >> >>>>>>>>> https://builds.apache.org/view/A-D/view/Archiva/job/archi > > >> >>>>>>>>> va-jcr-oak-branch/) > > >> >>>>>>>>> If you prefer master I would say no worries neither. > > >> >>>>>>>>> Something else to look at is upgrading maven-core etc... > > >> >>>>>>>>> Anyway > > >> >>>>>>>>> Cheers > > >> >>>>>>>>> Olivier > > >> >>>>>>>>> > > >> >>>>>>>>>> On 22 June 2017 at 19:16, Martin <[email protected]> > wrote: > > >> >>>>>>>>>> Hi, > > >> >>>>>>>>>> > > >> >>>>>>>>>> upgrading the maven indexer leads to some major changes. > > >> >>>>>>>>>> Lucene is used by maven-indexer and also by jackrabbit. > > >> >>> > > >> >>> Jackrabbit > > >> >>> > > >> >>>>>>>>>> sticks to > > >> >>>>>>>>>> the old 3.x version and, as I see it, they will not move > to a > > >> >>> > > >> >>> newer > > >> >>> > > >> >>>>>>>>>> version. > > >> >>>>>>>>>> There is Jackrabbit Oak as alternative. > > >> >>>>>>>>>> I tried a proof of concept and could replace the jackrabbit > > >> >>>>>>>>>> implementation of > > >> >>>>>>>>>> metadata-store-jcr with a oak implementation. At least I > got > > >> > > > >> >the > > >> > > > >> >>>>>> unit > > >> >>>>>> > > >> >>>>>>>>>> tests of > > >> >>>>>>>>>> this module all to pass. > > >> >>>>>>>>>> But switching to Oak has some drawbacks: > > >> >>>>>>>>>> - The repository format changed and we must provide a way > to > > >> >>>>>>>>>> migrate > > >> >>>>>>>>>> (either > > >> >>>>>>>>>> migrate the existing repository or create a new one by > > >> >>> > > >> >>> reindexing) > > >> >>> > > >> >>>>>>>>>> - The lucene version used is newer but does not match to > the > > >> >>>>>>>>>> version > > >> >>>>>>>>>> from the > > >> >>>>>>>>>> maven-indexer dependencies. There may come up some > > >> >>>>>>>>>> incompatibilities > > >> >>>>>>>>>> that are > > >> >>>>>>>>>> not solvable without using a modified version of one of the > > >> >>> > > >> >>> both. > > >> >>> > > >> >>>>>>>>>> Or > > >> >>>>>>>>>> there may > > >> >>>>>>>>>> be the possibility to switch to solr (as separate > component) > > >> > > > >> >and > > >> > > > >> >>>>>> get rid > > >> >>>>>> > > >> >>>>>>>>>> of > > >> >>>>>>>>>> the lucene dependencies for jcr inside the archiva project. > > >> >>>>>>>>>> > > >> >>>>>>>>>> Switching to maven-indexer 6.0-SNAPSHOT means some changes > > >> > > > >> >too: > > >> >>>>>>>>>> - The Plexus-Sisu-Bridge does not work as before. > > >> >>>>>>>>>> - We must migrate from the NexusIndexer to the indexer API. > > >> >>>>>>>>>> > > >> >>>>>>>>>> So switching to the new indexer and oak means more work as > > >> >>> > > >> >>> expected > > >> >>> > > >> >>>>>> and > > >> >>>>>> > > >> >>>>>>>>>> some > > >> >>>>>>>>>> risks regarding new incompatibility problems. And I think > > >> > > > >> >this > > >> > > > >> >>>>>> cannot be > > >> >>>>>> > > >> >>>>>>>>>> done > > >> >>>>>>>>>> without broken master builds for some time period. > > >> >>>>>>>>>> > > >> >>>>>>>>>> So, what should we do? I think maven indexer is one of the > > >> > > > >> >core > > >> > > > >> >>>>>>>>>> components of > > >> >>>>>>>>>> archiva, and we should utilize the 3.x-version to migrate > to > > >> >>> > > >> >>> the > > >> >>> > > >> >>>>>> new > > >> >>>>>> > > >> >>>>>>>>>> indexer > > >> >>>>>>>>>> version, even if this means switching to jcr oak. Otherwise > > >> > > > >> >it > > >> > > > >> >>>>>>>>>> would > > >> >>>>>>>>>> mean to > > >> >>>>>>>>>> stick to the old version for the next years. > > >> >>>>>>>>>> @Olivier, regarding the maven-indexer / sisu-Bridge API > > >> >>> > > >> >>> changes, I > > >> >>> > > >> >>>>>> hope > > >> >>>>>> > > >> >>>>>>>>>> you > > >> >>>>>>>>>> can provide useful help. > > >> >>>>>>>>>> > > >> >>>>>>>>>> I committed the PoC to the branch feature/jcr_oak. There > are > > >> >>> > > >> >>> some > > >> >>> > > >> >>>>>>>>>> modules > > >> >>>>>>>>>> where the tests do not pass (mainly because of the indexer > > >> > > > >> >API > > >> > > > >> >>>>>> changes). > > >> >>>>>> > > >> >>>>>>>>>> Any comments? > > >> >>>>>>>>>> > > >> >>>>>>>>>> Cheers > > >> >>>>>>>>>> > > >> >>>>>>>>>> Martin > > >> >>>>>>>>>> > > >> >>>>>>>>>> Am Dienstag, 13. Juni 2017, 09:07:35 CEST schrieb Olivier > > >> > > > >> >Lamy: > > >> >>>>>>>>>>> forget it but we need to ensure we can read maven index > > >> >>> > > >> >>> files.... > > >> >>> > > >> >>>>>>>>>>> On 13 June 2017 at 17:06, Olivier Lamy <[email protected]> > > >> >>> > > >> >>> wrote: > > >> >>>>>>>>>>>> Hi, > > >> >>>>>>>>>>>> Remember jackrabbit depends on Lucene as well so > upgrading > > >> >>>>>> > > >> >>>>>> Lucene > > >> >>>>>> > > >> >>>>>>>>>> can be a > > >> >>>>>>>>>> > > >> >>>>>>>>>>>> problem here. > > >> >>>>>>>>>>>> Regarding maven-indexer yes we can depend on a snapshot > > >> >>> > > >> >>> until > > >> >>> > > >> >>>>>> the > > >> >>>>>> > > >> >>>>>>>>>> release. > > >> >>>>>>>>>> > > >> >>>>>>>>>>>> I can release it ;-) > > >> >>>>>>>>>>>> > > >> >>>>>>>>>>>> On 13 June 2017 at 06:06, Martin <[email protected]> > > >> >>> > > >> >>> wrote: > > >> >>>>>>>>>>>>> Hi, > > >> >>>>>>>>>>>>> > > >> >>>>>>>>>>>>> the lucene version depends on the maven indexer. But I'm > > >> >>> > > >> >>> not > > >> >>> > > >> >>>>>> sure > > >> >>>>>> > > >> >>>>>>>>>> about > > >> >>>>>>>>>> > > >> >>>>>>>>>>>>> the > > >> >>>>>>>>>>>>> current state of maven-indexer. The version has not > > >> > > > >> >changed > > >> > > > >> >>>>>> since > > >> >>>>>> > > >> >>>>>>>>>> some > > >> >>>>>>>>>> > > >> >>>>>>>>>>>>> 2013. > > >> >>>>>>>>>>>>> > > >> >>>>>>>>>>>>> There are commits on the master branch since then, and > the > > >> >>>>>> > > >> >>>>>> lucene > > >> >>>>>> > > >> >>>>>>>>>> version > > >> >>>>>>>>>> > > >> >>>>>>>>>>>>> has > > >> >>>>>>>>>>>>> been changed too, but no releases were tagged. > > >> >>>>>>>>>>>>> Does it make sense to switch to the maven-indexer > > >> >>>>>>>>>>>>> 6.0-SNAPSHOT? > > >> >>>>>>>>>>>>> > > >> >>>>>>>>>>>>> As I know there are new compact index formats with new > > >> >>> > > >> >>> lucene > > >> >>> > > >> >>>>>>>>>> versions > > >> >>>>>>>>>> > > >> >>>>>>>>>>>>> but I'm > > >> >>>>>>>>>>>>> not sure if this is relevant for the maven indexes. > > >> >>>>>>>>>>>>> > > >> >>>>>>>>>>>>> Cheers > > >> >>>>>>>>>>>>> > > >> >>>>>>>>>>>>> Martin > > >> >>>>>>>>>>>> > > >> >>>>>>>>>>>> -- > > >> >>>>>>>>>>>> Olivier Lamy > > >> >>>>>>>>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy > > >> >>>>>>>>> > > >> >>>>>>>>> -- > > >> >>>>>>>>> Olivier Lamy > > >> >>>>>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy > > >> >>>>>>>> > > >> >>>>>>>> -- > > >> >>>>>>>> Olivier Lamy > > >> >>>>>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy > > >> >>>>> > > >> >>>>> -- > > >> >>>>> Olivier Lamy > > >> >>>>> http://twitter.com/olamy | http://linkedin.com/in/olamy > > >> >> > > >> >> -- > > >> >> Olivier Lamy > > >> >> http://twitter.com/olamy | http://linkedin.com/in/olamy > > >> > > >> -- > > >> Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet. > > > > > > -- > > > Olivier Lamy > > > http://twitter.com/olamy | http://linkedin.com/in/olamy > > > -- Olivier Lamy http://twitter.com/olamy | http://linkedin.com/in/olamy
