Hi Uwe, Thanks for taking the time out of your Easter break to respond. I've created the following Jira tickets for preparatory tasks:
LUCENE-9317 <https://issues.apache.org/jira/browse/LUCENE-9317> Resolve package name conflicts for StandardAnalyzer to allow Java module system support LUCENE-9318 <https://issues.apache.org/jira/browse/LUCENE-9318> Fix Codec API to not rely on package-private classes as part of changes to support java module system LUCENE-9319 <https://issues.apache.org/jira/browse/LUCENE-9319> Clean up Sandbox project by retiring/delete functionality or move it to Lucene core I've also created the following for adding module-info.java: LUCENE-9320 <https://issues.apache.org/jira/browse/LUCENE-9320> Enable support the Java module system by adding module-info to projects I've checked out the master branch and starting to look at setting up the development environment using Gradle to see if I can assist with a few items. I notice Gradle hasn't been set up for Eclipse yet (maybe something I can help with). My first roadblock to getting setup is understanding the cyclic dependencies between core, codecs and the test-framework. Given IRC is not really used from what I can see, and the slack channel is private to people with @apache.org emails. Is there any good place to get basic help on getting started with the codebase without spamming this email list? Thanks, David. On Sat, Apr 11, 2020 at 9:14 PM Uwe Schindler <u...@thetaphi.de> wrote: > Hi, > > > > I am currently on Easter vacation, but just wanted to give some feedback: > > They wy you are doing it works at the moment, because you don’t have > anything to do with service providers. You don’t use custom codecs (outside > lucene-core.jar) and you just ignored the META-IF/services folder. Your > example works, as Lucene finds everything that is needed in the core.jar, > so the “legacy” service loading works (lucene-core.jar find its own > services in lucene-core.jar). As soon as you would add > lucene-backwards-codecs.jar, it would break unfixable, unless you copy the > whole backwards codecs into core. This is completely unfixable in Lucene > 8.x, as the new service-loader interface requires Java 9+. > > > > But you may not yet have noticed that I already started to do preparatory > work and migrated in master (Lucene 9.0) the service provider to use the > Java Runtime classes and we have thrown away our own service loader (which > cannot cross module boundaries). The fix was merged not long ago: > https://issues.apache.org/jira/browse/LUCENE-9281 - this cannot be > backported to Java 8 / Lucene 8, as the reason for the home-made > SPIClassIterator was exactly the missing features (and some classpath > ordering issues in 3rd party JVMs like IBM J9). Those issues are fixed in > Java 11 (also allowing to instantiate the classes on your own). When you > are testing your “hack patch” with Lucene’s master branch you may succeed > with also loading classes from the backwards codecs – no guarantees yet!) > > > > About StandardAnalyzer: Unfortunately I aggressively complained a while > back when Mike McCandless wanted to move standard analyzer out of the > analysis package into core (“for convenience”). This was a bad step, and > IMHO we should revert that or completely rename the packages and > everything. The problem here is: As the analysis services are only part of > lucene-analyzers, we had to leave the factory classes there, but move the > implementation classes in core. The package has to be the same. The only > way around that is to move the analysis factory framework also to core (I > would not be against that). This would include all factory base classes and > the service loading stuff. Then we can move standard analyzer and some of > the filters/tokenizers including their factories to core an that problem > would be solved. > > > > My plan to move to modules is the following: > > > > (1) Do preparatory work: > > - Retire SPIClassIterator (DONE). > - Add some preparatory issues to cleanup class hierarchy: Move > Analysis SPI to core / remove StandardAnalyzer and related classes out of > core back to anaysis > - Fix Codec API to no rely on package-private classes, so we can have > a completely public API with abstract classes for codecs, so stuff in > backwards-codecs does not need to have access to package private stuff in > core. > - Cleanup sandbox to prefix all classes there with “sandbox” package > and where needed remove package-private access. If it’s needed for internal > access, WTF: Just move the stuff to core! We have a new version 9.0, so > either retire/delete Sandbox stuff or make it part of Lucene core. > > (2) Wait for the Gradle Build to be finalized, because including Module > stuff into the current Ant build won’t work > > - Ant build must be retired! > > (3) Make Lucene real modules > > - As we are on Java 11 already, add module-info.java everywhere > - Fix gradle build to create and test modules (Latest Gradle needed) > - Migrate all META-INF/services/* to module-info.java (before doing > this, figure out of the META-INF files must stays for non-module usage, or > if Java is clever enough to also look into module descriptor to find > services). We may need all services at both locations (for module or > classpath usage; we need a build helper to check that it’s in-line) > > > > I don’t want Lucene work with automodules in Java 11, it should be fully > modularized. > > > > If you want to help and express interest: Please open an issue for the > preparatory work listing the cases of same-package in different jar files. > I would split this up as described above: Analysis issues with > standardanalyzer, codecs pkg-private apis, sandbox. > > > > Uwe > > > > ----- > > Uwe Schindler > > Achterdiek 19, D-28357 Bremen > > https://www.thetaphi.de > > eMail: u...@thetaphi.de > > > > *From:* David Ryan <da...@livemedia.com.au> > *Sent:* Saturday, April 11, 2020 9:30 AM > *To:* dev@lucene.apache.org > *Subject:* Re: Lucene 9.0 Java module system support > > > > > > Thanks, I had probably missed some of what Uwe was saying in regards to > the limitation of what would be possible even if the suggested changes were > made. Given your points and the fact that it would take a while to have any > of the changes filter into a release version, I decided to develop a > patch-lucene.sh script to validate if the changes would allow me to access > the required parts of Lucene. While it may not provide the full > functionality provided by SPI including analysis chains and codecs, I > worked out it will allow the use of the basics. > > > > The patch script below adds Automatic module names to the four Lucene > libraries I needed (not strictly required, but I was interested to check if > it would work). For now, any duplicate packages I identified have been > moved into the core library (as identified in suggested changes). Given, > I'm not using the SPI functionality, I simply deleted the standard > analysers from the services list (required or the module system complains > they are missing). Thankfully Gradle doesn't validate jars in the cache so > the patch script only needs to be run once. > > > > Once the patch script was applied, I was able to use Ngram analysers and > the query parser without issue. This will provide the essentials of what we > needed for an embedded solution (location indexing, short text indexing > with ngrams tokenizer and lower case filter, searching by location with > nearest, bounding box and radius search, and partial text searches using > the query parser). Of course, I've only scratched the surface of the APIs > and right now it isn't clear if/when the SPI functionality might cause > further issues or be required. > > > > So, I've validated that the suggested changes would allow basic > functionality to work under the Java module system. If there's anything > else I can do to help progress the suggested changes please let me know. > > > > Regards, > > David. > > > > > > --------- patch-lucene.sh -- works on osx ------ > > mkdir -p patch-lucene/core > mkdir -p patch-lucene/analyzers > mkdir -p patch-lucene/sandbox > mkdir -p patch-lucene/queryparser > cd patch-lucene > > # copy the jars from the gradle cache. > cp > ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-core/8.5.0/3f9ea85fff4fc3f7c83869dddb9b0ef7818c0cae/lucene-core-8.5.0.jar > . > cp > ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-analyzers-common/8.5.0/7156f2e545fd6e7faaee4781d15eb60cf5f07646/lucene-analyzers-common-8.5.0.jar > . > cp > ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-sandbox/8.5.0/2b275921f2fd92b15b4f1a2a565467c3fa221ef9/lucene-sandbox-8.5.0.jar > . > cp > ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-queryparser/8.5.0/13c38f39b1a7d10c4749ba789fa95da5868d4885/lucene-queryparser-8.5.0.jar > . > > # Add automatic module name to core. > cd core > jar -xf ../lucene-core-8.5.0.jar > sed -i '' -e '/Created-By:/a\'$'\n''Automatic-Module-Name: > org.apache.lucene.core'$'\n' META-INF/MANIFEST.MF > > # Add automiatc module name, move standard analysis classes into core. > Remove any classes in standard from service lists. > cd ../analyzers > jar -xf ../lucene-analyzers-common-8.5.0.jar > sed -i '' -e '/Created-By:/a\'$'\n''Automatic-Module-Name: > org.apache.lucene.analyzers.common'$'\n' META-INF/MANIFEST.MF > sed -i '' -e '/standard/d' META-INF/services/* > mv org/apache/lucene/analysis/standard/* > ../core/org/apache/lucene/analysis/standard > rmdir org/apache/lucene/analysis/standard > jar -cfm ../lucene-analyzers-common-8.5.0-fix.jar META-INF/MANIFEST.MF . > > # Add automatic module name, move search and document packages into core. > move Java 9 specific search into core. > cd ../sandbox > jar -xf ../lucene-sandbox-8.5.0.jar > sed -i '' -e '/Created-By:/a\'$'\n''Automatic-Module-Name: > org.apache.lucene.sandbox'$'\n' META-INF/MANIFEST.MF > mv org/apache/lucene/search/* ../core/org/apache/lucene/search > rmdir org/apache/lucene/search > mv org/apache/lucene/document/* ../core/org/apache/lucene/document > rmdir org/apache/lucene/document > mv META-INF/versions/9/org/apache/lucene/search/* > ../core/META-INF/versions/9/org/apache/lucene/search > rm -rf META-INF/versions > jar -cfm ../lucene-sandbox-8.5.0-fix.jar META-INF/MANIFEST.MF . > > # Package up core with changes. > cd ../core > jar -cfm ../lucene-core-8.5.0-fix.jar META-INF/MANIFEST.MF . > > # Add automatic module name. > cd ../queryparser > jar -xf ../lucene-queryparser-8.5.0.jar > sed -i '' -e '/Created-By:/a\'$'\n''Automatic-Module-Name: > org.apache.lucene.queryparser'$'\n' META-INF/MANIFEST.MF > jar -cfm ../lucene-queryparser-8.5.0-fix.jar META-INF/MANIFEST.MF . > > # Copy the fixed versions back into gradle cache. > cd .. > cp lucene-core-8.5.0-fix.jar > ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-core/8.5.0/3f9ea85fff4fc3f7c83869dddb9b0ef7818c0cae/lucene-core-8.5.0.jar > cp lucene-analyzers-common-8.5.0-fix.jar > ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-analyzers-common/8.5.0/7156f2e545fd6e7faaee4781d15eb60cf5f07646/lucene-analyzers-common-8.5.0.jar > cp lucene-sandbox-8.5.0-fix.jar > ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-sandbox/8.5.0/2b275921f2fd92b15b4f1a2a565467c3fa221ef9/lucene-sandbox-8.5.0.jar > cp lucene-queryparser-8.5.0-fix.jar > ~/.gradle/caches/modules-2/files-2.1/org.apache.lucene/lucene-queryparser/8.5.0/13c38f39b1a7d10c4749ba789fa95da5868d4885/lucene-queryparser-8.5.0.jar > > ------------ > > > > On Sat, Apr 11, 2020 at 11:21 AM Chris Hostetter <hossman_luc...@fucit.org> > wrote: > > > : If the changes I proposed are still viewed as having too many downstream > : impacts, my fallback position will be to patch the jars. This involves > : using the gradle import system to get the jars from Maven, then using a > : patch script to manually unzip the jars, move the offending classes into > : other jars which share the same package name and rezip. So far, I've been > > I'm no expert here, but i trust that Uwe is, and i feel like your followup > questions/suggests have still avoided his primary point about *why* > Lucene/Solr hasn't attempted jigsaw modulariation... > > : >>> There is currently some preparatory things to move forward with > modules, > : >>> so although you might be able to actually compile Lucene with module > system > : >>> (by limiting to a subset of JAR files), it currently won’t work > : >>> cross-module due to the way how it handles ServiceLoader interfaces > : >>> (codecs, postings formats, analyzers, see > : >>> https://issues.apache.org/jira/browse/LUCENE-9281). The only way to > : >>> make it work at runtime is to put all of Lucene into one module. > > ...so, IIUC, even if you patch the *current* jars, any Lucene code you use > that depends on SPI (like analysis chains, codecs, etc...) isn't going to > work unless follow Uwe's primary suggestion for folks who care about > modules... > > : >>> Th general recommendation is to combine all required Lucene libraries > : >>> into a separate JAR file during the maven / gradle build (e.g. using > the > : >>> Maven Shade plugin). Keep in mind that Lucene is also not suitable > for use > > > -Hoss > http://www.lucidworks.com/ > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > >