I opened this issue to fix jar arrangements so that the OpenNLP integration could work. analysis-extras, opennlp, and uima share the same problem: they use lucene libraries and third-party dependencies.

Fixing license file problems is certainly helpful, but does not make deployment any easer. This issue was essentially hijacked.

Here is one way to make it easy to deploy items outside of the solr war file: repack dependent jars into all contrib dist/ jars. Just pack everything about analysis-extras into dist/analysis-extras.jar. Remove the contrib lucene libraries from the war file.

Add this to solr/contrib/analysis-extras/build.xml:

<target name="addjars">
    <zip destfile="../../dist/apache-solr-analysis-extras-4.0-SNAPSHOT.jar" update="true">
	<zipfileset src="" class="code-quote">"../../../lucene/build/analysis/common/lucene-analyzers-common-4.0-SNAPSHOT.jar" excludes="META-INF/MANIFEST.MF" />
	<zipfileset src="" class="code-quote">"../../../lucene/build/analysis/icu/lucene-analyzers-icu-4.0-SNAPSHOT.jar" excludes="META-INF/MANIFEST.MF" />
	<zipfileset src="" class="code-quote">"../../../lucene/build/analysis/kuromoji/lucene-analyzers-kuromoji-4.0-SNAPSHOT.jar" excludes="META-INF/MANIFEST.MF" />
	<zipfileset src="" class="code-quote">"../../../lucene/build/analysis/morfologik/lucene-analyzers-morfologik-4.0-SNAPSHOT.jar" excludes="META-INF/MANIFEST.MF" />
	<zipfileset src="" class="code-quote">"../../../lucene/build/analysis/phonetic/lucene-analyzers-phonetic-4.0-SNAPSHOT.jar" excludes="META-INF/MANIFEST.MF" />
	<zipfileset src="" class="code-quote">"../../../lucene/build/analysis/smartcn/lucene-analyzers-smartcn-4.0-SNAPSHOT.jar" excludes="META-INF/MANIFEST.MF" />
	<zipfileset src="" class="code-quote">"../../../lucene/build/analysis/stempel/lucene-analyzers-stempel-4.0-SNAPSHOT.jar" excludes="META-INF/MANIFEST.MF" />

	<zipfileset src="" class="code-quote">"lib/icu4j-49.1.jar" excludes="META-INF/MANIFEST.MF" />
	<zipfileset src="" class="code-quote">"lib/morfologik-fsa-1.5.3.jar" excludes="META-INF/MANIFEST.MF" />
	<zipfileset src="" class="code-quote">"lib/morfologik-polish-1.5.3.jar" excludes="META-INF/MANIFEST.MF" />
	<zipfileset src="" class="code-quote">"lib/morfologik-stemming-1.5.3.jar" excludes="META-INF/MANIFEST.MF" />
    </zip>
  </target>

Run 'ant dist addjars'. The dist jar goes from 20k (one file) to 21M (4035 files). But, it is 21M in one deployable file. Everything is in one place!

Caveats:

  • This approach needs a little rearranging of the order of the build steps. There is no place visible to the contrib build.xml where the solr/build dist jar is finished, but not yet copied to dist/. I don't know what to do about META-INF files in the absorbed libraries. This approach just preserves the manifest file.
  • Redundant dependencies:
    • analysis-extras and extraction both use icu4j, which is a huge jar. Too bad.
    • dataimporter wants all of extraction. Stick with the current arrangement.

This design is appropriate for analysis-extras, uima and opennlp. All of these have lucene libraries and lib/ directories, and the current build arrangement just plain does not work. It is a convenience for clustering, dataimporthandler (-extras), extraction, langid, and velocity.

The build.xml file above needs macro-izing, and as mentioned the build sequence needs a point where the contrib build can repack the dist file inside solr/build.

Change By: Lance Norskog (26/Aug/12 11:30)
Status: Resolved Reopened
Resolution: Fixed
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
--------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]

Reply via email to