Hello Tommaso,
thanks a lot for your reply :) I will follow the steps you gave me as
soon as there is a little time for this.
Also thanks for the SolrCas hint. I think we already talked about this.
As far as I understood, Solrcas as well as the Solr-UIMA integration
lack some of the features offered by LuCas, for example the alignment of
TokenStreams which allows you to merge multiple CAS indexes into a
single Lucene field where position_increments are adjusted to stack
Lucene tokens with the same offsets. Please (!!) tell me when I'm wrong
here, as I am still working on my own ways to use UIMA together with Solr.
Thanks again and warm regards,
Erik
Am 13.04.2011 14:16, schrieb Tommaso Teofili:
Hello Erik,
that would be a very valuable contribution indeed!
The common way of contributing code is creating a patch file which contains
the differences between your current working copy and the latest revision
available in SVN; you can check better how to do this at
http://www.apache.org/dev/contributors.html#patches .
Then you create a Jira issue under the UIMA project [1] and attach the
created file to the issue.
At that point a committer will review your patch and will commit it if
everything is fine :)
As a side note if you want to use Solr within a UIMA pipeline you could be
interested in Solrcas [2] or in the Solr-UIMA integration available in Solr
3.1.0 release [3].
Hope this helps,
Tommaso
[1] : https://issues.apache.org/jira/browse/UIMA
[2] : http://uima.apache.org/sandbox.html#solrcas.consumer
[3] : http://wiki.apache.org/solr/SolrUIMA
2011/4/13 Erik Fäßler<[email protected]>
Hey all,
back in January, I had the need to have the CAS Lucene indexer (LuCas, UIMA
Sandbox component) working with Lucene 2.9.x. So I checked it out from the
Sandbox SVN, updated the libraries and fixed the compiling bugs. The result
is a LuCas component working with Lucene 2.9.3. At least all tests are
working and I used the component (together with Solr which was why I needed
Lucene 2.9.x) successfully.
The changes needed were not too big as I did not take the leap to Lucene
3.x. Some filters have been updated to the new Token API and one or two
classes required a more or less complete rewrite until the tests would work
again.
So, my question: Would it be desirable to commit these changes back to the
Sandbox SVN? Which steps would have I have to take for this? Or should I
just send my sources to a developer? The component has been created in my
lab originally, but the developer has moved to another working place quite a
while ago.
Best regards,
Erik