[
https://issues.apache.org/jira/browse/DERBY-590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Knut Anders Hatlen updated DERBY-590:
-------------------------------------
Attachment: multifield.diff
Thanks, Rick. Those were the exact changes that were needed.
The attached patch [^multifield.diff] shows an example of how it could be used.
I made two small adjustments:
1) Instead of hard-coding the field names, I made LuceneSupport read them
dynamically from a database property (derby.tests.lucene.fields), so that I
could verify that the original Lucene tests still pass. (They do still pass, by
the way.) Also the field names are stored in the Lucene index property file, so
that LuceneQueryVTI can find them too. This is of course just a temporary hack
until we figure out the correct API.
2) I made LuceneUtils.defaultQueryParser() always return a
MultiFieldQueryParser, since MultiFieldQueryParser seems to behave just like
QueryParser in the degenerate case with a single field.
Since I didn't feel like writing a Java source file parser, I changed my
example use case to search in XML files, so that I could use the XML parser
that is in the JRE. I added a test case to LuceneSupportTest to verify that it
could be used for that.
The test case creates an index with two fields: tags and text. The tags field
contains only the XML tags, whereas the text field contains only the text
elements of the XML file. This way, you can use the index to search for data
and metadata separately in the XML documents stored in your table.
Now, while writing the test case, I found that you will most likely want to use
a custom query parser when you use it this way. The reason is that the default
query parser uses the same analyzer as the index writer used to extract tokens
from the search terms. That means, if you like in this case use a custom
analyzer that parser XML documents, the query parser will also expect the terms
in the query to be XML documents. So you'll end up with rather silly-looking
queries.
For example, to search for documents that contain the text "abc", you cannot
make the query {{text:"abc"}}, but have to wrap it in dummy XML tags to make it
parsable {{text:"<dummy>abc</dummy>"}}.
The custom query parser doesn't need to be very complex, though. The test case
in the patch shows one example in the method {{createXMLQueryParser()}}. That
method simply creates a MultiFieldQueryParser with a plain StandardAnalyzer.
With that parser, you can write queries like:
- {{text:abc}} to search for "abc" in the text elements of the XML
- {{tags:abc}} to search for XML tags called "abc"
- {{abc}} to search for "abc" in both text elements and tags
What do you think? Does it sound like a useful addition?
> How to integrate Derby with Lucene API?
> ---------------------------------------
>
> Key: DERBY-590
> URL: https://issues.apache.org/jira/browse/DERBY-590
> Project: Derby
> Issue Type: Improvement
> Components: Documentation, SQL
> Reporter: Abhijeet Mahesh
> Labels: derby_triage10_11
> Attachments: LucenePlugin.html, LucenePlugin.html, LucenePlugin.html,
> derby-590-01-ag-publicAccessToLuceneRoutines.diff,
> derby-590-01-ah-publicAccessToLuceneRoutines.diff,
> derby-590-01-am-publicAccessToLuceneRoutines.diff,
> derby-590-02-aa-cleanupFindbugsErrors.diff,
> derby-590-03-aa-removeTestingDiagnostic.diff,
> derby-590-04-aa-removeIDFromListIndexes.diff,
> derby-590-05-aa-accessDeclaredMembers.diff,
> derby-590-06-aa-suppressAccessChecks.diff,
> derby-590-07-aa-accessClassInPackage.sun.misc.diff,
> derby-590-08-aa-omitLuceneFlag.diff,
> derby-590-09-aa-localeSensitiveAnalysis.diff,
> derby-590-10-aa-fixLocaleTest.diff, derby-590-11-aa-moveCode.diff,
> derby-590-12-aa-newJar.diff, derby-590-13-aa-indexViews.diff,
> derby-590-14-aa-coarseGrainedAuthorization.diff,
> derby-590-15-aa-requireHardUpgrade.diff,
> derby-590-16-aa-adjustUpgradeTest.diff,
> derby-590-17-aa-closeInputStreamOnPropertiesFile.diff,
> derby-590-18-aa-cleanupAPI.diff, derby-590-19-aa-cleanupAPI2.diff,
> derby-590-20-aa-customQueryParser.diff, derby-590-21-aa-noTimeTravel.diff,
> derby-590-22-aa-cleanupPrivacy.diff, derby-590-23-aa-correctTestLocale.diff,
> derby-590-24-ad-luceneDirectory.diff, derby-590-26-ac-backupRestore.diff,
> derby-590-26-ad-backupRestoreEncryption.diff,
> derby-590-27-aa-publicAPILuceneUtils.diff,
> derby-590-28-renameLuceneJars.diff, derby-590-29-aa-useLucene_4.7.1.diff,
> derby-590-30-aa-nullableScoreCeiling.diff, exceptions.diff, lucene_demo.diff,
> lucene_demo_2.diff, multifield.diff, netbeans.diff, netbeans2.diff
>
>
> In order to use derby with lucene API what should be the steps to be taken?
--
This message was sent by Atlassian JIRA
(v6.2#6252)