[jira] [Commented] (OAK-319) Similar (rep:similar) support

Thomas Mueller (JIRA) Mon, 31 Mar 2014 08:35:34 -0700

    [ 
https://issues.apache.org/jira/browse/OAK-319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13955300#comment-13955300
 ]


Thomas Mueller commented on OAK-319:
------------------------------------

You are right, the path is in the Lucene index. There was a problem searching 
the document by path (the Lucene MoreLikeThis tool tries to search the document 
using a PhraseQuery, which doesn't work for paths). To solve / work around this 
problem, I have a patch for the MoreLikeThisHelper (below). That way, the 
lookup of the document by path works as expected. There is still a problem: the 
MoreLikeThis tool expects the content is stored in the document, however as far 
as I see we don't do that right now, the document is: 
"[stored,indexed,tokenized,omitNorms,indexOptions=DOCS_ONLY<:path:/test/a>]". I 
think we need to store the contents of the document in the document itself, for 
rep:similar to work.

Patch:
{noformat}
#P oak-lucene
Index: 
src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/util/MoreLikeThisHelper.java
===================================================================
--- 
src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/util/MoreLikeThisHelper.java
   (revision 1583237)
+++ 
src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/util/MoreLikeThisHelper.java
   (working copy)
@@ -17,10 +17,17 @@
 package org.apache.jackrabbit.oak.plugins.index.lucene.util;
 
 import java.io.StringReader;
+
+import org.apache.jackrabbit.oak.plugins.index.lucene.FieldNames;
 import org.apache.lucene.analysis.Analyzer;
 import org.apache.lucene.index.IndexReader;
+import org.apache.lucene.index.Term;
 import org.apache.lucene.queries.mlt.MoreLikeThis;
+import org.apache.lucene.search.IndexSearcher;
 import org.apache.lucene.search.Query;
+import org.apache.lucene.search.ScoreDoc;
+import org.apache.lucene.search.TermQuery;
+import org.apache.lucene.search.TopDocs;
 
 /**
  * Helper class for generating a {@link 
org.apache.lucene.queries.mlt.MoreLikeThisQuery} from the native query 
<code>String</code>
@@ -33,6 +40,7 @@
         mlt.setAnalyzer(analyzer);
         try {
             String text = null;
+            String[] fields = {};
             for (String param : mltQueryString.split("&")) {
                 String[] keyValuePair = param.split("=");
                 if (keyValuePair.length != 2 || keyValuePair[0] == null || 
keyValuePair[1] == null) {
@@ -41,7 +49,7 @@
                     if ("stream.body".equals(keyValuePair[0])) {
                         text = keyValuePair[1];
                     } else if ("mlt.fl".equals(keyValuePair[0])) {
-                        mlt.setFieldNames(keyValuePair[1].split(","));
+                        fields = keyValuePair[1].split(",");
                     } else if ("mlt.mindf".equals(keyValuePair[0])) {
                         mlt.setMinDocFreq(Integer.parseInt(keyValuePair[1]));
                     } else if ("mlt.mintf".equals(keyValuePair[0])) {
@@ -66,7 +74,21 @@
                 }
             }
             if (text != null) {
-                moreLikeThisQuery = mlt.like(new StringReader(text), 
mlt.getFieldNames()[0]);
+                if (FieldNames.PATH.equals(fields[0])) {
+                    IndexSearcher searcher = new IndexSearcher(reader);
+                    TermQuery q = new TermQuery(new Term(FieldNames.PATH, 
text));
+                    TopDocs top = searcher.search(q, 1);
+                    if (top.totalHits == 0) {
+                        mlt.setFieldNames(fields);
+                        moreLikeThisQuery = mlt.like(new StringReader(text), 
mlt.getFieldNames()[0]);
+                    } else{
+                        ScoreDoc d = top.scoreDocs[0];
+                        moreLikeThisQuery = mlt.like(d.doc);
+                    }
+                } else {
+                    mlt.setFieldNames(fields);
+                    moreLikeThisQuery = mlt.like(new StringReader(text), 
mlt.getFieldNames()[0]);
+                }
             }
             return moreLikeThisQuery;
         } catch (Exception e) {
{noformat}

> Similar (rep:similar) support
> -----------------------------
>
>                 Key: OAK-319
>                 URL: https://issues.apache.org/jira/browse/OAK-319
>             Project: Jackrabbit Oak
>          Issue Type: Sub-task
>          Components: jcr, query
>            Reporter: Alex Parvulescu
>            Assignee: Thomas Mueller
>            Priority: Critical
>             Fix For: 0.20
>
>
> Test class is: SimilarQueryTest
> Trace:
> {noformat}
> Caused by: java.text.ParseException: Query:
> //*[rep:similar(.(*), '/testroot')]; expected: rep:similar is not supported
>       at 
> org.apache.jackrabbit.oak.query.XPathToSQL2Converter.getSyntaxError(XPathToSQL2Converter.java:963)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (OAK-319) Similar (rep:similar) support

Reply via email to