subject:"\[jira\] \[Commented\] \(SOLR\-6248\) MoreLikeThis Query Parser"

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

2015-01-14 Thread Anshum Gupta (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14277640#comment-14277640
]

Anshum Gupta commented on SOLR-6248:

[~markus17] Thanks for the patch for 4.10 but that can't go in. It's a new
feature and will be released with 5.0 (sometime really soon).
I haven't looked at the patch yet but users who are running 4.10 and want to
use this patch are free to do so.

We can work on getting the bug fixes/tests into 5x now though. Can you provide
a patch for trunk/5x for the tests/fixes?

MoreLikeThis Query Parser
-

Key: SOLR-6248
URL: https://issues.apache.org/jira/browse/SOLR-6248
Project: Solr
Issue Type: New Feature
Components: query parsers
Reporter: Anshum Gupta
Assignee: Anshum Gupta
Fix For: 5.0, Trunk

Attachments: SOLR-6248-4x.patch, SOLR-6248-4x.patch, SOLR-6248.patch,
SOLR-6248.patch, SOLR-6248.patch, SOLR-6248.patch, SOLR-6248.patch,
SOLR-6248.patch

MLT Component doesn't let people highlight/paginate and the handler comes
with an cost of maintaining another piece in the config. Also, any changes to
the default (number of results to be fetched etc.) /select handler need to be
copied/synced with this handler too.
Having an MLT QParser would let users get back docs based on a query for them
to paginate, highlight etc. It would also give them the flexibility to use
this anywhere i.e. q,fq,bq etc.
A bit of history about MLT (thanks to Hoss)
MLT Handler pre-dates the existence of QParsers and was meant to take an
arbitrary query as input, find docs that match that
query, club them together to find interesting terms, and then use those
terms as if they were my main query to generate a main result set.
This result would then be used as the set to facet, highlight etc.
The flow: Query - DocList(m) - Bag (terms) - Query - DocList\(y)
The MLT component on the other hand solved a very different purpose of
augmenting the main result set. It is used to get similar docs for each of
the doc in the main result set.
DocSet\(n) - n * Bag (terms) - n * (Query) - n * DocList(m)
The new approach:
All of this can be done better and cleaner (and makes more sense too) using
an MLT QParser.
An important thing to handle here is the case where the user doesn't have
TermVectors, in which case, it does what happens right now i.e. parsing
stored fields.
Also, in case the user doesn't have a field (to be used for MLT) indexed, the
field would need to be a TextField with an index analyzer defined. This
analyzer will then be used to extract terms for MLT.
In case of SolrCloud mode, '/get-termvectors' can be used after looking at
the schema (if TermVectors are enabled for the field). If not, a /get call
can be used to fetch the field and parse it.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

2014-11-04 Thread ASF subversion and git services (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14197204#comment-14197204
]

ASF subversion and git services commented on SOLR-6248:
---

Commit 1636784 from [~anshumg] in branch 'dev/trunk'
[ https://svn.apache.org/r1636784 ]

SOLR-6248: Fixing an exception in case of missing qf

MoreLikeThis Query Parser
-

Key: SOLR-6248
URL: https://issues.apache.org/jira/browse/SOLR-6248
Project: Solr
Issue Type: New Feature
Components: query parsers
Reporter: Anshum Gupta
Assignee: Anshum Gupta
Fix For: 5.0

Attachments: SOLR-6248.patch, SOLR-6248.patch, SOLR-6248.patch,
SOLR-6248.patch, SOLR-6248.patch, SOLR-6248.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

2014-11-04 Thread ASF subversion and git services (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14197237#comment-14197237
]

ASF subversion and git services commented on SOLR-6248:
---

Commit 1636788 from [~anshumg] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1636788 ]

SOLR-6248: Fixing an exception in case of missing qf (merge from trunk)

MoreLikeThis Query Parser
-

Key: SOLR-6248
URL: https://issues.apache.org/jira/browse/SOLR-6248
Project: Solr
Issue Type: New Feature
Components: query parsers
Reporter: Anshum Gupta
Assignee: Anshum Gupta
Fix For: 5.0

Attachments: SOLR-6248.patch, SOLR-6248.patch, SOLR-6248.patch,
SOLR-6248.patch, SOLR-6248.patch, SOLR-6248.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

2014-10-29 Thread ASF subversion and git services (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14189236#comment-14189236
]

ASF subversion and git services commented on SOLR-6248:
---

Commit 1635329 from [~anshumg] in branch 'dev/trunk'
[ https://svn.apache.org/r1635329 ]

SOLR-6248: Changing the format of mlt query parser

MoreLikeThis Query Parser
-

Key: SOLR-6248
URL: https://issues.apache.org/jira/browse/SOLR-6248
Project: Solr
Issue Type: New Feature
Components: query parsers
Reporter: Anshum Gupta
Assignee: Anshum Gupta
Fix For: 5.0

Attachments: SOLR-6248.patch, SOLR-6248.patch, SOLR-6248.patch,
SOLR-6248.patch, SOLR-6248.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

2014-10-29 Thread Anshum Gupta (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14189235#comment-14189235
]

Anshum Gupta commented on SOLR-6248:

After a discussion with Hoss, I'm changing the format of the query parser. It
wouldn't have an 'id' key in the request i.e. the new request would look like:
{quote}
\{!mlt qf=fieldname\}docId
{quote}

This would eliminate the need to document/maintain and track a new parameter
name.

MoreLikeThis Query Parser
-

Key: SOLR-6248
URL: https://issues.apache.org/jira/browse/SOLR-6248
Project: Solr
Issue Type: New Feature
Components: query parsers
Reporter: Anshum Gupta
Assignee: Anshum Gupta
Fix For: 5.0

Attachments: SOLR-6248.patch, SOLR-6248.patch, SOLR-6248.patch,
SOLR-6248.patch, SOLR-6248.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

2014-10-29 Thread ASF subversion and git services (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14189268#comment-14189268
]

ASF subversion and git services commented on SOLR-6248:
---

Commit 1635336 from [~anshumg] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1635336 ]

SOLR-6248: Changing request format for mlt queryparser (merge from trunk)

MoreLikeThis Query Parser
-

Key: SOLR-6248
URL: https://issues.apache.org/jira/browse/SOLR-6248
Project: Solr
Issue Type: New Feature
Components: query parsers
Reporter: Anshum Gupta
Assignee: Anshum Gupta
Fix For: 5.0

Attachments: SOLR-6248.patch, SOLR-6248.patch, SOLR-6248.patch,
SOLR-6248.patch, SOLR-6248.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

2014-10-28 Thread Noble Paul (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186775#comment-14186775
]

Noble Paul commented on SOLR-6248:
--

doesn't it make sense to put an example query in the description ?

MoreLikeThis Query Parser
-

Key: SOLR-6248
URL: https://issues.apache.org/jira/browse/SOLR-6248
Project: Solr
Issue Type: New Feature
Components: query parsers
Reporter: Anshum Gupta
Assignee: Anshum Gupta
Attachments: SOLR-6248.patch, SOLR-6248.patch, SOLR-6248.patch,
SOLR-6248.patch, SOLR-6248.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

2014-10-28 Thread Noble Paul (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186848#comment-14186848
]

Noble Paul commented on SOLR-6248:
--

I guess it is good to go

MoreLikeThis Query Parser
-

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

2014-10-28 Thread ASF subversion and git services (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187205#comment-14187205
]

ASF subversion and git services commented on SOLR-6248:
---

Commit 1634937 from [~anshumg] in branch 'dev/trunk'
[ https://svn.apache.org/r1634937 ]

SOLR-6248: MoreLikeThis QParser that works in standalone/cloud mode

MoreLikeThis Query Parser
-

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

2014-10-28 Thread Markus Jelsma (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187217#comment-14187217
]

Markus Jelsma commented on SOLR-6248:
-

Anshun, very cool stuff here!

bq. {!mlt id=docId qf=fieldNames}

I assume this is not the Lucene DocID but the document's UniqueKey field value?
Also, must we query the correct shard for it to work?

MoreLikeThis Query Parser
-

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

2014-10-28 Thread ASF subversion and git services (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187220#comment-14187220
]

ASF subversion and git services commented on SOLR-6248:
---

Commit 1634939 from [~anshumg] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1634939 ]

SOLR-6248: MoreLikeThis QParser that works in standalone/cloud mode (merge from
trunk)

MoreLikeThis Query Parser
-

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

2014-10-28 Thread Anshum Gupta (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187234#comment-14187234
]

Anshum Gupta commented on SOLR-6248:

Thanks [~markus17]. This is indeed the documents` unique key field value.
Also, I don't think you'd need to target the correct shard as, in case of Cloud
mode, it uses the /get handler.

This has a lot of room for improvements/enhancements but I thought this was a
good point to start with.

MoreLikeThis Query Parser
-

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

2014-10-28 Thread Anshum Gupta (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187241#comment-14187241
]

Anshum Gupta commented on SOLR-6248:

[~noble.paul] Thanks for looking at the patch.
I've added a sample query in there. Also, there's basic description as a part
of the package.html.

I'll also be adding the usage in the ref guide.

MoreLikeThis Query Parser
-

Key: SOLR-6248
URL: https://issues.apache.org/jira/browse/SOLR-6248
Project: Solr
Issue Type: New Feature
Components: query parsers
Reporter: Anshum Gupta
Assignee: Anshum Gupta
Fix For: 5.0

Attachments: SOLR-6248.patch, SOLR-6248.patch, SOLR-6248.patch,
SOLR-6248.patch, SOLR-6248.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

2014-10-28 Thread ASF subversion and git services (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187244#comment-14187244
]

ASF subversion and git services commented on SOLR-6248:
---

Commit 1634941 from [~anshumg] in branch 'dev/trunk'
[ https://svn.apache.org/r1634941 ]

SOLR-6248: Removing svn:keywords that got auto-added with the commit hook.

MoreLikeThis Query Parser
-

Key: SOLR-6248
URL: https://issues.apache.org/jira/browse/SOLR-6248
Project: Solr
Issue Type: New Feature
Components: query parsers
Reporter: Anshum Gupta
Assignee: Anshum Gupta
Fix For: 5.0

Attachments: SOLR-6248.patch, SOLR-6248.patch, SOLR-6248.patch,
SOLR-6248.patch, SOLR-6248.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

2014-10-28 Thread ASF subversion and git services (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187247#comment-14187247
]

ASF subversion and git services commented on SOLR-6248:
---

Commit 1634942 from [~anshumg] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1634942 ]

SOLR-6248: Removing svn:keywords that got auto-added with the commit hook.
(merge from trunk)

MoreLikeThis Query Parser
-

Key: SOLR-6248
URL: https://issues.apache.org/jira/browse/SOLR-6248
Project: Solr
Issue Type: New Feature
Components: query parsers
Reporter: Anshum Gupta
Assignee: Anshum Gupta
Fix For: 5.0

Attachments: SOLR-6248.patch, SOLR-6248.patch, SOLR-6248.patch,
SOLR-6248.patch, SOLR-6248.patch

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

2014-10-27 Thread Erik Hatcher (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186323#comment-14186323
]

Erik Hatcher commented on SOLR-6248:

[~anshumg] - your latest patch does not have the new files (need to svn add
them). This new qparser, IMO, should be registered automatically in
QParserPlugin, so it doesn't need to be registered in solrconfig.xml manually.

Overall looks great (looking back and previous patches to see the new files)!
+1

MoreLikeThis Query Parser
-

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

2014-10-27 Thread Anshum Gupta (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14186379#comment-14186379
]

Anshum Gupta commented on SOLR-6248:

[~ehatcher] Thanks for looking at it. I merged the changes into
MoreLikeThis.java instead of duplicating code (so the files are actually gone).
The patch has everything that's required but yes I'll have this automatically
registered in QParserPlugin.

MoreLikeThis Query Parser
-

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

2014-09-20 Thread Upayavira (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14142063#comment-14142063
 ] 

Upayavira commented on SOLR-6248:
-

I also, after a conversation with Hoss, have knocked up a MLTQuery parser.

I very much doubt your query parser will allow you to pass in a stream.body, 
because the first few lines of 
o.a.s.handler.component.SearchHandler.handleRequestBody() say:

{code:java}
if (req.getContentStreams() != null  
req.getContentStreams().iterator().hasNext()) {
throw new SolrException(ErrorCode.BAD_REQUEST, Search requests cannot 
accept content streams);
}
{code}

This needs to be removed for stream.body to be available to the query parser.

I can post my patch later if anyone is interested. It doesn't have any tests 
yet. My next task is to work out how to make it work across cores (recommend 
docs in one core based upon docs in another).

Regarding the patch in this ticket, I'm curious why you needed a SolrCloud 
specific query parser? Is it because the doc you are using might be in a 
different shard?

Also, it appears from a cursory look that LWMoreLikeThis is a fork of Lucene's 
MoreLikeThis class. Is there a reason that is needed, and if so, why isn't it 
still in Lucene?

I expect to be working on my own version this week, and if what I produce can 
be useful to others (via this ticket or otherwise), I'd be happy to contribute 
it.
Thx!



 MoreLikeThis Query Parser
 -

 Key: SOLR-6248
 URL: https://issues.apache.org/jira/browse/SOLR-6248
 Project: Solr
  Issue Type: New Feature
Reporter: Anshum Gupta
Assignee: Anshum Gupta
 Attachments: SOLR-6248.patch


 MLT Component doesn't let people highlight/paginate and the handler comes 
 with an cost of maintaining another piece in the config. Also, any changes to 
 the default (number of results to be fetched etc.) /select handler need to be 
 copied/synced with this handler too.
 Having an MLT QParser would let users get back docs based on a query for them 
 to paginate, highlight etc. It would also give them the flexibility to use 
 this anywhere i.e. q,fq,bq etc.
 A bit of history about MLT (thanks to Hoss)
 MLT Handler pre-dates the existence of QParsers and was meant to take an 
 arbitrary query as input, find docs that match that 
 query, club them together to find interesting terms, and then use those 
 terms as if they were my main query to generate a main result set.
 This result would then be used as the set to facet, highlight etc.
 The flow: Query - DocList(m) - Bag (terms) - Query - DocList\(y)
 The MLT component on the other hand solved a very different purpose of 
 augmenting the main result set. It is used to get similar docs for each of 
 the doc in the main result set.
 DocSet\(n) - n * Bag (terms) - n * (Query) - n * DocList(m)
 The new approach:
 All of this can be done better and cleaner (and makes more sense too) using 
 an MLT QParser.
 An important thing to handle here is the case where the user doesn't have 
 TermVectors, in which case, it does what happens right now i.e. parsing 
 stored fields.
 Also, in case the user doesn't have a field (to be used for MLT) indexed, the 
 field would need to be a TextField with an index analyzer defined. This 
 analyzer will then be used to extract terms for MLT.
 In case of SolrCloud mode, '/get-termvectors' can be used after looking at 
 the schema (if TermVectors are enabled for the field). If not, a /get call 
 can be used to fetch the field and parse it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

2014-07-25 Thread Steve Molloy (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074360#comment-14074360
]

Steve Molloy commented on SOLR-6248:

In this case it cannot replace the current MoreLikeThisHandler implementation
which can analyze incoming text (as opposed to searching for a matching
document in the index) in order to find similar documents in the index. Being
able to query by unique field and returning similar documents is already
covered by the MoreLikeThisComponent if you use rows=1 to get a single document
and its set of similar ones. The use case that forces the MoreLikeThisHandler
currently (at least that I know of) is really this on-the-fly analysis of text
that is nowhere in the index.

MoreLikeThis Query Parser
-

Key: SOLR-6248
URL: https://issues.apache.org/jira/browse/SOLR-6248
Project: Solr
Issue Type: New Feature
Reporter: Anshum Gupta
Attachments: SOLR-6248.patch

--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

2014-07-25 Thread Anshum Gupta (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074514#comment-14074514
]

Anshum Gupta commented on SOLR-6248:

My bad, this was my mistake. The last time I'd looked at this patch was about
10 months ago.

This works like a component but also lets you paginate and do other stuff with
it.
Let me check out if accepting text would make sense here (or if we could have
something on similar lines).

MoreLikeThis Query Parser
-

Key: SOLR-6248
URL: https://issues.apache.org/jira/browse/SOLR-6248
Project: Solr
Issue Type: New Feature
Reporter: Anshum Gupta
Attachments: SOLR-6248.patch

--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

2014-07-24 Thread Steve Molloy (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14073275#comment-14073275
 ] 

Steve Molloy commented on SOLR-6248:


I'd like to give this a spin, but looking at the attached patch, it's unclear 
how to pass in text. The parsers seem to be looking at id parameter, I 
haven't seen any reference to stream.body. What parameter would be used to pass 
in text to be analyzed and for which to return similar documents?

 MoreLikeThis Query Parser
 -

 Key: SOLR-6248
 URL: https://issues.apache.org/jira/browse/SOLR-6248
 Project: Solr
  Issue Type: New Feature
Reporter: Anshum Gupta
 Attachments: SOLR-6248.patch


 MLT Component doesn't let people highlight/paginate and the handler comes 
 with an cost of maintaining another piece in the config. Also, any changes to 
 the default (number of results to be fetched etc.) /select handler need to be 
 copied/synced with this handler too.
 Having an MLT QParser would let users get back docs based on a query for them 
 to paginate, highlight etc. It would also give them the flexibility to use 
 this anywhere i.e. q,fq,bq etc.
 A bit of history about MLT (thanks to Hoss)
 MLT Handler pre-dates the existence of QParsers and was meant to take an 
 arbitrary query as input, find docs that match that 
 query, club them together to find interesting terms, and then use those 
 terms as if they were my main query to generate a main result set.
 This result would then be used as the set to facet, highlight etc.
 The flow: Query - DocList(m) - Bag (terms) - Query - DocList\(y)
 The MLT component on the other hand solved a very different purpose of 
 augmenting the main result set. It is used to get similar docs for each of 
 the doc in the main result set.
 DocSet\(n) - n * Bag (terms) - n * (Query) - n * DocList(m)
 The new approach:
 All of this can be done better and cleaner (and makes more sense too) using 
 an MLT QParser.
 An important thing to handle here is the case where the user doesn't have 
 TermVectors, in which case, it does what happens right now i.e. parsing 
 stored fields.
 Also, in case the user doesn't have a field (to be used for MLT) indexed, the 
 field would need to be a TextField with an index analyzer defined. This 
 analyzer will then be used to extract terms for MLT.
 In case of SolrCloud mode, '/get-termvectors' can be used after looking at 
 the schema (if TermVectors are enabled for the field). If not, a /get call 
 can be used to fetch the field and parse it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

2014-07-24 Thread Vitaliy Zhovtyuk (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14073671#comment-14073671
]

Vitaliy Zhovtyuk commented on SOLR-6248:

With current implementation in patch mlt qparser can match document by unique
field configured in schema and find similar document out of it. Parser syntax
now look like {code}{!mlt id=17 qf=lowerfilt}lowerfilt:*{code} where id is
value of unique field configure (not id column in schema), qf is matched
fields to search.

About passing text this parser can be extended with text parameter, search
document by this term and look for similar document using existing
implementation.

MoreLikeThis Query Parser
-

Key: SOLR-6248
URL: https://issues.apache.org/jira/browse/SOLR-6248
Project: Solr
Issue Type: New Feature
Reporter: Anshum Gupta
Attachments: SOLR-6248.patch

--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

2014-07-23 Thread Steve Molloy (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071645#comment-14071645
]

Steve Molloy commented on SOLR-6248:

I meant passing in text as parameter as opposed to finding it in the index.
With current MLT handler (not component), you can pass it in as body or
stream.body to get documents similar to the text you pass in. In our case, we
use it to find documents in one collection similar to a document found in
another, or to some text directly provided by user. So, I know that at some
point the SearchHandler started rejecting search requests with stream body,
which would prevent this unless it could be achieved in another way. That's why
I'm asking. :)

MoreLikeThis Query Parser
-

Key: SOLR-6248
URL: https://issues.apache.org/jira/browse/SOLR-6248
Project: Solr
Issue Type: New Feature
Reporter: Anshum Gupta

--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

2014-07-23 Thread Anshum Gupta (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071753#comment-14071753
]

Anshum Gupta commented on SOLR-6248:

I don't think this would really work across 2 collections straight out of the
box, but yes, as long as you have 'text' to pass, that is exactly what this
parser would take. In other words, for now, it would more or less maintain the
same mechanism of the handler (but in a manner that makes it work under
SolrCloud mode).

MoreLikeThis Query Parser
-

Key: SOLR-6248
URL: https://issues.apache.org/jira/browse/SOLR-6248
Project: Solr
Issue Type: New Feature
Reporter: Anshum Gupta

--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

2014-07-22 Thread Anshum Gupta (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070897#comment-14070897
]

Anshum Gupta commented on SOLR-6248:

What do you mean by text that isn't in the index? If you mean pseudo-random
text to find documents similar to that? Yes, it would handle that.

MoreLikeThis Query Parser
-

Key: SOLR-6248
URL: https://issues.apache.org/jira/browse/SOLR-6248
Project: Solr
Issue Type: New Feature
Reporter: Anshum Gupta

--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

2014-07-21 Thread Steve Molloy (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14068501#comment-14068501
]

Steve Molloy commented on SOLR-6248:

Would that approach also support sending in text that isn't in the index? This
is the main reason we're using the MLT handler, which we need to be distributed
(thus SOLR-5480). but if we can have a single approach for both, I agree that
not maintaining 2 configurations (and 2 handlers in the code) would be much
better. Let me know if I can help out.

MoreLikeThis Query Parser
-

Key: SOLR-6248
URL: https://issues.apache.org/jira/browse/SOLR-6248
Project: Solr
Issue Type: New Feature
Reporter: Anshum Gupta

--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

[jira] [Commented] (SOLR-6248) MoreLikeThis Query Parser

26 matches

Site Navigation

Mail list logo

Footer information