RE: MoreLikeThis supporting multiple document IDs as input?

2013-01-04 Thread David Parks
Aha! mlt=true, that was the key I hadn't worked out before (thought it was
qt=mlt that achieved that), things are looking rosy now, and these results
are a perfect fit for my needs. Thanks very much for your time to help
explain this!!

David


-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com] 
Sent: Thursday, January 03, 2013 8:46 PM
To: solr-user@lucene.apache.org
Subject: Re: MoreLikeThis supporting multiple document IDs as input?

The MLT search component is enabled using mlt=true and works on any normal
Solr query. It gives a batch of similar documents for each search result of
the original query, one batch per original query result. It uses the
mlt.count=n parameter to control how many similar results to return for
each original query result.

The MLT request handler is a standalone request handler that does a query,
takes the first result, and then returns one batch of documents that are
similar to that one document. You have to configure the handler yourself,
but typically it would have the name /mlt, so you would write:

http://10.0.0.1:8080/solr/mlt/?q=shoesrows=3

It will show you both the single document from the original query and then
the batch of documents that are most similar to the top terms from that one
original document.

Add debugQuery=true or debug=query or debug=results to see the terms that
are used in the secondary queries that find the similar documents.

There are a bunch a parameters that you have to tune for either approach.

-- Jack Krupansky

-Original Message-
From: David Parks
Sent: Thursday, January 03, 2013 4:11 AM
To: solr-user@lucene.apache.org
Subject: RE: MoreLikeThis supporting multiple document IDs as input?

I'm not seeing the results I would expect. In the previous email below it's
stated that the MLT search component returns N results and K similar
documents per EACH of the N results.

If I'm not mistaken I access the MLT search component via a query to
/solr/select/?qt=mlt, such as this:

http://10.0.0.1:8080/solr/select/?qt=mltterms=trueq=shoesrows=3

The query above for a simple term such as shoes can return many documents.
But I limited the results to 3, and I see 3 results, and the results don't
appear to me any different than doing this query:

http://107.23.102.164:8080/solr/select/?q=shoesrows=3

So that suggests to me that solr maybe isn't handing things off to the MLT
component as expected (I don't know what results to expect so it's hard for
me to know where I'm trying to get to).

So add in a debugQuery=on parameter and I see this, possibly useful
reference:

str name=QParserLuceneQParser/str

It also appears that the MoreLikeThisComponent did indeed run

lst name=org.apache.solr.handler.component.MoreLikeThisComponent

So maybe I should ask exactly what results I should be expecting here?

Thanks very much!
David


-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com]
Sent: Friday, December 28, 2012 8:13 PM
To: solr-user@lucene.apache.org
Subject: Re: MoreLikeThis supporting multiple document IDs as input?

Try a query that returns multiple results and you will see the difference.

MLT search component: n results, k similar documents per EACH of the n
results

MLT request handler: only FIRST result is examined, so only k similar
documents for that ONE (first) TOP search result.

Are you really saying that you don't comprehend what the difference is, or
simply that you don't LIKE the difference?! Or, maybe that you are wondering
WHY they are different? That latter question I don't have the answer to.

-- Jack Krupansky

-Original Message-
From: David Parks
Sent: Friday, December 28, 2012 2:48 AM
To: solr-user@lucene.apache.org
Subject: RE: MoreLikeThis supporting multiple document IDs as input?

So the Search Components are executed in series an _every_ request. I
presume then that they look at the request parameters and decide what and
whether to take action.

So in the case of the MLT component this was said:

 The MLT search component returns similar documents for each of the 
 documents in the search results, but processes each search result base 
 document one at a time and keeps its similar documents segregated by 
 each of the base documents.

So what I think I understand is that the Query Component (presumably this
guy: org.apache.solr.handler.component.QueryComponent) takes the input from
the q parameter and returns a result (the q=id:123456 ensure that the
Query Component will return just this one document).

The MltComponent then looks at the result from the QueryComponent and
generates its results.

The part that is still confusing is understanding the difference between
these two comments:

- The MLT search component returns similar documents for each of the
documents in the search results
- The MLT handler returns similar documents only for the first document that
the query matches.



-Original Message-
From: Otis Gospodnetic [mailto:otis.gospodne

RE: MoreLikeThis supporting multiple document IDs as input?

2013-01-03 Thread David Parks
I'm not seeing the results I would expect. In the previous email below it's
stated that the MLT search component returns N results and K similar
documents per EACH of the N results.

If I'm not mistaken I access the MLT search component via a query to
/solr/select/?qt=mlt, such as this:

http://10.0.0.1:8080/solr/select/?qt=mltterms=trueq=shoesrows=3

The query above for a simple term such as shoes can return many documents.
But I limited the results to 3, and I see 3 results, and the results don't
appear to me any different than doing this query:

http://107.23.102.164:8080/solr/select/?q=shoesrows=3

So that suggests to me that solr maybe isn't handing things off to the MLT
component as expected (I don't know what results to expect so it's hard for
me to know where I'm trying to get to).

So add in a debugQuery=on parameter and I see this, possibly useful
reference:

str name=QParserLuceneQParser/str

It also appears that the MoreLikeThisComponent did indeed run

lst name=org.apache.solr.handler.component.MoreLikeThisComponent

So maybe I should ask exactly what results I should be expecting here? 

Thanks very much!
David


-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com] 
Sent: Friday, December 28, 2012 8:13 PM
To: solr-user@lucene.apache.org
Subject: Re: MoreLikeThis supporting multiple document IDs as input?

Try a query that returns multiple results and you will see the difference.

MLT search component: n results, k similar documents per EACH of the n
results

MLT request handler: only FIRST result is examined, so only k similar
documents for that ONE (first) TOP search result.

Are you really saying that you don't comprehend what the difference is, or
simply that you don't LIKE the difference?! Or, maybe that you are wondering
WHY they are different? That latter question I don't have the answer to.

-- Jack Krupansky

-Original Message-
From: David Parks
Sent: Friday, December 28, 2012 2:48 AM
To: solr-user@lucene.apache.org
Subject: RE: MoreLikeThis supporting multiple document IDs as input?

So the Search Components are executed in series an _every_ request. I
presume then that they look at the request parameters and decide what and
whether to take action.

So in the case of the MLT component this was said:

 The MLT search component returns similar documents for each of the 
 documents in the search results, but processes each search result base 
 document one at a time and keeps its similar documents segregated by 
 each of the base documents.

So what I think I understand is that the Query Component (presumably this
guy: org.apache.solr.handler.component.QueryComponent) takes the input from
the q parameter and returns a result (the q=id:123456 ensure that the
Query Component will return just this one document).

The MltComponent then looks at the result from the QueryComponent and
generates its results.

The part that is still confusing is understanding the difference between
these two comments:

- The MLT search component returns similar documents for each of the
documents in the search results
- The MLT handler returns similar documents only for the first document that
the query matches.



-Original Message-
From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com]
Sent: Friday, December 28, 2012 1:26 PM
To: solr-user@lucene.apache.org
Subject: RE: MoreLikeThis supporting multiple document IDs as input?

Hi Dave,

Think of search components as a chain of Java classes that get executed
during each search request. If you open solrconfig.xml you will see how they
are defined and used.

HTH

Otis
Solr  ElasticSearch Support
http://sematext.com/
On Dec 28, 2012 12:06 AM, David Parks davidpark...@yahoo.com wrote:

 I'm somewhat new to Solr (it's running, I've been through the books, 
 but I'm no master). What I hear you say is that MLT *can* accept, say 
 5, documents and provide results, but the results would essentially be 
 the same as running the query 5 times for each document?

 If that's the case, I might accept it. I would just have to merge them 
 together at the end (perhaps I'd take the top 2 of each result, for 
 example).

 Being somewhat new I'm a little confused by the difference between a 
 Search Component and a Handler. I've got the /mlt handler working 
 and I'm using that. But how's that different from a Search 
 Component? Is that referring to the default /solr/select?q=...
 style query?

 And if what I said about multiple documents above is correct, what's 
 the syntax to try that out?

 Thanks very much for the great help!
 Dave


 -Original Message-
 From: Jack Krupansky [mailto:j...@basetechnology.com]
 Sent: Wednesday, December 26, 2012 12:07 PM
 To: solr-user@lucene.apache.org
 Subject: Re: MoreLikeThis supporting multiple document IDs as input?

 MLT has both a request handler and a search component.

 The MLT handler returns similar documents only for the first document 
 that the query matches

Re: MoreLikeThis supporting multiple document IDs as input?

2013-01-03 Thread Jack Krupansky
The MLT search component is enabled using mlt=true and works on any normal 
Solr query. It gives a batch of similar documents for each search result of 
the original query, one batch per original query result. It uses the 
mlt.count=n parameter to control how many similar results to return for 
each original query result.


The MLT request handler is a standalone request handler that does a query, 
takes the first result, and then returns one batch of documents that are 
similar to that one document. You have to configure the handler yourself, 
but typically it would have the name /mlt, so you would write:


http://10.0.0.1:8080/solr/mlt/?q=shoesrows=3

It will show you both the single document from the original query and then 
the batch of documents that are most similar to the top terms from that one 
original document.


Add debugQuery=true or debug=query or debug=results to see the terms that 
are used in the secondary queries that find the similar documents.


There are a bunch a parameters that you have to tune for either approach.

-- Jack Krupansky

-Original Message- 
From: David Parks

Sent: Thursday, January 03, 2013 4:11 AM
To: solr-user@lucene.apache.org
Subject: RE: MoreLikeThis supporting multiple document IDs as input?

I'm not seeing the results I would expect. In the previous email below it's
stated that the MLT search component returns N results and K similar
documents per EACH of the N results.

If I'm not mistaken I access the MLT search component via a query to
/solr/select/?qt=mlt, such as this:

http://10.0.0.1:8080/solr/select/?qt=mltterms=trueq=shoesrows=3

The query above for a simple term such as shoes can return many documents.
But I limited the results to 3, and I see 3 results, and the results don't
appear to me any different than doing this query:

http://107.23.102.164:8080/solr/select/?q=shoesrows=3

So that suggests to me that solr maybe isn't handing things off to the MLT
component as expected (I don't know what results to expect so it's hard for
me to know where I'm trying to get to).

So add in a debugQuery=on parameter and I see this, possibly useful
reference:

str name=QParserLuceneQParser/str

It also appears that the MoreLikeThisComponent did indeed run

lst name=org.apache.solr.handler.component.MoreLikeThisComponent

So maybe I should ask exactly what results I should be expecting here?

Thanks very much!
David


-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com]
Sent: Friday, December 28, 2012 8:13 PM
To: solr-user@lucene.apache.org
Subject: Re: MoreLikeThis supporting multiple document IDs as input?

Try a query that returns multiple results and you will see the difference.

MLT search component: n results, k similar documents per EACH of the n
results

MLT request handler: only FIRST result is examined, so only k similar
documents for that ONE (first) TOP search result.

Are you really saying that you don't comprehend what the difference is, or
simply that you don't LIKE the difference?! Or, maybe that you are wondering
WHY they are different? That latter question I don't have the answer to.

-- Jack Krupansky

-Original Message-
From: David Parks
Sent: Friday, December 28, 2012 2:48 AM
To: solr-user@lucene.apache.org
Subject: RE: MoreLikeThis supporting multiple document IDs as input?

So the Search Components are executed in series an _every_ request. I
presume then that they look at the request parameters and decide what and
whether to take action.

So in the case of the MLT component this was said:


The MLT search component returns similar documents for each of the
documents in the search results, but processes each search result base
document one at a time and keeps its similar documents segregated by
each of the base documents.


So what I think I understand is that the Query Component (presumably this
guy: org.apache.solr.handler.component.QueryComponent) takes the input from
the q parameter and returns a result (the q=id:123456 ensure that the
Query Component will return just this one document).

The MltComponent then looks at the result from the QueryComponent and
generates its results.

The part that is still confusing is understanding the difference between
these two comments:

- The MLT search component returns similar documents for each of the
documents in the search results
- The MLT handler returns similar documents only for the first document that
the query matches.



-Original Message-
From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com]
Sent: Friday, December 28, 2012 1:26 PM
To: solr-user@lucene.apache.org
Subject: RE: MoreLikeThis supporting multiple document IDs as input?

Hi Dave,

Think of search components as a chain of Java classes that get executed
during each search request. If you open solrconfig.xml you will see how they
are defined and used.

HTH

Otis
Solr  ElasticSearch Support
http://sematext.com/
On Dec 28, 2012 12:06 AM, David Parks davidpark...@yahoo.com wrote

Re: MoreLikeThis supporting multiple document IDs as input?

2012-12-28 Thread Jack Krupansky

Try a query that returns multiple results and you will see the difference.

MLT search component: n results, k similar documents per EACH of the n 
results


MLT request handler: only FIRST result is examined, so only k similar 
documents for that ONE (first) TOP search result.


Are you really saying that you don't comprehend what the difference is, or 
simply that you don't LIKE the difference?! Or, maybe that you are wondering 
WHY they are different? That latter question I don't have the answer to.


-- Jack Krupansky

-Original Message- 
From: David Parks

Sent: Friday, December 28, 2012 2:48 AM
To: solr-user@lucene.apache.org
Subject: RE: MoreLikeThis supporting multiple document IDs as input?

So the Search Components are executed in series an _every_ request. I
presume then that they look at the request parameters and decide what and
whether to take action.

So in the case of the MLT component this was said:


The MLT search component returns similar documents for each of the
documents in the search results, but processes each search result base
document one at a time and keeps its similar documents segregated by
each of the base documents.


So what I think I understand is that the Query Component (presumably this
guy: org.apache.solr.handler.component.QueryComponent) takes the input from
the q parameter and returns a result (the q=id:123456 ensure that the
Query Component will return just this one document).

The MltComponent then looks at the result from the QueryComponent and
generates its results.

The part that is still confusing is understanding the difference between
these two comments:

- The MLT search component returns similar documents for each of the
documents in the search results
- The MLT handler returns similar documents only for the first document
that the query matches.



-Original Message-
From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com]
Sent: Friday, December 28, 2012 1:26 PM
To: solr-user@lucene.apache.org
Subject: RE: MoreLikeThis supporting multiple document IDs as input?

Hi Dave,

Think of search components as a chain of Java classes that get executed
during each search request. If you open solrconfig.xml you will see how they
are defined and used.

HTH

Otis
Solr  ElasticSearch Support
http://sematext.com/
On Dec 28, 2012 12:06 AM, David Parks davidpark...@yahoo.com wrote:


I'm somewhat new to Solr (it's running, I've been through the books,
but I'm no master). What I hear you say is that MLT *can* accept, say
5, documents and provide results, but the results would essentially be
the same as running the query 5 times for each document?

If that's the case, I might accept it. I would just have to merge them
together at the end (perhaps I'd take the top 2 of each result, for
example).

Being somewhat new I'm a little confused by the difference between a
Search Component and a Handler. I've got the /mlt handler working
and I'm using that. But how's that different from a Search
Component? Is that referring to the default /solr/select?q=...
style query?

And if what I said about multiple documents above is correct, what's
the syntax to try that out?

Thanks very much for the great help!
Dave


-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com]
Sent: Wednesday, December 26, 2012 12:07 PM
To: solr-user@lucene.apache.org
Subject: Re: MoreLikeThis supporting multiple document IDs as input?

MLT has both a request handler and a search component.

The MLT handler returns similar documents only for the first document
that the query matches.

The MLT search component returns similar documents for each of the
documents in the search results, but processes each search result base
document one at a time and keeps its similar documents segregated by
each of the base documents.

It sounds like you wanted to merge the base search results and then
find documents similar to that merged super-document. Is that what you
were really seeking, as opposed to what the MLT component does?
Unfortunately, you can't do that with the components as they are.

You would have to manually merge the values from the base documents
and then you could POST that text back to the MLT handler and find
similar documents using the posted text rather than a query. Kind of
messy, but in theory that should work.

-- Jack Krupansky

-Original Message-
From: David Parks
Sent: Tuesday, December 25, 2012 5:04 AM
To: solr-user@lucene.apache.org
Subject: MoreLikeThis supporting multiple document IDs as input?

I'm unclear on this point from the documentation. Is it possible to
give Solr X # of document IDs and tell it that I want documents
similar to those X documents?

Example:

  - The user is browsing 5 different articles
  - I send Solr the IDs of these 5 articles so I can present the user
other similar articles

I see this example for sending it 1 document ID:
http://localhost:8080/solr/select/?qt=mltq=id:[document
id]mlt.fl=[field1],[field2],[field3]fl

RE: MoreLikeThis supporting multiple document IDs as input?

2012-12-27 Thread David Parks
I'm somewhat new to Solr (it's running, I've been through the books, but I'm
no master). What I hear you say is that MLT *can* accept, say 5, documents
and provide results, but the results would essentially be the same as
running the query 5 times for each document?

If that's the case, I might accept it. I would just have to merge them
together at the end (perhaps I'd take the top 2 of each result, for
example).

Being somewhat new I'm a little confused by the difference between a Search
Component and a Handler. I've got the /mlt handler working and I'm using
that. But how's that different from a Search Component? Is that referring
to the default /solr/select?q=... style query?

And if what I said about multiple documents above is correct, what's the
syntax to try that out?

Thanks very much for the great help!
Dave


-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com] 
Sent: Wednesday, December 26, 2012 12:07 PM
To: solr-user@lucene.apache.org
Subject: Re: MoreLikeThis supporting multiple document IDs as input?

MLT has both a request handler and a search component.

The MLT handler returns similar documents only for the first document that
the query matches.

The MLT search component returns similar documents for each of the documents
in the search results, but processes each search result base document one at
a time and keeps its similar documents segregated by each of the base
documents.

It sounds like you wanted to merge the base search results and then find
documents similar to that merged super-document. Is that what you were
really seeking, as opposed to what the MLT component does? Unfortunately,
you can't do that with the components as they are.

You would have to manually merge the values from the base documents and then
you could POST that text back to the MLT handler and find similar documents
using the posted text rather than a query. Kind of messy, but in theory that
should work.

-- Jack Krupansky

-Original Message-
From: David Parks
Sent: Tuesday, December 25, 2012 5:04 AM
To: solr-user@lucene.apache.org
Subject: MoreLikeThis supporting multiple document IDs as input?

I'm unclear on this point from the documentation. Is it possible to give
Solr X # of document IDs and tell it that I want documents similar to those
X documents?

Example:

  - The user is browsing 5 different articles
  - I send Solr the IDs of these 5 articles so I can present the user other
similar articles

I see this example for sending it 1 document ID:
http://localhost:8080/solr/select/?qt=mltq=id:[document
id]mlt.fl=[field1],[field2],[field3]fl=idrows=10

But can I send it 2+ document IDs as the query? 



RE: MoreLikeThis supporting multiple document IDs as input?

2012-12-27 Thread Otis Gospodnetic
Hi Dave,

Think of search components as a chain of Java classes that get executed
during each search request. If you open solrconfig.xml you will see how
they are defined and used.

HTH

Otis
Solr  ElasticSearch Support
http://sematext.com/
On Dec 28, 2012 12:06 AM, David Parks davidpark...@yahoo.com wrote:

 I'm somewhat new to Solr (it's running, I've been through the books, but
 I'm
 no master). What I hear you say is that MLT *can* accept, say 5, documents
 and provide results, but the results would essentially be the same as
 running the query 5 times for each document?

 If that's the case, I might accept it. I would just have to merge them
 together at the end (perhaps I'd take the top 2 of each result, for
 example).

 Being somewhat new I'm a little confused by the difference between a
 Search
 Component and a Handler. I've got the /mlt handler working and I'm using
 that. But how's that different from a Search Component? Is that referring
 to the default /solr/select?q=... style query?

 And if what I said about multiple documents above is correct, what's the
 syntax to try that out?

 Thanks very much for the great help!
 Dave


 -Original Message-
 From: Jack Krupansky [mailto:j...@basetechnology.com]
 Sent: Wednesday, December 26, 2012 12:07 PM
 To: solr-user@lucene.apache.org
 Subject: Re: MoreLikeThis supporting multiple document IDs as input?

 MLT has both a request handler and a search component.

 The MLT handler returns similar documents only for the first document that
 the query matches.

 The MLT search component returns similar documents for each of the
 documents
 in the search results, but processes each search result base document one
 at
 a time and keeps its similar documents segregated by each of the base
 documents.

 It sounds like you wanted to merge the base search results and then find
 documents similar to that merged super-document. Is that what you were
 really seeking, as opposed to what the MLT component does? Unfortunately,
 you can't do that with the components as they are.

 You would have to manually merge the values from the base documents and
 then
 you could POST that text back to the MLT handler and find similar documents
 using the posted text rather than a query. Kind of messy, but in theory
 that
 should work.

 -- Jack Krupansky

 -Original Message-
 From: David Parks
 Sent: Tuesday, December 25, 2012 5:04 AM
 To: solr-user@lucene.apache.org
 Subject: MoreLikeThis supporting multiple document IDs as input?

 I'm unclear on this point from the documentation. Is it possible to give
 Solr X # of document IDs and tell it that I want documents similar to those
 X documents?

 Example:

   - The user is browsing 5 different articles
   - I send Solr the IDs of these 5 articles so I can present the user other
 similar articles

 I see this example for sending it 1 document ID:
 http://localhost:8080/solr/select/?qt=mltq=id:[document
 id]mlt.fl=[field1],[field2],[field3]fl=idrows=10

 But can I send it 2+ document IDs as the query?




RE: MoreLikeThis supporting multiple document IDs as input?

2012-12-27 Thread David Parks
So the Search Components are executed in series an _every_ request. I
presume then that they look at the request parameters and decide what and
whether to take action.

So in the case of the MLT component this was said:

 The MLT search component returns similar documents for each of the 
 documents in the search results, but processes each search result base 
 document one at a time and keeps its similar documents segregated by 
 each of the base documents.

So what I think I understand is that the Query Component (presumably this
guy: org.apache.solr.handler.component.QueryComponent) takes the input from
the q parameter and returns a result (the q=id:123456 ensure that the
Query Component will return just this one document).

The MltComponent then looks at the result from the QueryComponent and
generates its results.

The part that is still confusing is understanding the difference between
these two comments:

 - The MLT search component returns similar documents for each of the
documents in the search results
 - The MLT handler returns similar documents only for the first document
that the query matches.



-Original Message-
From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com] 
Sent: Friday, December 28, 2012 1:26 PM
To: solr-user@lucene.apache.org
Subject: RE: MoreLikeThis supporting multiple document IDs as input?

Hi Dave,

Think of search components as a chain of Java classes that get executed
during each search request. If you open solrconfig.xml you will see how they
are defined and used.

HTH

Otis
Solr  ElasticSearch Support
http://sematext.com/
On Dec 28, 2012 12:06 AM, David Parks davidpark...@yahoo.com wrote:

 I'm somewhat new to Solr (it's running, I've been through the books, 
 but I'm no master). What I hear you say is that MLT *can* accept, say 
 5, documents and provide results, but the results would essentially be 
 the same as running the query 5 times for each document?

 If that's the case, I might accept it. I would just have to merge them 
 together at the end (perhaps I'd take the top 2 of each result, for 
 example).

 Being somewhat new I'm a little confused by the difference between a 
 Search Component and a Handler. I've got the /mlt handler working 
 and I'm using that. But how's that different from a Search 
 Component? Is that referring to the default /solr/select?q=... 
 style query?

 And if what I said about multiple documents above is correct, what's 
 the syntax to try that out?

 Thanks very much for the great help!
 Dave


 -Original Message-
 From: Jack Krupansky [mailto:j...@basetechnology.com]
 Sent: Wednesday, December 26, 2012 12:07 PM
 To: solr-user@lucene.apache.org
 Subject: Re: MoreLikeThis supporting multiple document IDs as input?

 MLT has both a request handler and a search component.

 The MLT handler returns similar documents only for the first document 
 that the query matches.

 The MLT search component returns similar documents for each of the 
 documents in the search results, but processes each search result base 
 document one at a time and keeps its similar documents segregated by 
 each of the base documents.

 It sounds like you wanted to merge the base search results and then 
 find documents similar to that merged super-document. Is that what you 
 were really seeking, as opposed to what the MLT component does? 
 Unfortunately, you can't do that with the components as they are.

 You would have to manually merge the values from the base documents 
 and then you could POST that text back to the MLT handler and find 
 similar documents using the posted text rather than a query. Kind of 
 messy, but in theory that should work.

 -- Jack Krupansky

 -Original Message-
 From: David Parks
 Sent: Tuesday, December 25, 2012 5:04 AM
 To: solr-user@lucene.apache.org
 Subject: MoreLikeThis supporting multiple document IDs as input?

 I'm unclear on this point from the documentation. Is it possible to 
 give Solr X # of document IDs and tell it that I want documents 
 similar to those X documents?

 Example:

   - The user is browsing 5 different articles
   - I send Solr the IDs of these 5 articles so I can present the user 
 other similar articles

 I see this example for sending it 1 document ID:
 http://localhost:8080/solr/select/?qt=mltq=id:[document
 id]mlt.fl=[field1],[field2],[field3]fl=idrows=10

 But can I send it 2+ document IDs as the query?





Re: MoreLikeThis supporting multiple document IDs as input?

2012-12-26 Thread Roman Chyla
Jay Luker has written MoreLikeThese which is probably what you want. You
may give it a try, though I am not sure if it works with Solr4.0 at this
point (we didn't port it yet)

https://github.com/romanchyla/montysolr/blob/MLT/contrib/adsabs/src/java/org/apache/solr/handler/MoreLikeTheseHandler.java

roman

On Wed, Dec 26, 2012 at 12:06 AM, Jack Krupansky j...@basetechnology.comwrote:

 MLT has both a request handler and a search component.

 The MLT handler returns similar documents only for the first document that
 the query matches.

 The MLT search component returns similar documents for each of the
 documents in the search results, but processes each search result base
 document one at a time and keeps its similar documents segregated by each
 of the base documents.

 It sounds like you wanted to merge the base search results and then find
 documents similar to that merged super-document. Is that what you were
 really seeking, as opposed to what the MLT component does? Unfortunately,
 you can't do that with the components as they are.

 You would have to manually merge the values from the base documents and
 then you could POST that text back to the MLT handler and find similar
 documents using the posted text rather than a query. Kind of messy, but in
 theory that should work.

 -- Jack Krupansky

 -Original Message- From: David Parks
 Sent: Tuesday, December 25, 2012 5:04 AM
 To: solr-user@lucene.apache.org
 Subject: MoreLikeThis supporting multiple document IDs as input?


 I'm unclear on this point from the documentation. Is it possible to give
 Solr X # of document IDs and tell it that I want documents similar to those
 X documents?

 Example:

  - The user is browsing 5 different articles
  - I send Solr the IDs of these 5 articles so I can present the user other
 similar articles

 I see this example for sending it 1 document ID:
 http://localhost:8080/solr/**select/?qt=mltq=id:[documenthttp://localhost:8080/solr/select/?qt=mltq=id:[document
 id]mlt.fl=[field1],[field2],[**field3]fl=idrows=10

 But can I send it 2+ document IDs as the query?



RE: MoreLikeThis supporting multiple document IDs as input?

2012-12-26 Thread David Parks
Someone else suggested this query: q=id:[1001 OR 1002], where the 
numbers represent multiple IDs, but if I get it, you're saying that these 
ultimate get turned into just one document and we get similar documents to just 
that one. 

MoreLikeThese sounds promising. Is this in one of the development builds, or is 
it just and addon I need to install? I haven't done much customization of Solr 
yet.

Thanks!
Dave


-Original Message-
From: Roman Chyla [mailto:roman.ch...@gmail.com] 
Sent: Wednesday, December 26, 2012 3:57 PM
To: solr-user@lucene.apache.org
Subject: Re: MoreLikeThis supporting multiple document IDs as input?

Jay Luker has written MoreLikeThese which is probably what you want. You may 
give it a try, though I am not sure if it works with Solr4.0 at this point (we 
didn't port it yet)

https://github.com/romanchyla/montysolr/blob/MLT/contrib/adsabs/src/java/org/apache/solr/handler/MoreLikeTheseHandler.java

roman

On Wed, Dec 26, 2012 at 12:06 AM, Jack Krupansky j...@basetechnology.comwrote:

 MLT has both a request handler and a search component.

 The MLT handler returns similar documents only for the first document 
 that the query matches.

 The MLT search component returns similar documents for each of the 
 documents in the search results, but processes each search result base 
 document one at a time and keeps its similar documents segregated by 
 each of the base documents.

 It sounds like you wanted to merge the base search results and then 
 find documents similar to that merged super-document. Is that what you 
 were really seeking, as opposed to what the MLT component does? 
 Unfortunately, you can't do that with the components as they are.

 You would have to manually merge the values from the base documents 
 and then you could POST that text back to the MLT handler and find 
 similar documents using the posted text rather than a query. Kind of 
 messy, but in theory that should work.

 -- Jack Krupansky

 -Original Message- From: David Parks
 Sent: Tuesday, December 25, 2012 5:04 AM
 To: solr-user@lucene.apache.org
 Subject: MoreLikeThis supporting multiple document IDs as input?


 I'm unclear on this point from the documentation. Is it possible to 
 give Solr X # of document IDs and tell it that I want documents 
 similar to those X documents?

 Example:

  - The user is browsing 5 different articles
  - I send Solr the IDs of these 5 articles so I can present the user 
 other similar articles

 I see this example for sending it 1 document ID:
 http://localhost:8080/solr/**select/?qt=mltq=id:[documenthttp://loca
 lhost:8080/solr/select/?qt=mltq=id:[document
 id]mlt.fl=[field1],[field2],[**field3]fl=idrows=10

 But can I send it 2+ document IDs as the query?




MoreLikeThis supporting multiple document IDs as input?

2012-12-25 Thread David Parks
I'm unclear on this point from the documentation. Is it possible to give
Solr X # of document IDs and tell it that I want documents similar to those
X documents?

Example:

  - The user is browsing 5 different articles
  - I send Solr the IDs of these 5 articles so I can present the user other
similar articles

I see this example for sending it 1 document ID:
http://localhost:8080/solr/select/?qt=mltq=id:[document
id]mlt.fl=[field1],[field2],[field3]fl=idrows=10

But can I send it 2+ document IDs as the query?



Re: MoreLikeThis supporting multiple document IDs as input?

2012-12-25 Thread Jack Krupansky

MLT has both a request handler and a search component.

The MLT handler returns similar documents only for the first document that 
the query matches.


The MLT search component returns similar documents for each of the documents 
in the search results, but processes each search result base document one at 
a time and keeps its similar documents segregated by each of the base 
documents.


It sounds like you wanted to merge the base search results and then find 
documents similar to that merged super-document. Is that what you were 
really seeking, as opposed to what the MLT component does? Unfortunately, 
you can't do that with the components as they are.


You would have to manually merge the values from the base documents and then 
you could POST that text back to the MLT handler and find similar documents 
using the posted text rather than a query. Kind of messy, but in theory that 
should work.


-- Jack Krupansky

-Original Message- 
From: David Parks

Sent: Tuesday, December 25, 2012 5:04 AM
To: solr-user@lucene.apache.org
Subject: MoreLikeThis supporting multiple document IDs as input?

I'm unclear on this point from the documentation. Is it possible to give
Solr X # of document IDs and tell it that I want documents similar to those
X documents?

Example:

 - The user is browsing 5 different articles
 - I send Solr the IDs of these 5 articles so I can present the user other
similar articles

I see this example for sending it 1 document ID:
http://localhost:8080/solr/select/?qt=mltq=id:[document
id]mlt.fl=[field1],[field2],[field3]fl=idrows=10

But can I send it 2+ document IDs as the query?