[jira] Commented: (SOLR-341) PHP Solr Client

2008-11-25 Thread Pieter Berkel (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12650781#action_12650781
 ] 

Pieter Berkel commented on SOLR-341:


Thanks for the quick response, just wanted to check that you uploaded the 
updated class files? I couldn't find the new setBoost() / setField() / 
setFieldBoost() methods in the Document class located in 
SolrPhpClient.2008-11-24.zip

 PHP Solr Client
 ---

 Key: SOLR-341
 URL: https://issues.apache.org/jira/browse/SOLR-341
 Project: Solr
  Issue Type: New Feature
  Components: clients - php
Affects Versions: 1.2
 Environment: PHP = 5.2.0 (or older with JSON PECL extension or other 
 json_decode function implementation). Solr = 1.2
Reporter: Donovan Jimenez
Priority: Trivial
 Fix For: 1.4

 Attachments: SolrPhpClient.2008-09-02.zip, 
 SolrPhpClient.2008-11-14.zip, SolrPhpClient.2008-11-24.zip, SolrPhpClient.zip


 Developed this client when the example PHP source didn't meet our needs.  The 
 company I work for agreed to release it under the terms of the Apache License.
 This version is slightly different from what I originally linked to on the 
 dev mailing list.  I've incorporated feedback from Yonik and hossman to 
 simplify the client and only accept one response format (JSON currently).
 When Solr 1.3 is released the client can be updated to use the PHP or 
 Serialized PHP response writer.
 example usage from my original mailing list post:
 ?php
 require_once('Solr/Service.php');
 $start = microtime(true);
 $solr = new Solr_Service(); //Or explicitly new Solr_Service('localhost', 
 8180, '/solr');
 try
 {
 $response = $solr-search('solr', 0, 10,
 array(/* you can include other parameters here */));
 echo 'search returned with status = ', 
 $response-responseHeader-status,
 ' and took ', microtime(true) - $start, ' seconds', \n;
 //here's how you would access results
 //Notice that I've mapped the values by name into a tree of stdClass 
 objects
 //and arrays (actually, most of this is done by json_decode )
 if ($response-response-numFound  0)
 {
 $doc_number = $response-response-start;
 foreach ($response-response-docs as $doc)
 {
 $doc_number++;
 echo $doc_number, ': ', $doc-text, \n;
 }
 }
 //for the purposes of seeing the available structure of the response
 //NOTE: Solr_Response::_parsedData is lazy loaded, so a print_r on 
 the response before
 //any values are accessed may result in different behavior (in case
 //anyone has some troubles debugging)
 //print_r($response);
 }
 catch (Exception $e)
 {
 echo $e-getMessage(), \n;
 }
 ?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-341) PHP Solr Client

2008-11-23 Thread Pieter Berkel (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12650095#action_12650095
 ] 

Pieter Berkel commented on SOLR-341:


Hi Donovan,

Great work on the PHP client library, however I noticed that there is no way to 
specify document- and/or field-level boost values when creating and indexing 
documents:

http://wiki.apache.org/solr/UpdateXmlMessages 

Perhaps Apache_Solr_Document could have a constructor method with an optional 
parameter for setting the document boost:

public function __construct($boost = '1.0') {

Not so sure how the field-level boost should be set, maybe add methods 
setFieldBoost($key) and getFieldBoost($key) to Apache_Solr_Document?

If necessary I can also submit code patches for these changes.

cheers,
Piete

 PHP Solr Client
 ---

 Key: SOLR-341
 URL: https://issues.apache.org/jira/browse/SOLR-341
 Project: Solr
  Issue Type: New Feature
  Components: clients - php
Affects Versions: 1.2
 Environment: PHP = 5.2.0 (or older with JSON PECL extension or other 
 json_decode function implementation). Solr = 1.2
Reporter: Donovan Jimenez
Priority: Trivial
 Fix For: 1.4

 Attachments: SolrPhpClient.2008-09-02.zip, 
 SolrPhpClient.2008-11-14.zip, SolrPhpClient.zip


 Developed this client when the example PHP source didn't meet our needs.  The 
 company I work for agreed to release it under the terms of the Apache License.
 This version is slightly different from what I originally linked to on the 
 dev mailing list.  I've incorporated feedback from Yonik and hossman to 
 simplify the client and only accept one response format (JSON currently).
 When Solr 1.3 is released the client can be updated to use the PHP or 
 Serialized PHP response writer.
 example usage from my original mailing list post:
 ?php
 require_once('Solr/Service.php');
 $start = microtime(true);
 $solr = new Solr_Service(); //Or explicitly new Solr_Service('localhost', 
 8180, '/solr');
 try
 {
 $response = $solr-search('solr', 0, 10,
 array(/* you can include other parameters here */));
 echo 'search returned with status = ', 
 $response-responseHeader-status,
 ' and took ', microtime(true) - $start, ' seconds', \n;
 //here's how you would access results
 //Notice that I've mapped the values by name into a tree of stdClass 
 objects
 //and arrays (actually, most of this is done by json_decode )
 if ($response-response-numFound  0)
 {
 $doc_number = $response-response-start;
 foreach ($response-response-docs as $doc)
 {
 $doc_number++;
 echo $doc_number, ': ', $doc-text, \n;
 }
 }
 //for the purposes of seeing the available structure of the response
 //NOTE: Solr_Response::_parsedData is lazy loaded, so a print_r on 
 the response before
 //any values are accessed may result in different behavior (in case
 //anyone has some troubles debugging)
 //print_r($response);
 }
 catch (Exception $e)
 {
 echo $e-getMessage(), \n;
 }
 ?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-379) KStem Token Filter

2008-06-15 Thread Pieter Berkel (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12605169#action_12605169
 ] 

Pieter Berkel commented on SOLR-379:


As far as I'm aware KStemFilterFactory.java was written by Harry Wagner so if 
he's happy to grant ASL it should be possible to include that in the repo.  
Everything in /src/java/org/apache/lucene/analysis has been copied from 
KStem.jar which was originally downloaded from CIIR, so if that can possibly be 
loaded on demand, then it should be fairly straightforward to include support 
for this stemmer in Solr.


 KStem Token Filter
 --

 Key: SOLR-379
 URL: https://issues.apache.org/jira/browse/SOLR-379
 Project: Solr
  Issue Type: New Feature
  Components: search
Reporter: Pieter Berkel
Priority: Minor
 Attachments: KStemSolr.zip


 A Lucene / Solr implementation of the KStem stemmer.  Full credit goes to 
 Harry Wagner for adapting the Lucene version found here:
 http://ciir.cs.umass.edu/cgi-bin/downloads/downloads.cgi
 Background discussion to this stemmer (including licensing issues) can be 
 found in this thread:
 http://www.nabble.com/Embedded-about-50--faster-for-indexing-tf4325720.html#a12376295
 I've made some minor changes to KStemFilterFactory so that it compiles 
 cleanly against trunk:
 1) removed some unnecessary imports
 2) changed the init() method parameters introduced by SOLR-215
 3) moved KStemFilterFactory into package org.apache.solr.analysis
 Once compiled and included in your Solr war (or as a jar in your lib 
 directory, the KStem filter can be used in your schema very easily:
   analyzer type=index
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.StopFilterFactory ignoreCase=true 
 words=stopwords.txt/
 filter class=solr.StandardFilterFactory/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.KStemFilterFactory cacheSize=2/
 filter class=solr.RemoveDuplicatesTokenFilterFactory/
   /analyzer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-380) There's no way to convert search results into page-level hits of a structured document.

2007-10-17 Thread Pieter Berkel (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12535426
 ] 

Pieter Berkel commented on SOLR-380:


There was a recent discussion surrounding a similar problem on solr-user:
http://www.nabble.com/Structured-Lucene-documents-tf4234661.html#a12048390

The idea was to use dynamic fields (e.g. page_1, page_2, page_3... page_N) to 
store the text of each page in a single document.  The problem is that 
currently Solr does not support glob style field expansion in query 
parameters (e.g. qf=page_* ) so you would end up having to specify the entire 
list of page fields in your query, which is impractical.  There is already an 
open issue related to this particular problem (SOLR-247) but nobody has had 
time to look into it.

In terms of returning term position information, this seems somehow (albeit 
loosely) related to highlighting, is there any way you could use the existing 
functionality to achieve your goal? (definitely would be a hack though)


 There's no way to convert search results into page-level hits of a 
 structured document.
 -

 Key: SOLR-380
 URL: https://issues.apache.org/jira/browse/SOLR-380
 Project: Solr
  Issue Type: New Feature
  Components: search
Reporter: Tricia Williams
Priority: Minor

 Paged-Text FieldType for Solr
 A chance to dig into the guts of Solr. The problem: If we index a monograph 
 in Solr, there's no way to convert search results into page-level hits. The 
 solution: have a paged-text fieldtype which keeps track of page divisions 
 as it indexes, and reports page-level hits in the search results.
 The input would contain page milestones: page id=234/. As Solr processed 
 the tokens (using its standard tokenizers and filters), it would concurrently 
 build a structural map of the item, indicating which term position marked the 
 beginning of which page: page id=234 firstterm=14324/. This map would 
 be stored in an unindexed field in some efficient format.
 At search time, Solr would retrieve term positions for all hits that are 
 returned in the current request, and use the stored map to determine page ids 
 for each term position. The results would imitate the results for 
 highlighting, something like:
 lst name=pages
 nbsp;nbsp;lst name=doc1
 nbsp;nbsp;nbsp;nbsp;int name=pageid234/int
 nbsp;nbsp;nbsp;nbsp;int name=pageid236/int
 nbsp;nbsp;/lst
 nbsp;nbsp;lst name=doc2
 nbsp;nbsp;nbsp;nbsp;int name=pageid19/int
 nbsp;nbsp;/lst
 /lst
 lst name=hitpos
 nbsp;nbsp;lst name=doc1
 nbsp;nbsp;nbsp;nbsp;lst name=234
 nbsp;nbsp;nbsp;nbsp;nbsp;nbsp;int 
 name=pos14325/int
 nbsp;nbsp;nbsp;nbsp;/lst
 nbsp;nbsp;/lst
 nbsp;nbsp;...
 /lst

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-377) speed increase for writers

2007-10-17 Thread Pieter Berkel (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pieter Berkel updated SOLR-377:
---

Attachment: SOLR-377-phpresponsewriter.patch

Sorry I've been a bit slow catching up with this issue.  Please find attached a 
trival patch to PHPResponseWriter.java that takes advantage of the new 
FastWriter code, it should provide speed improvements similar to the JSON 
writer (perhaps slightly less).

No fastwriter optimisation is necessary for PHPSerializedResponseWriter as 
there is no need to escape strings before they are written.


 speed increase for writers
 --

 Key: SOLR-377
 URL: https://issues.apache.org/jira/browse/SOLR-377
 Project: Solr
  Issue Type: Improvement
Reporter: Yonik Seeley
 Attachments: fastwriter.patch, SOLR-377-phpresponsewriter.patch


 When solr is writing the response of large cached documents, the bottleneck 
 is string encoding.
 a buffered writer implementation that doesn't do any synchronization could 
 offer some good speedups.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-379) KStem Token Filter

2007-10-15 Thread Pieter Berkel (JIRA)
KStem Token Filter
--

 Key: SOLR-379
 URL: https://issues.apache.org/jira/browse/SOLR-379
 Project: Solr
  Issue Type: New Feature
  Components: search
Reporter: Pieter Berkel
Priority: Minor


A Lucene / Solr implementation of the KStem stemmer.  Full credit goes to Harry 
Wagner for adapting the Lucene version found here:
http://ciir.cs.umass.edu/cgi-bin/downloads/downloads.cgi

Background discussion to this stemmer (including licensing issues) can be found 
in this thread:
http://www.nabble.com/Embedded-about-50--faster-for-indexing-tf4325720.html#a12376295

I've made some minor changes to KStemFilterFactory so that it compiles cleanly 
against trunk:
1) removed some unnecessary imports
2) changed the init() method parameters introduced by SOLR-215
3) moved KStemFilterFactory into package org.apache.solr.analysis

Once compiled and included in your Solr war (or as a jar in your lib directory, 
the KStem filter can be used in your schema very easily:

  analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true 
words=stopwords.txt/
filter class=solr.StandardFilterFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.KStemFilterFactory cacheSize=2/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
  /analyzer


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-379) KStem Token Filter

2007-10-15 Thread Pieter Berkel (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pieter Berkel updated SOLR-379:
---

Attachment: KStemSolr.zip

I've attached a zip file containing the KStem source rather than a patch as I'm 
not sure how this code will be eventually integrated with Solr.

Since I did not write this and am unsure of the legal status of this code, I 
have not granted ASF license, although recent discussion suggests the license 
included with KStem is compatible with the Apache license.

Hopefully we'll be able to resolve these above issues fairly quickly.


 KStem Token Filter
 --

 Key: SOLR-379
 URL: https://issues.apache.org/jira/browse/SOLR-379
 Project: Solr
  Issue Type: New Feature
  Components: search
Reporter: Pieter Berkel
Priority: Minor
 Attachments: KStemSolr.zip


 A Lucene / Solr implementation of the KStem stemmer.  Full credit goes to 
 Harry Wagner for adapting the Lucene version found here:
 http://ciir.cs.umass.edu/cgi-bin/downloads/downloads.cgi
 Background discussion to this stemmer (including licensing issues) can be 
 found in this thread:
 http://www.nabble.com/Embedded-about-50--faster-for-indexing-tf4325720.html#a12376295
 I've made some minor changes to KStemFilterFactory so that it compiles 
 cleanly against trunk:
 1) removed some unnecessary imports
 2) changed the init() method parameters introduced by SOLR-215
 3) moved KStemFilterFactory into package org.apache.solr.analysis
 Once compiled and included in your Solr war (or as a jar in your lib 
 directory, the KStem filter can be used in your schema very easily:
   analyzer type=index
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.StopFilterFactory ignoreCase=true 
 words=stopwords.txt/
 filter class=solr.StandardFilterFactory/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.KStemFilterFactory cacheSize=2/
 filter class=solr.RemoveDuplicatesTokenFilterFactory/
   /analyzer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-281) Search Components (plugins)

2007-09-16 Thread Pieter Berkel (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12527931
 ] 

Pieter Berkel commented on SOLR-281:


I'm having trouble applying the latest patch to trunk (r575809) again:

$ patch -p0  ../SOLR-281-SearchComponents.patch 
...
patching file src/java/org/apache/solr/handler/StandardRequestHandler.java
Hunk #1 FAILED at 17.
Hunk #2 FAILED at 45.
2 out of 2 hunks FAILED -- saving rejects to file 
src/java/org/apache/solr/handler/StandardRequestHandler.java.rej
patching file src/java/org/apache/solr/handler/DisMaxRequestHandler.java
Hunk #2 FAILED at 118.
1 out of 2 hunks FAILED -- saving rejects to file 
src/java/org/apache/solr/handler/DisMaxRequestHandler.java.rej

It also looks like the additions to solrconfig.xml have not been included in 
the latest patch either.  I was also going to suggest that it might be a good 
idea to support class shorthand notation, so 
org.apache.solr.handler.component.* can be written solr.component.* in 
solrconfig.xml.


 Search Components (plugins)
 ---

 Key: SOLR-281
 URL: https://issues.apache.org/jira/browse/SOLR-281
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
 Attachments: SOLR-281-SearchComponents.patch, 
 SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch, 
 SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch


 A request handler with pluggable search components for things like:
   - standard
   - dismax
   - more-like-this
   - highlighting
   - field collapsing 
 For more discussion, see:
 http://www.nabble.com/search-components-%28plugins%29-tf3898040.html#a11050274

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-247) Allow facet.field=* to facet on all fields (without knowing what they are)

2007-08-23 Thread Pieter Berkel (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522345
 ] 

Pieter Berkel commented on SOLR-247:


Some recent discussion on this topic:

http://www.nabble.com/Structured-Lucene-documents-tf4234661.html

I get the impression that general wildcard syntax support for field listing 
parameters (i.e. the reverse of dynamic fields) as described in the above 
thread would be far more useful than a simple '*' match-anything syntax (not 
only in faceting but other cases like hl.fl and perhaps even mlt.fl).

I haven't really considered the performance issues of this approach however, as 
it would involve checking each field supplied in the parameter for '*' before 
expanding it into full field names for every query.

Given the above, the fact that it could be used across multiple response 
handlers and subhandlers like SimpleFacets  Highlighting, and that it would 
require access to IndexReader to getFieldNames(), where might be the most 
sensible place to put this code?


 Allow facet.field=* to facet on all fields (without knowing what they are)
 --

 Key: SOLR-247
 URL: https://issues.apache.org/jira/browse/SOLR-247
 Project: Solr
  Issue Type: Improvement
Reporter: Ryan McKinley
Priority: Minor
 Attachments: SOLR-247-FacetAllFields.patch


 I don't know if this is a good idea to include -- it is potentially a bad 
 idea to use it, but that can be ok.
 This came out of trying to use faceting for the LukeRequestHandler top term 
 collecting.
 http://www.nabble.com/Luke-request-handler-issue-tf3762155.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-281) Search Components (plugins)

2007-08-16 Thread Pieter Berkel (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12520434
 ] 

Pieter Berkel commented on SOLR-281:


I just tried this patch on svn trunk (r566899) and got the following failures:

$ patch -p0  ../SOLR-281-SearchComponents.patch
...
patching file src/java/org/apache/solr/handler/StandardRequestHandler.java
Hunk #1 succeeded at 17 with fuzz 1.
Hunk #2 FAILED at 45.
1 out of 2 hunks FAILED -- saving rejects to file 
src/java/org/apache/solr/handler/StandardRequestHandler.java.rej
...
patching file src/java/org/apache/solr/handler/DisMaxRequestHandler.java
Hunk #1 FAILED at 17.
1 out of 1 hunk FAILED -- saving rejects to file 
src/java/org/apache/solr/handler/DisMaxRequestHandler.java.rej

I suspect it is the changes made by SOLR-326 that is causing the these 
problems, would it be possible for you to create a new patch?

thanks,
Piete

 Search Components (plugins)
 ---

 Key: SOLR-281
 URL: https://issues.apache.org/jira/browse/SOLR-281
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
 Attachments: SOLR-281-SearchComponents.patch, 
 SOLR-281-SearchComponents.patch, SOLR-281-SearchComponents.patch


 A request handler with pluggable search components for things like:
   - standard
   - dismax
   - more-like-this
   - highlighting
   - field collapsing 
 For more discussion, see:
 http://www.nabble.com/search-components-%28plugins%29-tf3898040.html#a11050274

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-196) A PHP response writer for Solr

2007-08-10 Thread Pieter Berkel (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12519195
 ] 

Pieter Berkel commented on SOLR-196:


Great! I'll try to add some documentation to the wiki in the next few days.

Regarding the content-type, I found it more useful to be able to actually see 
the result in a browser.
Is there a content-type we can use for JSON that can achieve both goals for 
firefox and IE at least?

I couldn't find any suitable mime types that would achieve this goal so it's 
probably better to leave the content-types unchanged for the moment.


 A PHP response writer for Solr
 --

 Key: SOLR-196
 URL: https://issues.apache.org/jira/browse/SOLR-196
 Project: Solr
  Issue Type: New Feature
  Components: clients - php, search
Reporter: Paul Borgermans
 Attachments: SOLR-192-php-responsewriter.patch, 
 SOLR-196-PHPResponseWriter.patch


 It would be useful to have a PHP response writer that returns an array to be 
 eval-ed directly. This is especially true for PHP4.x installs, where there is 
 no built in support for JSON.
 This issue attempts to address this.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-196) A PHP response writer for Solr

2007-08-10 Thread Pieter Berkel (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12519196
 ] 

Pieter Berkel commented on SOLR-196:


Hmm, it doesn't look like the two new files from the patch were added properly 
during the latest commit:
/src/java/org/apache/solr/request/PHPResponseWriter.java
/src/java/org/apache/solr/request/PHPSerializedResponseWriter.java 
We won't get very far without those!

 A PHP response writer for Solr
 --

 Key: SOLR-196
 URL: https://issues.apache.org/jira/browse/SOLR-196
 Project: Solr
  Issue Type: New Feature
  Components: clients - php, search
Reporter: Paul Borgermans
 Attachments: SOLR-192-php-responsewriter.patch, 
 SOLR-196-PHPResponseWriter.patch


 It would be useful to have a PHP response writer that returns an array to be 
 eval-ed directly. This is especially true for PHP4.x installs, where there is 
 no built in support for JSON.
 This issue attempts to address this.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-196) A PHP response writer for Solr

2007-08-02 Thread Pieter Berkel (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pieter Berkel updated SOLR-196:
---

Attachment: SOLR-196-PHPResponseWriter.patch

This patch updates the PHPResponseWriter original written by Paul Borgermans 
and integrates the serialized PHP response writer (renamed to 
PHPSerializedResponseWriter to avoid name clashes) originally authored by Nick 
Jenkin in SOLR-275.  See 
http://www.nabble.com/PHP-Response-Writer-for-Solr-tf4140580.html for some 
discussion on this implementation.

I've made minimal code changes to JSONwriter in order to reducing the amount of 
code-duplication, specifically replacing all static writes of array and map 
structure tokens with methods:

  public void writeMapOpener(int size) throws IOException, 
IllegalArgumentException { writer.write('{'); }
  public void writeMapSeparator() throws IOException { writer.write(','); }
  public void writeMapCloser() throws IOException { writer.write('}'); }
  
  public void writeArrayOpener(int size) throws IOException, 
IllegalArgumentException { writer.write('['); }
  public void writeArraySeparator() throws IOException { writer.write(','); }
  public void writeArrayCloser() throws IOException { writer.write(']'); }

The size parameter has been introduced specifically for PHPSerializedWriter 
(where the output format explicitly requires the size of the array / map to be 
set) and is currently ignored by all other response writers.  In cases where 
the size is not trivial to calculate (e.g. an Iterable object), it is set to 
-1.  Classes extending JSONwriter that require a valid (non-negative) size 
value must overload certain methods (i.e. writeArray() and writeDoc()) to 
calculate size correctly.  It would also be a good idea to check for invalid 
size values in writeMapOpener() and writeArrayOpener() and throw a 
IllegalArgumentException if so.

Some other changes I've made to the PHPWriter code from SOLR-196:

1) Removed a lot of code duplicated from JSONwriter.
2) Updated writeStr() to use StringBuilder.

Some other changes I've made to the PHPSerializedWriter code from SOLR-275:

1) Removed some uneccessary duplicate code.
2) Changed key type written by writeArray() from String to int (since they are 
suppose to be numeric indicies).
3) Updated writeStr() - serialized php strings don't need to be escaped (it 
seems to rely only on the specified string size value) and size needs be 
specified in bytes not characters (some Unicode characters were causing 
problems when using String.length() to calculate size, can someone please 
sanity check this code?).

I've tested both PHPWriter  PHPSerializedWriter and they both seem to output 
valid PHP data, it would be great if people could also test them to ensure they 
work in their environments.  JSONWriter also seems to be fine and although I 
didn't test the Python or Ruby writers, I assume they are unaffected (can 
anyone confirm?).  

Additionally, I've moved PythonWriter and RubyWriter from 
JSONResponseWriter.java to PythonResponseWriter.java and 
RubyResponseWriter.java respectively.

I noticed that while each Writer specifies a content type value (e.g. 
CONTENT_TYPE_JSON_UTF8, CONTENT_TYPE_PYTHON_ASCII) the value returned by 
getContentType() is generally CONTENT_TYPE_TEXT_UTF8 or 
CONTENT_TYPE_TEXT_ASCII.  This is not a big deal and I guessed this allows the 
output to be easily displayed in a browser, however it would be quite useful to 
have the actual content type value set so that client applications can 
determine the response format encoding and process it accordingly without 
relying on access to the original wt query paramater.


 A PHP response writer for Solr
 --

 Key: SOLR-196
 URL: https://issues.apache.org/jira/browse/SOLR-196
 Project: Solr
  Issue Type: New Feature
  Components: clients - php, search
Reporter: Paul Borgermans
 Attachments: SOLR-192-php-responsewriter.patch, 
 SOLR-196-PHPResponseWriter.patch


 It would be useful to have a PHP response writer that returns an array to be 
 eval-ed directly. This is especially true for PHP4.x installs, where there is 
 no built in support for JSON.
 This issue attempts to address this.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-301) Clean up param interface. Leave deprecated options in deprecated classes

2007-07-31 Thread Pieter Berkel (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12516839
 ] 

Pieter Berkel commented on SOLR-301:


While you're in the process of cleaning up the Params interfaces, I wonder if 
it worthwhile moving MoreLikeThisParams from o.a.s.common.util to 
o.a.s.common.params at the same time?  I made a note of this in my comments on 
a href=http://issues.apache.org/jira/browse/SOLR-295;SOLR-295/a.

 Clean up param interface.  Leave deprecated options in deprecated classes
 -

 Key: SOLR-301
 URL: https://issues.apache.org/jira/browse/SOLR-301
 Project: Solr
  Issue Type: Improvement
Reporter: Ryan McKinley
Assignee: Ryan McKinley
Priority: Minor
 Fix For: 1.3

 Attachments: SOLR-301-ParamCleanup.patch, SOLR-301-ParamCleanup.patch


 In SOLR-135, we moved the parameter handling stuff to a new package: 
 o.a.s.common.params and left @deprecated classes in the old location.
 Classes in the new package should not contain any deprecated options. 
 Aditionally, we should aim to seperate parameter manipulation logic 
 (DefaultSolrParams, AppendedSolrParams, etc) from 'parameter' interface 
 classes: 'HighlightParams', 'UpdateParams'

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-301) Clean up param interface. Leave deprecated options in deprecated classes

2007-07-31 Thread Pieter Berkel (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12516839
 ] 

Pieter Berkel edited comment on SOLR-301 at 7/31/07 5:39 PM:
-

While you're in the process of cleaning up the Params interfaces, I wonder if 
it worthwhile moving MoreLikeThisParams from o.a.s.common.util to 
o.a.s.common.params at the same time?  I made a note of this in my comments on 
SOLR-295.


 was:
While you're in the process of cleaning up the Params interfaces, I wonder if 
it worthwhile moving MoreLikeThisParams from o.a.s.common.util to 
o.a.s.common.params at the same time?  I made a note of this in my comments on 
a href=http://issues.apache.org/jira/browse/SOLR-295;SOLR-295/a.

 Clean up param interface.  Leave deprecated options in deprecated classes
 -

 Key: SOLR-301
 URL: https://issues.apache.org/jira/browse/SOLR-301
 Project: Solr
  Issue Type: Improvement
Reporter: Ryan McKinley
Assignee: Ryan McKinley
Priority: Minor
 Fix For: 1.3

 Attachments: SOLR-301-ParamCleanup.patch, SOLR-301-ParamCleanup.patch


 In SOLR-135, we moved the parameter handling stuff to a new package: 
 o.a.s.common.params and left @deprecated classes in the old location.
 Classes in the new package should not contain any deprecated options. 
 Aditionally, we should aim to seperate parameter manipulation logic 
 (DefaultSolrParams, AppendedSolrParams, etc) from 'parameter' interface 
 classes: 'HighlightParams', 'UpdateParams'

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-258) Date based Facets

2007-07-24 Thread Pieter Berkel (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12515208
 ] 

Pieter Berkel commented on SOLR-258:


Looking good Hoss, the NOW issue seems to be resolved and the results look 
consistent after a quick test.

 * what should happen if end  start or gap  0 ... maybe those should be 
 okay as long as both are true. 

It is probably wise to explicitly check for (end  start XOR gap  0) and 
return an error if so, otherwise the request gets caught in an infinite loop.

Just on the subject of errors, I notice that exceptions thrown by the date 
facet code are caught in SimpleFacets.getFacetCounts() and written out in the 
response:

try {
  res.add(facet_queries, getFacetQueryCounts());
  res.add(facet_fields, getFacetFieldCounts());
  res.add(facet_dates, getFacetDateCounts());
} catch (Exception e) {
  SolrException.logOnce(SolrCore.log, Exception during facet counts, e);
  res.add(exception, SolrException.toStr(e));
}

This doesn't seem very consistent the way other handlers deal with exceptions 
(i.e. http response code  400), is there any reason why it is done this way in 
SimpleFacets?

I also think it would also be a good idea to merge facet_dates response field 
into facet_fields so that all the facet data in the response is stored in the 
one location, how feasible would it be to do this?


 Date based Facets
 -

 Key: SOLR-258
 URL: https://issues.apache.org/jira/browse/SOLR-258
 Project: Solr
  Issue Type: New Feature
Reporter: Hoss Man
Assignee: Hoss Man
 Attachments: date_facets.patch, date_facets.patch, date_facets.patch, 
 date_facets.patch, date_facets.patch, date_facets.patch, date_facets.patch


 1) Allow clients to express concepts like...
 * give me facet counts per day for every day this month.
 * give me facet counts per hour for every hour of today.
 * give me facet counts per hour for every hour of a specific day.
 * give me facet counts per hour for every hour of a specific day and 
 give me facet counts for the 
number of matches before that day, or after that day. 
 2) Return all data in a way that makes it easy to use to build filter queries 
 on those date ranges.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-308) Add a field that generates an unique id when you have none in your data to index

2007-07-19 Thread Pieter Berkel (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513891
 ] 

Pieter Berkel commented on SOLR-308:


From the usage case you have provided, it sounds like the unique id will 
change every time you delete and re-insert the document.  If this is the case, 
then perhaps it might be more efficient to use the lucene document id as your 
unique id value rather than a seperate field?  However, as far as I'm aware, 
there currently isn't any way to access the lucene doc id from solr (except 
perhaps the luke request handler)?


 Add a field that generates an unique id when you have none in your data to 
 index
 

 Key: SOLR-308
 URL: https://issues.apache.org/jira/browse/SOLR-308
 Project: Solr
  Issue Type: New Feature
  Components: search
Reporter: Thomas Peuss
Priority: Minor
 Attachments: GeneratedId.patch


 This patch adds a field that generates an unique id when you have no unique 
 id in your data you want to index.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-258) Date based Facets

2007-07-15 Thread Pieter Berkel (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12512778
 ] 

Pieter Berkel commented on SOLR-258:


Sorry that last comment was from me, not posted from my regular computer.  I'll 
be more careful to post as myself and not as a colleague in future (I was 
wondering why JIRA didn't ask me to login, d'oh).

 Date based Facets
 -

 Key: SOLR-258
 URL: https://issues.apache.org/jira/browse/SOLR-258
 Project: Solr
  Issue Type: New Feature
Reporter: Hoss Man
Assignee: Hoss Man
 Attachments: date_facets.patch, date_facets.patch, date_facets.patch, 
 date_facets.patch, date_facets.patch


 1) Allow clients to express concepts like...
 * give me facet counts per day for every day this month.
 * give me facet counts per hour for every hour of today.
 * give me facet counts per hour for every hour of a specific day.
 * give me facet counts per hour for every hour of a specific day and 
 give me facet counts for the 
number of matches before that day, or after that day. 
 2) Return all data in a way that makes it easy to use to build filter queries 
 on those date ranges.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (SOLR-258) Date based Facets

2007-07-15 Thread Pieter Berkel (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12512778
 ] 

Pieter Berkel edited comment on SOLR-258 at 7/14/07 11:59 PM:
--

Sorry that last comment was from me (not Tristan), not posted from my regular 
computer.  I'll be more careful to post as myself and not as a colleague in 
future (I was wondering why JIRA didn't ask me to login, d'oh).


 was:
Sorry that last comment was from me, not posted from my regular computer.  I'll 
be more careful to post as myself and not as a colleague in future (I was 
wondering why JIRA didn't ask me to login, d'oh).

 Date based Facets
 -

 Key: SOLR-258
 URL: https://issues.apache.org/jira/browse/SOLR-258
 Project: Solr
  Issue Type: New Feature
Reporter: Hoss Man
Assignee: Hoss Man
 Attachments: date_facets.patch, date_facets.patch, date_facets.patch, 
 date_facets.patch, date_facets.patch


 1) Allow clients to express concepts like...
 * give me facet counts per day for every day this month.
 * give me facet counts per hour for every hour of today.
 * give me facet counts per hour for every hour of a specific day.
 * give me facet counts per hour for every hour of a specific day and 
 give me facet counts for the 
number of matches before that day, or after that day. 
 2) Return all data in a way that makes it easy to use to build filter queries 
 on those date ranges.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-258) Date based Facets

2007-07-13 Thread Pieter Berkel (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12512372
 ] 

Pieter Berkel commented on SOLR-258:


I've just tried this patch and the results are impressive!

I agree with Ryan regarding the naming of 'pre', 'post' and 'inner', using 
simple concrete words will make it easier for developers to understand the 
basic concepts.  At first I was a little confused how the 'gap' parameter was 
used, perhaps a name like 'interval' would be more indicative of it's purpose?

While on the topic of gaps / intervals, I can imagine a case where one might 
want facet counts over non-linear intervals, for instance obtaining results 
from: Last 7 days, Last 30 days, Last 90 days, Last 6 months.  
Obviously you can achieve this by setting facet.date.gap=+1DAY and then 
post-process the results, but a much more elegant solution would be to allow 
facet.date.gap  (or another suitably named param) to accept a 
(comma-delimited) set of explicit partition dates:

facet.date.start=NOW-6MONTHS/DAY
facet.date.end=NOW/DAY
facet.date.gap=NOW-90DAYS/DAY,NOW-30DAYS/DAY,NOW-7DAYS/DAY

It would then be trivial to calculate facet counts for the ranges specified 
above.

It would be useful to make the 'start' an 'end' parameters optional.  If not 
specified 'start' should default to the earliest stored date value, and 'end' 
should default to the latest stored date value (assuming that's possible).  
Probably should return a 400 if 'gap' is not set.

My personal opinion is that 'end' should be a hard limit, the last gap should 
never go past 'end'.  Given that the facet label is always generated from the 
lower value in the range, I don't think truncating the last 'gap' will cause 
problems, however it may be helpful to return the actual date value for end 
if it was specified as a offset of NOW.

What might be a problem is when both start and end dates are specified as 
offsets of NOW, the value of NOW may not be constant for both values.  In one 
of my tests, I set:

facet.date.start=NOW-12MONTHS
facet.date.end=NOW
facet.date.gap=+1MONTH

With some extra debugging output I can see that mostly the value of NOW is the 
same:

str name=start2006-07-13T06:06:07.397/str
str name=end2007-07-13T06:06:07.397/str

However occasionally there is a difference:

str name=start2006-07-13T05:48:23.014/str
str name=end2007-07-13T05:48:23.015/str

This difference alters the number of gaps calculated (+1 when NOW values are 
diff for start  end).  Not sure how this could be fixed, but as you mentioned 
above, it will probably involve changing ft.toExternal(ft.toInternal(...)).

Thanks again for creating this useful addition, I'll try to test it a bit more 
and see if I can find anything else.


 Date based Facets
 -

 Key: SOLR-258
 URL: https://issues.apache.org/jira/browse/SOLR-258
 Project: Solr
  Issue Type: New Feature
Reporter: Hoss Man
Assignee: Hoss Man
 Attachments: date_facets.patch, date_facets.patch, date_facets.patch, 
 date_facets.patch, date_facets.patch


 1) Allow clients to express concepts like...
 * give me facet counts per day for every day this month.
 * give me facet counts per hour for every hour of today.
 * give me facet counts per hour for every hour of a specific day.
 * give me facet counts per hour for every hour of a specific day and 
 give me facet counts for the 
number of matches before that day, or after that day. 
 2) Return all data in a way that makes it easy to use to build filter queries 
 on those date ranges.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (SOLR-281) Search Components (plugins)

2007-07-11 Thread Pieter Berkel (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12511678
 ] 

Pieter Berkel commented on SOLR-281:


I really like this modular approach to handling search requests, it will 
greatly simplify the process of adding new functionality (e.g. collapsing, 
faceting, more-like-this) to existing handlers without the need for unnecessary 
code replication.  My primary goal is to extend the more-like-this handler 
capabilities and make them available to other handlers (such as dismax), and I 
think the proposed solution is a good approach.

Some issues that I can forsee though are:

1) Ordering: its fairly obvious that certain handlers need to be called before 
others (e.g. standard / dismax query parsing before faceting / highlighting) 
however there may be cases where the required sequence of events may be more 
subtle (e.g. faceting the results of a more-like-this query).  There probably 
needs to be some mechanism to determine the order in which the components are 
prepared / processed.

2) Dependancy: a situation may arise where a component depends on operations 
performed by another component (e.g. more-like-this may take advantage of the 
dismax 'bq' parameter), perhaps there needs to be some method of specifying 
component dependency so that the SearchHandler can load and process required 
components automatically?

I hope this make sense, I'm fairly new to Solr development so I'm afraid my 
contributions to this issue would be mostly limited to (hopefully helpful) 
ideas and suggestions however I'm happy to tinker with the patched code from 
above and help test this new component framework as it is developed.

cheers,
Pieter


 Search Components (plugins)
 ---

 Key: SOLR-281
 URL: https://issues.apache.org/jira/browse/SOLR-281
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
 Attachments: SOLR-281-SearchComponents.patch


 A request handler with pluggable search components for things like:
   - standard
   - dismax
   - more-like-this
   - highlighting
   - field collapsing 
 For more discussion, see:
 http://www.nabble.com/search-components-%28plugins%29-tf3898040.html#a11050274

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-292) MoreLikeThisHandler generates incorrect facet counts

2007-07-09 Thread Pieter Berkel (JIRA)
MoreLikeThisHandler generates incorrect facet counts


 Key: SOLR-292
 URL: https://issues.apache.org/jira/browse/SOLR-292
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 1.3
Reporter: Pieter Berkel
Priority: Minor
 Fix For: 1.3


When obtaining facet counts using the MoreLikeThis handler, the facet 
information returned is generated from the document list returned rather than 
the entire set of matching documents.  For example, if your MoreLikeThis query 
returns by default 10 documents, then getFacetCounts() returns values based 
only on these 10 documents, despite the fact that there may be thousands of 
matching documents in the set.

The soon-to-be uploaded patch addresses this particular issue by changing the 
object type returned by MoreLikeThisHelper.getMoreLikeThis() from DocList to 
DocListAndSet and ensuring that the facet count is generated from the entire 
set rather than the document list.  The MLT functionality of the 
StandardRequestHandler should not be affected by this change.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (SOLR-292) MoreLikeThisHandler generates incorrect facet counts

2007-07-09 Thread Pieter Berkel (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pieter Berkel updated SOLR-292:
---

Attachment: MoreLikeThis-FacetCount_SOLR-292.patch

Patch updates src/java/org/apache/solr/handler/MoreLikeThisHandler.java and 
fixes the facet count problem.

 MoreLikeThisHandler generates incorrect facet counts
 

 Key: SOLR-292
 URL: https://issues.apache.org/jira/browse/SOLR-292
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 1.3
Reporter: Pieter Berkel
Priority: Minor
 Fix For: 1.3

 Attachments: MoreLikeThis-FacetCount_SOLR-292.patch


 When obtaining facet counts using the MoreLikeThis handler, the facet 
 information returned is generated from the document list returned rather than 
 the entire set of matching documents.  For example, if your MoreLikeThis 
 query returns by default 10 documents, then getFacetCounts() returns values 
 based only on these 10 documents, despite the fact that there may be 
 thousands of matching documents in the set.
 The soon-to-be uploaded patch addresses this particular issue by changing the 
 object type returned by MoreLikeThisHelper.getMoreLikeThis() from DocList to 
 DocListAndSet and ensuring that the facet count is generated from the entire 
 set rather than the document list.  The MLT functionality of the 
 StandardRequestHandler should not be affected by this change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (SOLR-295) Implementing MoreLikeThis support in DismaxRequestHandler

2007-07-09 Thread Pieter Berkel (JIRA)
Implementing MoreLikeThis support in DismaxRequestHandler
-

 Key: SOLR-295
 URL: https://issues.apache.org/jira/browse/SOLR-295
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 1.3
Reporter: Pieter Berkel
Priority: Minor


There's nothing too clever about this initial patch to be upload shortly, I 
have simply extracted the MLT code from the StandardRequestHandler and inserted 
it into the DismaxRequestHandler.  However, there are some broader MLT issues 
that I'd also like to address in the near future:

1) (trivial) No This response format is experimental warning when MLT is used 
with StandardRequestHandler (or DismaxRequestHandler).  Not really a big deal 
but at least makes developers aware of the possibility of future changes.

2) (trivial) org.apache.solr.common.util.MoreLikeThisParams should perhaps be 
moved to the more appropriate package org.apache.solr.common.params.

3) (non-trivial) The ability to specify the list of fields that should be 
returned when MLT is invoked from an external handler (i.e. 
StandardRequestHandler).  Currently the field list (FL) parameter is inherited 
from the main query but I can envisage cases where it would be desirable to 
specify more or less return fields in the MLT query than the main query.  One 
complication is that mlt.fl is already used to specify the fields used for 
similarity.  Perhaps mlt.fl is not the best name for this parameter and 
should be renamed to avoid potential conflict / confusion?

4) (fairly-trivial) On a similar note to 3, there is currently no way to 
specify a start value for the rows returned when MLT is invoked from an 
external handler (e.g. StandardRequestHandler), it is hard-coded to 0 (i.e. the 
first mlt.count documents matched).  While I can see the logic in naming the 
parameter mlt.count, it does seem a little inconsistent and perhaps it would 
be better to rename (or at least alias) it to mlt.rows to be consistent with 
the CommonQueryParameters.  Note that mlt.start is fundamentally different to 
the mlt.match.offset parameter as the later deals with documents *matching* 
the initial MLT query while the former deals with documents *returned* by the 
MLT query (hope that makes sense).

I have created a patch that implemented mlt.start (to specify the start doc) 
and added mlt.rows that could be used interchangeably with mlt.count (but I 
would prefer to remove mlt.count altogether), but since it involves changing 
the method definition of MoreLikeThisHelper.getMoreLikeThese(), I wanted to get 
some opinions before submitting it.

5) (non-trivial) Interesting Terms - the ability to return interesting term 
information using the mlt.interestingTerms parameter when MLT is invoked from 
an external handler.  This is perhaps the most useful feature I am looking to 
implement, I can see great benefit in being able to provide a list of 
interesting terms or keywords for each document returned in a standard or 
dismax query.  Currently this only available from the MLT request handler so 
perhaps the best approach would be to re-factor the interestingTerms code in 
MoreLikeThisHandler class and put it somewhere in MoreLikeThisHelper so it is 
available to all handlers?  Again, I would appreciate any comments or 
suggestions.

I've also noted the MLT features suggested by Tristan [ 
http://www.nabble.com/MoreLikeThis-with-DisMax-boost-query---functions-tf4047187.html
 ] which could quite possibly be rolled together with the above points -- I'm 
not sure whether is is better to have a single ticket tracking several related 
issues or create invididual tickets for each issue, however will be happy to 
comply with the Solr issue tracking policy on advice from the core developers.

regards,
Pieter


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.