Re: dynamic copyFields
: Syntax aside, the major implication is that DynamicCopy would need a : virtual function: : SchemaField getTargetField() I don't think i've ever looked at DynamicField before today ... but i see what you're talking about, you mean that final SchemaField targetField would need to be replaced with SchemaField getTargetField(String sourceField) right? yeah that seems simple enough, i'm not sure what Yonik ment by this comment... // Instead of storing a type, this could be implemented as a hierarchy // with a virtual matches(). // Given how often a search will be done, however, speed is the overriding // concern and I'm not sure which is faster. ... i don't see how this ever comes into play with search. on the issue of syntax and regex vs glob, i would leave it as a glob for now since that's already supported by the syntax and the impl ... if we want to support regexes that should be done seperately in DynamicReplacement where it can be leveraged by both copyField and dynamicField -Hoss
[jira] Commented: (SOLR-69) PATCH:MoreLikeThis support
[ https://issues.apache.org/jira/browse/SOLR-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12493770 ] Hoss Man commented on SOLR-69: -- looking back at the two main use cases Yonik described in his comment from 06/Feb/07... At the most basic level, A request for MLT results for a single doc by uniqueKey (case#1) is just a simplistic example of asking for MLT results for an arbitrary query (case#2) ... that arbitrary query just happens to be on a uniqueKey field, and only returns one result. Where things get more complicated is when you start returning other tier 2 type information about the request -- which begs the question what is tier 1 data? If the MLT results are added as tier 2 data to StandardRequestHandler response, then all of the other tier 2 data blocks (highlighting, faceting, debugQuery score explanation, etc..) still refer to the main result from the original query ... this may be what you want in use case #2, but doesn't really make sense for use case #1, where the tier 1 main result only contains the single document you asked for by id ... the score explanation and facet count numbers aren't very interesting in that case. for case #1, what you really want is for the MLT data to be treated as the primary (tier 1) result set, and all of hte tier 2 data is about those results ... highlighting is done on the MLT docs, facet counts are for the MLT docs, debugQuery score explanation tells you *why* the MLT docs are like your original docs, etc.. Case #1 and case #2 are both useful, to address Brian's 02/May/07 comment.. I've personally never understood the more documents that don't match this query but are like the documents in this query ... I'm confused as to how querying by query would work -- if a query for 'apache' returned 10 docs, would MLT work on each one and generate n more docs per doc? And would the original query results get returned? What's the ordering? in your example, yes ... the users main search on apache would return 10 results sorted by whatever sort they specified. for each of those 10 results, N similar results might me listed to the side (in a smaller font, or as a pop up widget) sorted most likely by how similar they are. even if you don't want to surface those similar docs right there on the main result page, you still need to execute the MLT logic as part of hte initial request to know if there there are *any* similar docs (so you can surface the link/button for displaying them to the user. I would even argue there is actually a third use case ... -- Case 3) The GUI queries the standard request handler to display a list of documents, with a single subsequent list of similar mlt documents that have things in common with all of the docs in the current page of results displayed elsewhere on the page. -- ...where case #2 is about having separate MLT lists for each of hte matching reuslts, this case is about having a single if you are interested in *all* of these items, you might also be interested in these other items list. case#1 and case#3 can both easily be satisfied with a single MoreLikeThisHandler which takes as it's input a generic query (ie: q=id:12345 for case#1, and q=apache for case#3) and then generates a single tier 1 result block of MLT results that relate to all of the docs matching that query (simpel case of 1 doc for case#1) ... all other tier 2 data would be in regards to this main MLT result set. case#2 would still easily be handled by having some new tier 2 MLT data added to the StandardRequestHandler. PATCH:MoreLikeThis support -- Key: SOLR-69 URL: https://issues.apache.org/jira/browse/SOLR-69 Project: Solr Issue Type: Improvement Components: search Reporter: Bertrand Delacretaz Priority: Minor Attachments: lucene-queries-2.0.0.jar, lucene-queries-2.1.1-dev.jar, SOLR-69-MoreLikeThisRequestHandler.patch, SOLR-69.patch, SOLR-69.patch, SOLR-69.patch, SOLR-69.patch Here's a patch that implements simple support of Lucene's MoreLikeThis class. The MoreLikeThisHelper code is heavily based on (hmm...lifted from might be more appropriate ;-) Erik Hatcher's example mentioned in http://www.mail-archive.com/[EMAIL PROTECTED]/msg00878.html To use it, add at least the following parameters to a standard or dismax query: mlt=true mlt.fl=list,of,fields,which,define,similarity See the MoreLikeThisHelper source code for more parameters. Here are two URLs that work with the example config, after loading all documents found in exampledocs in the index (just to show that it seems to work - of course you need a larger corpus to make it interesting): http://localhost:8983/solr/select/?stylesheet=q=apacheqt=standardmlt=truemlt.fl=manu,catmlt.mindf=1mlt.mindf=1fl=id,score
Re: dynamic copyFields
Chris Hostetter wrote: : Syntax aside, the major implication is that DynamicCopy would need a : virtual function: : SchemaField getTargetField() I don't think i've ever looked at DynamicField before today ... but i see what you're talking about, you mean that final SchemaField targetField would need to be replaced with SchemaField getTargetField(String sourceField) right? exactly. yeah that seems simple enough, i'm not sure what Yonik ment by this comment... // Instead of storing a type, this could be implemented as a hierarchy // with a virtual matches(). // Given how often a search will be done, however, speed is the overriding // concern and I'm not sure which is faster. ... i don't see how this ever comes into play with search. I don't either... I think it only happens at indexing. ResponseWriters do not know (or care) if a field is from a copy field or not. on the issue of syntax and regex vs glob, i would leave it as a glob for now since that's already supported by the syntax and the impl ... agreed. if we want to support regexes that should be done seperately in DynamicReplacement where it can be leveraged by both copyField and dynamicField glob is fine for what i need. Thanks for the feedback, i'll post something on JIRA soon. ryan
[jira] Commented: (SOLR-86) [PATCH] standalone updater cli based on httpClient
[ https://issues.apache.org/jira/browse/SOLR-86?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12493784 ] Will Johnson commented on SOLR-86: -- has anyone brought up the idea of creating post.bat and post.sh scripts that use this java class instead of the curl example that currently ships in example/exampledocs? it would be one less thing for people to figure out and possibly screw up. [PATCH] standalone updater cli based on httpClient --- Key: SOLR-86 URL: https://issues.apache.org/jira/browse/SOLR-86 Project: Solr Issue Type: New Feature Components: update Reporter: Thorsten Scherler Assigned To: Erik Hatcher Attachments: simple-post-tool-2007-02-15.patch, simple-post-tool-2007-02-16.patch, simple-post-using-urlconnection-approach.patch, solr-86.diff, solr-86.diff We need a cross platform replacement for the post.sh. The attached code is a direct replacement of the post.sh since it is actually doing the same exact thing. In the future one can extend the CLI with other feature like auto commit, etc.. Right now the code assumes that SOLR-85 is applied since we using the servlet of this issue to actually do the update. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-69) PATCH:MoreLikeThis support
[ https://issues.apache.org/jira/browse/SOLR-69?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan McKinley updated SOLR-69: -- Attachment: SOLR-69-MoreLikeThisRequestHandler.patch Refactored the MoreLikeThisRequestHandler so that it can support case #1, #2, #3 - added faceting to the MoreLikeThisHandler - made it possible to remove the original match from the response. This makes the response look the same as ones that come from /select - Added documentation to: http://wiki.apache.org/solr/MoreLikeThis PATCH:MoreLikeThis support -- Key: SOLR-69 URL: https://issues.apache.org/jira/browse/SOLR-69 Project: Solr Issue Type: Improvement Components: search Reporter: Bertrand Delacretaz Priority: Minor Attachments: lucene-queries-2.0.0.jar, lucene-queries-2.1.1-dev.jar, SOLR-69-MoreLikeThisRequestHandler.patch, SOLR-69-MoreLikeThisRequestHandler.patch, SOLR-69.patch, SOLR-69.patch, SOLR-69.patch, SOLR-69.patch Here's a patch that implements simple support of Lucene's MoreLikeThis class. The MoreLikeThisHelper code is heavily based on (hmm...lifted from might be more appropriate ;-) Erik Hatcher's example mentioned in http://www.mail-archive.com/[EMAIL PROTECTED]/msg00878.html To use it, add at least the following parameters to a standard or dismax query: mlt=true mlt.fl=list,of,fields,which,define,similarity See the MoreLikeThisHelper source code for more parameters. Here are two URLs that work with the example config, after loading all documents found in exampledocs in the index (just to show that it seems to work - of course you need a larger corpus to make it interesting): http://localhost:8983/solr/select/?stylesheet=q=apacheqt=standardmlt=truemlt.fl=manu,catmlt.mindf=1mlt.mindf=1fl=id,score http://localhost:8983/solr/select/?stylesheet=q=apacheqt=dismaxmlt=truemlt.fl=manu,catmlt.mindf=1mlt.mindf=1fl=id,score Results are added to the output like this: response ... lst name=moreLikeThis result name=UTF8TEST numFound=1 start=0 maxScore=1.5293242 doc float name=score1.5293242/float str name=idSOLR1000/str /doc /result result name=SOLR1000 numFound=1 start=0 maxScore=1.5293242 doc float name=score1.5293242/float str name=idUTF8TEST/str /doc /result /lst I haven't tested this extensively yet, will do in the next few days. But comments are welcome of course. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-86) [PATCH] standalone updater cli based on httpClient
[ https://issues.apache.org/jira/browse/SOLR-86?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12493828 ] Hoss Man commented on SOLR-86: -- this will ship in the next release, and the tutorial that will ship with that release already refers to it. creating a post.sh or post.bat that delegates to this tool seems like it can only complicate things ... file perms, line endings, shell conventions, shebang lines ... all things where portability is a concern, but java -jar post.jar *.xml works damn near anywhere. [PATCH] standalone updater cli based on httpClient --- Key: SOLR-86 URL: https://issues.apache.org/jira/browse/SOLR-86 Project: Solr Issue Type: New Feature Components: update Reporter: Thorsten Scherler Assigned To: Erik Hatcher Attachments: simple-post-tool-2007-02-15.patch, simple-post-tool-2007-02-16.patch, simple-post-using-urlconnection-approach.patch, solr-86.diff, solr-86.diff We need a cross platform replacement for the post.sh. The attached code is a direct replacement of the post.sh since it is actually doing the same exact thing. In the future one can extend the CLI with other feature like auto commit, etc.. Right now the code assumes that SOLR-85 is applied since we using the servlet of this issue to actually do the update. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [Solr Wiki] Update of Solr1.2 by ryan
On 5/4/07, Apache Wiki [EMAIL PROTECTED] wrote: -- requestParsers enableRemoteStreaming=false multipartUploadLimitInKB=2048 / }}} + * audit schema.xml duplicate field definition behavior. As is {{ + fieldType name=aaa ... / + fieldType name=aaa ... / + + field name=aaa ... / + field name=aaa ... / + + dynamicField name=aaa_* ... / + dynamicField name=aaa_* ... / + }} quietly continues -- tossing out the first definition. This should add a severe error and optionally abort (using SOLR-179) + Your description is clear, but not the example. Is it a problem if a fieldType and field have the same name, or just two fields? Also, the field/dyn field definition seems okay (because of the underscore). Perhaps we should enforce * to match something (like .+)? -MIke
[jira] Created: (SOLR-226) support dynamic fields as copyField destination
support dynamic fields as copyField destination --- Key: SOLR-226 URL: https://issues.apache.org/jira/browse/SOLR-226 Project: Solr Issue Type: Improvement Reporter: Ryan McKinley Priority: Minor Fix For: 1.3 I'd like to use a dynamic field as the destination of a copyField: Given: field name=tag_* type=string ... / field name=text_* type=text ... / I want: copyField source=tag_* dest=text_* / For background see: http://www.nabble.com/copyField-to-a-dynamic-field-tf2300115.html#a6419101 http://www.nabble.com/dynamic-copyFields-tf3683816.html#a10296520 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-216) Improvements to solr.py
[ https://issues.apache.org/jira/browse/SOLR-216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12493835 ] Brian Whitman commented on SOLR-216: Hi Jason, this is really great. I had one small issue -- highlighting did not seem to work. I looked into your code and found you were using hi.fl and hi, not hl.fl and hl. Not sure if your solr expects hi, but mine expects hl. Once I changed line 453 457 to hl instead of hi it works fine. Improvements to solr.py --- Key: SOLR-216 URL: https://issues.apache.org/jira/browse/SOLR-216 Project: Solr Issue Type: Improvement Components: clients - python Affects Versions: 1.2 Reporter: Jason Cater Priority: Trivial Attachments: solr.py I've taken the original solr.py code and extended it to include higher-level functions. * Requires python 2.3+ * Supports SSL (https://) schema * Conforms (mostly) to PEP 8 -- the Python Style Guide * Provides a high-level results object with implicit data type conversion * Supports batching of update commands -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [Solr Wiki] Update of Solr1.2 by ryan
Mike Klaas wrote: On 5/4/07, Apache Wiki [EMAIL PROTECTED] wrote: -- requestParsers enableRemoteStreaming=false multipartUploadLimitInKB=2048 / }}} + * audit schema.xml duplicate field definition behavior. As is {{ + fieldType name=aaa ... / + fieldType name=aaa ... / + + field name=aaa ... / + field name=aaa ... / + + dynamicField name=aaa_* ... / + dynamicField name=aaa_* ... / + }} quietly continues -- tossing out the first definition. This should add a severe error and optionally abort (using SOLR-179) + Your description is clear, but not the example. Is it a problem if a fieldType and field have the same name, or just two fields? Also, the field/dyn field definition seems okay (because of the underscore). Perhaps we should enforce * to match something (like .+)? Sorry, the problem is not between the various types, it is within them. There is no problem with aaa as both a fieldType and field. I have not done the audit yet, so i can't fully describe what happens in each case. I noticed was: field name=aaa type=text ... / field name=aaa type=string ... / the first field (with type text) is quietly thrown away and it uses the second. I looked quickly at the other cases and looks like fieldType does the same thing. dynamicField are different in that it will ignore the second one. It is an easy fix just to check if anything comes out of the map when you put something in. ryan
[jira] Updated: (SOLR-226) support dynamic fields as copyField destination
[ https://issues.apache.org/jira/browse/SOLR-226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan McKinley updated SOLR-226: --- Attachment: SOLR-226-DynamicCopyField.patch support dynamic fields as copyField destination --- Key: SOLR-226 URL: https://issues.apache.org/jira/browse/SOLR-226 Project: Solr Issue Type: Improvement Reporter: Ryan McKinley Priority: Minor Fix For: 1.3 Attachments: SOLR-226-DynamicCopyField.patch I'd like to use a dynamic field as the destination of a copyField: Given: field name=tag_* type=string ... / field name=text_* type=text ... / I want: copyField source=tag_* dest=text_* / For background see: http://www.nabble.com/copyField-to-a-dynamic-field-tf2300115.html#a6419101 http://www.nabble.com/dynamic-copyFields-tf3683816.html#a10296520 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-227) Add errors if you define multiple fieldTypes, fields, dynamicFields, requestHandlers with the same name
Add errors if you define multiple fieldTypes, fields, dynamicFields, requestHandlers with the same name --- Key: SOLR-227 URL: https://issues.apache.org/jira/browse/SOLR-227 Project: Solr Issue Type: Bug Reporter: Ryan McKinley Fix For: 1.2 The current implementation quietly tosses out one definition in favor of the other... -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-227) Add errors if you define multiple fieldTypes, fields, dynamicFields, requestHandlers with the same name
[ https://issues.apache.org/jira/browse/SOLR-227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan McKinley updated SOLR-227: --- Attachment: SOLR-227-DuplicateNameErrors.patch Add errors if you define multiple fieldTypes, fields, dynamicFields, requestHandlers with the same name --- Key: SOLR-227 URL: https://issues.apache.org/jira/browse/SOLR-227 Project: Solr Issue Type: Bug Reporter: Ryan McKinley Fix For: 1.2 Attachments: SOLR-227-DuplicateNameErrors.patch The current implementation quietly tosses out one definition in favor of the other... -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
SolrParams functions
SolrParams seems to have most options for how to get whom from where, but it is missing: public float getFieldFloat(String field, String param, float def); public String getFieldParam(String field, String param, String def); Any objections to adding these functions?