[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12557639#action_12557639 ] Doug Steigerwald commented on SOLR-236: --- I copied what was in QueryComponent.prepare() method because I was having to disable the query component because of the extra results I was getting. Initially I had CollapseComponent.prepare() empty, but I had results from the query component and then adding the collapse component results being returned (2 'response' in the results. Easy solution for me was to copy the prepare from QueryComponent and disable the query component in the request handler. There may be another way, but I was unable to figure it out. Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Attachments: field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-445) XmlUpdateRequestHandler bad documents mid batch aborts rest of batch
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12557681#action_12557681 ] Grant Ingersoll commented on SOLR-445: -- Is it reasonable to use the AddUpdateCommand to communicate out of the UpdateHandler that a given document failed? For instance, in the update handler, it could catch any exception, and then add that exception onto the command (the next reuse would have to reset it) and then the various RequestHandler (XML/CSV) can check to see if the exception is set, add it to a list of failed docs and then add the failed docs to the response, which can then be written out as needed by the writers? XmlUpdateRequestHandler bad documents mid batch aborts rest of batch Key: SOLR-445 URL: https://issues.apache.org/jira/browse/SOLR-445 Project: Solr Issue Type: Bug Components: update Affects Versions: 1.3 Reporter: Will Johnson Assignee: Grant Ingersoll Has anyone run into the problem of handling bad documents / failures mid batch. Ie: add doc field name=id1/field /doc doc field name=id2/field field name=myDateFieldI_AM_A_BAD_DATE/field /doc doc field name=id3/field /doc /add Right now solr adds the first doc and then aborts. It would seem like it should either fail the entire batch or log a message/return a code and then continue on to add doc 3. Option 1 would seem to be much harder to accomplish and possibly require more memory while Option 2 would require more information to come back from the API. I'm about to dig into this but I thought I'd ask to see if anyone had any suggestions, thoughts or comments. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-430) SpellcheckerRequest / Response
[ https://issues.apache.org/jira/browse/SOLR-430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew Runo updated SOLR-430: -- Component/s: spellchecker SpellcheckerRequest / Response -- Key: SOLR-430 URL: https://issues.apache.org/jira/browse/SOLR-430 Project: Solr Issue Type: New Feature Components: clients - java, spellchecker Affects Versions: 1.3 Reporter: Matthew Runo Fix For: 1.3 SolrJ should support at a minimum a basic SpellcheckRequest and Response. Response should return a set of strings, the suggestions returned by the SpellcheckQueryHandler. Request should accept the basic commands that SC accepts over HTTP. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-247) Allow facet.field=* to facet on all fields (without knowing what they are)
[ https://issues.apache.org/jira/browse/SOLR-247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12557719#action_12557719 ] Matthew Runo commented on SOLR-247: --- http://www.nabble.com/Dynamic-fields---Facets-to14739422.html also provides a use case for this to be fixed. While I'd never do a *, I'd love to be able to do a attribute_*. It just makes using the dynamic fields so much easier. Allow facet.field=* to facet on all fields (without knowing what they are) -- Key: SOLR-247 URL: https://issues.apache.org/jira/browse/SOLR-247 Project: Solr Issue Type: Improvement Reporter: Ryan McKinley Priority: Minor Attachments: SOLR-247-FacetAllFields.patch I don't know if this is a good idea to include -- it is potentially a bad idea to use it, but that can be ok. This came out of trying to use faceting for the LukeRequestHandler top term collecting. http://www.nabble.com/Luke-request-handler-issue-tf3762155.html -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-247) Allow facet.field=* to facet on all fields (without knowing what they are)
[ https://issues.apache.org/jira/browse/SOLR-247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12557719#action_12557719 ] mruno edited comment on SOLR-247 at 1/10/08 9:45 AM: http://www.nabble.com/Dynamic-fields---Facets-to14739422.html also provides a use case for this to be fixed. While I'd never do a '*', I'd love to be able to do a 'attribute_*'. It just makes using the dynamic fields so much easier. was (Author: mruno): http://www.nabble.com/Dynamic-fields---Facets-to14739422.html also provides a use case for this to be fixed. While I'd never do a *, I'd love to be able to do a attribute_*. It just makes using the dynamic fields so much easier. Allow facet.field=* to facet on all fields (without knowing what they are) -- Key: SOLR-247 URL: https://issues.apache.org/jira/browse/SOLR-247 Project: Solr Issue Type: Improvement Reporter: Ryan McKinley Priority: Minor Attachments: SOLR-247-FacetAllFields.patch I don't know if this is a good idea to include -- it is potentially a bad idea to use it, but that can be ok. This came out of trying to use faceting for the LukeRequestHandler top term collecting. http://www.nabble.com/Luke-request-handler-issue-tf3762155.html -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-247) Allow facet.field=* to facet on all fields (without knowing what they are)
[ https://issues.apache.org/jira/browse/SOLR-247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12557719#action_12557719 ] mruno edited comment on SOLR-247 at 1/10/08 9:46 AM: http://www.nabble.com/Dynamic-fields---Facets-to14739422.html also provides a use case for this to be fixed. While I'd never do a facet on the wildcard, I'd love to be able to do one on attribute_wildcard. It just makes using the dynamic fields so much easier. was (Author: mruno): http://www.nabble.com/Dynamic-fields---Facets-to14739422.html also provides a use case for this to be fixed. While I'd never do a facet on *, I'd love to be able to do one on attribute_*. It just makes using the dynamic fields so much easier. Allow facet.field=* to facet on all fields (without knowing what they are) -- Key: SOLR-247 URL: https://issues.apache.org/jira/browse/SOLR-247 Project: Solr Issue Type: Improvement Reporter: Ryan McKinley Priority: Minor Attachments: SOLR-247-FacetAllFields.patch I don't know if this is a good idea to include -- it is potentially a bad idea to use it, but that can be ok. This came out of trying to use faceting for the LukeRequestHandler top term collecting. http://www.nabble.com/Luke-request-handler-issue-tf3762155.html -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Reopened: (SOLR-446) TextResponseWriter should be able to work with SolrDocument and SolrDocumentList
[ https://issues.apache.org/jira/browse/SOLR-446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man reopened SOLR-446: --- sorry, i just noticed something ... in commit r610156 the new writeDoc(String name, SolrDocument doc, SetString returnFields, boolean includeScore) methods all seem to be ignoring the returnFields param completely. doesn't that mean any handler using SolrDocument's won't respect the fl param? TextResponseWriter should be able to work with SolrDocument and SolrDocumentList Key: SOLR-446 URL: https://issues.apache.org/jira/browse/SOLR-446 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-446-WriteSolrDocument.patch ResponseWriters should be able to write SolrDocuments the same way they write Documents. This will be useful for SOLR-303 or other RequestHandlres that modify a SolrDocument and return the result. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-247) Allow facet.field=* to facet on all fields (without knowing what they are)
[ https://issues.apache.org/jira/browse/SOLR-247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12557884#action_12557884 ] Hoss Man commented on SOLR-247: --- i've put soem thoughts on the broader issues of having solr admin control over how field names are dealt with (globs, regexes, aliasing, etc...) in various contexts on the wiki... http://wiki.apache.org/solr/FieldAliasesAndGlobsInParams ...it might be best to use that as a whiteboard for a design discussion since the ultimate issues are a little bigger then this issue originally set out to tackle. Allow facet.field=* to facet on all fields (without knowing what they are) -- Key: SOLR-247 URL: https://issues.apache.org/jira/browse/SOLR-247 Project: Solr Issue Type: Improvement Reporter: Ryan McKinley Priority: Minor Attachments: SOLR-247-FacetAllFields.patch I don't know if this is a good idea to include -- it is potentially a bad idea to use it, but that can be ok. This came out of trying to use faceting for the LukeRequestHandler top term collecting. http://www.nabble.com/Luke-request-handler-issue-tf3762155.html -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-446) TextResponseWriter should be able to work with SolrDocument and SolrDocumentList
[ https://issues.apache.org/jira/browse/SOLR-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12557887#action_12557887 ] Ryan McKinley commented on SOLR-446: good catch Hoss! Looking at it again, the 'score' bit is weird too -- you would get duplicate 'score' fields if you chained this (i think) {code:java} if (includeScore) { writeVal(score, doc.getFirstValue(score)); } {code} perhaps it should be: {code:java} Index: src/java/org/apache/solr/request/XMLWriter.java === --- src/java/org/apache/solr/request/XMLWriter.java (revision 610424) +++ src/java/org/apache/solr/request/XMLWriter.java (working copy) @@ -342,11 +342,14 @@ startTag(doc, name, false); incLevel(); -if (includeScore) { - writeVal(score, doc.getFirstValue(score)); +if (includeScore returnFields != null ) { + returnFields.add( score ); } for (String fname : doc.getFieldNames()) { + if (returnFields!=null !returnFields.contains(fname)) { +continue; + } Object val = doc.getFieldValue(fname); if (val instanceof Collection) { {code} TextResponseWriter should be able to work with SolrDocument and SolrDocumentList Key: SOLR-446 URL: https://issues.apache.org/jira/browse/SOLR-446 Project: Solr Issue Type: Improvement Affects Versions: 1.3 Reporter: Ryan McKinley Assignee: Ryan McKinley Fix For: 1.3 Attachments: SOLR-446-WriteSolrDocument.patch ResponseWriters should be able to write SolrDocuments the same way they write Documents. This will be useful for SOLR-303 or other RequestHandlres that modify a SolrDocument and return the result. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-303) Distributed Search over HTTP
[ https://issues.apache.org/jira/browse/SOLR-303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley updated SOLR-303: -- Attachment: distributed.patch Now patch attached... this one implements count tiebreaking by index order (to match the non-distributed faceting). Distributed Search over HTTP Key: SOLR-303 URL: https://issues.apache.org/jira/browse/SOLR-303 Project: Solr Issue Type: New Feature Components: search Reporter: Sharad Agarwal Assignee: Yonik Seeley Attachments: distributed.patch, distributed.patch, distributed.patch, distributed.patch, distributed.patch, distributed.patch, fedsearch.patch, fedsearch.patch, fedsearch.patch, fedsearch.patch, fedsearch.patch, fedsearch.patch, fedsearch.patch, fedsearch.stu.patch, fedsearch.stu.patch Searching over multiple shards and aggregating results. Motivated by http://wiki.apache.org/solr/DistributedSearch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-444) hl.fl parameter not checked
[ https://issues.apache.org/jira/browse/SOLR-444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12557919#action_12557919 ] Sergey Dryganets commented on SOLR-444: --- ok create following search schema ?xml version=1.0 ? schema name=dw-solr version=1.1 types fieldtype name=string class=solr.StrField sortMissingLast=true omitNorms=true/ fieldtype name=integer class=solr.IntField omitNorms=true/ !-- not case sensitive text field -- fieldtype name=ncs_text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldtype !-- case sensitive text field -- fieldtype name=cs_text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0/ filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0/ filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldtype /types fields field name=id type=integer indexed=true stored=true/ field name=post_text type=cs_text indexed=false stored=true multiValued=true/ field name=cs_post_text type=cs_text indexed=true stored=false multiValued=true/ field name=ncs_post_text type=ncs_text indexed=true stored=false multiValued=true/ /fields !-- field to use to determine and enforce document uniqueness. -- uniqueKeyid/uniqueKey !-- SolrQueryParser configuration: defaultOperator=AND|OR -- solrQueryParser defaultOperator=OR/ copyField source=post_text dest=cs_post_text/ copyField source=post_text dest=ncs_post_text/ /schema add following document to index: add doc field name=id2/field field name=post_textTest/field /doc /add and request search result with following parameters: fl=*,scoreq=cs_post_text:Teststart=0rows=10hl=true it's result NPE fl=*,scoreq=cs_post_text:Teststart=0rows=10hl=truehl.fl=post_text returns a good result hl.fl parameter not checked --- Key: SOLR-444 URL: https://issues.apache.org/jira/browse/SOLR-444 Project: Solr Issue Type: Bug Components: highlighter Affects Versions: 1.3 Reporter: Sergey Dryganets this exception apear if send Empty string in the hl.fl request parameter ava.lang.NullPointerException at org.apache.solr.highlight.SolrHighlighter.doHighlighting(SolrHighlighter.java:270) at org.apache.solr.handler.StandardRequestHandler.handleRequestBody(StandardRequestHandler.java:165) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:78) at
[jira] Issue Comment Edited: (SOLR-444) hl.fl parameter not checked
[ https://issues.apache.org/jira/browse/SOLR-444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12557919#action_12557919 ] 22dsse edited comment on SOLR-444 at 1/10/08 11:08 PM: - ok create following search schema ?xml version=1.0 ? schema name=dw-solr version=1.1 types fieldtype name=string class=solr.StrField sortMissingLast=true omitNorms=true/ fieldtype name=integer class=solr.IntField omitNorms=true/ !-- not case sensitive text field -- fieldtype name=ncs_text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldtype !-- case sensitive text field -- fieldtype name=cs_text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0/ filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0/ filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldtype /types fields field name=id type=integer indexed=true stored=true/ field name=post_text type=cs_text indexed=false stored=true multiValued=true/ field name=cs_post_text type=cs_text indexed=true stored=false multiValued=true/ field name=ncs_post_text type=ncs_text indexed=true stored=false multiValued=true/ /fields !-- field to use to determine and enforce document uniqueness. -- uniqueKeyid/uniqueKey !-- SolrQueryParser configuration: defaultOperator=AND|OR -- solrQueryParser defaultOperator=OR/ copyField source=post_text dest=cs_post_text/ copyField source=post_text dest=ncs_post_text/ /schema add following document to index: add doc field name=id2/field field name=post_textTest/field /doc /add and request search result with following parameters: fl=*,scoreq=cs_post_text:Teststart=0rows=10hl=true it's result NPE fl=*,scoreq=cs_post_text:Teststart=0rows=10hl=truehl.fl=post_text returns a good result PS: I use latest solr version from svn for this test was (Author: 22dsse): ok create following search schema ?xml version=1.0 ? schema name=dw-solr version=1.1 types fieldtype name=string class=solr.StrField sortMissingLast=true omitNorms=true/ fieldtype name=integer class=solr.IntField omitNorms=true/ !-- not case sensitive text field -- fieldtype name=ncs_text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter
[jira] Created: (SOLR-454) Some confusing bugs with highlighting and
Some confusing bugs with highlighting and - Key: SOLR-454 URL: https://issues.apache.org/jira/browse/SOLR-454 Project: Solr Issue Type: Bug Components: highlighter, search Affects Versions: 1.3 Reporter: Sergey Dryganets -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-454) Some confusing bugs with highlighting and
[ https://issues.apache.org/jira/browse/SOLR-454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Dryganets updated SOLR-454: -- Attachment: schema.xml schema.xml to reproduce problems Some confusing bugs with highlighting and - Key: SOLR-454 URL: https://issues.apache.org/jira/browse/SOLR-454 Project: Solr Issue Type: Bug Components: highlighter, search Affects Versions: 1.3 Reporter: Sergey Dryganets Attachments: schema.xml -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Closed: (SOLR-454) Some confusing bugs with highlighting and
[ https://issues.apache.org/jira/browse/SOLR-454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Dryganets closed SOLR-454. - Resolution: Invalid Some confusing bugs with highlighting and - Key: SOLR-454 URL: https://issues.apache.org/jira/browse/SOLR-454 Project: Solr Issue Type: Bug Components: highlighter, search Affects Versions: 1.3 Reporter: Sergey Dryganets Attachments: schema.xml -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.