[jira] [Created] (SOLR-11152) ClassNotFoundException: com.uwyn.jhighlight.renderer.XhtmlRendererFactory
Simon Endele created SOLR-11152: --- Summary: ClassNotFoundException: com.uwyn.jhighlight.renderer.XhtmlRendererFactory Key: SOLR-11152 URL: https://issues.apache.org/jira/browse/SOLR-11152 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: contrib - Solr Cell (Tika extraction) Affects Versions: 6.6 Reporter: Simon Endele We get the following error when trying to index/extract a tgz file with Solr 6.6.0: {code:java} Caused by: java.lang.NoClassDefFoundError: com/uwyn/jhighlight/renderer/XhtmlRendererFactory at org.apache.tika.parser.code.SourceCodeParser.getRenderer(SourceCodeParser.java:132) at org.apache.tika.parser.code.SourceCodeParser.parse(SourceCodeParser.java:111) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120) at org.apache.tika.parser.DelegatingParser.parse(DelegatingParser.java:72) at org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor.parseEmbedded(ParsingEmbeddedDocumentExtractor.java:102) at org.apache.tika.parser.pkg.PackageParser.parseEntry(PackageParser.java:219) at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:182) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120) at org.apache.tika.parser.DelegatingParser.parse(DelegatingParser.java:72) at org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor.parseEmbedded(ParsingEmbeddedDocumentExtractor.java:102) at org.apache.tika.parser.pkg.PackageParser.parseEntry(PackageParser.java:219) at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:182) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120) at org.apache.tika.parser.DelegatingParser.parse(DelegatingParser.java:72) at org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor.parseEmbedded(ParsingEmbeddedDocumentExtractor.java:102) at org.apache.tika.parser.pkg.CompressorParser.parse(CompressorParser.java:164) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120) at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:228) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173) at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477) at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723) at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529) ... 29 more Caused by: java.lang.ClassNotFoundException: com.uwyn.jhighlight.renderer.XhtmlRendererFactory at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:814) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 60 more {code} It seems like the dependency [com.uwyn:jhighlight:1.0|https://mvnrepository.com/artifact/com.uwyn/jhighlight/1.0] is missing in {{contrib/extraction/lib}} in the Solr installation. When placing it there, the indexation works perfectly. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6690) Highlight expanded results
[ https://issues.apache.org/jira/browse/SOLR-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Endele updated SOLR-6690: --- Issue Type: Bug (was: Wish) > Highlight expanded results > -- > > Key: SOLR-6690 > URL: https://issues.apache.org/jira/browse/SOLR-6690 > Project: Solr > Issue Type: Bug > Components: highlighter >Reporter: Simon Endele > Labels: expand, highlight > Attachments: HighlightComponent.java.patch > > > Is it possible to highlight documents in the "expand" section in the Solr > response? > I'm aware that https://cwiki.apache.org/confluence/x/jiBqAg states: > "All downstream components (faceting, highlighting, etc...) will work with > the collapsed result set." > So I tried to put the highlight component after the expand component like > this: > {code:xml} > query > facet > stats > debug > expand > highlight > {code} > But with no effect. > Is there another switch that needs to be flipped or could this be implemented > easily? > IMHO this is quite a common use case. And it was possible to highlight all > results of a group with the old grouping. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3085) Fix the dismax/edismax stopwords mm issue
[ https://issues.apache.org/jira/browse/SOLR-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642843#comment-14642843 ] Simon Endele commented on SOLR-3085: We're currently experiencing the same issue with query terms that only contain non-alphanumerical characters, which are removed by the StandardTokenizer or WordDelimiterFilter, e.g. miles more. Will this case also be addressed by {{mm.autoRelax}}? Fix the dismax/edismax stopwords mm issue - Key: SOLR-3085 URL: https://issues.apache.org/jira/browse/SOLR-3085 Project: Solr Issue Type: Bug Components: query parsers Reporter: Jan Høydahl Labels: MinimumShouldMatch, dismax, edismax, stopwords Fix For: Trunk Attachments: SOLR-3085.patch, SOLR-3085.patch, SOLR-3085.patch As discussed here http://search-lucene.com/m/Wr7iz1a95jx and here http://search-lucene.com/m/Yne042qEyCq1 and here http://search-lucene.com/m/RfAp82nSsla DisMax has an issue with stopwords if not all fields used in QF have exactly same stopword lists. Typical solution is to not use stopwords or harmonize stopword lists across all fields in your QF, or relax the MM to a lower percentag. Sometimes these are not acceptable workarounds, and we should find a better solution. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-7314) Constants missing in Solrj
[ https://issues.apache.org/jira/browse/SOLR-7314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Endele updated SOLR-7314: --- Description: There are some parameter names/values, for which constants are missing in SolrJ. One has always to declare constants for them by herself (or hard-code them). * defType * edismax (value for defType) * dismax (value for defType) * lucene (value for defType) * spellcheck (there's SpellingParams.SPELLCHECK_PREFIX = spellcheck., but none without dot) * [elevated] (pseudo field for the QueryElevationComponent) See http://lucene.apache.org/solr/5_0_0/solr-solrj/constant-values.html Maybe there are even more, but these are the ones I always stumble upon. Of course there are constants in the Solr Core code, but typically one doesn't want to have a dependency on it when implementing a client. was: There are some parameter names/values, for which constants are missing in SolrJ. One has always to declare constants for them by herself (or hard-code them). * defType * edismax (value for defType) * dismax (value for defType) * lucene (value for defType) * spellcheck (there's SpellingParams.SPELLCHECK_PREFIX = spellcheck., but none without dot) See http://lucene.apache.org/solr/5_0_0/solr-solrj/constant-values.html Maybe there are even more, but these are the ones I always stumble upon. Of course there are constants in the Solr Core code, but typically one doesn't want to have a dependency on it when implementing a client. Constants missing in Solrj -- Key: SOLR-7314 URL: https://issues.apache.org/jira/browse/SOLR-7314 Project: Solr Issue Type: Wish Components: SolrJ Reporter: Simon Endele There are some parameter names/values, for which constants are missing in SolrJ. One has always to declare constants for them by herself (or hard-code them). * defType * edismax (value for defType) * dismax (value for defType) * lucene (value for defType) * spellcheck (there's SpellingParams.SPELLCHECK_PREFIX = spellcheck., but none without dot) * [elevated] (pseudo field for the QueryElevationComponent) See http://lucene.apache.org/solr/5_0_0/solr-solrj/constant-values.html Maybe there are even more, but these are the ones I always stumble upon. Of course there are constants in the Solr Core code, but typically one doesn't want to have a dependency on it when implementing a client. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6709) ClassCastException in QueryResponse after applying XMLResponseParser on a response containing an expanded section
[ https://issues.apache.org/jira/browse/SOLR-6709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392527#comment-14392527 ] Simon Endele commented on SOLR-6709: Thank you guys very much for fixing/reviewing and happy Easter! ClassCastException in QueryResponse after applying XMLResponseParser on a response containing an expanded section --- Key: SOLR-6709 URL: https://issues.apache.org/jira/browse/SOLR-6709 Project: Solr Issue Type: Bug Components: SolrJ Reporter: Simon Endele Assignee: Varun Thacker Fix For: Trunk, 5.2 Attachments: SOLR-6709.patch, SOLR-6709.patch, SOLR-6709.patch, test-response.xml Shouldn't the following code work on the attached input file? It matches the structure of a Solr response with wt=xml. {code}import java.io.InputStream; import org.apache.solr.client.solrj.ResponseParser; import org.apache.solr.client.solrj.impl.XMLResponseParser; import org.apache.solr.client.solrj.response.QueryResponse; import org.apache.solr.common.util.NamedList; import org.junit.Test; public class ParseXmlExpandedTest { @Test public void test() { ResponseParser responseParser = new XMLResponseParser(); InputStream inStream = getClass() .getResourceAsStream(test-response.xml); NamedListObject response = responseParser .processResponse(inStream, UTF-8); QueryResponse queryResponse = new QueryResponse(response, null); } }{code} Unexpectedly (for me), it throws a java.lang.ClassCastException: org.apache.solr.common.util.SimpleOrderedMap cannot be cast to java.util.Map at org.apache.solr.client.solrj.response.QueryResponse.setResponse(QueryResponse.java:126) Am I missing something, is XMLResponseParser deprecated or something? We use a setup like this to mock a QueryResponse for unit tests in our service that post-processes the Solr response. Obviously, it works with the javabin format which SolrJ uses internally. But that is no appropriate format for unit tests, where the response should be human readable. I think there's some conversion missing in QueryResponse or XMLResponseParser. Note: The null value supplied as SolrServer argument to the constructor of QueryResponse shouldn't have an effect as the error occurs before the parameter is even used. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-7314) Constants missing in Solrj
Simon Endele created SOLR-7314: -- Summary: Constants missing in Solrj Key: SOLR-7314 URL: https://issues.apache.org/jira/browse/SOLR-7314 Project: Solr Issue Type: Wish Components: SolrJ Reporter: Simon Endele There are some parameter names/values, for which constants are missing in SolrJ. One has always to declare constants for them by herself (or hard-code them). * defType * edismax (value for defType) * dismax (value for defType) * lucene (value for defType) * spellcheck (there's SpellingParams.SPELLCHECK_PREFIX = spellcheck., but none without dot) See http://lucene.apache.org/solr/5_0_0/solr-solrj/constant-values.html Maybe there are even more, but these are the ones I always stumble upon. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-7314) Constants missing in Solrj
[ https://issues.apache.org/jira/browse/SOLR-7314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Endele updated SOLR-7314: --- Description: There are some parameter names/values, for which constants are missing in SolrJ. One has always to declare constants for them by herself (or hard-code them). * defType * edismax (value for defType) * dismax (value for defType) * lucene (value for defType) * spellcheck (there's SpellingParams.SPELLCHECK_PREFIX = spellcheck., but none without dot) See http://lucene.apache.org/solr/5_0_0/solr-solrj/constant-values.html Maybe there are even more, but these are the ones I always stumble upon. Of course there are constants in the Solr Core code, but typically one doesn't want to have a dependency on it when implementing a client. was: There are some parameter names/values, for which constants are missing in SolrJ. One has always to declare constants for them by herself (or hard-code them). * defType * edismax (value for defType) * dismax (value for defType) * lucene (value for defType) * spellcheck (there's SpellingParams.SPELLCHECK_PREFIX = spellcheck., but none without dot) See http://lucene.apache.org/solr/5_0_0/solr-solrj/constant-values.html Maybe there are even more, but these are the ones I always stumble upon. Constants missing in Solrj -- Key: SOLR-7314 URL: https://issues.apache.org/jira/browse/SOLR-7314 Project: Solr Issue Type: Wish Components: SolrJ Reporter: Simon Endele There are some parameter names/values, for which constants are missing in SolrJ. One has always to declare constants for them by herself (or hard-code them). * defType * edismax (value for defType) * dismax (value for defType) * lucene (value for defType) * spellcheck (there's SpellingParams.SPELLCHECK_PREFIX = spellcheck., but none without dot) See http://lucene.apache.org/solr/5_0_0/solr-solrj/constant-values.html Maybe there are even more, but these are the ones I always stumble upon. Of course there are constants in the Solr Core code, but typically one doesn't want to have a dependency on it when implementing a client. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5332) Add preserve original setting to the EdgeNGramFilterFactory
[ https://issues.apache.org/jira/browse/SOLR-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14343414#comment-14343414 ] Simon Endele commented on SOLR-5332: +1 for this feature. We use the EdgeNGramFilterFactory on a tokenized field (in order to implement a prefix search on index time) with minGramSize=3. Unfortunately we observed that tokens with length 1 or 2 are actually deleted, unexpectedly from our point of view. Using a second field (though complicated IMHO) would address query-issues, but it gets awkward when it comes to highlighting or phrase searches. For instance when searching for us rep - the field with EdgeNGramFilterFactory highlights rep in representative, but not US as this token has been removed, - the field without EdgeNGramFilterFactory highlights US, but not representative as it has no prefixes indexed. Bringing these highlightings together in one string is a quite complex task. Not speaking of a phrase search, which does not work at all for the example above. We use minGramSize=3 to reduce collisions of prefixes and abbreviations (like US and usage) and reduce the index size. I admit, this does not prevent all collisions (e.g. USA still collides with usage), but it's a compromise. Nevertheless, minGramSize is a nice feature of EdgeNGramFilterFactory, but it lacks a preserveOriginal flag IMO. Add preserve original setting to the EdgeNGramFilterFactory - Key: SOLR-5332 URL: https://issues.apache.org/jira/browse/SOLR-5332 Project: Solr Issue Type: Wish Affects Versions: 4.4, 4.5, 4.5.1, 4.6 Reporter: Alexander S. Hi, as described here: http://lucene.472066.n3.nabble.com/Help-to-figure-out-why-query-does-not-match-td4086967.html the problem is in that if you have these 2 strings to index: 1. facebook.com/someuser.1 2. facebook.com/someveryandverylongusername and the edge ngram filter factory with min and max gram size settings 2 and 25, search requests for these urls will fail. But search requests for: 1. facebook.com/someuser 2. facebook.com/someveryandverylonguserna will work properly. It's because first url has 1 at the end, which is lover than the allowed min gram size. In the second url the user name is longer than the max gram size (27 characters). Would be good to have a preserve original option, that will add the original string to the index if it does not fit the allowed gram size, so that 1 and someveryandverylongusername tokens will also be added to the index. Best, Alex -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6782) PostingsSolrHighlighter produces strange highlight results
Simon Endele created SOLR-6782: -- Summary: PostingsSolrHighlighter produces strange highlight results Key: SOLR-6782 URL: https://issues.apache.org/jira/browse/SOLR-6782 Project: Solr Issue Type: Bug Components: highlighter Reporter: Simon Endele If {{hl.fl}} contains commas _and_ whitespaces, e.g. {{hl.fl=title, content}}, the PostingsSolrHighlighter produces the following result: {code} highlighting: { mydoc1: { title: [], : [], content: [ my highlighted content. ] }, mydoc2: { title: [], : [], content: [ my highlighted content 2. ] } }, {code} Two things: - The space followed by the comma leads to an empty field (or even a bunch in case of longer field list). - Why is {{title: [],}} included in the response (though {{hl.defaultSummary}} is not set)? Tested with Solr 4.10.2. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6782) PostingsSolrHighlighter produces strange highlight results
[ https://issues.apache.org/jira/browse/SOLR-6782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Endele updated SOLR-6782: --- Attachment: SOLR-6782.patch I'm not a Solr expert, but if I understand the code right, this can be fixed with a few lines. Added a patch that addresses both issues. The request above now produces the following response: {code} highlighting: { mydoc1: { content: [ my highlighted content. ] }, mydoc2: { content: [ my highlighted content 2. ] } }, {code} Seems to work with {{hl.defaultSummary=true}}, too. Response: {code} highlighting: { mydoc1: { title: [ My Summary. ], content: [ my highlighted content. ] }, mydoc2: { title: [ My Summary 2. ], content: [ my highlighted content 2. ] } }, {code} PostingsSolrHighlighter produces strange highlight results -- Key: SOLR-6782 URL: https://issues.apache.org/jira/browse/SOLR-6782 Project: Solr Issue Type: Bug Components: highlighter Reporter: Simon Endele Attachments: SOLR-6782.patch If {{hl.fl}} contains commas _and_ whitespaces, e.g. {{hl.fl=title, content}}, the PostingsSolrHighlighter produces the following result: {code} highlighting: { mydoc1: { title: [], : [], content: [ my highlighted content. ] }, mydoc2: { title: [], : [], content: [ my highlighted content 2. ] } }, {code} Two things: - The space followed by the comma leads to an empty field (or even a bunch in case of longer field list). - Why is {{title: [],}} included in the response (though {{hl.defaultSummary}} is not set)? Tested with Solr 4.10.2. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6783) SolrHighlighter does not accept globs in multi-valued hl.fl argument
Simon Endele created SOLR-6783: -- Summary: SolrHighlighter does not accept globs in multi-valued hl.fl argument Key: SOLR-6783 URL: https://issues.apache.org/jira/browse/SOLR-6783 Project: Solr Issue Type: Bug Reporter: Simon Endele These two cases work correctly: - hl.fl = *_text - hl.fl = title_text,content_text,myfield But the expression {{hl.fl=*_text,myfield}} results in empty highlighted docs when the default highlighter is used. Using the PostingsSolrHighlighter it even causes the following exception: {code} java.lang.IllegalArgumentException: fieldsIn must not be empty at org.apache.lucene.search.postingshighlight.PostingsHighlighter.highlightFieldsAsObjects(PostingsHighlighter.java:342) at org.apache.lucene.search.postingshighlight.PostingsHighlighter.highlightFields(PostingsHighlighter.java:303) at org.apache.solr.highlight.PostingsSolrHighlighter.doHighlighting(PostingsSolrHighlighter.java:140) at org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:146) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:218) {code} Not yet tested with FastVectorHighlighter. Tested with Solr 4.10.2. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Closed] (SOLR-6783) SolrHighlighter does not accept globs in multi-valued hl.fl argument
[ https://issues.apache.org/jira/browse/SOLR-6783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Endele closed SOLR-6783. -- Resolution: Duplicate SolrHighlighter does not accept globs in multi-valued hl.fl argument Key: SOLR-6783 URL: https://issues.apache.org/jira/browse/SOLR-6783 Project: Solr Issue Type: Bug Reporter: Simon Endele These two cases work correctly: - hl.fl = *_text - hl.fl = title_text,content_text,myfield But the expression {{hl.fl=*_text,myfield}} results in empty highlighted docs when the default highlighter is used. Using the PostingsSolrHighlighter it even causes the following exception: {code} java.lang.IllegalArgumentException: fieldsIn must not be empty at org.apache.lucene.search.postingshighlight.PostingsHighlighter.highlightFieldsAsObjects(PostingsHighlighter.java:342) at org.apache.lucene.search.postingshighlight.PostingsHighlighter.highlightFields(PostingsHighlighter.java:303) at org.apache.solr.highlight.PostingsSolrHighlighter.doHighlighting(PostingsSolrHighlighter.java:140) at org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:146) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:218) {code} Not yet tested with FastVectorHighlighter. Tested with Solr 4.10.2. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5127) Allow multiple wildcards in hl.fl
[ https://issues.apache.org/jira/browse/SOLR-5127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14222931#comment-14222931 ] Simon Endele commented on SOLR-5127: I implemented a similar solution, which seems to work for us. May be interesting: Using the PostingsSolrHighlighter an expression like {{hl.fl=*_text,myfield}} even causes the following exception: {code} java.lang.IllegalArgumentException: fieldsIn must not be empty at org.apache.lucene.search.postingshighlight.PostingsHighlighter.highlightFieldsAsObjects(PostingsHighlighter.java:342) at org.apache.lucene.search.postingshighlight.PostingsHighlighter.highlightFields(PostingsHighlighter.java:303) at org.apache.solr.highlight.PostingsSolrHighlighter.doHighlighting(PostingsSolrHighlighter.java:140) at org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:146) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:218) {code} Allow multiple wildcards in hl.fl - Key: SOLR-5127 URL: https://issues.apache.org/jira/browse/SOLR-5127 Project: Solr Issue Type: New Feature Components: highlighter Affects Versions: 3.6.1, 4.4 Reporter: Sven-S. Porst Attachments: highlight-wildcards.patch When a wildcard is present in the hl.fl field, the field is not split up at commas/spaces into components. As a consequence settings like hl.fl=*_highlight,*_data do not work. Splitting the string first and evaluating wildcards on each component afterwards would be more powerful and consistent with the documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6759) ExpandComponent does not call finish() on DelegatingCollectors
Simon Endele created SOLR-6759: -- Summary: ExpandComponent does not call finish() on DelegatingCollectors Key: SOLR-6759 URL: https://issues.apache.org/jira/browse/SOLR-6759 Project: Solr Issue Type: Bug Reporter: Simon Endele We have a PostFilter for ACL filtering in action that has a similar structure as CollapsingQParserPlugin, i.e. it's DelegatingCollector gathers all documents and calls delegate.collect() for all docs finally in its finish() method. In contrast to CollapsingQParserPlugin our PostFilter is also called by the ExpandComponent (for purpose). But as the finish method is never called by the ExpandComponent, the expand section in the result is always empty. Tested with Solr 4.10.2. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6759) ExpandComponent does not call finish() on DelegatingCollectors
[ https://issues.apache.org/jira/browse/SOLR-6759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Endele updated SOLR-6759: --- Attachment: ExpandComponent.java.patch I'm not a Solr expert, but if I understand the code right, this can be fixed with a few lines. Added a patch. Seems to work for us. ExpandComponent does not call finish() on DelegatingCollectors -- Key: SOLR-6759 URL: https://issues.apache.org/jira/browse/SOLR-6759 Project: Solr Issue Type: Bug Reporter: Simon Endele Attachments: ExpandComponent.java.patch We have a PostFilter for ACL filtering in action that has a similar structure as CollapsingQParserPlugin, i.e. it's DelegatingCollector gathers all documents and calls delegate.collect() for all docs finally in its finish() method. In contrast to CollapsingQParserPlugin our PostFilter is also called by the ExpandComponent (for purpose). But as the finish method is never called by the ExpandComponent, the expand section in the result is always empty. Tested with Solr 4.10.2. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6690) Highlight expanded results
[ https://issues.apache.org/jira/browse/SOLR-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Endele updated SOLR-6690: --- Description: Is it possible to highlight documents in the expand section in the Solr response? I'm aware that https://cwiki.apache.org/confluence/x/jiBqAg states: All downstream components (faceting, highlighting, etc...) will work with the collapsed result set. So I tried to put the highlight component after the expand component like this: {code:xml}arr name=components strquery/str strfacet/str strstats/str strdebug/str strexpand/str strhighlight/str /arr{code} But with no effect. Is there another switch that needs to be flipped or could this be implemented easily? IMHO this is quite a common use case. And it was possible to highlight all results of a group with the old grouping. was: Is it possible to highlight documents in the expand section in the Solr response? I'm aware that https://cwiki.apache.org/confluence/x/jiBqAg states: All downstream components (faceting, highlighting, etc...) will work with the collapsed result set. So I tried to put the highlight component after the expand component like this: {code:xml}arr name=components strquery/str strfacet/str strstats/str strdebug/str strexpand/str strhighlight/str /arr{code} But with no effect. Is there another switch that needs to be flipped or could this be implemented easily? IMHO this is quite a common use case... Highlight expanded results -- Key: SOLR-6690 URL: https://issues.apache.org/jira/browse/SOLR-6690 Project: Solr Issue Type: Wish Components: highlighter Reporter: Simon Endele Labels: expand, highlight Attachments: HighlightComponent.java.patch Is it possible to highlight documents in the expand section in the Solr response? I'm aware that https://cwiki.apache.org/confluence/x/jiBqAg states: All downstream components (faceting, highlighting, etc...) will work with the collapsed result set. So I tried to put the highlight component after the expand component like this: {code:xml}arr name=components strquery/str strfacet/str strstats/str strdebug/str strexpand/str strhighlight/str /arr{code} But with no effect. Is there another switch that needs to be flipped or could this be implemented easily? IMHO this is quite a common use case. And it was possible to highlight all results of a group with the old grouping. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6690) Highlight expanded results
[ https://issues.apache.org/jira/browse/SOLR-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Endele updated SOLR-6690: --- Priority: Major (was: Minor) Highlight expanded results -- Key: SOLR-6690 URL: https://issues.apache.org/jira/browse/SOLR-6690 Project: Solr Issue Type: Wish Components: highlighter Reporter: Simon Endele Labels: expand, highlight Attachments: HighlightComponent.java.patch Is it possible to highlight documents in the expand section in the Solr response? I'm aware that https://cwiki.apache.org/confluence/x/jiBqAg states: All downstream components (faceting, highlighting, etc...) will work with the collapsed result set. So I tried to put the highlight component after the expand component like this: {code:xml}arr name=components strquery/str strfacet/str strstats/str strdebug/str strexpand/str strhighlight/str /arr{code} But with no effect. Is there another switch that needs to be flipped or could this be implemented easily? IMHO this is quite a common use case... -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6709) ClassCastException in QueryResponse after applying XMLResponseParser on a response containing an expanded section
Simon Endele created SOLR-6709: -- Summary: ClassCastException in QueryResponse after applying XMLResponseParser on a response containing an expanded section Key: SOLR-6709 URL: https://issues.apache.org/jira/browse/SOLR-6709 Project: Solr Issue Type: Bug Components: SolrJ Reporter: Simon Endele Shouldn't the following code work on the attached input file? It matches the structure of a Solr response with wt=xml. {code}import java.io.InputStream; import org.apache.solr.client.solrj.ResponseParser; import org.apache.solr.client.solrj.impl.XMLResponseParser; import org.apache.solr.client.solrj.response.QueryResponse; import org.apache.solr.common.util.NamedList; import org.junit.Test; public class ParseXmlExpandedTest { @Test public void test() { ResponseParser responseParser = new XMLResponseParser(); InputStream inStream = getClass() .getResourceAsStream(test-response.xml); NamedListObject response = responseParser .processResponse(inStream, UTF-8); QueryResponse queryResponse = new QueryResponse(response, null); } }{code} Unexpectedly (for me), it throws a java.lang.ClassCastException: org.apache.solr.common.util.SimpleOrderedMap cannot be cast to java.util.Map at org.apache.solr.client.solrj.response.QueryResponse.setResponse(QueryResponse.java:126) Am I missing something, is XMLResponseParser deprecated or something? We use a setup like this to mock a QueryResponse for unit tests in our service that post-processes the Solr response. Obviously, it works with the javabin format which SolrJ uses internally. But that is no appropriate format for unit tests, where the response should be human readable. I think there's some conversion missing in QueryResponse or XMLResponseParser. Note: The null value supplied as SolrServer argument to the constructor of QueryResponse shouldn't have an effect as the error occurs before the parameter is even used. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6709) ClassCastException in QueryResponse after applying XMLResponseParser on a response containing an expanded section
[ https://issues.apache.org/jira/browse/SOLR-6709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Endele updated SOLR-6709: --- Attachment: test-response.xml ClassCastException in QueryResponse after applying XMLResponseParser on a response containing an expanded section --- Key: SOLR-6709 URL: https://issues.apache.org/jira/browse/SOLR-6709 Project: Solr Issue Type: Bug Components: SolrJ Reporter: Simon Endele Attachments: test-response.xml Shouldn't the following code work on the attached input file? It matches the structure of a Solr response with wt=xml. {code}import java.io.InputStream; import org.apache.solr.client.solrj.ResponseParser; import org.apache.solr.client.solrj.impl.XMLResponseParser; import org.apache.solr.client.solrj.response.QueryResponse; import org.apache.solr.common.util.NamedList; import org.junit.Test; public class ParseXmlExpandedTest { @Test public void test() { ResponseParser responseParser = new XMLResponseParser(); InputStream inStream = getClass() .getResourceAsStream(test-response.xml); NamedListObject response = responseParser .processResponse(inStream, UTF-8); QueryResponse queryResponse = new QueryResponse(response, null); } }{code} Unexpectedly (for me), it throws a java.lang.ClassCastException: org.apache.solr.common.util.SimpleOrderedMap cannot be cast to java.util.Map at org.apache.solr.client.solrj.response.QueryResponse.setResponse(QueryResponse.java:126) Am I missing something, is XMLResponseParser deprecated or something? We use a setup like this to mock a QueryResponse for unit tests in our service that post-processes the Solr response. Obviously, it works with the javabin format which SolrJ uses internally. But that is no appropriate format for unit tests, where the response should be human readable. I think there's some conversion missing in QueryResponse or XMLResponseParser. Note: The null value supplied as SolrServer argument to the constructor of QueryResponse shouldn't have an effect as the error occurs before the parameter is even used. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6690) Highlight expanded results
[ https://issues.apache.org/jira/browse/SOLR-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Endele updated SOLR-6690: --- Attachment: HighlightComponent.java.patch Added a patch for Solr core trunk. I'm not a Solr core expert. It's just a rough sketch, but it seems to work. Still to do: - The order of the ExpandComponent and the HighlightComponent needs to be switched to make it work (as mentioned in the issue description). I'm not sure what effects changing the default order in org.apache.solr.handler.component.SearchHandler.getDefaultComponents() may have. - It would be good to have a config param to turn this on, I guess. Suggestion: {{hl.expanded=true/false}}. Highlight expanded results -- Key: SOLR-6690 URL: https://issues.apache.org/jira/browse/SOLR-6690 Project: Solr Issue Type: Wish Reporter: Simon Endele Priority: Minor Labels: expand, highlight Attachments: HighlightComponent.java.patch Is it possible to apply the highlighting to documents in the expand section in the Solr response? I'm aware that https://cwiki.apache.org/confluence/x/jiBqAg states: All downstream components (faceting, highlighting, etc...) will work with the collapsed result set. So I tried to put the highlight component after the expand component like this: {code:xml}arr name=components strquery/str strfacet/str strstats/str strdebug/str strexpand/str strhighlight/str /arr{code} But with no effect. Is there another switch that needs to be flipped or could this be implemented easily? IMHO this is quite a common use case... -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6690) Highlight expanded results
[ https://issues.apache.org/jira/browse/SOLR-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Endele updated SOLR-6690: --- Description: Is it possible to highlight documents in the expand section in the Solr response? I'm aware that https://cwiki.apache.org/confluence/x/jiBqAg states: All downstream components (faceting, highlighting, etc...) will work with the collapsed result set. So I tried to put the highlight component after the expand component like this: {code:xml}arr name=components strquery/str strfacet/str strstats/str strdebug/str strexpand/str strhighlight/str /arr{code} But with no effect. Is there another switch that needs to be flipped or could this be implemented easily? IMHO this is quite a common use case... was: Is it possible to apply the highlighting to documents in the expand section in the Solr response? I'm aware that https://cwiki.apache.org/confluence/x/jiBqAg states: All downstream components (faceting, highlighting, etc...) will work with the collapsed result set. So I tried to put the highlight component after the expand component like this: {code:xml}arr name=components strquery/str strfacet/str strstats/str strdebug/str strexpand/str strhighlight/str /arr{code} But with no effect. Is there another switch that needs to be flipped or could this be implemented easily? IMHO this is quite a common use case... Highlight expanded results -- Key: SOLR-6690 URL: https://issues.apache.org/jira/browse/SOLR-6690 Project: Solr Issue Type: Wish Reporter: Simon Endele Priority: Minor Labels: expand, highlight Attachments: HighlightComponent.java.patch Is it possible to highlight documents in the expand section in the Solr response? I'm aware that https://cwiki.apache.org/confluence/x/jiBqAg states: All downstream components (faceting, highlighting, etc...) will work with the collapsed result set. So I tried to put the highlight component after the expand component like this: {code:xml}arr name=components strquery/str strfacet/str strstats/str strdebug/str strexpand/str strhighlight/str /arr{code} But with no effect. Is there another switch that needs to be flipped or could this be implemented easily? IMHO this is quite a common use case... -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6690) Highlight expanded results
[ https://issues.apache.org/jira/browse/SOLR-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Endele updated SOLR-6690: --- Component/s: highlighter Highlight expanded results -- Key: SOLR-6690 URL: https://issues.apache.org/jira/browse/SOLR-6690 Project: Solr Issue Type: Wish Components: highlighter Reporter: Simon Endele Priority: Minor Labels: expand, highlight Attachments: HighlightComponent.java.patch Is it possible to highlight documents in the expand section in the Solr response? I'm aware that https://cwiki.apache.org/confluence/x/jiBqAg states: All downstream components (faceting, highlighting, etc...) will work with the collapsed result set. So I tried to put the highlight component after the expand component like this: {code:xml}arr name=components strquery/str strfacet/str strstats/str strdebug/str strexpand/str strhighlight/str /arr{code} But with no effect. Is there another switch that needs to be flipped or could this be implemented easily? IMHO this is quite a common use case... -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6690) Highlight expanded results
Simon Endele created SOLR-6690: -- Summary: Highlight expanded results Key: SOLR-6690 URL: https://issues.apache.org/jira/browse/SOLR-6690 Project: Solr Issue Type: Wish Reporter: Simon Endele Priority: Minor Is it possible to apply the highlighting to documents in the expand section in the Solr response? I'm aware that https://cwiki.apache.org/confluence/x/jiBqAg states: All downstream components (faceting, highlighting, etc...) will work with the collapsed result set. So I tried to put the highlight component after the expand component like this: {code:xml}arr name=components strquery/str strfacet/str strstats/str strdebug/str strexpand/str strhighlight/str /arr{code} But with no effect. Is there another switch that needs to be flipped or could this be implemented easily? IMHO this is quite a common use case... -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1763) Integrate Solr Cell/Tika as an UpdateRequestProcessor
[ https://issues.apache.org/jira/browse/SOLR-1763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14163485#comment-14163485 ] Simon Endele commented on SOLR-1763: I'd appreciate this feature, because it would also be possible to post-process the output of Tika. Integrate Solr Cell/Tika as an UpdateRequestProcessor - Key: SOLR-1763 URL: https://issues.apache.org/jira/browse/SOLR-1763 Project: Solr Issue Type: New Feature Components: update Reporter: Jan Høydahl Labels: extracting_request_handler, solr_cell, tika, update_request_handler From Chris Hostetter's original post in solr-dev: As someone with very little knowledge of Solr Cell and/or Tika, I find myself wondering if ExtractingRequestHandler would make more sense as an extractingUpdateProcessor -- where it could be configured to take take either binary fields (or string fields containing URLs) out of the Documents, parse them with tika, and add the various XPath matching hunks of text back into the document as new fields. Then ExtractingRequestHandler just becomes a handler that slurps up it's ContentStreams and adds them as binary data fields and adds the other literal params as fields. Wouldn't that make things like SOLR-1358, and using Tika with URLs/filepaths in XML and CSV based updates fairly trivial? -Hoss I couldn't agree more, so I decided to add it as an issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6158) Solr looks up configSets in the wrong directory
[ https://issues.apache.org/jira/browse/SOLR-6158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14027497#comment-14027497 ] Simon Endele commented on SOLR-6158: No problem. Thanks a lot for the quick response and the fix! Solr looks up configSets in the wrong directory --- Key: SOLR-6158 URL: https://issues.apache.org/jira/browse/SOLR-6158 Project: Solr Issue Type: Bug Affects Versions: 4.8, 4.8.1 Reporter: Simon Endele Assignee: Alan Woodward Attachments: SOLR-6158.patch I tried the small tutorial on http://heliosearch.org/solr-4-8-features/ to create Named Config Sets based on the Solr example shipped with Solr 4.8.1 (like it's done in the tutorial, same problem with 4.8.0). Creating a new core with a configSet seems to work (directory 'books' and 'books/core.properties' are created correctly). But loading the new core does not work: {code:none}67446 [qtp25155085-11] INFO org.apache.solr.handler.admin.CoreAdminHandler core create command configSet=genericname=booksaction=CREATE 67452 [qtp25155085-11] ERROR org.apache.solr.core.CoreContainer Unable to create core: books org.apache.solr.common.SolrException: Could not load configuration from directory C:\dev\solr-4.8.1\example\configsets\generic at org.apache.solr.core.ConfigSetService$Default.locateInstanceDir(ConfigSetService.java:145) at org.apache.solr.core.ConfigSetService$Default.createCoreResourceLoader(ConfigSetService.java:130) at org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:58) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:554) ... {code} It seems like Solr looks up the config sets in the wrong directory: C:\dev\solr-4.8.1\example\configsets\generic (in the log above) instead of C:\dev\solr-4.8.1\example\solr\configsets\generic (like stated in the tutorial and the documentation on https://cwiki.apache.org/confluence/display/solr/Config+Sets) Moving the configsets directory one level up (into 'example') will work. But as of the documentation (and the tutorial) it should be located in the solr home directory. In case I'm completely wrong and everythings works as expected, how can the configsets directory be configured? The documentation on https://cwiki.apache.org/confluence/display/solr/Config+Sets mentions a configurable configset base directory, but I can't find any information on the web. Another thing: If it would work as I expect, the references lib dir=../../../contrib/extraction/lib regex=.*\.jar / etc. in solr-4.8.1/example/solr/configsets/generic/conf/solrconfig.xml should get one more ../ added, I guess (missing in the tutorial). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6158) Solr looks up configSets in the wrong directory
[ https://issues.apache.org/jira/browse/SOLR-6158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14027518#comment-14027518 ] Simon Endele commented on SOLR-6158: For all who may stumble upon this: Your solr.xml should look like this (for the example project): {code:xml} solr str name=configSetBaseDir${configSetBaseDir:solr/configsets}/str ... /solr {code} Solr looks up configSets in the wrong directory --- Key: SOLR-6158 URL: https://issues.apache.org/jira/browse/SOLR-6158 Project: Solr Issue Type: Bug Affects Versions: 4.8, 4.8.1 Reporter: Simon Endele Assignee: Alan Woodward Attachments: SOLR-6158.patch I tried the small tutorial on http://heliosearch.org/solr-4-8-features/ to create Named Config Sets based on the Solr example shipped with Solr 4.8.1 (like it's done in the tutorial, same problem with 4.8.0). Creating a new core with a configSet seems to work (directory 'books' and 'books/core.properties' are created correctly). But loading the new core does not work: {code:none}67446 [qtp25155085-11] INFO org.apache.solr.handler.admin.CoreAdminHandler core create command configSet=genericname=booksaction=CREATE 67452 [qtp25155085-11] ERROR org.apache.solr.core.CoreContainer Unable to create core: books org.apache.solr.common.SolrException: Could not load configuration from directory C:\dev\solr-4.8.1\example\configsets\generic at org.apache.solr.core.ConfigSetService$Default.locateInstanceDir(ConfigSetService.java:145) at org.apache.solr.core.ConfigSetService$Default.createCoreResourceLoader(ConfigSetService.java:130) at org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:58) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:554) ... {code} It seems like Solr looks up the config sets in the wrong directory: C:\dev\solr-4.8.1\example\configsets\generic (in the log above) instead of C:\dev\solr-4.8.1\example\solr\configsets\generic (like stated in the tutorial and the documentation on https://cwiki.apache.org/confluence/display/solr/Config+Sets) Moving the configsets directory one level up (into 'example') will work. But as of the documentation (and the tutorial) it should be located in the solr home directory. In case I'm completely wrong and everythings works as expected, how can the configsets directory be configured? The documentation on https://cwiki.apache.org/confluence/display/solr/Config+Sets mentions a configurable configset base directory, but I can't find any information on the web. Another thing: If it would work as I expect, the references lib dir=../../../contrib/extraction/lib regex=.*\.jar / etc. in solr-4.8.1/example/solr/configsets/generic/conf/solrconfig.xml should get one more ../ added, I guess (missing in the tutorial). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6158) Solr looks up configSets in the wrong directory
Simon Endele created SOLR-6158: -- Summary: Solr looks up configSets in the wrong directory Key: SOLR-6158 URL: https://issues.apache.org/jira/browse/SOLR-6158 Project: Solr Issue Type: Bug Affects Versions: 4.8.1, 4.8 Reporter: Simon Endele I tried the small tutorial on http://heliosearch.org/solr-4-8-features/ to create Named Config Sets based on the Solr example shipped with Solr 4.8.1 (like it's done in the tutorial, same problem with 4.8.0). Creating a new core with a configSet seems to work (directory 'books' and 'books/core.properties' are created correctly). But loading the new core does not work: {code:none}67446 [qtp25155085-11] INFO org.apache.solr.handler.admin.CoreAdminHandler core create command configSet=genericname=booksaction=CREATE 67452 [qtp25155085-11] ERROR org.apache.solr.core.CoreContainer Unable to create core: books org.apache.solr.common.SolrException: Could not load configuration from directory C:\dev\solr-4.8.1\example\configsets\generic at org.apache.solr.core.ConfigSetService$Default.locateInstanceDir(ConfigSetService.java:145) at org.apache.solr.core.ConfigSetService$Default.createCoreResourceLoader(ConfigSetService.java:130) at org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:58) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:554) ... {code} It seems like Solr looks up the config sets in the wrong directory: C:\dev\solr-4.8.1\example\configsets\generic (in the log above) instead of C:\dev\solr-4.8.1\example\solr\configsets\generic (like stated in the tutorial and the documentation on https://cwiki.apache.org/confluence/display/solr/Config+Sets) Moving the configsets directory one level up (into 'example') will work. But as of the documentation (and the tutorial) it should be located in the solr home directory. In case I'm completely wrong and everythings works as expected, how can one configure the configsets directory be configured? The documentation on https://cwiki.apache.org/confluence/display/solr/Config+Sets mentions a configurable configset base directory, but I can't find any information on the web. Another thing: If it would work as I expect, the references lib dir=../../../contrib/extraction/lib regex=.*\.jar / etc. in solr-4.8.1/example/solr/configsets/generic/conf/solrconfig.xml should get one more ../ added, I guess (missing in the tutorial). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6158) Solr looks up configSets in the wrong directory
[ https://issues.apache.org/jira/browse/SOLR-6158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Endele updated SOLR-6158: --- Description: I tried the small tutorial on http://heliosearch.org/solr-4-8-features/ to create Named Config Sets based on the Solr example shipped with Solr 4.8.1 (like it's done in the tutorial, same problem with 4.8.0). Creating a new core with a configSet seems to work (directory 'books' and 'books/core.properties' are created correctly). But loading the new core does not work: {code:none}67446 [qtp25155085-11] INFO org.apache.solr.handler.admin.CoreAdminHandler core create command configSet=genericname=booksaction=CREATE 67452 [qtp25155085-11] ERROR org.apache.solr.core.CoreContainer Unable to create core: books org.apache.solr.common.SolrException: Could not load configuration from directory C:\dev\solr-4.8.1\example\configsets\generic at org.apache.solr.core.ConfigSetService$Default.locateInstanceDir(ConfigSetService.java:145) at org.apache.solr.core.ConfigSetService$Default.createCoreResourceLoader(ConfigSetService.java:130) at org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:58) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:554) ... {code} It seems like Solr looks up the config sets in the wrong directory: C:\dev\solr-4.8.1\example\configsets\generic (in the log above) instead of C:\dev\solr-4.8.1\example\solr\configsets\generic (like stated in the tutorial and the documentation on https://cwiki.apache.org/confluence/display/solr/Config+Sets) Moving the configsets directory one level up (into 'example') will work. But as of the documentation (and the tutorial) it should be located in the solr home directory. In case I'm completely wrong and everythings works as expected, how can the configsets directory be configured? The documentation on https://cwiki.apache.org/confluence/display/solr/Config+Sets mentions a configurable configset base directory, but I can't find any information on the web. Another thing: If it would work as I expect, the references lib dir=../../../contrib/extraction/lib regex=.*\.jar / etc. in solr-4.8.1/example/solr/configsets/generic/conf/solrconfig.xml should get one more ../ added, I guess (missing in the tutorial). was: I tried the small tutorial on http://heliosearch.org/solr-4-8-features/ to create Named Config Sets based on the Solr example shipped with Solr 4.8.1 (like it's done in the tutorial, same problem with 4.8.0). Creating a new core with a configSet seems to work (directory 'books' and 'books/core.properties' are created correctly). But loading the new core does not work: {code:none}67446 [qtp25155085-11] INFO org.apache.solr.handler.admin.CoreAdminHandler core create command configSet=genericname=booksaction=CREATE 67452 [qtp25155085-11] ERROR org.apache.solr.core.CoreContainer Unable to create core: books org.apache.solr.common.SolrException: Could not load configuration from directory C:\dev\solr-4.8.1\example\configsets\generic at org.apache.solr.core.ConfigSetService$Default.locateInstanceDir(ConfigSetService.java:145) at org.apache.solr.core.ConfigSetService$Default.createCoreResourceLoader(ConfigSetService.java:130) at org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:58) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:554) ... {code} It seems like Solr looks up the config sets in the wrong directory: C:\dev\solr-4.8.1\example\configsets\generic (in the log above) instead of C:\dev\solr-4.8.1\example\solr\configsets\generic (like stated in the tutorial and the documentation on https://cwiki.apache.org/confluence/display/solr/Config+Sets) Moving the configsets directory one level up (into 'example') will work. But as of the documentation (and the tutorial) it should be located in the solr home directory. In case I'm completely wrong and everythings works as expected, how can one configure the configsets directory be configured? The documentation on https://cwiki.apache.org/confluence/display/solr/Config+Sets mentions a configurable configset base directory, but I can't find any information on the web. Another thing: If it would work as I expect, the references lib dir=../../../contrib/extraction/lib regex=.*\.jar / etc. in solr-4.8.1/example/solr/configsets/generic/conf/solrconfig.xml should get one more ../ added, I guess (missing in the tutorial). Solr looks up configSets in the wrong directory --- Key: SOLR-6158 URL: https://issues.apache.org/jira/browse/SOLR-6158 Project: Solr Issue Type: Bug Affects Versions: 4.8, 4.8.1 Reporter: Simon Endele I tried the small tutorial on http://heliosearch.org/solr-4-8-features/ to create
[jira] [Commented] (SOLR-5027) Field Collapsing PostFilter
[ https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13894749#comment-13894749 ] Simon Endele commented on SOLR-5027: Hi Joel, a similar question to Phil John's one: Is it correct that no equivalent for group.limit of the old grouping is/will be available? I.e. only one document is returned for each group and the ExpandComponent can be used to get more, right? I always thought that the aim of the ExpandComponent is to return _additional_ docs in a sense that these documents were not hit by the query (we wrote a component by ourselves for that based on the old grouping functionality). Will that be possible with the ExpandComponent, or will it only be possible to fetch n (or all) documents of each group that were hit and collapsed by the CollapsingQParserPlugin (each only for a single page, of course)? See also my question above concerning a filter query for the ExpandComponent. Thanks in advance, Simon Field Collapsing PostFilter --- Key: SOLR-5027 URL: https://issues.apache.org/jira/browse/SOLR-5027 Project: Solr Issue Type: New Feature Components: search Affects Versions: 5.0 Reporter: Joel Bernstein Assignee: Joel Bernstein Priority: Minor Fix For: 4.6, 5.0 Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch This ticket introduces the *CollapsingQParserPlugin* The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. This is a high performance alternative to standard Solr field collapsing (with *ngroups*) when the number of distinct groups in the result set is high. For example in one performance test, a search with 10 million full results and 1 million collapsed groups: Standard grouping with ngroups : 17 seconds. CollapsingQParserPlugin: 300 milli-seconds. Sample syntax: Collapse based on the highest scoring document: {code} fq=(!collapse field=field_name} {code} Collapse based on the min value of a numeric field: {code} fq={!collapse field=field_name min=field_name} {code} Collapse based on the max value of a numeric field: {code} fq={!collapse field=field_name max=field_name} {code} Collapse with a null policy: {code} fq={!collapse field=field_name nullPolicy=null_policy} {code} There are three null policies: ignore : removes docs with a null value in the collapse field (default). expand : treats each doc with a null value in the collapse field as a separate group. collapse : collapses all docs with a null value into a single group using either highest score, or min/max. The CollapsingQParserPlugin also fully supports the QueryElevationComponent *Note:* The July 16 patch also includes and ExpandComponent that expands the collapsed groups for the current search result page. This functionality will be moved to it's own ticket. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5375) Param literalsOverride for ExtractingRequestHandler / SolrCell does not consider lowernames
Simon Endele created SOLR-5375: -- Summary: Param literalsOverride for ExtractingRequestHandler / SolrCell does not consider lowernames Key: SOLR-5375 URL: https://issues.apache.org/jira/browse/SOLR-5375 Project: Solr Issue Type: Bug Reporter: Simon Endele Priority: Minor Can be reproduced with the following command and the example configuration shipped with Solr: cd exampledocs curl -F file=@hd.xml http://localhost:8983/solr/update/extract?commit=trueliteral.id=myidliteralsOverride=truelowernames=trueliteral.content_type=mytype; The added doc contains both values: http://localhost:8983/solr/collection1/select?q=id%3Amyidwt=xmlindent=true {code:xml}arr name=content_type strmytype/str strapplication/xml/str /arr{code} If the corresponding field is not multi-valued, the request raises an org.apache.solr.common.SolrException: ERROR: multiple values encountered for non multiValued field content_type: Debugging the code (Solr 4.4.0) I found out that the parameter lowernames is not considered at several places in org.apache.solr.handler.extraction.SolrContentHandler looking like: {code}if (literalsOverride literalFieldNames.contains(name)) continue; {code} The same problem occurs for the following command (though its correctness could be discussed): curl -F file=@hd.xml http://localhost:8983/solr/update/extract?commit=trueliteral.id=myidliteralsOverride=truelowernames=falsefmap.Content-Type=content_typeliteral.content_type=mytype; -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1856) In Solr Cell, literals should override Tika-parsed values
[ https://issues.apache.org/jira/browse/SOLR-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13801643#comment-13801643 ] Simon Endele commented on SOLR-1856: Did so, see SOLR-5375. In Solr Cell, literals should override Tika-parsed values - Key: SOLR-1856 URL: https://issues.apache.org/jira/browse/SOLR-1856 Project: Solr Issue Type: Improvement Components: contrib - Solr Cell (Tika extraction) Reporter: Chris Harris Assignee: Jan Høydahl Fix For: 4.0-BETA, 5.0 Attachments: SOLR-1856.patch, SOLR-1856.patch I propose that ExtractingRequestHandler / SolrCell literals should take precedence over Tika-parsed metadata in all situations, including where multiValued=true. (Compare SOLR-1633?) My personal motivation is that I have several fields (e.g. title, date) where my own metadata is much superior to what Tika offers, and I want to throw those Tika values away. (I actually wouldn't mind throwing away _all_ Tika-parsed values, but let's set that aside.) SOLR-1634 is one potential approach to this, but the fix here might be simpler. I'll attach a patch shortly. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5375) Param literalsOverride for ExtractingRequestHandler / SolrCell does not consider lowernames
[ https://issues.apache.org/jira/browse/SOLR-5375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13801752#comment-13801752 ] Simon Endele edited comment on SOLR-5375 at 10/22/13 12:08 PM: --- It's not as easy as I thought in the first place as there's another issue that bothers me and touches this one: From my expectation, fmap should only be applied to the values returned from Tika and not to literals. So currently it is not possible to declare the following mapping (assuming lowernames=true): literal.content_type = schema field content_type content_type from Tika = schema field content_type_tika This is what the following request should do IMO: literal.content_type=mytypefmap.content_type=content_type_tika Instead both values are stored to content_type_tika. The same problem exists for lowernames. If enabled it is not possible to fill schema fields containing upper-case letters using an ContentStreamUpdateRequest. But this is a question of expected behavior and I'm afraid this would cause backwards compatibility issues. What do you think? was (Author: simon.endele): It's not as easy as I thought in the first place as there's another issue that bothers me and touches this one: From my expectation, fmap should only be applied to the values returned from Tika and not to literals. So currently it is not possible to declare the following mapping (assuming lowernames=true): literal.content_type = schema field content_type content_type from Tika = schema field content_type_tika what the following request should do IMO: literal.content_type=mytypefmap.content_type=content_type_tika Instead both values are stored to content_type_tika. The same problem exists for lowernames. If enabled it is not possible to fill schema fields containing upper-case letters using an ContentStreamUpdateRequest. But this is a question of expected behavior and I'm afraid this would cause backwards compatibility issues. What do you think? Param literalsOverride for ExtractingRequestHandler / SolrCell does not consider lowernames --- Key: SOLR-5375 URL: https://issues.apache.org/jira/browse/SOLR-5375 Project: Solr Issue Type: Bug Components: contrib - Solr Cell (Tika extraction) Reporter: Simon Endele Priority: Minor Fix For: 4.6 Can be reproduced with the following command and the example configuration shipped with Solr: cd exampledocs curl -F file=@hd.xml http://localhost:8983/solr/update/extract?commit=trueliteral.id=myidliteralsOverride=truelowernames=trueliteral.content_type=mytype; The added doc contains both values: http://localhost:8983/solr/collection1/select?q=id%3Amyidwt=xmlindent=true {code:xml}arr name=content_type strmytype/str strapplication/xml/str /arr{code} If the corresponding field is not multi-valued, the request raises an org.apache.solr.common.SolrException: ERROR: multiple values encountered for non multiValued field content_type: Debugging the code (Solr 4.4.0) I found out that the parameter lowernames is not considered at several places in org.apache.solr.handler.extraction.SolrContentHandler looking like: {code}if (literalsOverride literalFieldNames.contains(name)) continue; {code} The same problem occurs for the following command (though its correctness could be discussed): curl -F file=@hd.xml http://localhost:8983/solr/update/extract?commit=trueliteral.id=myidliteralsOverride=truelowernames=falsefmap.Content-Type=content_typeliteral.content_type=mytype; -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5375) Param literalsOverride for ExtractingRequestHandler / SolrCell does not consider lowernames
[ https://issues.apache.org/jira/browse/SOLR-5375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13801752#comment-13801752 ] Simon Endele commented on SOLR-5375: It's not as easy as I thought in the first place as there's another issue that bothers me and touches this one: From my expectation, fmap should only be applied to the values returned from Tika and not to literals. So currently it is not possible to declare the following mapping (assuming lowernames=true): literal.content_type = schema field content_type content_type from Tika = schema field content_type_tika what the following request should do IMO: literal.content_type=mytypefmap.content_type=content_type_tika Instead both values are stored to content_type_tika. The same problem exists for lowernames. If enabled it is not possible to fill schema fields containing upper-case letters using an ContentStreamUpdateRequest. But this is a question of expected behavior and I'm afraid this would cause backwards compatibility issues. What do you think? Param literalsOverride for ExtractingRequestHandler / SolrCell does not consider lowernames --- Key: SOLR-5375 URL: https://issues.apache.org/jira/browse/SOLR-5375 Project: Solr Issue Type: Bug Components: contrib - Solr Cell (Tika extraction) Reporter: Simon Endele Priority: Minor Fix For: 4.6 Can be reproduced with the following command and the example configuration shipped with Solr: cd exampledocs curl -F file=@hd.xml http://localhost:8983/solr/update/extract?commit=trueliteral.id=myidliteralsOverride=truelowernames=trueliteral.content_type=mytype; The added doc contains both values: http://localhost:8983/solr/collection1/select?q=id%3Amyidwt=xmlindent=true {code:xml}arr name=content_type strmytype/str strapplication/xml/str /arr{code} If the corresponding field is not multi-valued, the request raises an org.apache.solr.common.SolrException: ERROR: multiple values encountered for non multiValued field content_type: Debugging the code (Solr 4.4.0) I found out that the parameter lowernames is not considered at several places in org.apache.solr.handler.extraction.SolrContentHandler looking like: {code}if (literalsOverride literalFieldNames.contains(name)) continue; {code} The same problem occurs for the following command (though its correctness could be discussed): curl -F file=@hd.xml http://localhost:8983/solr/update/extract?commit=trueliteral.id=myidliteralsOverride=truelowernames=falsefmap.Content-Type=content_typeliteral.content_type=mytype; -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5375) Param literalsOverride for ExtractingRequestHandler / SolrCell does not consider lowernames
[ https://issues.apache.org/jira/browse/SOLR-5375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Endele updated SOLR-5375: --- Attachment: SolrContentHandler.java.patch Added a patch for trunk that addresses only this specific issue. Param literalsOverride for ExtractingRequestHandler / SolrCell does not consider lowernames --- Key: SOLR-5375 URL: https://issues.apache.org/jira/browse/SOLR-5375 Project: Solr Issue Type: Bug Components: contrib - Solr Cell (Tika extraction) Reporter: Simon Endele Priority: Minor Fix For: 4.6 Attachments: SolrContentHandler.java.patch Can be reproduced with the following command and the example configuration shipped with Solr: cd exampledocs curl -F file=@hd.xml http://localhost:8983/solr/update/extract?commit=trueliteral.id=myidliteralsOverride=truelowernames=trueliteral.content_type=mytype; The added doc contains both values: http://localhost:8983/solr/collection1/select?q=id%3Amyidwt=xmlindent=true {code:xml}arr name=content_type strmytype/str strapplication/xml/str /arr{code} If the corresponding field is not multi-valued, the request raises an org.apache.solr.common.SolrException: ERROR: multiple values encountered for non multiValued field content_type: Debugging the code (Solr 4.4.0) I found out that the parameter lowernames is not considered at several places in org.apache.solr.handler.extraction.SolrContentHandler looking like: {code}if (literalsOverride literalFieldNames.contains(name)) continue; {code} The same problem occurs for the following command (though its correctness could be discussed): curl -F file=@hd.xml http://localhost:8983/solr/update/extract?commit=trueliteral.id=myidliteralsOverride=truelowernames=falsefmap.Content-Type=content_typeliteral.content_type=mytype; -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1856) In Solr Cell, literals should override Tika-parsed values
[ https://issues.apache.org/jira/browse/SOLR-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13800850#comment-13800850 ] Simon Endele commented on SOLR-1856: Debugging the code (Solr 4.4.0) I found out that the parameter lowernames is not considered. The request lowernames=trueliteralsOverride=trueliteral.url=myurl still raises an org.apache.solr.common.SolrException: ERROR: multiple values encountered for non multiValued field url: [.., ..], if a URL is extracted from the metadata of the binary. In Solr Cell, literals should override Tika-parsed values - Key: SOLR-1856 URL: https://issues.apache.org/jira/browse/SOLR-1856 Project: Solr Issue Type: Improvement Components: contrib - Solr Cell (Tika extraction) Reporter: Chris Harris Assignee: Jan Høydahl Fix For: 4.0-BETA, 5.0 Attachments: SOLR-1856.patch, SOLR-1856.patch I propose that ExtractingRequestHandler / SolrCell literals should take precedence over Tika-parsed metadata in all situations, including where multiValued=true. (Compare SOLR-1633?) My personal motivation is that I have several fields (e.g. title, date) where my own metadata is much superior to what Tika offers, and I want to throw those Tika values away. (I actually wouldn't mind throwing away _all_ Tika-parsed values, but let's set that aside.) SOLR-1634 is one potential approach to this, but the fix here might be simpler. I'll attach a patch shortly. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5027) Result Set Collapse and Expand Plugins
[ https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13777303#comment-13777303 ] Simon Endele commented on SOLR-5027: Sounds good. I propose to add an additional parameter expand.fq to restrict the expanded documents to a certain filter query. Sometimes the complete groups are very large and should only be expanded by one or a few representatives of that group. Other group members that are not hit by the main query are not interesting (at least in the first place). Note that this is different from adding a basic filter query, since documents that are hit by the main query but not by expand.fq are kept. Example: Group consisting of: representative A, more group members B and C. Query hits B, group is expanded by A, but not C (due to expand.fq) = Result: A, B A filter query before expanding would filter out B and thus yield no results for this group. A filter query after expanding would filter out B and C thus keep only A. Is that technically possible? Maybe this is worth a separate issue... Result Set Collapse and Expand Plugins -- Key: SOLR-5027 URL: https://issues.apache.org/jira/browse/SOLR-5027 Project: Solr Issue Type: New Feature Components: search Affects Versions: 5.0 Reporter: Joel Bernstein Priority: Minor Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch This ticket introduces two new Solr plugins, the *CollapsingQParserPlugin* and the *ExpandComponent*. The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. This allows field collapsing to be done within the normal search flow. Initial syntax: fq=(!collapse field=field_name} All documents in a group will be collapsed to the highest ranking document in the group. The *ExpandComponent* is a search component that takes the collapsed docList and expands the groups for a single page based on parameters provided. Initial syntax: expand=true - Turns on the expand component. expand.field=field - Expands results for this field expand.limit=5 - Limits the documents for each expanded group. expand.sort=sort spec - The sort spec for the expanded documents. Default is score. expand.rows=500 - The max number of expanded results to bring back. Default is 500. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5027) Result Set Collapse and Expand Plugins
[ https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13777303#comment-13777303 ] Simon Endele edited comment on SOLR-5027 at 9/25/13 9:19 AM: - Sounds good. I propose to add an additional parameter expand.fq to restrict the expanded documents to a certain filter query. Sometimes the complete groups are very large and should only be expanded by one or a few representatives of that group (which can be addressed with a filter query). Other group members that are not hit by the main query are not interesting (at least in the first place). Note that this is different from adding a basic filter query, since documents that are hit by the main query but not by expand.fq are kept. Example: Group consisting of: representative A, more group members B and C. Query hits B, group is expanded by A (due to expand.fq), but not C = Result: A, B A filter query before expanding would filter out B and thus yield no results for this group. A filter query after expanding would filter out B and C thus keep only A. Is that technically possible? Maybe this is worth a separate issue... was (Author: simon.endele): Sounds good. I propose to add an additional parameter expand.fq to restrict the expanded documents to a certain filter query. Sometimes the complete groups are very large and should only be expanded by one or a few representatives of that group. Other group members that are not hit by the main query are not interesting (at least in the first place). Note that this is different from adding a basic filter query, since documents that are hit by the main query but not by expand.fq are kept. Example: Group consisting of: representative A, more group members B and C. Query hits B, group is expanded by A, but not C (due to expand.fq) = Result: A, B A filter query before expanding would filter out B and thus yield no results for this group. A filter query after expanding would filter out B and C thus keep only A. Is that technically possible? Maybe this is worth a separate issue... Result Set Collapse and Expand Plugins -- Key: SOLR-5027 URL: https://issues.apache.org/jira/browse/SOLR-5027 Project: Solr Issue Type: New Feature Components: search Affects Versions: 5.0 Reporter: Joel Bernstein Priority: Minor Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch This ticket introduces two new Solr plugins, the *CollapsingQParserPlugin* and the *ExpandComponent*. The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. This allows field collapsing to be done within the normal search flow. Initial syntax: fq=(!collapse field=field_name} All documents in a group will be collapsed to the highest ranking document in the group. The *ExpandComponent* is a search component that takes the collapsed docList and expands the groups for a single page based on parameters provided. Initial syntax: expand=true - Turns on the expand component. expand.field=field - Expands results for this field expand.limit=5 - Limits the documents for each expanded group. expand.sort=sort spec - The sort spec for the expanded documents. Default is score. expand.rows=500 - The max number of expanded results to bring back. Default is 500. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5270) lastModified not updating when selecting another core in Core Admin
Simon Endele created SOLR-5270: -- Summary: lastModified not updating when selecting another core in Core Admin Key: SOLR-5270 URL: https://issues.apache.org/jira/browse/SOLR-5270 Project: Solr Issue Type: Bug Components: web gui Reporter: Simon Endele Priority: Minor When selecting a core in the section Core Admin in the Solr Admin web UI, data like dataDir, version, numDocs, maxDoc are updated via JavaScript, but lastModified is not. A refresh of the page does the trick. Had a look into the network traffic of my browser and it seems that the JSON fetched via AJAX contains the correct information. Can be reproduced in different browsers with the example by cloning collection1 into a collection2 and indexing collection2 anew by calling java -jar post.jar *.xml in the exampledocs directory. Tested with Solr 4.4.0. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2216) Highlighter query exceeds maxBooleanClause limit due to range query
[ https://issues.apache.org/jira/browse/SOLR-2216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13777691#comment-13777691 ] Simon Endele commented on SOLR-2216: Am I right in assuming that this isn't a problem when using the FastVectorHighlighter or the PostingsHighlighter? Highlighter query exceeds maxBooleanClause limit due to range query --- Key: SOLR-2216 URL: https://issues.apache.org/jira/browse/SOLR-2216 Project: Solr Issue Type: Bug Components: highlighter Affects Versions: 1.4.1 Environment: Linux solr-2.bizjournals.int 2.6.18-194.3.1.el5 #1 SMP Thu May 13 13:08:30 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux java version 1.6.0_21 Java(TM) SE Runtime Environment (build 1.6.0_21-b06) Java HotSpot(TM) 64-Bit Server VM (build 17.0-b16, mixed mode) JAVA_OPTS=-client -Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.port= -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.access.file=/root/.jmxaccess -Dcom.sun.management.jmxremote.password.file=/root/.jmxpasswd -Dcom.sun.management.jmxremote.ssl=false -XX:+UseCompressedOops -XX:MaxPermSize=512M -Xms10240M -Xmx15360M -XX:+UseParallelGC -XX:+AggressiveOpts -XX:NewRatio=5 top - 11:38:49 up 124 days, 22:37, 1 user, load average: 5.20, 4.35, 3.90 Tasks: 220 total, 1 running, 219 sleeping, 0 stopped, 0 zombie Cpu(s): 47.5%us, 2.9%sy, 0.0%ni, 49.5%id, 0.1%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 24679008k total, 18179980k used, 6499028k free, 125424k buffers Swap: 26738680k total,29276k used, 26709404k free, 8187444k cached Reporter: Ken Stanley For a full detail of the issue, please see the mailing list: http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201011.mbox/%3CAANLkTimE8z8yOni+u0Nsbgct1=ef7e+su0_waku2c...@mail.gmail.com%3E The nutshell version of the issue is that when I have a query that contains ranges on a specific (non-highlighted) field, the highlighter component is attempting to create a query that exceeds the value of maxBooleanClauses set from solrconfig.xml. This is despite my explicit setting of hl.field, hl.requireFieldMatch, and various other hightlight options in the query. As suggested by Koji in the follow-up response, I removed the range queries from my main query, and SOLR and highlighting were happy to fulfill my request. It was suggested that if removing the range queries worked that this might potentially be a bug, hence my filing this JIRA ticket. For what it is worth, if I move my range queries into an fq, I do not get the exception about exceeding maxBooleanClauses, and I get the effect that I was looking for. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5249) ClassNotFoundException due to white-spaces in solrconfig.xml
[ https://issues.apache.org/jira/browse/SOLR-5249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773140#comment-13773140 ] Simon Endele commented on SOLR-5249: Wow, thanks for your quick and detailed response! I'm using Eclipse with default settings, so I thought this might bother some more people like me. Eclipse inserts line-breaks and white-spaces at other places in the solrconfig.xml, which are ignored, for example in the defaults-section of a request handler: {code}str name=hl.flcontent title field1 field2 field3 field4 /str{code} Ok, this is maybe a bad example as the field list ist parsed. As far I know class names are Java identifiers, which cannot contain any white-spaces. This certain code fragment only handles class names and no files, doesn't it? ClassNotFoundException due to white-spaces in solrconfig.xml Key: SOLR-5249 URL: https://issues.apache.org/jira/browse/SOLR-5249 Project: Solr Issue Type: Bug Reporter: Simon Endele Priority: Minor Attachments: SolrResourceLoader.java.patch Original Estimate: 1h Remaining Estimate: 1h Due to auto-formatting by an text editor/IDE there may be line-breaks after class names in the solrconfig.xml, for example: {code:xml}searchComponent class=solr.SpellCheckComponent name=suggest lst name=spellchecker str name=namesuggest/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.fst.WFSTLookupFactory /str [...] /lst /searchComponent{code} This will raise an exception in SolrResourceLoader as the white-spaces are not stripped from the class name: {code}Caused by: org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.spelling.suggest.fst.WFSTLookupFactory ' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:449) at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:471) at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:467) at org.apache.solr.spelling.suggest.Suggester.init(Suggester.java:102) at org.apache.solr.handler.component.SpellCheckComponent.inform(SpellCheckComponent.java:623) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:601) at org.apache.solr.core.SolrCore.init(SolrCore.java:830) ... 13 more Caused by: java.lang.ClassNotFoundException: org.apache.solr.spelling.suggest.fst.WFSTLookupFactory at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:423) at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:789) at java.lang.ClassLoader.loadClass(ClassLoader.java:356) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:433) ... 19 more{code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5249) ClassNotFoundException due to white-spaces in solrconfig.xml
[ https://issues.apache.org/jira/browse/SOLR-5249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773140#comment-13773140 ] Simon Endele edited comment on SOLR-5249 at 9/20/13 4:18 PM: - Wow, thanks for your quick and detailed response! I'm using Eclipse with default settings, so I thought this might bother some more people like me. Eclipse inserts line-breaks and white-spaces at other places in the solrconfig.xml, which are ignored, for example in the defaults-section of a request handler: {code}str name=hl.flcontent title field1 field2 field3 field4 /str{code} Ok, this is maybe a bad example as the field list is parsed. As far I know class names are Java identifiers, which cannot contain any white-spaces. This certain code fragment only handles class names and no files, doesn't it? was (Author: simon.endele): Wow, thanks for your quick and detailed response! I'm using Eclipse with default settings, so I thought this might bother some more people like me. Eclipse inserts line-breaks and white-spaces at other places in the solrconfig.xml, which are ignored, for example in the defaults-section of a request handler: {code}str name=hl.flcontent title field1 field2 field3 field4 /str{code} Ok, this is maybe a bad example as the field list ist parsed. As far I know class names are Java identifiers, which cannot contain any white-spaces. This certain code fragment only handles class names and no files, doesn't it? ClassNotFoundException due to white-spaces in solrconfig.xml Key: SOLR-5249 URL: https://issues.apache.org/jira/browse/SOLR-5249 Project: Solr Issue Type: Bug Reporter: Simon Endele Priority: Minor Attachments: SolrResourceLoader.java.patch Original Estimate: 1h Remaining Estimate: 1h Due to auto-formatting by an text editor/IDE there may be line-breaks after class names in the solrconfig.xml, for example: {code:xml}searchComponent class=solr.SpellCheckComponent name=suggest lst name=spellchecker str name=namesuggest/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.fst.WFSTLookupFactory /str [...] /lst /searchComponent{code} This will raise an exception in SolrResourceLoader as the white-spaces are not stripped from the class name: {code}Caused by: org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.spelling.suggest.fst.WFSTLookupFactory ' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:449) at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:471) at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:467) at org.apache.solr.spelling.suggest.Suggester.init(Suggester.java:102) at org.apache.solr.handler.component.SpellCheckComponent.inform(SpellCheckComponent.java:623) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:601) at org.apache.solr.core.SolrCore.init(SolrCore.java:830) ... 13 more Caused by: java.lang.ClassNotFoundException: org.apache.solr.spelling.suggest.fst.WFSTLookupFactory at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:423) at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:789) at java.lang.ClassLoader.loadClass(ClassLoader.java:356) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:433) ... 19 more{code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5249) ClassNotFoundException due to white-spaces in solrconfig.xml
Simon Endele created SOLR-5249: -- Summary: ClassNotFoundException due to white-spaces in solrconfig.xml Key: SOLR-5249 URL: https://issues.apache.org/jira/browse/SOLR-5249 Project: Solr Issue Type: Bug Components: SearchComponents - other Reporter: Simon Endele Priority: Minor Due to auto-formatting by an text editor/IDE there may be line-breaks after class names in the solrconfig.xml, for example: {code:xml}searchComponent class=solr.SpellCheckComponent name=suggest lst name=spellchecker str name=namesuggest/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.fst.WFSTLookupFactory /str [...] /lst /searchComponent{code} This will raise an exception in SolrResourceLoader as the white-spaces are not stripped from the class name: {code}Caused by: org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.spelling.suggest.fst.WFSTLookupFactory ' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:449) at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:471) at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:467) at org.apache.solr.spelling.suggest.Suggester.init(Suggester.java:102) at org.apache.solr.handler.component.SpellCheckComponent.inform(SpellCheckComponent.java:623) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:601) at org.apache.solr.core.SolrCore.init(SolrCore.java:830) ... 13 more Caused by: java.lang.ClassNotFoundException: org.apache.solr.spelling.suggest.fst.WFSTLookupFactory at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:423) at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:789) at java.lang.ClassLoader.loadClass(ClassLoader.java:356) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:433) ... 19 more{code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5249) ClassNotFoundException due to white-spaces in solrconfig.xml
[ https://issues.apache.org/jira/browse/SOLR-5249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Endele updated SOLR-5249: --- Attachment: SolrResourceLoader.java.patch Uploaded a patch for trunk. ClassNotFoundException due to white-spaces in solrconfig.xml Key: SOLR-5249 URL: https://issues.apache.org/jira/browse/SOLR-5249 Project: Solr Issue Type: Bug Components: SearchComponents - other Reporter: Simon Endele Priority: Minor Attachments: SolrResourceLoader.java.patch Original Estimate: 1h Remaining Estimate: 1h Due to auto-formatting by an text editor/IDE there may be line-breaks after class names in the solrconfig.xml, for example: {code:xml}searchComponent class=solr.SpellCheckComponent name=suggest lst name=spellchecker str name=namesuggest/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.fst.WFSTLookupFactory /str [...] /lst /searchComponent{code} This will raise an exception in SolrResourceLoader as the white-spaces are not stripped from the class name: {code}Caused by: org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.spelling.suggest.fst.WFSTLookupFactory ' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:449) at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:471) at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:467) at org.apache.solr.spelling.suggest.Suggester.init(Suggester.java:102) at org.apache.solr.handler.component.SpellCheckComponent.inform(SpellCheckComponent.java:623) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:601) at org.apache.solr.core.SolrCore.init(SolrCore.java:830) ... 13 more Caused by: java.lang.ClassNotFoundException: org.apache.solr.spelling.suggest.fst.WFSTLookupFactory at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:423) at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:789) at java.lang.ClassLoader.loadClass(ClassLoader.java:356) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:433) ... 19 more{code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5249) ClassNotFoundException due to white-spaces in solrconfig.xml
[ https://issues.apache.org/jira/browse/SOLR-5249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Endele updated SOLR-5249: --- Component/s: (was: SearchComponents - other) ClassNotFoundException due to white-spaces in solrconfig.xml Key: SOLR-5249 URL: https://issues.apache.org/jira/browse/SOLR-5249 Project: Solr Issue Type: Bug Reporter: Simon Endele Priority: Minor Attachments: SolrResourceLoader.java.patch Original Estimate: 1h Remaining Estimate: 1h Due to auto-formatting by an text editor/IDE there may be line-breaks after class names in the solrconfig.xml, for example: {code:xml}searchComponent class=solr.SpellCheckComponent name=suggest lst name=spellchecker str name=namesuggest/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.fst.WFSTLookupFactory /str [...] /lst /searchComponent{code} This will raise an exception in SolrResourceLoader as the white-spaces are not stripped from the class name: {code}Caused by: org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.spelling.suggest.fst.WFSTLookupFactory ' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:449) at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:471) at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:467) at org.apache.solr.spelling.suggest.Suggester.init(Suggester.java:102) at org.apache.solr.handler.component.SpellCheckComponent.inform(SpellCheckComponent.java:623) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:601) at org.apache.solr.core.SolrCore.init(SolrCore.java:830) ... 13 more Caused by: java.lang.ClassNotFoundException: org.apache.solr.spelling.suggest.fst.WFSTLookupFactory at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:423) at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:789) at java.lang.ClassLoader.loadClass(ClassLoader.java:356) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:433) ... 19 more{code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5027) Result Set Collapse and Expand Plugins
[ https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13765381#comment-13765381 ] Simon Endele commented on SOLR-5027: What do you mean exactly by there is no concept of ngroups or group facets? Does that include that there will be no possibility to return the number of groups, like the request parameter group.ngroups currently does? Will it still be possible to decide if the faceting is done before/after collapsing, similar to group.facet? Result Set Collapse and Expand Plugins -- Key: SOLR-5027 URL: https://issues.apache.org/jira/browse/SOLR-5027 Project: Solr Issue Type: New Feature Components: search Affects Versions: 5.0 Reporter: Joel Bernstein Priority: Minor Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch This ticket introduces two new Solr plugins, the *CollapsingQParserPlugin* and the *ExpandComponent*. The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. This allows field collapsing to be done within the normal search flow. Initial syntax: fq=(!collapse field=field_name} All documents in a group will be collapsed to the highest ranking document in the group. The *ExpandComponent* is a search component that takes the collapsed docList and expands the groups for a single page based on parameters provided. Initial syntax: expand=true - Turns on the expand component. expand.field=field - Expands results for this field expand.limit=5 - Limits the documents for each expanded group. expand.sort=sort spec - The sort spec for the expanded documents. Default is score. expand.rows=500 - The max number of expanded results to bring back. Default is 500. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5230) Call DelegatingCollector.finish() during grouping
[ https://issues.apache.org/jira/browse/SOLR-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764157#comment-13764157 ] Simon Endele commented on SOLR-5230: Applied the patch and it seems to work at a first glance. Thank you very much for your quick reaction on https://issues.apache.org/jira/browse/SOLR-5020, Joel! But for some scenarios (e.g. expensive post-filters) it might be a drawback that the phases cannot be distinguished in the finish() method. What do you think about introducing a second method DelegatingCollector.finishAfterGrouping() or similar that is called in the second phase instead of finish()? Call DelegatingCollector.finish() during grouping - Key: SOLR-5230 URL: https://issues.apache.org/jira/browse/SOLR-5230 Project: Solr Issue Type: New Feature Components: search Affects Versions: 4.4 Reporter: Joel Bernstein Priority: Minor Fix For: 4.5, 5.0 Attachments: SOLR-5230.patch This is an add-on to SOLR-5020 to call the new DelegatingCollector.finish() method from inside the grouping flow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5230) Call DelegatingCollector.finish() during grouping
[ https://issues.apache.org/jira/browse/SOLR-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764157#comment-13764157 ] Simon Endele edited comment on SOLR-5230 at 9/11/13 10:26 AM: -- Applied the patch and it seems to work at a first glance. Thank you very much for your quick reaction on SOLR-5020, Joel! But for some scenarios (e.g. expensive post-filters) it might be a drawback that the phases cannot be distinguished in the finish() method. What do you think about introducing a second method DelegatingCollector.finishAfterGrouping() or similar that is called in the second phase instead of finish()? was (Author: simon.endele): Applied the patch and it seems to work at a first glance. Thank you very much for your quick reaction on https://issues.apache.org/jira/browse/SOLR-5020, Joel! But for some scenarios (e.g. expensive post-filters) it might be a drawback that the phases cannot be distinguished in the finish() method. What do you think about introducing a second method DelegatingCollector.finishAfterGrouping() or similar that is called in the second phase instead of finish()? Call DelegatingCollector.finish() during grouping - Key: SOLR-5230 URL: https://issues.apache.org/jira/browse/SOLR-5230 Project: Solr Issue Type: New Feature Components: search Affects Versions: 4.4 Reporter: Joel Bernstein Priority: Minor Fix For: 4.5, 5.0 Attachments: SOLR-5230.patch This is an add-on to SOLR-5020 to call the new DelegatingCollector.finish() method from inside the grouping flow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5230) Call DelegatingCollector.finish() during grouping
[ https://issues.apache.org/jira/browse/SOLR-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764390#comment-13764390 ] Simon Endele commented on SOLR-5230: Hm, I'm quite sure that collect() is called for all docs in both phases. Excerpt from my final result: lst name=grouped lst name=group_id int name=matches61/int int name=ngroups35/int arr name=groups [...] And collect() is called twice 61 times, followed by a call of finish() each. Call DelegatingCollector.finish() during grouping - Key: SOLR-5230 URL: https://issues.apache.org/jira/browse/SOLR-5230 Project: Solr Issue Type: New Feature Components: search Affects Versions: 4.4 Reporter: Joel Bernstein Priority: Minor Fix For: 4.5, 5.0 Attachments: SOLR-5230.patch This is an add-on to SOLR-5020 to call the new DelegatingCollector.finish() method from inside the grouping flow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5020) Add finish() method to DelegatingCollector
[ https://issues.apache.org/jira/browse/SOLR-5020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762988#comment-13762988 ] Simon Endele commented on SOLR-5020: It looks like this isn't working in combination with grouping. Is that possible? I applied the attached patch to my Solr 4.4.0 workspace containing an AclQParserPlugin as described here: http://searchhub.org/2012/02/22/custom-security-filtering-in-solr/ It works without grouping, but if grouping is activated, the collect() method is still called, but finish() is not. Add finish() method to DelegatingCollector -- Key: SOLR-5020 URL: https://issues.apache.org/jira/browse/SOLR-5020 Project: Solr Issue Type: New Feature Components: search Affects Versions: 5.0 Reporter: Joel Bernstein Priority: Minor Fix For: 4.5, 5.0 Attachments: SOLR-5020.patch This issue adds a finish() method to the DelegatingCollector class so that it can be notified when collection is complete. The current collect() method assumes that the delegating collector will either forward on the document or not with each call. The finish() method will allow DelegatingCollectors to have more sophisticated behavior. For example a Field Collapsing delegating collector could collapse the documents as the collect() method is being called. Then when the finish() method is called it could pass the collapsed documents to the delegate collectors. This would allow grouping to be implemented within the PostFilter framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5021) NextDoc NPE safety when bulk collecting
[ https://issues.apache.org/jira/browse/LUCENE-5021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13761664#comment-13761664 ] Simon Endele commented on LUCENE-5021: -- I think what you originally searched for is this: SOLR-5020 NextDoc NPE safety when bulk collecting --- Key: LUCENE-5021 URL: https://issues.apache.org/jira/browse/LUCENE-5021 Project: Lucene - Core Issue Type: Improvement Components: core/index, core/other Affects Versions: 3.6.2 Environment: Any with custom filters Reporter: Alexis Torres Paderewski Labels: NPE,, Null-Safety, Scorer Hello, I would like to apply ACL once as a PostFilter and I therefore need to bulk this call since round trips would severely decrease performances. I tried to just stack them on the DelegatingCollector using this collect : @Override public void collect(int doc) throws IOException { while ((doc = scorer.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) { docs.put(getDocumentId(doc), doc); } batchCollect(); } Depending on the Scorer it may or it may not work. Indeed when the Scorer is Safe that is when it handles the case in which the scorer is exhausted and is called once again after exhaustion. This is the case of the (e.g. DisjunctionMaxScorer, ConstantScorer): if (numScorers == 0) return doc = NO_MORE_DOCS; On the other hand, when using the DisjunctionSumScorer, it either asserts on NO_MORE_DOCS, or it throws a NPE. Shouldn't we copy the DisjunctionMaxScorer mechanism to protect nextDoc of an exausted iterator using either current doc or checking numbers of subScorers ? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4645) Missing Adobe XMP library can abort DataImportHandler process
[ https://issues.apache.org/jira/browse/SOLR-4645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692961#comment-13692961 ] Simon Endele commented on SOLR-4645: Had the same problem. Worked for me. Thanks. Building solr.war with integrated SolrCell using Maven one can also use: dependency groupIdcom.adobe.xmp/groupId artifactIdxmpcore/artifactId version5.1.2/version /dependency See http://mvnrepository.com/artifact/com.adobe.xmp/xmpcore Missing Adobe XMP library can abort DataImportHandler process - Key: SOLR-4645 URL: https://issues.apache.org/jira/browse/SOLR-4645 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler, contrib - Solr Cell (Tika extraction) Affects Versions: 4.2 Reporter: Alexandre Rafalovitch Priority: Minor Fix For: 5.0 Solr distribution is missing Adobe XMP library ( http://www.adobe.com/devnet/xmp.html ). In particular code path, DIH+Tika tries to load an XMPException and fails with ClassNotFound. The library is present in Tika's distribution. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org