from:"Simon Endele \(JIRA\)"

[jira] [Created] (SOLR-11152) ClassNotFoundException: com.uwyn.jhighlight.renderer.XhtmlRendererFactory

2017-07-26 Thread Simon Endele (JIRA)

Simon Endele created SOLR-11152:
---

 Summary: ClassNotFoundException: 
com.uwyn.jhighlight.renderer.XhtmlRendererFactory
 Key: SOLR-11152
 URL: https://issues.apache.org/jira/browse/SOLR-11152
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: contrib - Solr Cell (Tika extraction)
Affects Versions: 6.6
Reporter: Simon Endele


We get the following error when trying to index/extract a tgz file with Solr 
6.6.0:

{code:java}
Caused by: java.lang.NoClassDefFoundError: 
com/uwyn/jhighlight/renderer/XhtmlRendererFactory
at 
org.apache.tika.parser.code.SourceCodeParser.getRenderer(SourceCodeParser.java:132)
at 
org.apache.tika.parser.code.SourceCodeParser.parse(SourceCodeParser.java:111)
at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
at 
org.apache.tika.parser.DelegatingParser.parse(DelegatingParser.java:72)
at 
org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor.parseEmbedded(ParsingEmbeddedDocumentExtractor.java:102)
at 
org.apache.tika.parser.pkg.PackageParser.parseEntry(PackageParser.java:219)
at 
org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:182)
at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
at 
org.apache.tika.parser.DelegatingParser.parse(DelegatingParser.java:72)
at 
org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor.parseEmbedded(ParsingEmbeddedDocumentExtractor.java:102)
at 
org.apache.tika.parser.pkg.PackageParser.parseEntry(PackageParser.java:219)
at 
org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:182)
at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
at 
org.apache.tika.parser.DelegatingParser.parse(DelegatingParser.java:72)
at 
org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor.parseEmbedded(ParsingEmbeddedDocumentExtractor.java:102)
at 
org.apache.tika.parser.pkg.CompressorParser.parse(CompressorParser.java:164)
at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
at 
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:228)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:173)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
... 29 more
Caused by: java.lang.ClassNotFoundException: 
com.uwyn.jhighlight.renderer.XhtmlRendererFactory
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:814)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 60 more
{code}

It seems like the dependency 
[com.uwyn:jhighlight:1.0|https://mvnrepository.com/artifact/com.uwyn/jhighlight/1.0]
 is missing in {{contrib/extraction/lib}} in the Solr installation.

When placing it there, the indexation works perfectly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-6690) Highlight expanded results

2017-05-16 Thread Simon Endele (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Endele updated SOLR-6690:
---
Issue Type: Bug  (was: Wish)

> Highlight expanded results
> --
>
> Key: SOLR-6690
> URL: https://issues.apache.org/jira/browse/SOLR-6690
> Project: Solr
>  Issue Type: Bug
>  Components: highlighter
>Reporter: Simon Endele
>  Labels: expand, highlight
> Attachments: HighlightComponent.java.patch
>
>
> Is it possible to highlight documents in the "expand" section in the Solr 
> response?
> I'm aware that https://cwiki.apache.org/confluence/x/jiBqAg states:
> "All downstream components (faceting, highlighting, etc...) will work with 
> the collapsed result set."
> So I tried to put the highlight component after the expand component like 
> this:
> {code:xml}
>   query
>   facet
>   stats
>   debug
>   expand
>   highlight
> {code}
> But with no effect.
> Is there another switch that needs to be flipped or could this be implemented 
> easily?
> IMHO this is quite a common use case. And it was possible to highlight all 
> results of a group with the old grouping.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3085) Fix the dismax/edismax stopwords mm issue

2015-07-27 Thread Simon Endele (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642843#comment-14642843
 ] 

Simon Endele commented on SOLR-3085:


We're currently experiencing the same issue with query terms that only contain 
non-alphanumerical characters, which are removed by the StandardTokenizer or 
WordDelimiterFilter, e.g. miles  more.
Will this case also be addressed by {{mm.autoRelax}}?

 Fix the dismax/edismax stopwords mm issue
 -

 Key: SOLR-3085
 URL: https://issues.apache.org/jira/browse/SOLR-3085
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Reporter: Jan Høydahl
  Labels: MinimumShouldMatch, dismax, edismax, stopwords
 Fix For: Trunk

 Attachments: SOLR-3085.patch, SOLR-3085.patch, SOLR-3085.patch


 As discussed here http://search-lucene.com/m/Wr7iz1a95jx and here 
 http://search-lucene.com/m/Yne042qEyCq1 and here 
 http://search-lucene.com/m/RfAp82nSsla DisMax has an issue with stopwords if 
 not all fields used in QF have exactly same stopword lists.
 Typical solution is to not use stopwords or harmonize stopword lists across 
 all fields in your QF, or relax the MM to a lower percentag. Sometimes these 
 are not acceptable workarounds, and we should find a better solution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-7314) Constants missing in Solrj

2015-04-08 Thread Simon Endele (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-7314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Endele updated SOLR-7314:
---
Description: 
There are some parameter names/values, for which constants are missing in 
SolrJ. One has always to declare constants for them by herself (or hard-code 
them).

* defType
* edismax (value for defType)
* dismax (value for defType)
* lucene (value for defType)
* spellcheck (there's SpellingParams.SPELLCHECK_PREFIX = spellcheck., but 
none without dot)
* [elevated] (pseudo field for the QueryElevationComponent)

See http://lucene.apache.org/solr/5_0_0/solr-solrj/constant-values.html

Maybe there are even more, but these are the ones I always stumble upon.
Of course there are constants in the Solr Core code, but typically one doesn't 
want to have a dependency on it when implementing a client.

  was:
There are some parameter names/values, for which constants are missing in 
SolrJ. One has always to declare constants for them by herself (or hard-code 
them).

* defType
* edismax (value for defType)
* dismax (value for defType)
* lucene (value for defType)
* spellcheck (there's SpellingParams.SPELLCHECK_PREFIX = spellcheck., but 
none without dot)

See http://lucene.apache.org/solr/5_0_0/solr-solrj/constant-values.html

Maybe there are even more, but these are the ones I always stumble upon.
Of course there are constants in the Solr Core code, but typically one doesn't 
want to have a dependency on it when implementing a client.


 Constants missing in Solrj
 --

 Key: SOLR-7314
 URL: https://issues.apache.org/jira/browse/SOLR-7314
 Project: Solr
  Issue Type: Wish
  Components: SolrJ
Reporter: Simon Endele

 There are some parameter names/values, for which constants are missing in 
 SolrJ. One has always to declare constants for them by herself (or hard-code 
 them).
 * defType
 * edismax (value for defType)
 * dismax (value for defType)
 * lucene (value for defType)
 * spellcheck (there's SpellingParams.SPELLCHECK_PREFIX = spellcheck., but 
 none without dot)
 * [elevated] (pseudo field for the QueryElevationComponent)
 See http://lucene.apache.org/solr/5_0_0/solr-solrj/constant-values.html
 Maybe there are even more, but these are the ones I always stumble upon.
 Of course there are constants in the Solr Core code, but typically one 
 doesn't want to have a dependency on it when implementing a client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6709) ClassCastException in QueryResponse after applying XMLResponseParser on a response containing an expanded section

2015-04-02 Thread Simon Endele (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-6709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392527#comment-14392527
 ] 

Simon Endele commented on SOLR-6709:


Thank you guys very much for fixing/reviewing and happy Easter!

 ClassCastException in QueryResponse after applying XMLResponseParser on a 
 response containing an expanded section
 ---

 Key: SOLR-6709
 URL: https://issues.apache.org/jira/browse/SOLR-6709
 Project: Solr
  Issue Type: Bug
  Components: SolrJ
Reporter: Simon Endele
Assignee: Varun Thacker
 Fix For: Trunk, 5.2

 Attachments: SOLR-6709.patch, SOLR-6709.patch, SOLR-6709.patch, 
 test-response.xml


 Shouldn't the following code work on the attached input file?
 It matches the structure of a Solr response with wt=xml.
 {code}import java.io.InputStream;
 import org.apache.solr.client.solrj.ResponseParser;
 import org.apache.solr.client.solrj.impl.XMLResponseParser;
 import org.apache.solr.client.solrj.response.QueryResponse;
 import org.apache.solr.common.util.NamedList;
 import org.junit.Test;
 public class ParseXmlExpandedTest {
   @Test
   public void test() {
   ResponseParser responseParser = new XMLResponseParser();
   InputStream inStream = getClass()
   .getResourceAsStream(test-response.xml);
   NamedListObject response = responseParser
   .processResponse(inStream, UTF-8);
   QueryResponse queryResponse = new QueryResponse(response, null);
   }
 }{code}
 Unexpectedly (for me), it throws a
 java.lang.ClassCastException: org.apache.solr.common.util.SimpleOrderedMap 
 cannot be cast to java.util.Map
 at 
 org.apache.solr.client.solrj.response.QueryResponse.setResponse(QueryResponse.java:126)
 Am I missing something, is XMLResponseParser deprecated or something?
 We use a setup like this to mock a QueryResponse for unit tests in our 
 service that post-processes the Solr response.
 Obviously, it works with the javabin format which SolrJ uses internally.
 But that is no appropriate format for unit tests, where the response should 
 be human readable.
 I think there's some conversion missing in QueryResponse or XMLResponseParser.
 Note: The null value supplied as SolrServer argument to the constructor of 
 QueryResponse shouldn't have an effect as the error occurs before the 
 parameter is even used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-7314) Constants missing in Solrj

2015-03-26 Thread Simon Endele (JIRA)

Simon Endele created SOLR-7314:
--

 Summary: Constants missing in Solrj
 Key: SOLR-7314
 URL: https://issues.apache.org/jira/browse/SOLR-7314
 Project: Solr
  Issue Type: Wish
  Components: SolrJ
Reporter: Simon Endele


There are some parameter names/values, for which constants are missing in 
SolrJ. One has always to declare constants for them by herself (or hard-code 
them).

* defType
* edismax (value for defType)
* dismax (value for defType)
* lucene (value for defType)
* spellcheck (there's SpellingParams.SPELLCHECK_PREFIX = spellcheck., but 
none without dot)

See http://lucene.apache.org/solr/5_0_0/solr-solrj/constant-values.html

Maybe there are even more, but these are the ones I always stumble upon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-7314) Constants missing in Solrj

2015-03-26 Thread Simon Endele (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-7314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Endele updated SOLR-7314:
---
Description: 
There are some parameter names/values, for which constants are missing in 
SolrJ. One has always to declare constants for them by herself (or hard-code 
them).

* defType
* edismax (value for defType)
* dismax (value for defType)
* lucene (value for defType)
* spellcheck (there's SpellingParams.SPELLCHECK_PREFIX = spellcheck., but 
none without dot)

See http://lucene.apache.org/solr/5_0_0/solr-solrj/constant-values.html

Maybe there are even more, but these are the ones I always stumble upon.
Of course there are constants in the Solr Core code, but typically one doesn't 
want to have a dependency on it when implementing a client.

  was:
There are some parameter names/values, for which constants are missing in 
SolrJ. One has always to declare constants for them by herself (or hard-code 
them).

* defType
* edismax (value for defType)
* dismax (value for defType)
* lucene (value for defType)
* spellcheck (there's SpellingParams.SPELLCHECK_PREFIX = spellcheck., but 
none without dot)

See http://lucene.apache.org/solr/5_0_0/solr-solrj/constant-values.html

Maybe there are even more, but these are the ones I always stumble upon.


 Constants missing in Solrj
 --

 Key: SOLR-7314
 URL: https://issues.apache.org/jira/browse/SOLR-7314
 Project: Solr
  Issue Type: Wish
  Components: SolrJ
Reporter: Simon Endele

 There are some parameter names/values, for which constants are missing in 
 SolrJ. One has always to declare constants for them by herself (or hard-code 
 them).
 * defType
 * edismax (value for defType)
 * dismax (value for defType)
 * lucene (value for defType)
 * spellcheck (there's SpellingParams.SPELLCHECK_PREFIX = spellcheck., but 
 none without dot)
 See http://lucene.apache.org/solr/5_0_0/solr-solrj/constant-values.html
 Maybe there are even more, but these are the ones I always stumble upon.
 Of course there are constants in the Solr Core code, but typically one 
 doesn't want to have a dependency on it when implementing a client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5332) Add preserve original setting to the EdgeNGramFilterFactory

2015-03-02 Thread Simon Endele (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14343414#comment-14343414
]

Simon Endele commented on SOLR-5332:

+1 for this feature.
We use the EdgeNGramFilterFactory on a tokenized field (in order to implement a
prefix search on index time) with minGramSize=3.
Unfortunately we observed that tokens with length 1 or 2 are actually deleted,
unexpectedly from our point of view.

Using a second field (though complicated IMHO) would address query-issues, but
it gets awkward when it comes to highlighting or phrase searches.
For instance when searching for us rep
- the field with EdgeNGramFilterFactory highlights rep in representative,
but not US as this token has been removed,
- the field without EdgeNGramFilterFactory highlights US, but not
representative as it has no prefixes indexed.

Bringing these highlightings together in one string is a quite complex task.
Not speaking of a phrase search, which does not work at all for the example
above.

We use minGramSize=3 to reduce collisions of prefixes and abbreviations (like
US and usage) and reduce the index size.
I admit, this does not prevent all collisions (e.g. USA still collides with
usage), but it's a compromise.

Nevertheless, minGramSize is a nice feature of EdgeNGramFilterFactory, but it
lacks a preserveOriginal flag IMO.

Add preserve original setting to the EdgeNGramFilterFactory
-

Key: SOLR-5332
URL: https://issues.apache.org/jira/browse/SOLR-5332
Project: Solr
Issue Type: Wish
Affects Versions: 4.4, 4.5, 4.5.1, 4.6
Reporter: Alexander S.

Hi, as described here:
http://lucene.472066.n3.nabble.com/Help-to-figure-out-why-query-does-not-match-td4086967.html
the problem is in that if you have these 2 strings to index:
1. facebook.com/someuser.1
2. facebook.com/someveryandverylongusername
and the edge ngram filter factory with min and max gram size settings 2 and
25, search requests for these urls will fail.
But search requests for:
1. facebook.com/someuser
2. facebook.com/someveryandverylonguserna
will work properly.
It's because first url has 1 at the end, which is lover than the allowed
min gram size. In the second url the user name is longer than the max gram
size (27 characters).
Would be good to have a preserve original option, that will add the
original string to the index if it does not fit the allowed gram size, so
that 1 and someveryandverylongusername tokens will also be added to the
index.
Best,
Alex

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-6782) PostingsSolrHighlighter produces strange highlight results

2014-11-24 Thread Simon Endele (JIRA)

Simon Endele created SOLR-6782:
--

 Summary: PostingsSolrHighlighter produces strange highlight results
 Key: SOLR-6782
 URL: https://issues.apache.org/jira/browse/SOLR-6782
 Project: Solr
  Issue Type: Bug
  Components: highlighter
Reporter: Simon Endele


If {{hl.fl}} contains commas _and_ whitespaces, e.g. {{hl.fl=title, content}}, 
the PostingsSolrHighlighter produces the following result:
{code}
  highlighting: {
mydoc1: {
  title: [],
  : [],
  content: [
my highlighted content. 
  ]
},
mydoc2: {
  title: [],
  : [],
  content: [
my highlighted content 2. 
  ]
}
  },
{code}

Two things:
- The space followed by the comma leads to an empty field (or even a bunch in 
case of longer field list).
- Why is {{title: [],}} included in the response (though 
{{hl.defaultSummary}} is not set)?

Tested with Solr 4.10.2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-6782) PostingsSolrHighlighter produces strange highlight results

2014-11-24 Thread Simon Endele (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-6782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Endele updated SOLR-6782:
---
Attachment: SOLR-6782.patch

I'm not a Solr expert, but if I understand the code right, this can be fixed 
with a few lines.

Added a patch that addresses both issues. The request above now produces the 
following response:
{code}
  highlighting: {
mydoc1: {
  content: [
my highlighted content. 
  ]
},
mydoc2: {
  content: [
my highlighted content 2. 
  ]
}
  },
{code}

Seems to work with {{hl.defaultSummary=true}}, too. Response:
{code}
  highlighting: {
mydoc1: {
  title: [
My Summary.
  ],
  content: [
my highlighted content. 
  ]
},
mydoc2: {
  title: [
My Summary 2.
  ],
  content: [
my highlighted content 2. 
  ]
}
  },
{code}

 PostingsSolrHighlighter produces strange highlight results
 --

 Key: SOLR-6782
 URL: https://issues.apache.org/jira/browse/SOLR-6782
 Project: Solr
  Issue Type: Bug
  Components: highlighter
Reporter: Simon Endele
 Attachments: SOLR-6782.patch


 If {{hl.fl}} contains commas _and_ whitespaces, e.g. {{hl.fl=title, 
 content}}, the PostingsSolrHighlighter produces the following result:
 {code}
   highlighting: {
 mydoc1: {
   title: [],
   : [],
   content: [
 my highlighted content. 
   ]
 },
 mydoc2: {
   title: [],
   : [],
   content: [
 my highlighted content 2. 
   ]
 }
   },
 {code}
 Two things:
 - The space followed by the comma leads to an empty field (or even a bunch in 
 case of longer field list).
 - Why is {{title: [],}} included in the response (though 
 {{hl.defaultSummary}} is not set)?
 Tested with Solr 4.10.2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-6783) SolrHighlighter does not accept globs in multi-valued hl.fl argument

2014-11-24 Thread Simon Endele (JIRA)

Simon Endele created SOLR-6783:
--

 Summary: SolrHighlighter does not accept globs in multi-valued 
hl.fl argument
 Key: SOLR-6783
 URL: https://issues.apache.org/jira/browse/SOLR-6783
 Project: Solr
  Issue Type: Bug
Reporter: Simon Endele


These two cases work correctly:
- hl.fl = *_text
- hl.fl = title_text,content_text,myfield

But the expression {{hl.fl=*_text,myfield}} results in empty highlighted docs 
when the default highlighter is used.
Using the PostingsSolrHighlighter it even causes the following exception:
{code}
java.lang.IllegalArgumentException: fieldsIn must not be empty
at 
org.apache.lucene.search.postingshighlight.PostingsHighlighter.highlightFieldsAsObjects(PostingsHighlighter.java:342)
at 
org.apache.lucene.search.postingshighlight.PostingsHighlighter.highlightFields(PostingsHighlighter.java:303)
at 
org.apache.solr.highlight.PostingsSolrHighlighter.doHighlighting(PostingsSolrHighlighter.java:140)
at 
org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:146)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:218)
{code}
Not yet tested with FastVectorHighlighter.

Tested with Solr 4.10.2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Closed] (SOLR-6783) SolrHighlighter does not accept globs in multi-valued hl.fl argument

2014-11-24 Thread Simon Endele (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-6783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Endele closed SOLR-6783.
--
Resolution: Duplicate

 SolrHighlighter does not accept globs in multi-valued hl.fl argument
 

 Key: SOLR-6783
 URL: https://issues.apache.org/jira/browse/SOLR-6783
 Project: Solr
  Issue Type: Bug
Reporter: Simon Endele

 These two cases work correctly:
 - hl.fl = *_text
 - hl.fl = title_text,content_text,myfield
 But the expression {{hl.fl=*_text,myfield}} results in empty highlighted docs 
 when the default highlighter is used.
 Using the PostingsSolrHighlighter it even causes the following exception:
 {code}
 java.lang.IllegalArgumentException: fieldsIn must not be empty
 at 
 org.apache.lucene.search.postingshighlight.PostingsHighlighter.highlightFieldsAsObjects(PostingsHighlighter.java:342)
 at 
 org.apache.lucene.search.postingshighlight.PostingsHighlighter.highlightFields(PostingsHighlighter.java:303)
 at 
 org.apache.solr.highlight.PostingsSolrHighlighter.doHighlighting(PostingsSolrHighlighter.java:140)
 at 
 org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:146)
 at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:218)
 {code}
 Not yet tested with FastVectorHighlighter.
 Tested with Solr 4.10.2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5127) Allow multiple wildcards in hl.fl

2014-11-24 Thread Simon Endele (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14222931#comment-14222931
 ] 

Simon Endele commented on SOLR-5127:


I implemented a similar solution, which seems to work for us.

May be interesting:
Using the PostingsSolrHighlighter an expression like {{hl.fl=*_text,myfield}} 
even causes the following exception:
{code}
java.lang.IllegalArgumentException: fieldsIn must not be empty
at 
org.apache.lucene.search.postingshighlight.PostingsHighlighter.highlightFieldsAsObjects(PostingsHighlighter.java:342)
at 
org.apache.lucene.search.postingshighlight.PostingsHighlighter.highlightFields(PostingsHighlighter.java:303)
at 
org.apache.solr.highlight.PostingsSolrHighlighter.doHighlighting(PostingsSolrHighlighter.java:140)
at 
org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:146)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:218)
{code}

 Allow multiple wildcards in hl.fl
 -

 Key: SOLR-5127
 URL: https://issues.apache.org/jira/browse/SOLR-5127
 Project: Solr
  Issue Type: New Feature
  Components: highlighter
Affects Versions: 3.6.1, 4.4
Reporter: Sven-S. Porst
 Attachments: highlight-wildcards.patch


 When a wildcard is present in the hl.fl field, the field is not split up at 
 commas/spaces into components. As a consequence settings like 
 hl.fl=*_highlight,*_data do not work.
 Splitting the string first and evaluating wildcards on each component 
 afterwards would be more powerful and consistent with the documentation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-6759) ExpandComponent does not call finish() on DelegatingCollectors

2014-11-19 Thread Simon Endele (JIRA)

Simon Endele created SOLR-6759:
--

 Summary: ExpandComponent does not call finish() on 
DelegatingCollectors
 Key: SOLR-6759
 URL: https://issues.apache.org/jira/browse/SOLR-6759
 Project: Solr
  Issue Type: Bug
Reporter: Simon Endele


We have a PostFilter for ACL filtering in action that has a similar structure 
as CollapsingQParserPlugin, i.e. it's DelegatingCollector gathers all documents 
and calls delegate.collect() for all docs finally in its finish() method.

In contrast to CollapsingQParserPlugin our PostFilter is also called by the 
ExpandComponent (for purpose).
But as the finish method is never called by the ExpandComponent, the expand 
section in the result is always empty.

Tested with Solr 4.10.2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-6759) ExpandComponent does not call finish() on DelegatingCollectors

2014-11-19 Thread Simon Endele (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-6759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Endele updated SOLR-6759:
---
Attachment: ExpandComponent.java.patch

I'm not a Solr expert, but if I understand the code right, this can be fixed 
with a few lines.
Added a patch. Seems to work for us.

 ExpandComponent does not call finish() on DelegatingCollectors
 --

 Key: SOLR-6759
 URL: https://issues.apache.org/jira/browse/SOLR-6759
 Project: Solr
  Issue Type: Bug
Reporter: Simon Endele
 Attachments: ExpandComponent.java.patch


 We have a PostFilter for ACL filtering in action that has a similar structure 
 as CollapsingQParserPlugin, i.e. it's DelegatingCollector gathers all 
 documents and calls delegate.collect() for all docs finally in its finish() 
 method.
 In contrast to CollapsingQParserPlugin our PostFilter is also called by the 
 ExpandComponent (for purpose).
 But as the finish method is never called by the ExpandComponent, the expand 
 section in the result is always empty.
 Tested with Solr 4.10.2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-6690) Highlight expanded results

2014-11-10 Thread Simon Endele (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Endele updated SOLR-6690:
---
Description: 
Is it possible to highlight documents in the expand section in the Solr 
response?

I'm aware that https://cwiki.apache.org/confluence/x/jiBqAg states:
All downstream components (faceting, highlighting, etc...) will work with the 
collapsed result set.

So I tried to put the highlight component after the expand component like this:
{code:xml}arr name=components
strquery/str
strfacet/str
strstats/str
strdebug/str
strexpand/str
strhighlight/str
/arr{code}
But with no effect.

Is there another switch that needs to be flipped or could this be implemented 
easily?
IMHO this is quite a common use case. And it was possible to highlight all 
results of a group with the old grouping.

  was:
Is it possible to highlight documents in the expand section in the Solr 
response?

I'm aware that https://cwiki.apache.org/confluence/x/jiBqAg states:
All downstream components (faceting, highlighting, etc...) will work with the 
collapsed result set.

So I tried to put the highlight component after the expand component like this:
{code:xml}arr name=components
strquery/str
strfacet/str
strstats/str
strdebug/str
strexpand/str
strhighlight/str
/arr{code}
But with no effect.

Is there another switch that needs to be flipped or could this be implemented 
easily?
IMHO this is quite a common use case...


 Highlight expanded results
 --

 Key: SOLR-6690
 URL: https://issues.apache.org/jira/browse/SOLR-6690
 Project: Solr
  Issue Type: Wish
  Components: highlighter
Reporter: Simon Endele
  Labels: expand, highlight
 Attachments: HighlightComponent.java.patch


 Is it possible to highlight documents in the expand section in the Solr 
 response?
 I'm aware that https://cwiki.apache.org/confluence/x/jiBqAg states:
 All downstream components (faceting, highlighting, etc...) will work with 
 the collapsed result set.
 So I tried to put the highlight component after the expand component like 
 this:
 {code:xml}arr name=components
   strquery/str
   strfacet/str
   strstats/str
   strdebug/str
   strexpand/str
   strhighlight/str
 /arr{code}
 But with no effect.
 Is there another switch that needs to be flipped or could this be implemented 
 easily?
 IMHO this is quite a common use case. And it was possible to highlight all 
 results of a group with the old grouping.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-6690) Highlight expanded results

2014-11-06 Thread Simon Endele (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Endele updated SOLR-6690:
---
Priority: Major  (was: Minor)

 Highlight expanded results
 --

 Key: SOLR-6690
 URL: https://issues.apache.org/jira/browse/SOLR-6690
 Project: Solr
  Issue Type: Wish
  Components: highlighter
Reporter: Simon Endele
  Labels: expand, highlight
 Attachments: HighlightComponent.java.patch


 Is it possible to highlight documents in the expand section in the Solr 
 response?
 I'm aware that https://cwiki.apache.org/confluence/x/jiBqAg states:
 All downstream components (faceting, highlighting, etc...) will work with 
 the collapsed result set.
 So I tried to put the highlight component after the expand component like 
 this:
 {code:xml}arr name=components
   strquery/str
   strfacet/str
   strstats/str
   strdebug/str
   strexpand/str
   strhighlight/str
 /arr{code}
 But with no effect.
 Is there another switch that needs to be flipped or could this be implemented 
 easily?
 IMHO this is quite a common use case...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-6709) ClassCastException in QueryResponse after applying XMLResponseParser on a response containing an expanded section

2014-11-06 Thread Simon Endele (JIRA)

Simon Endele created SOLR-6709:
--

 Summary: ClassCastException in QueryResponse after applying 
XMLResponseParser on a response containing an expanded section
 Key: SOLR-6709
 URL: https://issues.apache.org/jira/browse/SOLR-6709
 Project: Solr
  Issue Type: Bug
  Components: SolrJ
Reporter: Simon Endele


Shouldn't the following code work on the attached input file?
It matches the structure of a Solr response with wt=xml.

{code}import java.io.InputStream;
import org.apache.solr.client.solrj.ResponseParser;
import org.apache.solr.client.solrj.impl.XMLResponseParser;
import org.apache.solr.client.solrj.response.QueryResponse;
import org.apache.solr.common.util.NamedList;
import org.junit.Test;

public class ParseXmlExpandedTest {
@Test
public void test() {
ResponseParser responseParser = new XMLResponseParser();
InputStream inStream = getClass()
.getResourceAsStream(test-response.xml);
NamedListObject response = responseParser
.processResponse(inStream, UTF-8);
QueryResponse queryResponse = new QueryResponse(response, null);
}
}{code}

Unexpectedly (for me), it throws a
java.lang.ClassCastException: org.apache.solr.common.util.SimpleOrderedMap 
cannot be cast to java.util.Map
at 
org.apache.solr.client.solrj.response.QueryResponse.setResponse(QueryResponse.java:126)

Am I missing something, is XMLResponseParser deprecated or something?

We use a setup like this to mock a QueryResponse for unit tests in our 
service that post-processes the Solr response.
Obviously, it works with the javabin format which SolrJ uses internally.
But that is no appropriate format for unit tests, where the response should be 
human readable.

I think there's some conversion missing in QueryResponse or XMLResponseParser.

Note: The null value supplied as SolrServer argument to the constructor of 
QueryResponse shouldn't have an effect as the error occurs before the parameter 
is even used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-6709) ClassCastException in QueryResponse after applying XMLResponseParser on a response containing an expanded section

2014-11-06 Thread Simon Endele (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-6709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Endele updated SOLR-6709:
---
Attachment: test-response.xml

 ClassCastException in QueryResponse after applying XMLResponseParser on a 
 response containing an expanded section
 ---

 Key: SOLR-6709
 URL: https://issues.apache.org/jira/browse/SOLR-6709
 Project: Solr
  Issue Type: Bug
  Components: SolrJ
Reporter: Simon Endele
 Attachments: test-response.xml


 Shouldn't the following code work on the attached input file?
 It matches the structure of a Solr response with wt=xml.
 {code}import java.io.InputStream;
 import org.apache.solr.client.solrj.ResponseParser;
 import org.apache.solr.client.solrj.impl.XMLResponseParser;
 import org.apache.solr.client.solrj.response.QueryResponse;
 import org.apache.solr.common.util.NamedList;
 import org.junit.Test;
 public class ParseXmlExpandedTest {
   @Test
   public void test() {
   ResponseParser responseParser = new XMLResponseParser();
   InputStream inStream = getClass()
   .getResourceAsStream(test-response.xml);
   NamedListObject response = responseParser
   .processResponse(inStream, UTF-8);
   QueryResponse queryResponse = new QueryResponse(response, null);
   }
 }{code}
 Unexpectedly (for me), it throws a
 java.lang.ClassCastException: org.apache.solr.common.util.SimpleOrderedMap 
 cannot be cast to java.util.Map
 at 
 org.apache.solr.client.solrj.response.QueryResponse.setResponse(QueryResponse.java:126)
 Am I missing something, is XMLResponseParser deprecated or something?
 We use a setup like this to mock a QueryResponse for unit tests in our 
 service that post-processes the Solr response.
 Obviously, it works with the javabin format which SolrJ uses internally.
 But that is no appropriate format for unit tests, where the response should 
 be human readable.
 I think there's some conversion missing in QueryResponse or XMLResponseParser.
 Note: The null value supplied as SolrServer argument to the constructor of 
 QueryResponse shouldn't have an effect as the error occurs before the 
 parameter is even used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-6690) Highlight expanded results

2014-11-05 Thread Simon Endele (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Simon Endele updated SOLR-6690:
---
Attachment: HighlightComponent.java.patch

Added a patch for Solr core trunk.
I'm not a Solr core expert. It's just a rough sketch, but it seems to work.

Still to do:
- The order of the ExpandComponent and the HighlightComponent needs to be
switched to make it work (as mentioned in the issue description). I'm not sure
what effects changing the default order in
org.apache.solr.handler.component.SearchHandler.getDefaultComponents() may have.
- It would be good to have a config param to turn this on, I guess. Suggestion:
{{hl.expanded=true/false}}.

Highlight expanded results
--

Key: SOLR-6690
URL: https://issues.apache.org/jira/browse/SOLR-6690
Project: Solr
Issue Type: Wish
Reporter: Simon Endele
Priority: Minor
Labels: expand, highlight
Attachments: HighlightComponent.java.patch

Is it possible to apply the highlighting to documents in the expand section
in the Solr response?
I'm aware that https://cwiki.apache.org/confluence/x/jiBqAg states:
All downstream components (faceting, highlighting, etc...) will work with
the collapsed result set.
So I tried to put the highlight component after the expand component like
this:
{code:xml}arr name=components
strquery/str
strfacet/str
strstats/str
strdebug/str
strexpand/str
strhighlight/str
/arr{code}
But with no effect.
Is there another switch that needs to be flipped or could this be implemented
easily?
IMHO this is quite a common use case...

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-6690) Highlight expanded results

2014-11-05 Thread Simon Endele (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Endele updated SOLR-6690:
---
Description: 
Is it possible to highlight documents in the expand section in the Solr 
response?

I'm aware that https://cwiki.apache.org/confluence/x/jiBqAg states:
All downstream components (faceting, highlighting, etc...) will work with the 
collapsed result set.

So I tried to put the highlight component after the expand component like this:
{code:xml}arr name=components
strquery/str
strfacet/str
strstats/str
strdebug/str
strexpand/str
strhighlight/str
/arr{code}
But with no effect.

Is there another switch that needs to be flipped or could this be implemented 
easily?
IMHO this is quite a common use case...

  was:
Is it possible to apply the highlighting to documents in the expand section 
in the Solr response?

I'm aware that https://cwiki.apache.org/confluence/x/jiBqAg states:
All downstream components (faceting, highlighting, etc...) will work with the 
collapsed result set.

So I tried to put the highlight component after the expand component like this:
{code:xml}arr name=components
strquery/str
strfacet/str
strstats/str
strdebug/str
strexpand/str
strhighlight/str
/arr{code}
But with no effect.

Is there another switch that needs to be flipped or could this be implemented 
easily?
IMHO this is quite a common use case...


 Highlight expanded results
 --

 Key: SOLR-6690
 URL: https://issues.apache.org/jira/browse/SOLR-6690
 Project: Solr
  Issue Type: Wish
Reporter: Simon Endele
Priority: Minor
  Labels: expand, highlight
 Attachments: HighlightComponent.java.patch


 Is it possible to highlight documents in the expand section in the Solr 
 response?
 I'm aware that https://cwiki.apache.org/confluence/x/jiBqAg states:
 All downstream components (faceting, highlighting, etc...) will work with 
 the collapsed result set.
 So I tried to put the highlight component after the expand component like 
 this:
 {code:xml}arr name=components
   strquery/str
   strfacet/str
   strstats/str
   strdebug/str
   strexpand/str
   strhighlight/str
 /arr{code}
 But with no effect.
 Is there another switch that needs to be flipped or could this be implemented 
 easily?
 IMHO this is quite a common use case...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-6690) Highlight expanded results

2014-11-05 Thread Simon Endele (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Endele updated SOLR-6690:
---
Component/s: highlighter

 Highlight expanded results
 --

 Key: SOLR-6690
 URL: https://issues.apache.org/jira/browse/SOLR-6690
 Project: Solr
  Issue Type: Wish
  Components: highlighter
Reporter: Simon Endele
Priority: Minor
  Labels: expand, highlight
 Attachments: HighlightComponent.java.patch


 Is it possible to highlight documents in the expand section in the Solr 
 response?
 I'm aware that https://cwiki.apache.org/confluence/x/jiBqAg states:
 All downstream components (faceting, highlighting, etc...) will work with 
 the collapsed result set.
 So I tried to put the highlight component after the expand component like 
 this:
 {code:xml}arr name=components
   strquery/str
   strfacet/str
   strstats/str
   strdebug/str
   strexpand/str
   strhighlight/str
 /arr{code}
 But with no effect.
 Is there another switch that needs to be flipped or could this be implemented 
 easily?
 IMHO this is quite a common use case...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-6690) Highlight expanded results

2014-11-03 Thread Simon Endele (JIRA)

Simon Endele created SOLR-6690:
--

 Summary: Highlight expanded results
 Key: SOLR-6690
 URL: https://issues.apache.org/jira/browse/SOLR-6690
 Project: Solr
  Issue Type: Wish
Reporter: Simon Endele
Priority: Minor


Is it possible to apply the highlighting to documents in the expand section 
in the Solr response?

I'm aware that https://cwiki.apache.org/confluence/x/jiBqAg states:
All downstream components (faceting, highlighting, etc...) will work with the 
collapsed result set.

So I tried to put the highlight component after the expand component like this:
{code:xml}arr name=components
strquery/str
strfacet/str
strstats/str
strdebug/str
strexpand/str
strhighlight/str
/arr{code}
But with no effect.

Is there another switch that needs to be flipped or could this be implemented 
easily?
IMHO this is quite a common use case...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1763) Integrate Solr Cell/Tika as an UpdateRequestProcessor

2014-10-08 Thread Simon Endele (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14163485#comment-14163485
 ] 

Simon Endele commented on SOLR-1763:


I'd appreciate this feature, because it would also be possible to post-process 
the output of Tika.

 Integrate Solr Cell/Tika as an UpdateRequestProcessor
 -

 Key: SOLR-1763
 URL: https://issues.apache.org/jira/browse/SOLR-1763
 Project: Solr
  Issue Type: New Feature
  Components: update
Reporter: Jan Høydahl
  Labels: extracting_request_handler, solr_cell, tika, 
 update_request_handler

 From Chris Hostetter's original post in solr-dev:
 As someone with very little knowledge of Solr Cell and/or Tika, I find myself 
 wondering if ExtractingRequestHandler would make more sense as an 
 extractingUpdateProcessor -- where it could be configured to take take either 
 binary fields (or string fields containing URLs) out of the Documents, parse 
 them with tika, and add the various XPath matching hunks of text back into 
 the document as new fields.
 Then ExtractingRequestHandler just becomes a handler that slurps up it's 
 ContentStreams and adds them as binary data fields and adds the other literal 
 params as fields.
 Wouldn't that make things like SOLR-1358, and using Tika with URLs/filepaths 
 in XML and CSV based updates fairly trivial?
 -Hoss
 I couldn't agree more, so I decided to add it as an issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6158) Solr looks up configSets in the wrong directory

2014-06-11 Thread Simon Endele (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-6158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14027497#comment-14027497
 ] 

Simon Endele commented on SOLR-6158:


No problem. Thanks a lot for the quick response and the fix!

 Solr looks up configSets in the wrong directory
 ---

 Key: SOLR-6158
 URL: https://issues.apache.org/jira/browse/SOLR-6158
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.8, 4.8.1
Reporter: Simon Endele
Assignee: Alan Woodward
 Attachments: SOLR-6158.patch


 I tried the small tutorial on http://heliosearch.org/solr-4-8-features/ to 
 create Named Config Sets based on the Solr example shipped with Solr 4.8.1 
 (like it's done in the tutorial, same problem with 4.8.0).
 Creating a new core with a configSet seems to work (directory 'books' and 
 'books/core.properties' are created correctly).
 But loading the new core does not work:
 {code:none}67446 [qtp25155085-11] INFO  
 org.apache.solr.handler.admin.CoreAdminHandler  core create command 
 configSet=genericname=booksaction=CREATE
 67452 [qtp25155085-11] ERROR org.apache.solr.core.CoreContainer  Unable to 
 create core: books
 org.apache.solr.common.SolrException: Could not load configuration from 
 directory C:\dev\solr-4.8.1\example\configsets\generic
 at 
 org.apache.solr.core.ConfigSetService$Default.locateInstanceDir(ConfigSetService.java:145)
 at 
 org.apache.solr.core.ConfigSetService$Default.createCoreResourceLoader(ConfigSetService.java:130)
 at 
 org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:58)
 at org.apache.solr.core.CoreContainer.create(CoreContainer.java:554)
 ...
 {code}
 It seems like Solr looks up the config sets in the wrong directory:
 C:\dev\solr-4.8.1\example\configsets\generic (in the log above) instead of
 C:\dev\solr-4.8.1\example\solr\configsets\generic (like stated in the 
 tutorial and the documentation on 
 https://cwiki.apache.org/confluence/display/solr/Config+Sets)
 Moving the configsets directory one level up (into 'example') will work.
 But as of the documentation (and the tutorial) it should be located in the 
 solr home directory.
 In case I'm completely wrong and everythings works as expected, how can the 
 configsets directory be configured?
 The documentation on 
 https://cwiki.apache.org/confluence/display/solr/Config+Sets mentions a 
 configurable configset base directory, but I can't find any information on 
 the web.
 Another thing: If it would work as I expect, the references lib 
 dir=../../../contrib/extraction/lib regex=.*\.jar / etc. in 
 solr-4.8.1/example/solr/configsets/generic/conf/solrconfig.xml should get one 
 more ../ added, I guess (missing in the tutorial).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-6158) Solr looks up configSets in the wrong directory

2014-06-11 Thread Simon Endele (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-6158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14027518#comment-14027518
 ] 

Simon Endele commented on SOLR-6158:


For all who may stumble upon this: Your solr.xml should look like this (for the 
example project):
{code:xml}
solr
  str name=configSetBaseDir${configSetBaseDir:solr/configsets}/str
  ...
/solr
{code}

 Solr looks up configSets in the wrong directory
 ---

 Key: SOLR-6158
 URL: https://issues.apache.org/jira/browse/SOLR-6158
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.8, 4.8.1
Reporter: Simon Endele
Assignee: Alan Woodward
 Attachments: SOLR-6158.patch


 I tried the small tutorial on http://heliosearch.org/solr-4-8-features/ to 
 create Named Config Sets based on the Solr example shipped with Solr 4.8.1 
 (like it's done in the tutorial, same problem with 4.8.0).
 Creating a new core with a configSet seems to work (directory 'books' and 
 'books/core.properties' are created correctly).
 But loading the new core does not work:
 {code:none}67446 [qtp25155085-11] INFO  
 org.apache.solr.handler.admin.CoreAdminHandler  core create command 
 configSet=genericname=booksaction=CREATE
 67452 [qtp25155085-11] ERROR org.apache.solr.core.CoreContainer  Unable to 
 create core: books
 org.apache.solr.common.SolrException: Could not load configuration from 
 directory C:\dev\solr-4.8.1\example\configsets\generic
 at 
 org.apache.solr.core.ConfigSetService$Default.locateInstanceDir(ConfigSetService.java:145)
 at 
 org.apache.solr.core.ConfigSetService$Default.createCoreResourceLoader(ConfigSetService.java:130)
 at 
 org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:58)
 at org.apache.solr.core.CoreContainer.create(CoreContainer.java:554)
 ...
 {code}
 It seems like Solr looks up the config sets in the wrong directory:
 C:\dev\solr-4.8.1\example\configsets\generic (in the log above) instead of
 C:\dev\solr-4.8.1\example\solr\configsets\generic (like stated in the 
 tutorial and the documentation on 
 https://cwiki.apache.org/confluence/display/solr/Config+Sets)
 Moving the configsets directory one level up (into 'example') will work.
 But as of the documentation (and the tutorial) it should be located in the 
 solr home directory.
 In case I'm completely wrong and everythings works as expected, how can the 
 configsets directory be configured?
 The documentation on 
 https://cwiki.apache.org/confluence/display/solr/Config+Sets mentions a 
 configurable configset base directory, but I can't find any information on 
 the web.
 Another thing: If it would work as I expect, the references lib 
 dir=../../../contrib/extraction/lib regex=.*\.jar / etc. in 
 solr-4.8.1/example/solr/configsets/generic/conf/solrconfig.xml should get one 
 more ../ added, I guess (missing in the tutorial).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-6158) Solr looks up configSets in the wrong directory

2014-06-10 Thread Simon Endele (JIRA)

Simon Endele created SOLR-6158:
--

 Summary: Solr looks up configSets in the wrong directory
 Key: SOLR-6158
 URL: https://issues.apache.org/jira/browse/SOLR-6158
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.8.1, 4.8
Reporter: Simon Endele


I tried the small tutorial on http://heliosearch.org/solr-4-8-features/ to 
create Named Config Sets based on the Solr example shipped with Solr 4.8.1 
(like it's done in the tutorial, same problem with 4.8.0).
Creating a new core with a configSet seems to work (directory 'books' and 
'books/core.properties' are created correctly).

But loading the new core does not work:
{code:none}67446 [qtp25155085-11] INFO  
org.apache.solr.handler.admin.CoreAdminHandler  core create command 
configSet=genericname=booksaction=CREATE
67452 [qtp25155085-11] ERROR org.apache.solr.core.CoreContainer  Unable to 
create core: books
org.apache.solr.common.SolrException: Could not load configuration from 
directory C:\dev\solr-4.8.1\example\configsets\generic
at 
org.apache.solr.core.ConfigSetService$Default.locateInstanceDir(ConfigSetService.java:145)
at 
org.apache.solr.core.ConfigSetService$Default.createCoreResourceLoader(ConfigSetService.java:130)
at 
org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:58)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:554)
...
{code}

It seems like Solr looks up the config sets in the wrong directory:
C:\dev\solr-4.8.1\example\configsets\generic (in the log above) instead of
C:\dev\solr-4.8.1\example\solr\configsets\generic (like stated in the tutorial 
and the documentation on 
https://cwiki.apache.org/confluence/display/solr/Config+Sets)

Moving the configsets directory one level up (into 'example') will work.
But as of the documentation (and the tutorial) it should be located in the solr 
home directory.

In case I'm completely wrong and everythings works as expected, how can one 
configure the configsets directory be configured?
The documentation on 
https://cwiki.apache.org/confluence/display/solr/Config+Sets mentions a 
configurable configset base directory, but I can't find any information on 
the web.

Another thing: If it would work as I expect, the references lib 
dir=../../../contrib/extraction/lib regex=.*\.jar / etc. in 
solr-4.8.1/example/solr/configsets/generic/conf/solrconfig.xml should get one 
more ../ added, I guess (missing in the tutorial).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-6158) Solr looks up configSets in the wrong directory

2014-06-10 Thread Simon Endele (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-6158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Endele updated SOLR-6158:
---

Description: 
I tried the small tutorial on http://heliosearch.org/solr-4-8-features/ to 
create Named Config Sets based on the Solr example shipped with Solr 4.8.1 
(like it's done in the tutorial, same problem with 4.8.0).
Creating a new core with a configSet seems to work (directory 'books' and 
'books/core.properties' are created correctly).

But loading the new core does not work:
{code:none}67446 [qtp25155085-11] INFO  
org.apache.solr.handler.admin.CoreAdminHandler  core create command 
configSet=genericname=booksaction=CREATE
67452 [qtp25155085-11] ERROR org.apache.solr.core.CoreContainer  Unable to 
create core: books
org.apache.solr.common.SolrException: Could not load configuration from 
directory C:\dev\solr-4.8.1\example\configsets\generic
at 
org.apache.solr.core.ConfigSetService$Default.locateInstanceDir(ConfigSetService.java:145)
at 
org.apache.solr.core.ConfigSetService$Default.createCoreResourceLoader(ConfigSetService.java:130)
at 
org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:58)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:554)
...
{code}

It seems like Solr looks up the config sets in the wrong directory:
C:\dev\solr-4.8.1\example\configsets\generic (in the log above) instead of
C:\dev\solr-4.8.1\example\solr\configsets\generic (like stated in the tutorial 
and the documentation on 
https://cwiki.apache.org/confluence/display/solr/Config+Sets)

Moving the configsets directory one level up (into 'example') will work.
But as of the documentation (and the tutorial) it should be located in the solr 
home directory.

In case I'm completely wrong and everythings works as expected, how can the 
configsets directory be configured?
The documentation on 
https://cwiki.apache.org/confluence/display/solr/Config+Sets mentions a 
configurable configset base directory, but I can't find any information on 
the web.

Another thing: If it would work as I expect, the references lib 
dir=../../../contrib/extraction/lib regex=.*\.jar / etc. in 
solr-4.8.1/example/solr/configsets/generic/conf/solrconfig.xml should get one 
more ../ added, I guess (missing in the tutorial).

  was:
I tried the small tutorial on http://heliosearch.org/solr-4-8-features/ to 
create Named Config Sets based on the Solr example shipped with Solr 4.8.1 
(like it's done in the tutorial, same problem with 4.8.0).
Creating a new core with a configSet seems to work (directory 'books' and 
'books/core.properties' are created correctly).

But loading the new core does not work:
{code:none}67446 [qtp25155085-11] INFO  
org.apache.solr.handler.admin.CoreAdminHandler  core create command 
configSet=genericname=booksaction=CREATE
67452 [qtp25155085-11] ERROR org.apache.solr.core.CoreContainer  Unable to 
create core: books
org.apache.solr.common.SolrException: Could not load configuration from 
directory C:\dev\solr-4.8.1\example\configsets\generic
at 
org.apache.solr.core.ConfigSetService$Default.locateInstanceDir(ConfigSetService.java:145)
at 
org.apache.solr.core.ConfigSetService$Default.createCoreResourceLoader(ConfigSetService.java:130)
at 
org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:58)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:554)
...
{code}

It seems like Solr looks up the config sets in the wrong directory:
C:\dev\solr-4.8.1\example\configsets\generic (in the log above) instead of
C:\dev\solr-4.8.1\example\solr\configsets\generic (like stated in the tutorial 
and the documentation on 
https://cwiki.apache.org/confluence/display/solr/Config+Sets)

Moving the configsets directory one level up (into 'example') will work.
But as of the documentation (and the tutorial) it should be located in the solr 
home directory.

In case I'm completely wrong and everythings works as expected, how can one 
configure the configsets directory be configured?
The documentation on 
https://cwiki.apache.org/confluence/display/solr/Config+Sets mentions a 
configurable configset base directory, but I can't find any information on 
the web.

Another thing: If it would work as I expect, the references lib 
dir=../../../contrib/extraction/lib regex=.*\.jar / etc. in 
solr-4.8.1/example/solr/configsets/generic/conf/solrconfig.xml should get one 
more ../ added, I guess (missing in the tutorial).


 Solr looks up configSets in the wrong directory
 ---

 Key: SOLR-6158
 URL: https://issues.apache.org/jira/browse/SOLR-6158
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.8, 4.8.1
Reporter: Simon Endele

 I tried the small tutorial on http://heliosearch.org/solr-4-8-features/ to 
 create

[jira] [Commented] (SOLR-5027) Field Collapsing PostFilter

2014-02-07 Thread Simon Endele (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13894749#comment-13894749
 ] 

Simon Endele commented on SOLR-5027:


Hi Joel,

a similar question to Phil John's one: Is it correct that no equivalent for 
group.limit of the old grouping is/will be available?
I.e. only one document is returned for each group and the ExpandComponent can 
be used to get more, right?

I always thought that the aim of the ExpandComponent is to return _additional_ 
docs in a sense that these documents were not hit by the query (we wrote a 
component by ourselves for that based on the old grouping functionality).
Will that be possible with the ExpandComponent, or will it only be possible to 
fetch n (or all) documents of each group that were hit and collapsed by the 
CollapsingQParserPlugin (each only for a single page, of course)?

See also my question above concerning a filter query for the ExpandComponent.

Thanks in advance,
Simon

 Field Collapsing PostFilter
 ---

 Key: SOLR-5027
 URL: https://issues.apache.org/jira/browse/SOLR-5027
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 5.0
Reporter: Joel Bernstein
Assignee: Joel Bernstein
Priority: Minor
 Fix For: 4.6, 5.0

 Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, 
 SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, 
 SOLR-5027.patch, SOLR-5027.patch


 This ticket introduces the *CollapsingQParserPlugin* 
 The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. 
 This is a high performance alternative to standard Solr field collapsing 
 (with *ngroups*) when the number of distinct groups in the result set is high.
 For example in one performance test, a search with 10 million full results 
 and 1 million collapsed groups:
 Standard grouping with ngroups : 17 seconds.
 CollapsingQParserPlugin: 300 milli-seconds.
 Sample syntax:
 Collapse based on the highest scoring document:
 {code}
 fq=(!collapse field=field_name}
 {code}
 Collapse based on the min value of a numeric field:
 {code}
 fq={!collapse field=field_name min=field_name}
 {code}
 Collapse based on the max value of a numeric field:
 {code}
 fq={!collapse field=field_name max=field_name}
 {code}
 Collapse with a null policy:
 {code}
 fq={!collapse field=field_name nullPolicy=null_policy}
 {code}
 There are three null policies:
 ignore : removes docs with a null value in the collapse field (default).
 expand : treats each doc with a null value in the collapse field as a 
 separate group.
 collapse : collapses all docs with a null value into a single group using 
 either highest score, or min/max.
 The CollapsingQParserPlugin also fully supports the QueryElevationComponent
 *Note:*  The July 16 patch also includes and ExpandComponent that expands the 
 collapsed groups for the current search result page. This functionality will 
 be moved to it's own ticket.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5375) Param literalsOverride for ExtractingRequestHandler / SolrCell does not consider lowernames

2013-10-22 Thread Simon Endele (JIRA)

Simon Endele created SOLR-5375:
--

 Summary: Param literalsOverride for ExtractingRequestHandler / 
SolrCell does not consider lowernames
 Key: SOLR-5375
 URL: https://issues.apache.org/jira/browse/SOLR-5375
 Project: Solr
  Issue Type: Bug
Reporter: Simon Endele
Priority: Minor


Can be reproduced with the following command and the example configuration 
shipped with Solr:

cd exampledocs
curl -F file=@hd.xml 
http://localhost:8983/solr/update/extract?commit=trueliteral.id=myidliteralsOverride=truelowernames=trueliteral.content_type=mytype;

The added doc contains both values:
http://localhost:8983/solr/collection1/select?q=id%3Amyidwt=xmlindent=true
{code:xml}arr name=content_type
strmytype/str
strapplication/xml/str
/arr{code}

If the corresponding field is not multi-valued, the request raises an 
org.apache.solr.common.SolrException: ERROR: multiple values encountered for 
non multiValued field content_type: 

Debugging the code (Solr 4.4.0) I found out that the parameter lowernames is 
not considered at several places in 
org.apache.solr.handler.extraction.SolrContentHandler looking like:
{code}if (literalsOverride  literalFieldNames.contains(name))
continue;
{code}

The same problem occurs for the following command (though its correctness could 
be discussed):
curl -F file=@hd.xml 
http://localhost:8983/solr/update/extract?commit=trueliteral.id=myidliteralsOverride=truelowernames=falsefmap.Content-Type=content_typeliteral.content_type=mytype;



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1856) In Solr Cell, literals should override Tika-parsed values

2013-10-22 Thread Simon Endele (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13801643#comment-13801643
 ] 

Simon Endele commented on SOLR-1856:


Did so, see SOLR-5375.

 In Solr Cell, literals should override Tika-parsed values
 -

 Key: SOLR-1856
 URL: https://issues.apache.org/jira/browse/SOLR-1856
 Project: Solr
  Issue Type: Improvement
  Components: contrib - Solr Cell (Tika extraction)
Reporter: Chris Harris
Assignee: Jan Høydahl
 Fix For: 4.0-BETA, 5.0

 Attachments: SOLR-1856.patch, SOLR-1856.patch


 I propose that ExtractingRequestHandler / SolrCell literals should take 
 precedence over Tika-parsed metadata in all situations, including where 
 multiValued=true. (Compare SOLR-1633?)
 My personal motivation is that I have several fields (e.g. title, date) 
 where my own metadata is much superior to what Tika offers, and I want to 
 throw those Tika values away. (I actually wouldn't mind throwing away _all_ 
 Tika-parsed values, but let's set that aside.) SOLR-1634 is one potential 
 approach to this, but the fix here might be simpler.
 I'll attach a patch shortly.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-5375) Param literalsOverride for ExtractingRequestHandler / SolrCell does not consider lowernames

2013-10-22 Thread Simon Endele (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13801752#comment-13801752
 ] 

Simon Endele edited comment on SOLR-5375 at 10/22/13 12:08 PM:
---

It's not as easy as I thought in the first place as there's another issue that 
bothers me and touches this one:
From my expectation, fmap should only be applied to the values returned from 
Tika and not to literals. So currently it is not possible to declare the 
following mapping (assuming lowernames=true):
literal.content_type = schema field content_type
content_type from Tika = schema field content_type_tika

This is what the following request should do IMO: 
literal.content_type=mytypefmap.content_type=content_type_tika
Instead both values are stored to content_type_tika.

The same problem exists for lowernames. If enabled it is not possible to fill 
schema fields containing upper-case letters using an ContentStreamUpdateRequest.

But this is a question of expected behavior and I'm afraid this would cause 
backwards compatibility issues.
What do you think?


was (Author: simon.endele):
It's not as easy as I thought in the first place as there's another issue that 
bothers me and touches this one:
From my expectation, fmap should only be applied to the values returned from 
Tika and not to literals. So currently it is not possible to declare the 
following mapping (assuming lowernames=true):
literal.content_type = schema field content_type
content_type from Tika = schema field content_type_tika
what the following request should do IMO: 
literal.content_type=mytypefmap.content_type=content_type_tika
Instead both values are stored to content_type_tika.

The same problem exists for lowernames. If enabled it is not possible to fill 
schema fields containing upper-case letters using an ContentStreamUpdateRequest.

But this is a question of expected behavior and I'm afraid this would cause 
backwards compatibility issues.
What do you think?

 Param literalsOverride for ExtractingRequestHandler / SolrCell does not 
 consider lowernames
 ---

 Key: SOLR-5375
 URL: https://issues.apache.org/jira/browse/SOLR-5375
 Project: Solr
  Issue Type: Bug
  Components: contrib - Solr Cell (Tika extraction)
Reporter: Simon Endele
Priority: Minor
 Fix For: 4.6


 Can be reproduced with the following command and the example configuration 
 shipped with Solr:
 cd exampledocs
 curl -F file=@hd.xml 
 http://localhost:8983/solr/update/extract?commit=trueliteral.id=myidliteralsOverride=truelowernames=trueliteral.content_type=mytype;
 The added doc contains both values:
 http://localhost:8983/solr/collection1/select?q=id%3Amyidwt=xmlindent=true
 {code:xml}arr name=content_type
 strmytype/str
 strapplication/xml/str
 /arr{code}
 If the corresponding field is not multi-valued, the request raises an 
 org.apache.solr.common.SolrException: ERROR: multiple values encountered for 
 non multiValued field content_type: 
 Debugging the code (Solr 4.4.0) I found out that the parameter lowernames 
 is not considered at several places in 
 org.apache.solr.handler.extraction.SolrContentHandler looking like:
 {code}if (literalsOverride  literalFieldNames.contains(name))
 continue;
 {code}
 The same problem occurs for the following command (though its correctness 
 could be discussed):
 curl -F file=@hd.xml 
 http://localhost:8983/solr/update/extract?commit=trueliteral.id=myidliteralsOverride=truelowernames=falsefmap.Content-Type=content_typeliteral.content_type=mytype;



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5375) Param literalsOverride for ExtractingRequestHandler / SolrCell does not consider lowernames

2013-10-22 Thread Simon Endele (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13801752#comment-13801752
 ] 

Simon Endele commented on SOLR-5375:


It's not as easy as I thought in the first place as there's another issue that 
bothers me and touches this one:
From my expectation, fmap should only be applied to the values returned from 
Tika and not to literals. So currently it is not possible to declare the 
following mapping (assuming lowernames=true):
literal.content_type = schema field content_type
content_type from Tika = schema field content_type_tika
what the following request should do IMO: 
literal.content_type=mytypefmap.content_type=content_type_tika
Instead both values are stored to content_type_tika.

The same problem exists for lowernames. If enabled it is not possible to fill 
schema fields containing upper-case letters using an ContentStreamUpdateRequest.

But this is a question of expected behavior and I'm afraid this would cause 
backwards compatibility issues.
What do you think?

 Param literalsOverride for ExtractingRequestHandler / SolrCell does not 
 consider lowernames
 ---

 Key: SOLR-5375
 URL: https://issues.apache.org/jira/browse/SOLR-5375
 Project: Solr
  Issue Type: Bug
  Components: contrib - Solr Cell (Tika extraction)
Reporter: Simon Endele
Priority: Minor
 Fix For: 4.6


 Can be reproduced with the following command and the example configuration 
 shipped with Solr:
 cd exampledocs
 curl -F file=@hd.xml 
 http://localhost:8983/solr/update/extract?commit=trueliteral.id=myidliteralsOverride=truelowernames=trueliteral.content_type=mytype;
 The added doc contains both values:
 http://localhost:8983/solr/collection1/select?q=id%3Amyidwt=xmlindent=true
 {code:xml}arr name=content_type
 strmytype/str
 strapplication/xml/str
 /arr{code}
 If the corresponding field is not multi-valued, the request raises an 
 org.apache.solr.common.SolrException: ERROR: multiple values encountered for 
 non multiValued field content_type: 
 Debugging the code (Solr 4.4.0) I found out that the parameter lowernames 
 is not considered at several places in 
 org.apache.solr.handler.extraction.SolrContentHandler looking like:
 {code}if (literalsOverride  literalFieldNames.contains(name))
 continue;
 {code}
 The same problem occurs for the following command (though its correctness 
 could be discussed):
 curl -F file=@hd.xml 
 http://localhost:8983/solr/update/extract?commit=trueliteral.id=myidliteralsOverride=truelowernames=falsefmap.Content-Type=content_typeliteral.content_type=mytype;



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5375) Param literalsOverride for ExtractingRequestHandler / SolrCell does not consider lowernames

2013-10-22 Thread Simon Endele (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Endele updated SOLR-5375:
---

Attachment: SolrContentHandler.java.patch

Added a patch for trunk that addresses only this specific issue.

 Param literalsOverride for ExtractingRequestHandler / SolrCell does not 
 consider lowernames
 ---

 Key: SOLR-5375
 URL: https://issues.apache.org/jira/browse/SOLR-5375
 Project: Solr
  Issue Type: Bug
  Components: contrib - Solr Cell (Tika extraction)
Reporter: Simon Endele
Priority: Minor
 Fix For: 4.6

 Attachments: SolrContentHandler.java.patch


 Can be reproduced with the following command and the example configuration 
 shipped with Solr:
 cd exampledocs
 curl -F file=@hd.xml 
 http://localhost:8983/solr/update/extract?commit=trueliteral.id=myidliteralsOverride=truelowernames=trueliteral.content_type=mytype;
 The added doc contains both values:
 http://localhost:8983/solr/collection1/select?q=id%3Amyidwt=xmlindent=true
 {code:xml}arr name=content_type
 strmytype/str
 strapplication/xml/str
 /arr{code}
 If the corresponding field is not multi-valued, the request raises an 
 org.apache.solr.common.SolrException: ERROR: multiple values encountered for 
 non multiValued field content_type: 
 Debugging the code (Solr 4.4.0) I found out that the parameter lowernames 
 is not considered at several places in 
 org.apache.solr.handler.extraction.SolrContentHandler looking like:
 {code}if (literalsOverride  literalFieldNames.contains(name))
 continue;
 {code}
 The same problem occurs for the following command (though its correctness 
 could be discussed):
 curl -F file=@hd.xml 
 http://localhost:8983/solr/update/extract?commit=trueliteral.id=myidliteralsOverride=truelowernames=falsefmap.Content-Type=content_typeliteral.content_type=mytype;



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1856) In Solr Cell, literals should override Tika-parsed values

2013-10-21 Thread Simon Endele (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13800850#comment-13800850
 ] 

Simon Endele commented on SOLR-1856:


Debugging the code (Solr 4.4.0) I found out that the parameter lowernames is 
not considered.
The request lowernames=trueliteralsOverride=trueliteral.url=myurl still 
raises an org.apache.solr.common.SolrException: ERROR: multiple values 
encountered for non multiValued field url: [.., ..], if a URL is extracted 
from the metadata of the binary.

 In Solr Cell, literals should override Tika-parsed values
 -

 Key: SOLR-1856
 URL: https://issues.apache.org/jira/browse/SOLR-1856
 Project: Solr
  Issue Type: Improvement
  Components: contrib - Solr Cell (Tika extraction)
Reporter: Chris Harris
Assignee: Jan Høydahl
 Fix For: 4.0-BETA, 5.0

 Attachments: SOLR-1856.patch, SOLR-1856.patch


 I propose that ExtractingRequestHandler / SolrCell literals should take 
 precedence over Tika-parsed metadata in all situations, including where 
 multiValued=true. (Compare SOLR-1633?)
 My personal motivation is that I have several fields (e.g. title, date) 
 where my own metadata is much superior to what Tika offers, and I want to 
 throw those Tika values away. (I actually wouldn't mind throwing away _all_ 
 Tika-parsed values, but let's set that aside.) SOLR-1634 is one potential 
 approach to this, but the fix here might be simpler.
 I'll attach a patch shortly.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5027) Result Set Collapse and Expand Plugins

2013-09-25 Thread Simon Endele (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13777303#comment-13777303
]

Simon Endele commented on SOLR-5027:

Sounds good.

I propose to add an additional parameter expand.fq to restrict the expanded
documents to a certain filter query.
Sometimes the complete groups are very large and should only be expanded by one
or a few representatives of that group. Other group members that are not hit by
the main query are not interesting (at least in the first place).

Note that this is different from adding a basic filter query, since documents
that are hit by the main query but not by expand.fq are kept.
Example: Group consisting of: representative A, more group members B and
C.
Query hits B, group is expanded by A, but not C (due to expand.fq) =
Result: A, B
A filter query before expanding would filter out B and thus yield no results
for this group.
A filter query after expanding would filter out B and C thus keep only A.

Is that technically possible? Maybe this is worth a separate issue...

Result Set Collapse and Expand Plugins
--

Key: SOLR-5027
URL: https://issues.apache.org/jira/browse/SOLR-5027
Project: Solr
Issue Type: New Feature
Components: search
Affects Versions: 5.0
Reporter: Joel Bernstein
Priority: Minor
Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch

This ticket introduces two new Solr plugins, the *CollapsingQParserPlugin*
and the *ExpandComponent*.
The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing.
This allows field collapsing to be done within the normal search flow.
Initial syntax:
fq=(!collapse field=field_name}
All documents in a group will be collapsed to the highest ranking document in
the group.
The *ExpandComponent* is a search component that takes the collapsed docList
and expands the groups for a single page based on parameters provided.
Initial syntax:
expand=true - Turns on the expand component.
expand.field=field - Expands results for this field
expand.limit=5 - Limits the documents for each expanded group.
expand.sort=sort spec - The sort spec for the expanded documents. Default
is score.
expand.rows=500 - The max number of expanded results to bring back. Default
is 500.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-5027) Result Set Collapse and Expand Plugins

2013-09-25 Thread Simon Endele (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13777303#comment-13777303
]

Simon Endele edited comment on SOLR-5027 at 9/25/13 9:19 AM:
-

Sounds good.

I propose to add an additional parameter expand.fq to restrict the expanded
documents to a certain filter query.
Sometimes the complete groups are very large and should only be expanded by one
or a few representatives of that group (which can be addressed with a filter
query). Other group members that are not hit by the main query are not
interesting (at least in the first place).

Note that this is different from adding a basic filter query, since documents
that are hit by the main query but not by expand.fq are kept.
Example: Group consisting of: representative A, more group members B and
C.
Query hits B, group is expanded by A (due to expand.fq), but not C =
Result: A, B
A filter query before expanding would filter out B and thus yield no results
for this group.
A filter query after expanding would filter out B and C thus keep only A.

Is that technically possible? Maybe this is worth a separate issue...

was (Author: simon.endele):
Sounds good.

Is that technically possible? Maybe this is worth a separate issue...

Result Set Collapse and Expand Plugins
--

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5270) lastModified not updating when selecting another core in Core Admin

2013-09-25 Thread Simon Endele (JIRA)

Simon Endele created SOLR-5270:
--

 Summary: lastModified not updating when selecting another core in 
Core Admin
 Key: SOLR-5270
 URL: https://issues.apache.org/jira/browse/SOLR-5270
 Project: Solr
  Issue Type: Bug
  Components: web gui
Reporter: Simon Endele
Priority: Minor


When selecting a core in the section Core Admin in the Solr Admin web UI, 
data like dataDir, version, numDocs, maxDoc are updated via JavaScript, but 
lastModified is not. A refresh of the page does the trick.

Had a look into the network traffic of my browser and it seems that the JSON 
fetched via AJAX contains the correct information.

Can be reproduced in different browsers with the example by cloning collection1 
into a collection2 and indexing collection2 anew by calling java -jar post.jar 
*.xml in the exampledocs directory.

Tested with Solr 4.4.0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2216) Highlighter query exceeds maxBooleanClause limit due to range query

2013-09-25 Thread Simon Endele (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13777691#comment-13777691
 ] 

Simon Endele commented on SOLR-2216:


Am I right in assuming that this isn't a problem when using the 
FastVectorHighlighter or the PostingsHighlighter?

 Highlighter query exceeds maxBooleanClause limit due to range query
 ---

 Key: SOLR-2216
 URL: https://issues.apache.org/jira/browse/SOLR-2216
 Project: Solr
  Issue Type: Bug
  Components: highlighter
Affects Versions: 1.4.1
 Environment: Linux solr-2.bizjournals.int 2.6.18-194.3.1.el5 #1 SMP 
 Thu May 13 13:08:30 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
 java version 1.6.0_21
 Java(TM) SE Runtime Environment (build 1.6.0_21-b06)
 Java HotSpot(TM) 64-Bit Server VM (build 17.0-b16, mixed mode)
 JAVA_OPTS=-client -Dcom.sun.management.jmxremote=true 
 -Dcom.sun.management.jmxremote.port= 
 -Dcom.sun.management.jmxremote.authenticate=true 
 -Dcom.sun.management.jmxremote.access.file=/root/.jmxaccess 
 -Dcom.sun.management.jmxremote.password.file=/root/.jmxpasswd 
 -Dcom.sun.management.jmxremote.ssl=false -XX:+UseCompressedOops 
 -XX:MaxPermSize=512M -Xms10240M -Xmx15360M -XX:+UseParallelGC 
 -XX:+AggressiveOpts -XX:NewRatio=5
 top - 11:38:49 up 124 days, 22:37,  1 user,  load average: 5.20, 4.35, 3.90
 Tasks: 220 total,   1 running, 219 sleeping,   0 stopped,   0 zombie
 Cpu(s): 47.5%us,  2.9%sy,  0.0%ni, 49.5%id,  0.1%wa,  0.0%hi,  0.0%si,  0.0%st
 Mem:  24679008k total, 18179980k used,  6499028k free,   125424k buffers
 Swap: 26738680k total,29276k used, 26709404k free,  8187444k cached
Reporter: Ken Stanley

 For a full detail of the issue, please see the mailing list: 
 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201011.mbox/%3CAANLkTimE8z8yOni+u0Nsbgct1=ef7e+su0_waku2c...@mail.gmail.com%3E
 The nutshell version of the issue is that when I have a query that contains 
 ranges on a specific (non-highlighted) field, the highlighter component is 
 attempting to create a query that exceeds the value of maxBooleanClauses set 
 from solrconfig.xml. This is despite my explicit setting of hl.field, 
 hl.requireFieldMatch, and various other hightlight options in the query. 
 As suggested by Koji in the follow-up response, I removed the range queries 
 from my main query, and SOLR and highlighting were happy to fulfill my 
 request. It was suggested that if removing the range queries worked that this 
 might potentially be a bug, hence my filing this JIRA ticket. For what it is 
 worth, if I move my range queries into an fq, I do not get the exception 
 about exceeding maxBooleanClauses, and I get the effect that I was looking 
 for. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5249) ClassNotFoundException due to white-spaces in solrconfig.xml

2013-09-20 Thread Simon Endele (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773140#comment-13773140
 ] 

Simon Endele commented on SOLR-5249:


Wow, thanks for your quick and detailed response!

I'm using Eclipse with default settings, so I thought this might bother some 
more people like me.

Eclipse inserts line-breaks and white-spaces at other places in the 
solrconfig.xml, which are ignored, for example in the defaults-section of a 
request handler:
{code}str name=hl.flcontent title field1 field2 field3
field4
/str{code}
Ok, this is maybe a bad example as the field list ist parsed.

As far I know class names are Java identifiers, which cannot contain any 
white-spaces. This certain code fragment only handles class names and no files, 
doesn't it?

 ClassNotFoundException due to white-spaces in solrconfig.xml
 

 Key: SOLR-5249
 URL: https://issues.apache.org/jira/browse/SOLR-5249
 Project: Solr
  Issue Type: Bug
Reporter: Simon Endele
Priority: Minor
 Attachments: SolrResourceLoader.java.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 Due to auto-formatting by an text editor/IDE there may be line-breaks after 
 class names in the solrconfig.xml, for example:
 {code:xml}searchComponent class=solr.SpellCheckComponent name=suggest
   lst name=spellchecker
   str name=namesuggest/str
   str 
 name=classnameorg.apache.solr.spelling.suggest.Suggester/str
   str 
 name=lookupImplorg.apache.solr.spelling.suggest.fst.WFSTLookupFactory
   /str
   [...]
   /lst
 /searchComponent{code}
 This will raise an exception in SolrResourceLoader as the white-spaces are 
 not stripped from the class name:
 {code}Caused by: org.apache.solr.common.SolrException: Error loading class 
 'org.apache.solr.spelling.suggest.fst.WFSTLookupFactory
   '
   at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:449)
   at 
 org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:471)
   at 
 org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:467)
   at org.apache.solr.spelling.suggest.Suggester.init(Suggester.java:102)
   at 
 org.apache.solr.handler.component.SpellCheckComponent.inform(SpellCheckComponent.java:623)
   at 
 org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:601)
   at org.apache.solr.core.SolrCore.init(SolrCore.java:830)
   ... 13 more
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.solr.spelling.suggest.fst.WFSTLookupFactory
   
   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
   at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:789)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
   at java.lang.Class.forName0(Native Method)
   at java.lang.Class.forName(Class.java:264)
   at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:433)
   ... 19 more{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-5249) ClassNotFoundException due to white-spaces in solrconfig.xml

2013-09-20 Thread Simon Endele (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773140#comment-13773140
 ] 

Simon Endele edited comment on SOLR-5249 at 9/20/13 4:18 PM:
-

Wow, thanks for your quick and detailed response!

I'm using Eclipse with default settings, so I thought this might bother some 
more people like me.

Eclipse inserts line-breaks and white-spaces at other places in the 
solrconfig.xml, which are ignored, for example in the defaults-section of a 
request handler:
{code}str name=hl.flcontent title field1 field2 field3
field4
/str{code}
Ok, this is maybe a bad example as the field list is parsed.

As far I know class names are Java identifiers, which cannot contain any 
white-spaces. This certain code fragment only handles class names and no files, 
doesn't it?

  was (Author: simon.endele):
Wow, thanks for your quick and detailed response!

I'm using Eclipse with default settings, so I thought this might bother some 
more people like me.

Eclipse inserts line-breaks and white-spaces at other places in the 
solrconfig.xml, which are ignored, for example in the defaults-section of a 
request handler:
{code}str name=hl.flcontent title field1 field2 field3
field4
/str{code}
Ok, this is maybe a bad example as the field list ist parsed.

As far I know class names are Java identifiers, which cannot contain any 
white-spaces. This certain code fragment only handles class names and no files, 
doesn't it?
  
 ClassNotFoundException due to white-spaces in solrconfig.xml
 

 Key: SOLR-5249
 URL: https://issues.apache.org/jira/browse/SOLR-5249
 Project: Solr
  Issue Type: Bug
Reporter: Simon Endele
Priority: Minor
 Attachments: SolrResourceLoader.java.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 Due to auto-formatting by an text editor/IDE there may be line-breaks after 
 class names in the solrconfig.xml, for example:
 {code:xml}searchComponent class=solr.SpellCheckComponent name=suggest
   lst name=spellchecker
   str name=namesuggest/str
   str 
 name=classnameorg.apache.solr.spelling.suggest.Suggester/str
   str 
 name=lookupImplorg.apache.solr.spelling.suggest.fst.WFSTLookupFactory
   /str
   [...]
   /lst
 /searchComponent{code}
 This will raise an exception in SolrResourceLoader as the white-spaces are 
 not stripped from the class name:
 {code}Caused by: org.apache.solr.common.SolrException: Error loading class 
 'org.apache.solr.spelling.suggest.fst.WFSTLookupFactory
   '
   at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:449)
   at 
 org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:471)
   at 
 org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:467)
   at org.apache.solr.spelling.suggest.Suggester.init(Suggester.java:102)
   at 
 org.apache.solr.handler.component.SpellCheckComponent.inform(SpellCheckComponent.java:623)
   at 
 org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:601)
   at org.apache.solr.core.SolrCore.init(SolrCore.java:830)
   ... 13 more
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.solr.spelling.suggest.fst.WFSTLookupFactory
   
   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
   at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:789)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
   at java.lang.Class.forName0(Native Method)
   at java.lang.Class.forName(Class.java:264)
   at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:433)
   ... 19 more{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5249) ClassNotFoundException due to white-spaces in solrconfig.xml

2013-09-18 Thread Simon Endele (JIRA)

Simon Endele created SOLR-5249:
--

 Summary: ClassNotFoundException due to white-spaces in 
solrconfig.xml
 Key: SOLR-5249
 URL: https://issues.apache.org/jira/browse/SOLR-5249
 Project: Solr
  Issue Type: Bug
  Components: SearchComponents - other
Reporter: Simon Endele
Priority: Minor


Due to auto-formatting by an text editor/IDE there may be line-breaks after 
class names in the solrconfig.xml, for example:

{code:xml}searchComponent class=solr.SpellCheckComponent name=suggest
lst name=spellchecker
str name=namesuggest/str
str 
name=classnameorg.apache.solr.spelling.suggest.Suggester/str
str 
name=lookupImplorg.apache.solr.spelling.suggest.fst.WFSTLookupFactory
/str
[...]
/lst
/searchComponent{code}

This will raise an exception in SolrResourceLoader as the white-spaces are not 
stripped from the class name:
{code}Caused by: org.apache.solr.common.SolrException: Error loading class 
'org.apache.solr.spelling.suggest.fst.WFSTLookupFactory
'
at 
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:449)
at 
org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:471)
at 
org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:467)
at org.apache.solr.spelling.suggest.Suggester.init(Suggester.java:102)
at 
org.apache.solr.handler.component.SpellCheckComponent.inform(SpellCheckComponent.java:623)
at 
org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:601)
at org.apache.solr.core.SolrCore.init(SolrCore.java:830)
... 13 more
Caused by: java.lang.ClassNotFoundException: 
org.apache.solr.spelling.suggest.fst.WFSTLookupFactory

at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:789)
at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at 
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:433)
... 19 more{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5249) ClassNotFoundException due to white-spaces in solrconfig.xml

2013-09-18 Thread Simon Endele (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Endele updated SOLR-5249:
---

Attachment: SolrResourceLoader.java.patch

Uploaded a patch for trunk.

 ClassNotFoundException due to white-spaces in solrconfig.xml
 

 Key: SOLR-5249
 URL: https://issues.apache.org/jira/browse/SOLR-5249
 Project: Solr
  Issue Type: Bug
  Components: SearchComponents - other
Reporter: Simon Endele
Priority: Minor
 Attachments: SolrResourceLoader.java.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 Due to auto-formatting by an text editor/IDE there may be line-breaks after 
 class names in the solrconfig.xml, for example:
 {code:xml}searchComponent class=solr.SpellCheckComponent name=suggest
   lst name=spellchecker
   str name=namesuggest/str
   str 
 name=classnameorg.apache.solr.spelling.suggest.Suggester/str
   str 
 name=lookupImplorg.apache.solr.spelling.suggest.fst.WFSTLookupFactory
   /str
   [...]
   /lst
 /searchComponent{code}
 This will raise an exception in SolrResourceLoader as the white-spaces are 
 not stripped from the class name:
 {code}Caused by: org.apache.solr.common.SolrException: Error loading class 
 'org.apache.solr.spelling.suggest.fst.WFSTLookupFactory
   '
   at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:449)
   at 
 org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:471)
   at 
 org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:467)
   at org.apache.solr.spelling.suggest.Suggester.init(Suggester.java:102)
   at 
 org.apache.solr.handler.component.SpellCheckComponent.inform(SpellCheckComponent.java:623)
   at 
 org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:601)
   at org.apache.solr.core.SolrCore.init(SolrCore.java:830)
   ... 13 more
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.solr.spelling.suggest.fst.WFSTLookupFactory
   
   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
   at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:789)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
   at java.lang.Class.forName0(Native Method)
   at java.lang.Class.forName(Class.java:264)
   at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:433)
   ... 19 more{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5249) ClassNotFoundException due to white-spaces in solrconfig.xml

2013-09-18 Thread Simon Endele (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Endele updated SOLR-5249:
---

Component/s: (was: SearchComponents - other)

 ClassNotFoundException due to white-spaces in solrconfig.xml
 

 Key: SOLR-5249
 URL: https://issues.apache.org/jira/browse/SOLR-5249
 Project: Solr
  Issue Type: Bug
Reporter: Simon Endele
Priority: Minor
 Attachments: SolrResourceLoader.java.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 Due to auto-formatting by an text editor/IDE there may be line-breaks after 
 class names in the solrconfig.xml, for example:
 {code:xml}searchComponent class=solr.SpellCheckComponent name=suggest
   lst name=spellchecker
   str name=namesuggest/str
   str 
 name=classnameorg.apache.solr.spelling.suggest.Suggester/str
   str 
 name=lookupImplorg.apache.solr.spelling.suggest.fst.WFSTLookupFactory
   /str
   [...]
   /lst
 /searchComponent{code}
 This will raise an exception in SolrResourceLoader as the white-spaces are 
 not stripped from the class name:
 {code}Caused by: org.apache.solr.common.SolrException: Error loading class 
 'org.apache.solr.spelling.suggest.fst.WFSTLookupFactory
   '
   at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:449)
   at 
 org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:471)
   at 
 org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:467)
   at org.apache.solr.spelling.suggest.Suggester.init(Suggester.java:102)
   at 
 org.apache.solr.handler.component.SpellCheckComponent.inform(SpellCheckComponent.java:623)
   at 
 org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:601)
   at org.apache.solr.core.SolrCore.init(SolrCore.java:830)
   ... 13 more
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.solr.spelling.suggest.fst.WFSTLookupFactory
   
   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
   at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:789)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
   at java.lang.Class.forName0(Native Method)
   at java.lang.Class.forName(Class.java:264)
   at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:433)
   ... 19 more{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5027) Result Set Collapse and Expand Plugins

2013-09-12 Thread Simon Endele (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13765381#comment-13765381
]

Simon Endele commented on SOLR-5027:

What do you mean exactly by there is no concept of ngroups or group facets?
Does that include that there will be no possibility to return the number of
groups, like the request parameter group.ngroups currently does?

Will it still be possible to decide if the faceting is done before/after
collapsing, similar to group.facet?

Result Set Collapse and Expand Plugins
--

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5230) Call DelegatingCollector.finish() during grouping

2013-09-11 Thread Simon Endele (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764157#comment-13764157
 ] 

Simon Endele commented on SOLR-5230:


Applied the patch and it seems to work at a first glance.
Thank you very much for your quick reaction on 
https://issues.apache.org/jira/browse/SOLR-5020, Joel!

But for some scenarios (e.g. expensive post-filters) it might be a drawback 
that the phases cannot be distinguished in the finish() method.
What do you think about introducing a second method 
DelegatingCollector.finishAfterGrouping() or similar that is called in the 
second phase instead of finish()?

 Call DelegatingCollector.finish() during grouping
 -

 Key: SOLR-5230
 URL: https://issues.apache.org/jira/browse/SOLR-5230
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 4.4
Reporter: Joel Bernstein
Priority: Minor
 Fix For: 4.5, 5.0

 Attachments: SOLR-5230.patch


 This is an add-on to SOLR-5020 to call the new DelegatingCollector.finish() 
 method from inside the grouping flow. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-5230) Call DelegatingCollector.finish() during grouping

2013-09-11 Thread Simon Endele (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764157#comment-13764157
 ] 

Simon Endele edited comment on SOLR-5230 at 9/11/13 10:26 AM:
--

Applied the patch and it seems to work at a first glance.
Thank you very much for your quick reaction on SOLR-5020, Joel!

But for some scenarios (e.g. expensive post-filters) it might be a drawback 
that the phases cannot be distinguished in the finish() method.
What do you think about introducing a second method 
DelegatingCollector.finishAfterGrouping() or similar that is called in the 
second phase instead of finish()?

  was (Author: simon.endele):
Applied the patch and it seems to work at a first glance.
Thank you very much for your quick reaction on 
https://issues.apache.org/jira/browse/SOLR-5020, Joel!

But for some scenarios (e.g. expensive post-filters) it might be a drawback 
that the phases cannot be distinguished in the finish() method.
What do you think about introducing a second method 
DelegatingCollector.finishAfterGrouping() or similar that is called in the 
second phase instead of finish()?
  
 Call DelegatingCollector.finish() during grouping
 -

 Key: SOLR-5230
 URL: https://issues.apache.org/jira/browse/SOLR-5230
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 4.4
Reporter: Joel Bernstein
Priority: Minor
 Fix For: 4.5, 5.0

 Attachments: SOLR-5230.patch


 This is an add-on to SOLR-5020 to call the new DelegatingCollector.finish() 
 method from inside the grouping flow. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5230) Call DelegatingCollector.finish() during grouping

2013-09-11 Thread Simon Endele (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764390#comment-13764390
 ] 

Simon Endele commented on SOLR-5230:


Hm, I'm quite sure that collect() is called for all docs in both phases.

Excerpt from my final result:
lst name=grouped
  lst name=group_id
int name=matches61/int
int name=ngroups35/int
arr name=groups
  [...]

And collect() is called twice 61 times, followed by a call of finish() each.

 Call DelegatingCollector.finish() during grouping
 -

 Key: SOLR-5230
 URL: https://issues.apache.org/jira/browse/SOLR-5230
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 4.4
Reporter: Joel Bernstein
Priority: Minor
 Fix For: 4.5, 5.0

 Attachments: SOLR-5230.patch


 This is an add-on to SOLR-5020 to call the new DelegatingCollector.finish() 
 method from inside the grouping flow. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5020) Add finish() method to DelegatingCollector

2013-09-10 Thread Simon Endele (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762988#comment-13762988
 ] 

Simon Endele commented on SOLR-5020:


It looks like this isn't working in combination with grouping. Is that possible?

I applied the attached patch to my Solr 4.4.0 workspace containing an 
AclQParserPlugin as described here:
http://searchhub.org/2012/02/22/custom-security-filtering-in-solr/

It works without grouping, but if grouping is activated, the collect() method 
is still called, but finish() is not.

 Add finish() method to DelegatingCollector
 --

 Key: SOLR-5020
 URL: https://issues.apache.org/jira/browse/SOLR-5020
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 5.0
Reporter: Joel Bernstein
Priority: Minor
 Fix For: 4.5, 5.0

 Attachments: SOLR-5020.patch


 This issue adds a finish() method to the DelegatingCollector class so that it 
 can be notified when collection is complete. 
 The current collect() method assumes that the delegating collector will 
 either forward on the document or not with each call. The finish() method 
 will allow DelegatingCollectors to have more sophisticated behavior.
 For example a Field Collapsing delegating collector could collapse the 
 documents as the collect() method is being called. Then when the finish() 
 method is called it could pass the collapsed documents to the delegate 
 collectors.
 This would allow grouping to be implemented within the PostFilter framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5021) NextDoc NPE safety when bulk collecting

2013-09-09 Thread Simon Endele (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13761664#comment-13761664
 ] 

Simon Endele commented on LUCENE-5021:
--

I think what you originally searched for is this: SOLR-5020

 NextDoc NPE safety when bulk collecting
 ---

 Key: LUCENE-5021
 URL: https://issues.apache.org/jira/browse/LUCENE-5021
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index, core/other
Affects Versions: 3.6.2
 Environment: Any with custom filters
Reporter: Alexis Torres Paderewski
  Labels: NPE,, Null-Safety, Scorer

 Hello,
 I would like to apply ACL once as a PostFilter and I therefore need to bulk 
 this call since round trips would severely decrease performances.
 I tried to just stack them on the DelegatingCollector using this collect :
 @Override
 public void collect(int doc) throws IOException {
 while ((doc = scorer.nextDoc()) != DocIdSetIterator.NO_MORE_DOCS) {
 docs.put(getDocumentId(doc), doc);
 }
 batchCollect();
 }
 Depending on the Scorer it may or it may not work. Indeed when the Scorer is 
 Safe  that is when it handles 
 the case in which the scorer is exhausted and is called once again after 
 exhaustion.
 This is the case of the (e.g. DisjunctionMaxScorer, ConstantScorer):
 if (numScorers == 0) return doc = NO_MORE_DOCS; 
 On the other hand, when using the DisjunctionSumScorer, it either asserts on 
 NO_MORE_DOCS, or it throws a NPE.
 Shouldn't we copy the DisjunctionMaxScorer mechanism to protect nextDoc of an 
 exausted iterator using either current doc or checking numbers of subScorers ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4645) Missing Adobe XMP library can abort DataImportHandler process

2013-06-25 Thread Simon Endele (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13692961#comment-13692961
 ] 

Simon Endele commented on SOLR-4645:


Had the same problem. Worked for me. Thanks.

Building solr.war with integrated SolrCell using Maven one can also use:
dependency
groupIdcom.adobe.xmp/groupId
artifactIdxmpcore/artifactId
version5.1.2/version
/dependency
See http://mvnrepository.com/artifact/com.adobe.xmp/xmpcore

 Missing Adobe XMP library can abort DataImportHandler process
 -

 Key: SOLR-4645
 URL: https://issues.apache.org/jira/browse/SOLR-4645
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler, contrib - Solr Cell (Tika 
 extraction)
Affects Versions: 4.2
Reporter: Alexandre Rafalovitch
Priority: Minor
 Fix For: 5.0


 Solr distribution is missing Adobe XMP library ( 
 http://www.adobe.com/devnet/xmp.html ). In particular code path, DIH+Tika 
 tries to load an XMPException and fails with ClassNotFound. The library is 
 present in Tika's distribution.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

51 matches

Mail list logo