[jira] [Commented] (LUCENE-4208) Spatial distance relevancy should use score of 1/distance
[ https://issues.apache.org/jira/browse/LUCENE-4208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453746#comment-13453746 ] Chris Male commented on LUCENE-4208: I disagree that makeQuery shouldn't exist. There are optimizations to be had in Query code, such as using BooleanQuery and its associated highly optimized scorer algs. I think it should continue to exist but should have a default implementation that creates a CSQ by calling makeFilter. Spatial distance relevancy should use score of 1/distance - Key: LUCENE-4208 URL: https://issues.apache.org/jira/browse/LUCENE-4208 Project: Lucene - Core Issue Type: New Feature Components: modules/spatial Reporter: David Smiley Fix For: 4.0 The SpatialStrategy.makeQuery() at the moment uses the distance as the score (although some strategies -- TwoDoubles if I recall might not do anything which would be a bug). The distance is a poor value to use as the score because the score should be related to relevancy, and the distance itself is inversely related to that. A score of 1/distance would be nice. Another alternative is earthCircumference/2 - distance, although I like 1/distance better. Maybe use a different constant than 1. Credit: this is Chris Male's idea. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Collator-based facet sorting in Solr
On Tue, 2012-09-11 at 17:23 +0200, Robert Muir wrote: Just a concern where things could act a little funky: today for example, If I set strength=primary, then its going to fold Test and test to the same unique term, but under this scheme you would have bytesTest and bytestest as two terms. this could be undesirable in the typical case that you just want case-insensitive facets: but we don't provide any way to preprocess the text to avoid this. I seem to be missing something here. The ICUCollationKeyFilter can be at the end of the analyzer chain, so why can't the input be normalized before entering this filter? Really a lot of this is because factory-based analysis chains have no way to specify the AttributeFactory, e.g. i guess if we really wanted to fix this right we would need to pass in the AttributeFactory to TokenizerFactory's create() method. Sounds like a larger change. But for now from Solr it would be a little hacky, e.g. someone is gonna have to fold the case client-side or whatever if they don't want these problems. That would be a serious impediment. For some of our uncontrolled fields, the same word can be cased very differently: CD, cd, Cd. To be of the safe side, the client would have to ask for 3 times the wanted amount of facet information. But if we cannot normalize at index time, de-duplication on the server would require changes to the faceting code. Regardless, it sounds that the idea passes the initial sanity check. Should I open a JIRA issue for it? - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4376) Add Query subclasses for selecting documents where a field is empty or not
[ https://issues.apache.org/jira/browse/LUCENE-4376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453811#comment-13453811 ] Uwe Schindler commented on LUCENE-4376: --- The filter is already there, just QueryParser does not support this. To make this work for your use case, you can override Lucene's/Solr's QueryParser to return ConstantScoreQuery() with the LUCENE-3593 filter as replacement for the field:* only query. The positive and negative variant works using the boolean to the filter. To conclude: The Query is already there, no need for the 2 new classes. The wanted functionality is: {code:java} new ConstantScoreQuery(new FieldValueFilter(String field, boolean negate)) {code} To find all document with any term in the field use negate=false, otherwise negate=true. There is absolutely no need for a Query. bq. Okay, so would it be straightforward and super-efficient for PrefixQuery to do exactly that if the prefix term is zero-length? Thats super-slow as it will search for all terms in the field. This is what e.g. Solr is doing currently for the field:* queries. Solr should use the filter, too, this would make that much more efficient. Add Query subclasses for selecting documents where a field is empty or not -- Key: LUCENE-4376 URL: https://issues.apache.org/jira/browse/LUCENE-4376 Project: Lucene - Core Issue Type: Improvement Components: core/query/scoring Reporter: Jack Krupansky Fix For: 5.0 Users frequently wish to select documents based on whether a specified sparsely-populated field has a value or not. Lucene should provide specific Query subclasses that optimize for these two cases, rather than force users to guess what workaround might be most efficient. It is simplest for users to use a simple pure wildcard term to check for non-empty fields or a negated pure wildcard term to check for empty fields, but it has been suggested that this can be rather inefficient, especially for text fields with many terms. 1. Add NonEmptyFieldQuery - selects all documents that have a value for the specified field. 2. Add EmptyFieldQuery - selects all documents that do not have a value for the specified field. The query parsers could turn a pure wildcard query (asterisk only) into a NonEmptyFieldQuery, and a negated pure wildcard query into an EmptyFieldQuery. Alternatively, maybe PrefixQuery could detect pure wildcard and automatically rewrite it into NonEmptyFieldQuery. My assumption is that if the actual values of the field are not needed, Lucene can much more efficiently simply detect whether values are present, rather than, for example, the user having to create a separate boolean has value field that they would query for true or false. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-4376) Add Query subclasses for selecting documents where a field is empty or not
[ https://issues.apache.org/jira/browse/LUCENE-4376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453811#comment-13453811 ] Uwe Schindler edited comment on LUCENE-4376 at 9/12/12 6:50 PM: The filter is already there, just QueryParser does not support this. To make this work for your use case, you can override Lucene's/Solr's QueryParser to return ConstantScoreQuery() with the LUCENE-3593 filter as replacement for the field:* only query. The positive and negative variant works using the boolean to the filter. To conclude: The Query is already there, no need for the 2 new classes. The wanted functionality is: {code:java} new ConstantScoreQuery(new FieldValueFilter(String field, boolean negate)) {code} To find all document with any term in the field use negate=false, otherwise negate=true. There is absolutely no need for a Query. bq. Okay, so would it be straightforward and super-efficient for PrefixQuery to do exactly that if the prefix term is zero-length? It would be straight forward, but we should not do this as the default (although PrefixQuery could rewrite to that). The problem is that it implicitely needs to build the FieldCache for that field, so automatism is no-go here. If you need that functionality, modify QueryParser. was (Author: thetaphi): The filter is already there, just QueryParser does not support this. To make this work for your use case, you can override Lucene's/Solr's QueryParser to return ConstantScoreQuery() with the LUCENE-3593 filter as replacement for the field:* only query. The positive and negative variant works using the boolean to the filter. To conclude: The Query is already there, no need for the 2 new classes. The wanted functionality is: {code:java} new ConstantScoreQuery(new FieldValueFilter(String field, boolean negate)) {code} To find all document with any term in the field use negate=false, otherwise negate=true. There is absolutely no need for a Query. bq. Okay, so would it be straightforward and super-efficient for PrefixQuery to do exactly that if the prefix term is zero-length? Thats super-slow as it will search for all terms in the field. This is what e.g. Solr is doing currently for the field:* queries. Solr should use the filter, too, this would make that much more efficient. Add Query subclasses for selecting documents where a field is empty or not -- Key: LUCENE-4376 URL: https://issues.apache.org/jira/browse/LUCENE-4376 Project: Lucene - Core Issue Type: Improvement Components: core/query/scoring Reporter: Jack Krupansky Fix For: 5.0 Users frequently wish to select documents based on whether a specified sparsely-populated field has a value or not. Lucene should provide specific Query subclasses that optimize for these two cases, rather than force users to guess what workaround might be most efficient. It is simplest for users to use a simple pure wildcard term to check for non-empty fields or a negated pure wildcard term to check for empty fields, but it has been suggested that this can be rather inefficient, especially for text fields with many terms. 1. Add NonEmptyFieldQuery - selects all documents that have a value for the specified field. 2. Add EmptyFieldQuery - selects all documents that do not have a value for the specified field. The query parsers could turn a pure wildcard query (asterisk only) into a NonEmptyFieldQuery, and a negated pure wildcard query into an EmptyFieldQuery. Alternatively, maybe PrefixQuery could detect pure wildcard and automatically rewrite it into NonEmptyFieldQuery. My assumption is that if the actual values of the field are not needed, Lucene can much more efficiently simply detect whether values are present, rather than, for example, the user having to create a separate boolean has value field that they would query for true or false. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4252) Detect/Fail tests when they leak RAM in static fields
[ https://issues.apache.org/jira/browse/LUCENE-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss resolved LUCENE-4252. - Resolution: Fixed Fix Version/s: 4.0 5.0 Detect/Fail tests when they leak RAM in static fields - Key: LUCENE-4252 URL: https://issues.apache.org/jira/browse/LUCENE-4252 Project: Lucene - Core Issue Type: Test Components: general/test Reporter: Robert Muir Assignee: Dawid Weiss Fix For: 5.0, 4.0 Attachments: LUCENE-4252.patch, LUCENE-4252.patch, sfi.patch We run our junit tests without firing up a JVM each time. But some tests initialize lots of stuff in @BeforeClass and don't properly null it out in an @AfterClass, which can cause a subsequent test in the same JVM to OOM, which is difficult to debug. Inspiration for this was me committing Mike's cool TestPostingsFormat, which forgot to do this: then we were seeing OOMs in several jenkins runs. We should try to detect these leaks in LuceneTestCase with RAMUsageEstimator and fail the test. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4345) Create a Classification module
[ https://issues.apache.org/jira/browse/LUCENE-4345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453853#comment-13453853 ] Tommaso Teofili commented on LUCENE-4345: - bq. Can we remove the ClassificationException? It only seems to box IOException... we can just throw IOException directly instead? sure, we can keep IOException for now bq. What is the scale that you expect this bayesian classifier to handle? How many training documents does it need? I'm doing some benchmarking in these days therefore I should be able to say something about this shortly. Create a Classification module -- Key: LUCENE-4345 URL: https://issues.apache.org/jira/browse/LUCENE-4345 Project: Lucene - Core Issue Type: New Feature Reporter: Tommaso Teofili Assignee: Tommaso Teofili Priority: Minor Attachments: LUCENE-4345_2.patch, LUCENE-4345.patch, SOLR-3700_2.patch, SOLR-3700.patch Lucene/Solr can host huge sets of documents containing lots of information in fields so that these can be used as training examples (w/ features) in order to very quickly create classifiers algorithms to use on new documents and / or to provide an additional service. So the idea is to create a contrib module (called 'classification') to host a ClassificationComponent that will use already seen data (the indexed documents / fields) to classify new documents / text fragments. The first version will contain a (simplistic) Lucene based Naive Bayes classifier but more implementations should be added in the future. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Collator-based facet sorting in Solr
On Wed, Sep 12, 2012 at 3:44 AM, Toke Eskildsen t...@statsbiblioteket.dk wrote: I seem to be missing something here. The ICUCollationKeyFilter can be at the end of the analyzer chain, so why can't the input be normalized before entering this filter? ICUCollationKeyFilter is gone. -- lucidworks.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4345) Create a Classification module
[ https://issues.apache.org/jira/browse/LUCENE-4345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453899#comment-13453899 ] Tommaso Teofili commented on LUCENE-4345: - side note: it seems a bit old but I just realized something similar had been done in LUCENE-1039, maybe both impl could be then added in the future. Create a Classification module -- Key: LUCENE-4345 URL: https://issues.apache.org/jira/browse/LUCENE-4345 Project: Lucene - Core Issue Type: New Feature Reporter: Tommaso Teofili Assignee: Tommaso Teofili Priority: Minor Attachments: LUCENE-4345_2.patch, LUCENE-4345.patch, SOLR-3700_2.patch, SOLR-3700.patch Lucene/Solr can host huge sets of documents containing lots of information in fields so that these can be used as training examples (w/ features) in order to very quickly create classifiers algorithms to use on new documents and / or to provide an additional service. So the idea is to create a contrib module (called 'classification') to host a ClassificationComponent that will use already seen data (the indexed documents / fields) to classify new documents / text fragments. The first version will contain a (simplistic) Lucene based Naive Bayes classifier but more implementations should be added in the future. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Collator-based facet sorting in Solr
On Wed, Sep 12, 2012 at 3:44 AM, Toke Eskildsen t...@statsbiblioteket.dk wrote: That would be a serious impediment. For some of our uncontrolled fields, the same word can be cased very differently: CD, cd, Cd. To be of the safe side, the client would have to ask for 3 times the wanted amount of facet information. But if we cannot normalize at index time, de-duplication on the server would require changes to the faceting code. I'll open an issue for this. We should at least fix the analysis factory APIs to support it, even if the solr configuration xml doesn't yet have syntax. Regardless, it sounds that the idea passes the initial sanity check. Should I open a JIRA issue for it? I think you should. As an ugly workaround to the above problem: you could actually construct a Lucene Analyzer with KeywordTokenizer(ICUCollationAtt) followed by LowerCase/etc/etc and load that up with analyzer class= in solr. I think that will work fine. -- lucidworks.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4379) Add AttributeFactory parameter to TokenizerFactory.create()
Robert Muir created LUCENE-4379: --- Summary: Add AttributeFactory parameter to TokenizerFactory.create() Key: LUCENE-4379 URL: https://issues.apache.org/jira/browse/LUCENE-4379 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Currently the analysis factories don't support using a different attribute factory. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3828) Query Elevation component boosts excluded results in markExcludes mode
Alexey Serba created SOLR-3828: -- Summary: Query Elevation component boosts excluded results in markExcludes mode Key: SOLR-3828 URL: https://issues.apache.org/jira/browse/SOLR-3828 Project: Solr Issue Type: Bug Components: SearchComponents - other Affects Versions: 4.0-BETA Reporter: Alexey Serba Priority: Trivial Fix For: 4.0 Query Elevation component boosts excluded results in markExcludes=true mode causing them to be higher on results than they should. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3828) Query Elevation component boosts excluded results in markExcludes mode
[ https://issues.apache.org/jira/browse/SOLR-3828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Serba updated SOLR-3828: --- Attachment: SOLR-3828.patch Attached patch (fix + test). Query Elevation component boosts excluded results in markExcludes mode -- Key: SOLR-3828 URL: https://issues.apache.org/jira/browse/SOLR-3828 Project: Solr Issue Type: Bug Components: SearchComponents - other Affects Versions: 4.0-BETA Reporter: Alexey Serba Priority: Trivial Fix For: 4.0 Attachments: SOLR-3828.patch Query Elevation component boosts excluded results in markExcludes=true mode causing them to be higher on results than they should. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4377) consolidate various copyBytes() methods
[ https://issues.apache.org/jira/browse/LUCENE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453901#comment-13453901 ] Michael McCandless commented on LUCENE-4377: +1 consolidate various copyBytes() methods --- Key: LUCENE-4377 URL: https://issues.apache.org/jira/browse/LUCENE-4377 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Assignee: Robert Muir Fix For: 5.0, 4.0 Attachments: LUCENE-4377.patch Spinoff of LUCENE-4371: {quote} I don't think the default impl (SlicedIndexInput) should overrided BII's copyBytes? Seems ... spooky. {quote} There are copyBytes everywhere, mostly not really being used. Particularly DataOutput.copyBytes(DataInput) versus IndexInput.copyBytes(IndexOutput). Bulk merging already uses DataOutput.copyBytes(DataInput), its the most general (as it works on DataInput/Output), and its in dst, src order. I think we should remove IndexInput.copyBytes, its not necessary. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4377) consolidate various copyBytes() methods
[ https://issues.apache.org/jira/browse/LUCENE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453902#comment-13453902 ] Uwe Schindler commented on LUCENE-4377: --- +1, this annoyed me since long time! consolidate various copyBytes() methods --- Key: LUCENE-4377 URL: https://issues.apache.org/jira/browse/LUCENE-4377 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Assignee: Robert Muir Fix For: 5.0, 4.0 Attachments: LUCENE-4377.patch Spinoff of LUCENE-4371: {quote} I don't think the default impl (SlicedIndexInput) should overrided BII's copyBytes? Seems ... spooky. {quote} There are copyBytes everywhere, mostly not really being used. Particularly DataOutput.copyBytes(DataInput) versus IndexInput.copyBytes(IndexOutput). Bulk merging already uses DataOutput.copyBytes(DataInput), its the most general (as it works on DataInput/Output), and its in dst, src order. I think we should remove IndexInput.copyBytes, its not necessary. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-2163) Remove synchronized from DirReader.reopen/clone
[ https://issues.apache.org/jira/browse/LUCENE-2163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-2163. Resolution: Fixed Fix Version/s: (was: 4.1) 4.0 5.0 Remove synchronized from DirReader.reopen/clone --- Key: LUCENE-2163 URL: https://issues.apache.org/jira/browse/LUCENE-2163 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Priority: Minor Fix For: 5.0, 4.0 Attachments: LUCENE-2163.patch Spinoff from LUCENE-2161, where the fact that DirReader.reopen is sync'd was dangerous in the context of NRT (could block all searches against that reader when CMS was throttling). So, with LUCENE-2161, we're removing the synchronization when it's an NRT reader that you're reopening. But... why should we sync even for a normal reopen? There are various sync'd methods on IndexReader/DirReader (we are reducing that, with LUCENE-2161 and also LUCENE-2156), but, in general it doesn't seem like normal reopen really needs to be sync'd. Performing a reopen shouldn't incur any chance of blocking a search... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-2925) modules/* are excluded from the versioned site javadocs
[ https://issues.apache.org/jira/browse/LUCENE-2925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-2925. Resolution: Fixed Fix Version/s: (was: 4.1) 4.0 5.0 This is working now. modules/* are excluded from the versioned site javadocs --- Key: LUCENE-2925 URL: https://issues.apache.org/jira/browse/LUCENE-2925 Project: Lucene - Core Issue Type: Bug Components: general/website, modules/analysis, modules/benchmark Affects Versions: 4.0-ALPHA Reporter: Steven Rowe Fix For: 5.0, 4.0 The {{javadocs}} target in {{lucene/build.xml}} builds javadocs for the versioned website, including for Lucene core and all contribs under {{lucene/contrib/}}. Nothing under {{modules/}} is included, but all modules there should be. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3829) Admin UI Logging events broken if schema.xml defines a catch-all dynamicField with type ignored
Andreas Hubold created SOLR-3829: Summary: Admin UI Logging events broken if schema.xml defines a catch-all dynamicField with type ignored Key: SOLR-3829 URL: https://issues.apache.org/jira/browse/SOLR-3829 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 4.0-BETA Reporter: Andreas Hubold The Solr Admin page does not show any log events. There are Javascript errors {noformat} TypeError: doc.logger.esc is not a function ... 'abbr title=' + doc.logger.esc() + '' + doc.logger.split( '.' ).pop().esc()... {noformat} This is because the response of the LoggingHandler added unexpected {{[ ... ]}} characters around the values for time, level, logger and message: {noformat} ... history:{numFound:2,start:0,docs:[{time:[2012-09-11T15:07:05.453Z],level:[WARNING],logger:[org.apache.solr.core.SolrCore],message:[New index directory detected: ... {noformat} This is caused by the way the JSON is created. org.apache.solr.logging.LogWatcher#toSolrDocument creates a SolrDocument which is then formatted with a org.apache.solr.response.JSONResponseWriter. But the JSONResponseWriter uses the index schema to decide how to format the JSON. We have the following field declaration in schema.xml: {noformat} dynamicField name=* type=ignored / {noformat} The field type ignored has the attribute multiValued set to true. Because of this JSONResponseWriter adds [] characters in org.apache.solr.response.JSONWriter#writeSolrDocument The formatting should be independent from schema.xml -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3367) Show Logging Events in Admin UI
[ https://issues.apache.org/jira/browse/SOLR-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453919#comment-13453919 ] Andreas Hubold commented on SOLR-3367: -- This feature is broken in Solr 4.0-BETA - at least with certain schema.xml files. See SOLR-3829. Show Logging Events in Admin UI --- Key: SOLR-3367 URL: https://issues.apache.org/jira/browse/SOLR-3367 Project: Solr Issue Type: New Feature Components: web gui Reporter: Ryan McKinley Assignee: Stefan Matheis (steffkes) Fix For: 4.0-ALPHA Attachments: SOLR-3367.patch, SOLR-3367.patch, SOLR-3367.patch, SOLR-3367.patch, SOLR-3367.png We can show logging events in the Admin UI. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-3000) Lucene release artifacts should be named apache-lucene-*
[ https://issues.apache.org/jira/browse/LUCENE-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-3000. Resolution: Won't Fix Lucene release artifacts should be named apache-lucene-* Key: LUCENE-3000 URL: https://issues.apache.org/jira/browse/LUCENE-3000 Project: Lucene - Core Issue Type: Bug Affects Versions: 4.0-ALPHA Reporter: Grant Ingersoll Priority: Minor Fix For: 4.1 Our artifact names should be prefixed with apache-, as in apache-lucene-4.0-src.tar.gz (or whatever) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4377) consolidate various copyBytes() methods
[ https://issues.apache.org/jira/browse/LUCENE-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-4377. - Resolution: Fixed consolidate various copyBytes() methods --- Key: LUCENE-4377 URL: https://issues.apache.org/jira/browse/LUCENE-4377 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Assignee: Robert Muir Fix For: 5.0, 4.0 Attachments: LUCENE-4377.patch Spinoff of LUCENE-4371: {quote} I don't think the default impl (SlicedIndexInput) should overrided BII's copyBytes? Seems ... spooky. {quote} There are copyBytes everywhere, mostly not really being used. Particularly DataOutput.copyBytes(DataInput) versus IndexInput.copyBytes(IndexOutput). Bulk merging already uses DataOutput.copyBytes(DataInput), its the most general (as it works on DataInput/Output), and its in dst, src order. I think we should remove IndexInput.copyBytes, its not necessary. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4371) consider refactoring slicer to indexinput.slice
[ https://issues.apache.org/jira/browse/LUCENE-4371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-4371: Attachment: LUCENE-4371.patch just syncing the patch up to trunk. part of the funkiness i dont like is e.g. NIOFSIndexInput extends SimpleIndexInput. This is not good. I will see if i can clear that up in a separate issue. consider refactoring slicer to indexinput.slice --- Key: LUCENE-4371 URL: https://issues.apache.org/jira/browse/LUCENE-4371 Project: Lucene - Core Issue Type: Task Reporter: Robert Muir Attachments: LUCENE-4371.patch, LUCENE-4371.patch From LUCENE-4364: {quote} In my opinion, we should maybe check, if we can remove the whole Slicer in all Indexinputs? Just make the slice(...) method return the current BufferedIndexInput-based one. This could be another issue, once this is in. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4380) fix simplefs/niofshierarchy
Robert Muir created LUCENE-4380: --- Summary: fix simplefs/niofshierarchy Key: LUCENE-4380 URL: https://issues.apache.org/jira/browse/LUCENE-4380 Project: Lucene - Core Issue Type: Task Reporter: Robert Muir Attachments: LUCENE-4380.patch spinoff from LUCENE-4371: Currently NIOFSDirectory.NIOFSIndexInput extends SimpleFSDirectory.SimpleFSIndexInput, but this isn't an is-a relationship at all. Additionally SimpleFSDirectory has a funky Descriptor class that extends RandomAccessFile that is useless: {noformat} /** * Extension of RandomAccessFile that tracks if the file is * open. */ ... // remember if the file is open, so that we don't try to close it // more than once {noformat} RandomAccessFile is closeable, this is not necessary and I don't think we should be subclassing it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4380) fix simplefs/niofshierarchy
[ https://issues.apache.org/jira/browse/LUCENE-4380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-4380: Attachment: LUCENE-4380.patch Here's a patch: i factored the shared logic into an FSIndexInput (parallel with FSIndexOutput) instead. fix simplefs/niofshierarchy --- Key: LUCENE-4380 URL: https://issues.apache.org/jira/browse/LUCENE-4380 Project: Lucene - Core Issue Type: Task Reporter: Robert Muir Attachments: LUCENE-4380.patch spinoff from LUCENE-4371: Currently NIOFSDirectory.NIOFSIndexInput extends SimpleFSDirectory.SimpleFSIndexInput, but this isn't an is-a relationship at all. Additionally SimpleFSDirectory has a funky Descriptor class that extends RandomAccessFile that is useless: {noformat} /** * Extension of RandomAccessFile that tracks if the file is * open. */ ... // remember if the file is open, so that we don't try to close it // more than once {noformat} RandomAccessFile is closeable, this is not necessary and I don't think we should be subclassing it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3823) Parentheses in a boost query cause errors
[ https://issues.apache.org/jira/browse/SOLR-3823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454028#comment-13454028 ] James Dyer commented on SOLR-3823: -- Hoss, I appreciate you fixing this, but I would rather get a fix that preserves the negative boost support (SOLR-3278). I guess I don't understand the bug this issue was addressing. Is it simply that bq would fail if extra whitespace was in the query? Could we write a failing testcase for that? Do you see a reason why it would be difficult to fix this and retain the negative boosts? The discussion of LUCENE-4378 is pertinent: we have products in our index that we either do not sell or we know most of our customer do not want. Yet they often score very high. The only way I can reliably prevent these from becoming top hits is to use a negative boost. I would imagine this is a frequent requirement. I'm more than willing to contribute for this, but I couldn't tell that this issue was an actual problem or a case of users putting whitespace where it doesn't belong and prior versions being more forgiving. Parentheses in a boost query cause errors - Key: SOLR-3823 URL: https://issues.apache.org/jira/browse/SOLR-3823 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0-BETA Environment: Mac, jdk 1.6, Chrome Reporter: Mathos Marcer Assignee: Hoss Man Fix For: 4.0, 5.0 When using a boost query (bq) that contains a parentheses (like this example from the Relevancy Cookbook section of the wiki): {noformat} ? defType = dismax q = foo bar bq = (*:* -xxx)^999 {noformat} You get the following error: org.apache.lucene.queryparser.classic.ParseException: Cannot parse '-xxx)': Encountered ) ) at line 1, column 12. Was expecting one of: EOF AND ... OR ... NOT ... + ... - ... BAREOPER ... ( ... * ... ^ ... QUOTED ... TERM ... FUZZY_SLOP ... PREFIXTERM ... WILDTERM ... REGEXPTERM ... [ ... { ... NUMBER ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4373) BBoxStrategy should support query shapes of any type
[ https://issues.apache.org/jira/browse/LUCENE-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454036#comment-13454036 ] David Smiley commented on LUCENE-4373: -- As part of this, I think a makeValueSource() might be modified to alter the area similarity to consider the query shape's percentage of the bbox that it fills. Perhaps something like this: {code:java} public ValueSource makeValueSource(SpatialArgs args) { Shape shape = args.getShape(); double queryPowerFactor = 1; if (!(shape instanceof Rectangle)) { double queryBBoxArea = shape.getBoundingBox().getArea(ctx); double queryArea = shape.getArea(ctx); if (queryBBoxArea != 0) queryPowerFactor = queryArea / queryBBoxArea; } return new BBoxSimilarityValueSource( this, new AreaSimilarity(shape.getBoundingBox(), queryPower * queryPowerFactor, targetPower)); } {code} BBoxStrategy should support query shapes of any type Key: LUCENE-4373 URL: https://issues.apache.org/jira/browse/LUCENE-4373 Project: Lucene - Core Issue Type: Improvement Components: modules/spatial Reporter: David Smiley Priority: Minor It's great that BBoxStrategy has sophisticated shape area similarity based on bounding box, but I think that doesn't have to preclude having a non-rectangular query shape. The bbox to bbox query implemented already is probably pretty pretty fast as can work by numeric range queries, but I'd like this to be the first stage of which the 2nd is a FieldCache based comparison to the query shape if it's not a rectangle. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4173) Remove IgnoreIncompatibleGeometry for SpatialStrategys
[ https://issues.apache.org/jira/browse/LUCENE-4173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated LUCENE-4173: - Attachment: LUCENE-4173_remove_IgnoreIncompatibleGeometry,_fail_when_given_the_exact_shape_needed.patch Updated the patch: * renamed the test method with the underscore to be convertShapeFromGetDocuments instead * In BBoxStrategy.makeValueSource, I moved my TODO bbox shape similarity idea to a comment on a JIRA issue. And I modified this makeValueSource to fail if a rectangle is not given, instead of coalescing via getBoundingBox(). Remove IgnoreIncompatibleGeometry for SpatialStrategys -- Key: LUCENE-4173 URL: https://issues.apache.org/jira/browse/LUCENE-4173 Project: Lucene - Core Issue Type: Bug Components: modules/spatial Reporter: Chris Male Assignee: David Smiley Fix For: 4.0 Attachments: LUCENE-4173.patch, LUCENE-4173_remove_ignoreIncompatibleGeometry,_fail_when_given_the_exact_shape_needed.patch, LUCENE-4173_remove_IgnoreIncompatibleGeometry,_fail_when_given_the_exact_shape_needed.patch Silently not indexing anything for a Shape is not okay. Users should get an Exception and then they can decide how to proceed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3830) Rename LFUCache to FastLFUCache
Adrien Grand created SOLR-3830: -- Summary: Rename LFUCache to FastLFUCache Key: SOLR-3830 URL: https://issues.apache.org/jira/browse/SOLR-3830 Project: Solr Issue Type: Bug Affects Versions: 4.0-BETA Reporter: Adrien Grand Priority: Minor I find it a little disturbing that LFUCache shares most of its behavior (not strictly bounded size, good at concurrent reads, slow at writes unless eviction is performed in a separate thread) with FastLRUCache while it sounds like it is the LFU equivalent of LRUCache (strictly bounded size, synchronized reads, fast writes) so I'd like to rename it to FastLFUCache. Maybe we should also rename these Fast*Cache to Concurrent*Cache so that people don't think that they are better than their non Fast alternatives in every way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3823) Parentheses in a boost query cause errors
[ https://issues.apache.org/jira/browse/SOLR-3823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454059#comment-13454059 ] Erick Erickson commented on SOLR-3823: -- James: The problem was quite the opposite. When there was NO space in the bq clause it'd fail like this, i.e. bq=(stuff). And when there was space, I don't think it worked at all But yeah, it'd be good to have both parens and negative boosts... Parentheses in a boost query cause errors - Key: SOLR-3823 URL: https://issues.apache.org/jira/browse/SOLR-3823 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0-BETA Environment: Mac, jdk 1.6, Chrome Reporter: Mathos Marcer Assignee: Hoss Man Fix For: 4.0, 5.0 When using a boost query (bq) that contains a parentheses (like this example from the Relevancy Cookbook section of the wiki): {noformat} ? defType = dismax q = foo bar bq = (*:* -xxx)^999 {noformat} You get the following error: org.apache.lucene.queryparser.classic.ParseException: Cannot parse '-xxx)': Encountered ) ) at line 1, column 12. Was expecting one of: EOF AND ... OR ... NOT ... + ... - ... BAREOPER ... ( ... * ... ^ ... QUOTED ... TERM ... FUZZY_SLOP ... PREFIXTERM ... WILDTERM ... REGEXPTERM ... [ ... { ... NUMBER ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3830) Rename LFUCache to FastLFUCache
[ https://issues.apache.org/jira/browse/SOLR-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454067#comment-13454067 ] Yonik Seeley commented on SOLR-3830: bq. so I'd like to rename it to FastLFUCache. +1, it hasn't been used in the default solrconfig.xml, so this change shouldn't really affect anyone. I don't think we should rename the other ones that people are likely to have in their configs already though. There is a note right next to where one would configure these cases that tries to explain the difference. We should update that if it's not sufficient. Rename LFUCache to FastLFUCache --- Key: SOLR-3830 URL: https://issues.apache.org/jira/browse/SOLR-3830 Project: Solr Issue Type: Bug Affects Versions: 4.0-BETA Reporter: Adrien Grand Priority: Minor I find it a little disturbing that LFUCache shares most of its behavior (not strictly bounded size, good at concurrent reads, slow at writes unless eviction is performed in a separate thread) with FastLRUCache while it sounds like it is the LFU equivalent of LRUCache (strictly bounded size, synchronized reads, fast writes) so I'd like to rename it to FastLFUCache. Maybe we should also rename these Fast*Cache to Concurrent*Cache so that people don't think that they are better than their non Fast alternatives in every way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: svn commit: r1384000 - /lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/index/BaseCompositeReader.java
Thanks! This was overseen... - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: rm...@apache.org [mailto:rm...@apache.org] Sent: Wednesday, September 12, 2012 5:35 PM To: comm...@lucene.apache.org Subject: svn commit: r1384000 - /lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/index/BaseCompo siteReader.java Author: rmuir Date: Wed Sep 12 15:34:56 2012 New Revision: 1384000 URL: http://svn.apache.org/viewvc?rev=1384000view=rev Log: LUCENE-4306: dont upgrade this method to public in BaseCompositeReader Modified: lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/index/BaseComposi teReader.java Modified: lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/index/BaseComposi teReader.java URL: http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/core/src/java/org/apac he/lucene/index/BaseCompositeReader.java?rev=1384000r1=1383999r2=1 384000view=diff == --- lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/index/BaseComposi teReader.java (original) +++ lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/index/BaseComposi teReader.java Wed Sep 12 15:34:56 2012 @@ -151,7 +151,7 @@ public abstract class BaseCompositeReade } @Override - public final List? extends R getSequentialSubReaders() { + protected final List? extends R getSequentialSubReaders() { return subReadersList; } } - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3830) Rename LFUCache to FastLFUCache
[ https://issues.apache.org/jira/browse/SOLR-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454085#comment-13454085 ] Hoss Man commented on SOLR-3830: -1. Repeating my comment from SOLR-3393... {quote} #OhDearGodPleaseNotAnotherClassWithFastInTheName Please, please, please lets end the madness of subjective adjectives in class names ... if it's an LFU cache wrapped around a hawtdb why don't we just call it HawtDbLFUCache ? {quote} we should not be adding new names with Fast in front of them - it does nothing to help the user understand the value of the class. {quote} Maybe we should also rename these Fast*Cache to Concurrent*Cache so that people don't think that they are better than their non Fast alternatives in every way. {quote} I would much rather rename FastLRUCache to something else (with a deprecated FastLRUCache stub subclass still provided for config backcompat) then see any more a new Fast*Foo class. Rename LFUCache to FastLFUCache --- Key: SOLR-3830 URL: https://issues.apache.org/jira/browse/SOLR-3830 Project: Solr Issue Type: Bug Affects Versions: 4.0-BETA Reporter: Adrien Grand Priority: Minor I find it a little disturbing that LFUCache shares most of its behavior (not strictly bounded size, good at concurrent reads, slow at writes unless eviction is performed in a separate thread) with FastLRUCache while it sounds like it is the LFU equivalent of LRUCache (strictly bounded size, synchronized reads, fast writes) so I'd like to rename it to FastLFUCache. Maybe we should also rename these Fast*Cache to Concurrent*Cache so that people don't think that they are better than their non Fast alternatives in every way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Error when Integrating wordnet to Lucene
Hi all, I want to integrate wordnet 3.0 to Lucene 4.0. This is my code: 1. String op = new Scanner(new File(E:\\...\\WNprolog-3.0\\prolog\\wn_s.pl)).useDelimiter(\\Z).next() ; 2. WordnetSynonymParser parser = new WordnetSynonymParser(true, true, new StandardAnalyzer(Version.LUCENE_40)); 3. parser.add(new StringReader(op)); 4. SynonymMap map = parser.build(); But when the 3rd line was executed, I got this error: /Invalid synonym rule at line 109/ I don't know what the cause is. Could you please help me with this problem? Thank you so much. -- View this message in context: http://lucene.472066.n3.nabble.com/Error-when-Integrating-wordnet-to-Lucene-tp4007141.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3823) Parentheses in a boost query cause errors
[ https://issues.apache.org/jira/browse/SOLR-3823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454104#comment-13454104 ] Hoss Man commented on SOLR-3823: bq. I couldn't tell that this issue was an actual problem or a case of users putting whitespace where it doesn't belong and prior versions being more forgiving. James: the core of the bug was your use of SolrPluginUtils.parseFieldBoosts to try and parse the bq params. This is not safe -- if you look at the method it is an extremely trivial utility that is specific for parsing qf/pf style strings containing a list of field names and boosts. it's _not_ a safe way to parse an arbitrary query string, and any non trivial query string can cause problems with it. AS you noted in SOLR-3278, parseFieldBoosts is used for parsing the bf param and that's actually a long standing unsafe bug as well (SOLR-2014) but since functions tend to be much simpler, it's historically been less problematic. when people run into problems with it, the workarround is to use bq={!func}... instead. bq. I would rather get a fix that preserves the negative boost support Since SOLR-3278 had not been released publicly outside of the ALPHA/BETA, my first priority was to fix the regression compared to 3.x where non-trivial bq queries worked fine. The documented method of dealing with negative boosting in solr is actually the type of query that was the crux of this bug report, and i updated the tests you added in SOLR-3278 to use that pattern... https://wiki.apache.org/solr/SolrRelevancyFAQ#How_do_I_give_a_negative_.28or_very_low.29_boost_to_documents_that_match_a_query.3F I have no objections to supporting true negative boosts, but i think the right way to do it is in the query parsers / QParsers themselves (so that the boosts can be on any clause) and not just as a special hack for bq/bf (the fact that it works in bf is actualy just a fluke of it's buggy implementation) but as you can see in LUCENE-4378 this is a contentious idea. Parentheses in a boost query cause errors - Key: SOLR-3823 URL: https://issues.apache.org/jira/browse/SOLR-3823 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0-BETA Environment: Mac, jdk 1.6, Chrome Reporter: Mathos Marcer Assignee: Hoss Man Fix For: 4.0, 5.0 When using a boost query (bq) that contains a parentheses (like this example from the Relevancy Cookbook section of the wiki): {noformat} ? defType = dismax q = foo bar bq = (*:* -xxx)^999 {noformat} You get the following error: org.apache.lucene.queryparser.classic.ParseException: Cannot parse '-xxx)': Encountered ) ) at line 1, column 12. Was expecting one of: EOF AND ... OR ... NOT ... + ... - ... BAREOPER ... ( ... * ... ^ ... QUOTED ... TERM ... FUZZY_SLOP ... PREFIXTERM ... WILDTERM ... REGEXPTERM ... [ ... { ... NUMBER ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4173) Remove IgnoreIncompatibleGeometry for SpatialStrategys
[ https://issues.apache.org/jira/browse/LUCENE-4173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley resolved LUCENE-4173. -- Resolution: Fixed I received Chris's blessing on these changes in chat and I committed now. Trunk: r1384026, 4x: r1384028 Remove IgnoreIncompatibleGeometry for SpatialStrategys -- Key: LUCENE-4173 URL: https://issues.apache.org/jira/browse/LUCENE-4173 Project: Lucene - Core Issue Type: Bug Components: modules/spatial Reporter: Chris Male Assignee: David Smiley Fix For: 4.0 Attachments: LUCENE-4173.patch, LUCENE-4173_remove_ignoreIncompatibleGeometry,_fail_when_given_the_exact_shape_needed.patch, LUCENE-4173_remove_IgnoreIncompatibleGeometry,_fail_when_given_the_exact_shape_needed.patch Silently not indexing anything for a Shape is not okay. Users should get an Exception and then they can decide how to proceed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3823) Parentheses in a boost query cause errors
[ https://issues.apache.org/jira/browse/SOLR-3823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454126#comment-13454126 ] James Dyer commented on SOLR-3823: -- Hoss, Thank you for working through this and opening Lucene-4378 to at least investigate changing the parser grammar. I understand the issue with what I had done initially and appreciate your help on this. Parentheses in a boost query cause errors - Key: SOLR-3823 URL: https://issues.apache.org/jira/browse/SOLR-3823 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0-BETA Environment: Mac, jdk 1.6, Chrome Reporter: Mathos Marcer Assignee: Hoss Man Fix For: 4.0, 5.0 When using a boost query (bq) that contains a parentheses (like this example from the Relevancy Cookbook section of the wiki): {noformat} ? defType = dismax q = foo bar bq = (*:* -xxx)^999 {noformat} You get the following error: org.apache.lucene.queryparser.classic.ParseException: Cannot parse '-xxx)': Encountered ) ) at line 1, column 12. Was expecting one of: EOF AND ... OR ... NOT ... + ... - ... BAREOPER ... ( ... * ... ^ ... QUOTED ... TERM ... FUZZY_SLOP ... PREFIXTERM ... WILDTERM ... REGEXPTERM ... [ ... { ... NUMBER ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1223) Query Filter fq with OR operator
[ https://issues.apache.org/jira/browse/SOLR-1223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454135#comment-13454135 ] Ron Buchanan commented on SOLR-1223: If you care for input from a nobody that's fairly new to Solr, I like Hoss Man's idea - and I very, very much want this Though my thought was that it would make sense to use the v=$paramName facility and just add multiple instances of paramName Query Filter fq with OR operator Key: SOLR-1223 URL: https://issues.apache.org/jira/browse/SOLR-1223 Project: Solr Issue Type: New Feature Components: search Reporter: Brian Pearson See this [issue|http://lucene.472066.n3.nabble.com/Query-Filter-fq-with-OR-operator-td499172.html] for some background. Today, all of the Query filters specified with the fq parameter are AND'd together. This issue is about allowing a set of filters to be OR'd together (in addition to having another set of filters that are AND'd). The OR'd filters would of course be applied before any scoring is done. The advantage of this feature is that you will be able to break up complex filters into simple, more cacheable filters, which should improve performance. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3830) Rename LFUCache to FastLFUCache
[ https://issues.apache.org/jira/browse/SOLR-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454154#comment-13454154 ] Yonik Seeley commented on SOLR-3830: OK, let's leave things as they are then. Documentation is the key if we need to clarify anything. Rename LFUCache to FastLFUCache --- Key: SOLR-3830 URL: https://issues.apache.org/jira/browse/SOLR-3830 Project: Solr Issue Type: Bug Affects Versions: 4.0-BETA Reporter: Adrien Grand Priority: Minor I find it a little disturbing that LFUCache shares most of its behavior (not strictly bounded size, good at concurrent reads, slow at writes unless eviction is performed in a separate thread) with FastLRUCache while it sounds like it is the LFU equivalent of LRUCache (strictly bounded size, synchronized reads, fast writes) so I'd like to rename it to FastLFUCache. Maybe we should also rename these Fast*Cache to Concurrent*Cache so that people don't think that they are better than their non Fast alternatives in every way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful
[ https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454167#comment-13454167 ] Robert Muir commented on LUCENE-4369: - How about WholeTextField? thats fine with me. Does anyone object? StringFields name is unintuitive and not helpful Key: LUCENE-4369 URL: https://issues.apache.org/jira/browse/LUCENE-4369 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-4369.patch There's a huge difference between TextField and StringField, StringField screws up scoring and bypasses your Analyzer. (see java-user thread Custom Analyzer Not Called When Indexing as an example.) The name we use here is vital, otherwise people will get bad results. I think we should rename StringField to MatchOnlyField. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful
[ https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454172#comment-13454172 ] Robert Muir commented on LUCENE-4369: - ok just a few downsides of 'whole': * it seems similar to full, like full-text field. but StringField is not that. * then what is TextField, only partial? Guys i realistically dont think we are going to come up with a perfect name here that everyone likes. But I think enough people agree that StringField is bad. I seriously propose ASDFGHIJField in the interim, we gotta make some incremental progress. StringFields name is unintuitive and not helpful Key: LUCENE-4369 URL: https://issues.apache.org/jira/browse/LUCENE-4369 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-4369.patch There's a huge difference between TextField and StringField, StringField screws up scoring and bypasses your Analyzer. (see java-user thread Custom Analyzer Not Called When Indexing as an example.) The name we use here is vital, otherwise people will get bad results. I think we should rename StringField to MatchOnlyField. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful
[ https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454175#comment-13454175 ] Uwe Schindler commented on LUCENE-4369: --- WholeTextField sounds like Starbucks... I would like UntokenizedField. StringFields name is unintuitive and not helpful Key: LUCENE-4369 URL: https://issues.apache.org/jira/browse/LUCENE-4369 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-4369.patch There's a huge difference between TextField and StringField, StringField screws up scoring and bypasses your Analyzer. (see java-user thread Custom Analyzer Not Called When Indexing as an example.) The name we use here is vital, otherwise people will get bad results. I think we should rename StringField to MatchOnlyField. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful
[ https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454186#comment-13454186 ] Steven Rowe commented on LUCENE-4369: - Some more choices: AsIsTextField, IntactTextField, UnSoiledTextField, HalfCaffLatteField StringFields name is unintuitive and not helpful Key: LUCENE-4369 URL: https://issues.apache.org/jira/browse/LUCENE-4369 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-4369.patch There's a huge difference between TextField and StringField, StringField screws up scoring and bypasses your Analyzer. (see java-user thread Custom Analyzer Not Called When Indexing as an example.) The name we use here is vital, otherwise people will get bad results. I think we should rename StringField to MatchOnlyField. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful
[ https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454194#comment-13454194 ] Shai Erera commented on LUCENE-4369: bq. I would like UntokenizedField +1 for that. I don't think we should underestimate Lucene users to the point that they don't understand what an Analyzer is, or tokenization means. When they create IWC, they need to specify an Analyzer. I think, seriously, that Analyzer is as basic as Document. StringFields name is unintuitive and not helpful Key: LUCENE-4369 URL: https://issues.apache.org/jira/browse/LUCENE-4369 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-4369.patch There's a huge difference between TextField and StringField, StringField screws up scoring and bypasses your Analyzer. (see java-user thread Custom Analyzer Not Called When Indexing as an example.) The name we use here is vital, otherwise people will get bad results. I think we should rename StringField to MatchOnlyField. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful
[ https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454200#comment-13454200 ] Robert Muir commented on LUCENE-4369: - I am +1 for UntokenizedField too. This is much more intuitive than StringField! StringFields name is unintuitive and not helpful Key: LUCENE-4369 URL: https://issues.apache.org/jira/browse/LUCENE-4369 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-4369.patch There's a huge difference between TextField and StringField, StringField screws up scoring and bypasses your Analyzer. (see java-user thread Custom Analyzer Not Called When Indexing as an example.) The name we use here is vital, otherwise people will get bad results. I think we should rename StringField to MatchOnlyField. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful
[ https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454204#comment-13454204 ] Hoss Man commented on LUCENE-4369: -- Didn't we spcifically get rid of an enums called TOKENIZED and UN_TOKENIZED because they convoluted the concept of tokenization with analysis? weren't there users who wanted keyword tokenization combined with other tokenfilters who thought UN_TOKENIZED was what they wanted? Perhaps TextField should be renamed AnalyzedTextField and StringField should be NonAnalyzedTextField ? StringFields name is unintuitive and not helpful Key: LUCENE-4369 URL: https://issues.apache.org/jira/browse/LUCENE-4369 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-4369.patch There's a huge difference between TextField and StringField, StringField screws up scoring and bypasses your Analyzer. (see java-user thread Custom Analyzer Not Called When Indexing as an example.) The name we use here is vital, otherwise people will get bad results. I think we should rename StringField to MatchOnlyField. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful
[ https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454205#comment-13454205 ] Shai Erera commented on LUCENE-4369: Great, then do we have a winner? :) StringFields name is unintuitive and not helpful Key: LUCENE-4369 URL: https://issues.apache.org/jira/browse/LUCENE-4369 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-4369.patch There's a huge difference between TextField and StringField, StringField screws up scoring and bypasses your Analyzer. (see java-user thread Custom Analyzer Not Called When Indexing as an example.) The name we use here is vital, otherwise people will get bad results. I think we should rename StringField to MatchOnlyField. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful
[ https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454236#comment-13454236 ] Uwe Schindler commented on LUCENE-4369: --- I never understood the difference and why this was renamed in 2.4. For me the issue explains nothing and the mailing list thread referenced from there is in my opinion unrelated. I am also fine with replacing tokenized with analyzed. Inert question: why is it called Tokenizer and not Analyzerator? StringFields name is unintuitive and not helpful Key: LUCENE-4369 URL: https://issues.apache.org/jira/browse/LUCENE-4369 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-4369.patch There's a huge difference between TextField and StringField, StringField screws up scoring and bypasses your Analyzer. (see java-user thread Custom Analyzer Not Called When Indexing as an example.) The name we use here is vital, otherwise people will get bad results. I think we should rename StringField to MatchOnlyField. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful
[ https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454242#comment-13454242 ] Erick Erickson commented on LUCENE-4369: Shai: bq: ...I don't think we should underestimate Lucene users to the point that they don't understand what an Analyzer... I absolutely agree with you about _Lucene_ users, but I disagree when we're talking about _Solr_ users who are just using the schema.xml file. I flat guarantee that they don't always look under the covers. I've seen way more than one site with solr rocks as the first/newSearchers. But all that said, I'm not doing the work so whatever gets chosen is fine with me. StringFields name is unintuitive and not helpful Key: LUCENE-4369 URL: https://issues.apache.org/jira/browse/LUCENE-4369 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-4369.patch There's a huge difference between TextField and StringField, StringField screws up scoring and bypasses your Analyzer. (see java-user thread Custom Analyzer Not Called When Indexing as an example.) The name we use here is vital, otherwise people will get bad results. I think we should rename StringField to MatchOnlyField. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2608) TestReplicationHandler is flakey
[ https://issues.apache.org/jira/browse/SOLR-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454243#comment-13454243 ] Hoss Man commented on SOLR-2608: I can't comment on the specific exceptions mentioned above, but havin recently looked at TestReplicationHandler because of SOLR-3809 i noticed a few things i thought i'd comment on here... at some point it was annotated as @Slow - i believe the crux of the problem with why it can be very slow for some people is that the majority of the functionality being tested seems to rely on the slave polling the master for replication, and the rQuery method used through out the test will retry queries over and over (up to 30 seconds) until they pass. While we should definitley have some test that the polling works, a lot of the functionality not-polling specific could probably be tested more reliably using on demand snappulling commands to the slave. TestReplicationHandler is flakey Key: SOLR-2608 URL: https://issues.apache.org/jira/browse/SOLR-2608 Project: Solr Issue Type: Bug Reporter: selckin I've been running some while(1) tests on trunk, and TestReplicationHandler is very flakey it fails about every 10th run. Probably not a bug, but the test not waiting correctly {code} [junit] Testsuite: org.apache.solr.handler.TestReplicationHandler [junit] Testcase: org.apache.solr.handler.TestReplicationHandler: FAILED [junit] ERROR: SolrIndexSearcher opens=48 closes=47 [junit] junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=48 closes=47 [junit] at org.apache.solr.SolrTestCaseJ4.endTrackingSearchers(SolrTestCaseJ4.java:131) [junit] at org.apache.solr.SolrTestCaseJ4.afterClassSolrTestCase(SolrTestCaseJ4.java:74) [junit] [junit] [junit] Tests run: 8, Failures: 1, Errors: 0, Time elapsed: 40.772 sec [junit] [junit] - Standard Error - [junit] 19-Jun-2011 21:26:44 org.apache.solr.handler.SnapPuller fetchLatestIndex [junit] SEVERE: Master at: http://localhost:51817/solr/replication is not available. Index fetch failed. Exception: Connection refused [junit] 19-Jun-2011 21:26:49 org.apache.solr.common.SolrException log [junit] SEVERE: java.util.concurrent.RejectedExecutionException [junit] at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:1768) [junit] at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767) [junit] at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658) [junit] at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:92) [junit] at java.util.concurrent.Executors$DelegatedExecutorService.submit(Executors.java:603) [junit] at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1149) [junit] at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:346) [junit] at org.apache.solr.handler.SnapPuller.doCommit(SnapPuller.java:483) [junit] at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:332) [junit] at org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:267) [junit] at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:166) [junit] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) [junit] at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317) [junit] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150) [junit] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98) [junit] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180) [junit] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204) [junit] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [junit] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [junit] at java.lang.Thread.run(Thread.java:662) [junit] [junit] 19-Jun-2011 21:26:51 org.apache.solr.update.SolrIndexWriter finalize [junit] SEVERE: SolrIndexWriter was not closed prior to finalize(), indicates a bug -- POSSIBLE RESOURCE LEAK!!! [junit] 19-Jun-2011 21:26:51 org.apache.solr.common.util.ConcurrentLRUCache finalize [junit] SEVERE: ConcurrentLRUCache was not destroyed prior to finalize(), indicates a bug -- POSSIBLE RESOURCE LEAK!!!
[jira] [Updated] (SOLR-3809) Replication of config files fails when using sub directories
[ https://issues.apache.org/jira/browse/SOLR-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-3809: --- Attachment: SOLR-3809.patch i modified TestReplicationHandler to demonstrate the original bug Emma mentioned, and then merged in his patch to show that it fixed the problem -- however i then modified the fix quite a bit, as it was doing some wonky stuff (like equality comparisons between a string path and a File object) I think this patch is good to go. Replication of config files fails when using sub directories Key: SOLR-3809 URL: https://issues.apache.org/jira/browse/SOLR-3809 Project: Solr Issue Type: Bug Reporter: Emmanuel Espina Assignee: Hoss Man Fix For: 4.0 Attachments: SOLR-3809.patch, SOLR-3809.patch If you want to replicate a configuration file inside a subdirectory of conf directory (eg conf/stopwords/english.txt) Solr fails because it cannot find the subdirectory. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful
[ https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454252#comment-13454252 ] Steven Rowe commented on LUCENE-4369: - bq. I never understood the difference and why this was renamed in 2.4. For me the issue explains nothing and the mailing list thread referenced from there is in my opinion unrelated. Yeah, no. Totally related, see e.g. http://mail-archives.apache.org/mod_mbox/lucene-java-user/200808.mbox/%3c184419b1-6589-41cb-b5d4-3ea9c4215...@mikemccandless.com%3E StringFields name is unintuitive and not helpful Key: LUCENE-4369 URL: https://issues.apache.org/jira/browse/LUCENE-4369 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-4369.patch There's a huge difference between TextField and StringField, StringField screws up scoring and bypasses your Analyzer. (see java-user thread Custom Analyzer Not Called When Indexing as an example.) The name we use here is vital, otherwise people will get bad results. I think we should rename StringField to MatchOnlyField. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4369) StringFields name is unintuitive and not helpful
[ https://issues.apache.org/jira/browse/LUCENE-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454254#comment-13454254 ] Hoss Man commented on LUCENE-4369: -- bq. the mailing list thread referenced from there is in my opinion unrelated. Did you read the whole thread? It's littered with comments about confusion between how that UN_TOKENIZED related to the Analyzer configured on the IndexWriter -- some people thought it ment the *tokenizer* in the Analyzer wouldn't be used, bu the rest of their analyzer would. It's very representative of lots of other threads i'd seen over the years. bq. I disagree when we're talking about Solr users who are just using the schema.xml file I don't think anyone is talking about changing solr.StrField and solr.TextField -- this issue is about the convincient subclasses of oal.document.Field... https://lucene.apache.org/core/4_0_0-BETA/core/org/apache/lucene/document/Field.html StringFields name is unintuitive and not helpful Key: LUCENE-4369 URL: https://issues.apache.org/jira/browse/LUCENE-4369 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-4369.patch There's a huge difference between TextField and StringField, StringField screws up scoring and bypasses your Analyzer. (see java-user thread Custom Analyzer Not Called When Indexing as an example.) The name we use here is vital, otherwise people will get bad results. I think we should rename StringField to MatchOnlyField. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3830) Rename LFUCache to FastLFUCache
[ https://issues.apache.org/jira/browse/SOLR-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454291#comment-13454291 ] Adrien Grand commented on SOLR-3830: bq. we should not be adding new names with Fast in front of them This is why I also suggested to rename FastLRUCache to ConcurrentLRUCache in my 2nd paragraph (or something else, I'm open to other ideas). bq. OK, let's leave things as they are then. Documentation is the key if we need to clarify anything. Why don't you like renaming FastLRUCache to something else and adding a deprecated FastLRUCache subclass for backward compatibility, as Chris suggests? Rename LFUCache to FastLFUCache --- Key: SOLR-3830 URL: https://issues.apache.org/jira/browse/SOLR-3830 Project: Solr Issue Type: Bug Affects Versions: 4.0-BETA Reporter: Adrien Grand Priority: Minor I find it a little disturbing that LFUCache shares most of its behavior (not strictly bounded size, good at concurrent reads, slow at writes unless eviction is performed in a separate thread) with FastLRUCache while it sounds like it is the LFU equivalent of LRUCache (strictly bounded size, synchronized reads, fast writes) so I'd like to rename it to FastLFUCache. Maybe we should also rename these Fast*Cache to Concurrent*Cache so that people don't think that they are better than their non Fast alternatives in every way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3830) Rename LFUCache to FastLFUCache
[ https://issues.apache.org/jira/browse/SOLR-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454299#comment-13454299 ] Yonik Seeley commented on SOLR-3830: I have a higher bar for renaming things in config files and APIs. Solr has a large user base with tons of people that know what things do, and we often overlook the downside of destroying collective knowledge by renaming things that are only a slight improvement. I personally think Lucene has gone rename-crazy and wouldn't do many of those if it were up to me... Rename LFUCache to FastLFUCache --- Key: SOLR-3830 URL: https://issues.apache.org/jira/browse/SOLR-3830 Project: Solr Issue Type: Bug Affects Versions: 4.0-BETA Reporter: Adrien Grand Priority: Minor I find it a little disturbing that LFUCache shares most of its behavior (not strictly bounded size, good at concurrent reads, slow at writes unless eviction is performed in a separate thread) with FastLRUCache while it sounds like it is the LFU equivalent of LRUCache (strictly bounded size, synchronized reads, fast writes) so I'd like to rename it to FastLFUCache. Maybe we should also rename these Fast*Cache to Concurrent*Cache so that people don't think that they are better than their non Fast alternatives in every way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-3830) Rename LFUCache to FastLFUCache
[ https://issues.apache.org/jira/browse/SOLR-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved SOLR-3830. Resolution: Won't Fix Given that we can neither rename LFUCache to FastLFUCache nor rename FastLRUCache to something else, I am marking this issue as won't fix since there is no way to have a consistent name for these two classes. Rename LFUCache to FastLFUCache --- Key: SOLR-3830 URL: https://issues.apache.org/jira/browse/SOLR-3830 Project: Solr Issue Type: Bug Affects Versions: 4.0-BETA Reporter: Adrien Grand Priority: Minor I find it a little disturbing that LFUCache shares most of its behavior (not strictly bounded size, good at concurrent reads, slow at writes unless eviction is performed in a separate thread) with FastLRUCache while it sounds like it is the LFU equivalent of LRUCache (strictly bounded size, synchronized reads, fast writes) so I'd like to rename it to FastLFUCache. Maybe we should also rename these Fast*Cache to Concurrent*Cache so that people don't think that they are better than their non Fast alternatives in every way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3815) add hash range to shard
[ https://issues.apache.org/jira/browse/SOLR-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley updated SOLR-3815: --- Attachment: SOLR-3815_addrange.patch Here's a start on adding ranges to shard properties. Seems to work at first and then gets lost on an update currently. Example: {code} {collection1:{ shard1:{replicas:{Rogue:8983_solr_collection1:{ shard:shard1, leader:true, roles:null, state:active, core:collection1, collection:collection1, node_name:Rogue:8983_solr, base_url:http://Rogue:8983/solr}}}, shard2:{ range:0-7fff, replicas:{ {code} add hash range to shard --- Key: SOLR-3815 URL: https://issues.apache.org/jira/browse/SOLR-3815 Project: Solr Issue Type: Sub-task Reporter: Yonik Seeley Attachments: SOLR-3815_addrange.patch, SOLR-3815.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3831) atomic updates to fields of type payloads do not distribute correctly
Jim Musil created SOLR-3831: --- Summary: atomic updates to fields of type payloads do not distribute correctly Key: SOLR-3831 URL: https://issues.apache.org/jira/browse/SOLR-3831 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0-BETA Environment: linux Reporter: Jim Musil After setting up two independent solr nodes using the SolrCloud tutorial, atomic updates to a field of type payloads gives an error when updating the destination node. The error is: SEVERE: java.lang.NumberFormatException: For input string: 100} The input sent to the first node is in the expected default format for a payload field (eg foo|100) and that update succeeds. I've found that the update always works for the first node, but never the second. I've tested each server running independently and found that this update works as expected. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3796) I am getting 404 when accessing http://localhost:7101/wcoe-solr/admin
[ https://issues.apache.org/jira/browse/SOLR-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454405#comment-13454405 ] Erick Erickson commented on SOLR-3796: -- Is this still a problem? This is probably better raised on the user's list before making this a JIRA. I am getting 404 when accessing http://localhost:7101/wcoe-solr/admin - Key: SOLR-3796 URL: https://issues.apache.org/jira/browse/SOLR-3796 Project: Solr Issue Type: Bug Components: Build, web gui Affects Versions: 3.6.1 Environment: windows XP/Weblogic Reporter: Sridharan I deployed solr.war successfully in weblogic 9 I got the welcome page when i access http://localhost:7101/wcoe-solr/ But it gives 404 error, when i access the admin http://localhost:7101/wcoe-solr/admin; Please help -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Admin UI, schema browser, numbers squished together...
When I go into the new Admin UIschema browser and select a field that has lots of terms in it, the display isn't correct. The count field to the left of the each term value is cut off, making it very hard to actually see the term counts. I'm in a situation where I have many thousands of docs that have a particular term Worth a JIRA? I didn't see any relevant ones on a fast scan Erick - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4208) Spatial distance relevancy should use score of 1/distance
[ https://issues.apache.org/jira/browse/LUCENE-4208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated LUCENE-4208: - Attachment: LUCENE-4208_makeQuery_return_ConstantScoreQuery_and_remake_TwoDoublesStrategy.patch This patch is the start of something I hope to finish tonight. makeValueSource is makeDistanceValueSource to make abundantly clear. TwoDoubles is getting overhauled to support the dateline and any query shape--should probably go into another issue. Spatial distance relevancy should use score of 1/distance - Key: LUCENE-4208 URL: https://issues.apache.org/jira/browse/LUCENE-4208 Project: Lucene - Core Issue Type: New Feature Components: modules/spatial Reporter: David Smiley Fix For: 4.0 Attachments: LUCENE-4208_makeQuery_return_ConstantScoreQuery_and_remake_TwoDoublesStrategy.patch The SpatialStrategy.makeQuery() at the moment uses the distance as the score (although some strategies -- TwoDoubles if I recall might not do anything which would be a bug). The distance is a poor value to use as the score because the score should be related to relevancy, and the distance itself is inversely related to that. A score of 1/distance would be nice. Another alternative is earthCircumference/2 - distance, although I like 1/distance better. Maybe use a different constant than 1. Credit: this is Chris Male's idea. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4381) support unicode 6.2
Robert Muir created LUCENE-4381: --- Summary: support unicode 6.2 Key: LUCENE-4381 URL: https://issues.apache.org/jira/browse/LUCENE-4381 Project: Lucene - Core Issue Type: Task Components: modules/analysis Reporter: Robert Muir Fix For: 4.1, 5.0 ICU will release a new version in about a month. They have a version for testing (http://site.icu-project.org/download/milestone) already out with some interesting features, e.g. dictionary-based CJK segmentation. This issue is just to test it out/integrate the new stuff/etc. We should try out the automation Steve did as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4381) support unicode 6.2
[ https://issues.apache.org/jira/browse/LUCENE-4381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-4381: Attachment: LUCENE-4381.patch A hacked up patch for testing: I think its nice to offer the CJK dictionary-based stuff as an option? I'm not sure how good results will be on average yet (maybe I can enlist Christian to help investigate). So as a test I just added a boolean option, which if enabled, keeps all han/hiragana/katakana marked as Chinese/Japanese (uses the 15924 Japanese code, but I overrode the toString to try to prevent confusion). Seems to work ok: some trivial snippets from smartcn and kuromoji are analyzed fine, and testRandomStrings is happy :) support unicode 6.2 --- Key: LUCENE-4381 URL: https://issues.apache.org/jira/browse/LUCENE-4381 Project: Lucene - Core Issue Type: Task Components: modules/analysis Reporter: Robert Muir Fix For: 4.1, 5.0 Attachments: LUCENE-4381.patch ICU will release a new version in about a month. They have a version for testing (http://site.icu-project.org/download/milestone) already out with some interesting features, e.g. dictionary-based CJK segmentation. This issue is just to test it out/integrate the new stuff/etc. We should try out the automation Steve did as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4380) fix simplefs/niofshierarchy
[ https://issues.apache.org/jira/browse/LUCENE-4380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454495#comment-13454495 ] Michael McCandless commented on LUCENE-4380: +1, this is a very nice simplification. fix simplefs/niofshierarchy --- Key: LUCENE-4380 URL: https://issues.apache.org/jira/browse/LUCENE-4380 Project: Lucene - Core Issue Type: Task Reporter: Robert Muir Attachments: LUCENE-4380.patch spinoff from LUCENE-4371: Currently NIOFSDirectory.NIOFSIndexInput extends SimpleFSDirectory.SimpleFSIndexInput, but this isn't an is-a relationship at all. Additionally SimpleFSDirectory has a funky Descriptor class that extends RandomAccessFile that is useless: {noformat} /** * Extension of RandomAccessFile that tracks if the file is * open. */ ... // remember if the file is open, so that we don't try to close it // more than once {noformat} RandomAccessFile is closeable, this is not necessary and I don't think we should be subclassing it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4382) Unicode escape no longer works for non-prefix wildcard terms
Jack Krupansky created LUCENE-4382: -- Summary: Unicode escape no longer works for non-prefix wildcard terms Key: LUCENE-4382 URL: https://issues.apache.org/jira/browse/LUCENE-4382 Project: Lucene - Core Issue Type: Bug Components: core/queryparser Affects Versions: 4.0-BETA Reporter: Jack Krupansky Fix For: 4.0 LUCENE-588 added support for escaping of wildcard characters, but when the de-escaping logic was pushed down from the query parser (QueryParserBase) into WildcardQuery, support for Unicode escaping (backslash, u, and the four-digit hex Unicode code) was not included. Two solutions: 1. Do the Unicode de-escaping in the query parser before calling getWildcardQuery. 2. Support Unicode de-escaping in WildcardQuery. A suffix wildcard does not exhibit this problem because full de-escaping is performed in the query parser before calling getPrefixQuery. My test case, added at the beginning of TestExtendedDismaxParser.testFocusQueryParser: {code} assertQ(expected doc is missing (using escaped edismax w/field), req(q, t_special:literal\\:\\u0063olo*n, defType, edismax), //doc[1]/str[@name='id'][.='46']); {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4382) Unicode escape no longer works for non-prefix wildcard terms
[ https://issues.apache.org/jira/browse/LUCENE-4382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jack Krupansky updated LUCENE-4382: --- Description: LUCENE-588 added support for escaping of wildcard characters, but when the de-escaping logic was pushed down from the query parser (QueryParserBase) into WildcardQuery, support for Unicode escaping (backslash, u, and the four-digit hex Unicode code) was not included. Two solutions: 1. Do the Unicode de-escaping in the query parser before calling getWildcardQuery. 2. Support Unicode de-escaping in WildcardQuery. A suffix wildcard does not exhibit this problem because full de-escaping is performed in the query parser before calling getPrefixQuery. My test case, added at the beginning of TestExtendedDismaxParser.testFocusQueryParser: {code} assertQ(expected doc is missing (using escaped edismax w/field), req(q, t_special:literal\\:\\u0063olo*n, defType, edismax), //doc[1]/str[@name='id'][.='46']); {code} Note: That test case was only used to debug into WildcardQuery to see that the Unicode escape was not processed correctly. It fails in all cases, but that's because of how the field type is analyzed. Here is a Lucene-level test case that can also be debugged to see that WildcardQuery is not processing the Unicode escape properly. I added it at the start of TestMultiAnalyzer.testMultiAnalyzer: {code} assertEquals(literal\\:\\u0063olo*n, qp.parse(literal\\:\\u0063olo*n).toString()); {code} Note: This case will always run correctly since it is only checking the input pattern string for WildcardQuery and not how the de-escaping was performed within WildcardQuery. was: LUCENE-588 added support for escaping of wildcard characters, but when the de-escaping logic was pushed down from the query parser (QueryParserBase) into WildcardQuery, support for Unicode escaping (backslash, u, and the four-digit hex Unicode code) was not included. Two solutions: 1. Do the Unicode de-escaping in the query parser before calling getWildcardQuery. 2. Support Unicode de-escaping in WildcardQuery. A suffix wildcard does not exhibit this problem because full de-escaping is performed in the query parser before calling getPrefixQuery. My test case, added at the beginning of TestExtendedDismaxParser.testFocusQueryParser: {code} assertQ(expected doc is missing (using escaped edismax w/field), req(q, t_special:literal\\:\\u0063olo*n, defType, edismax), //doc[1]/str[@name='id'][.='46']); {code} Unicode escape no longer works for non-prefix wildcard terms Key: LUCENE-4382 URL: https://issues.apache.org/jira/browse/LUCENE-4382 Project: Lucene - Core Issue Type: Bug Components: core/queryparser Affects Versions: 4.0-BETA Reporter: Jack Krupansky Fix For: 4.0 LUCENE-588 added support for escaping of wildcard characters, but when the de-escaping logic was pushed down from the query parser (QueryParserBase) into WildcardQuery, support for Unicode escaping (backslash, u, and the four-digit hex Unicode code) was not included. Two solutions: 1. Do the Unicode de-escaping in the query parser before calling getWildcardQuery. 2. Support Unicode de-escaping in WildcardQuery. A suffix wildcard does not exhibit this problem because full de-escaping is performed in the query parser before calling getPrefixQuery. My test case, added at the beginning of TestExtendedDismaxParser.testFocusQueryParser: {code} assertQ(expected doc is missing (using escaped edismax w/field), req(q, t_special:literal\\:\\u0063olo*n, defType, edismax), //doc[1]/str[@name='id'][.='46']); {code} Note: That test case was only used to debug into WildcardQuery to see that the Unicode escape was not processed correctly. It fails in all cases, but that's because of how the field type is analyzed. Here is a Lucene-level test case that can also be debugged to see that WildcardQuery is not processing the Unicode escape properly. I added it at the start of TestMultiAnalyzer.testMultiAnalyzer: {code} assertEquals(literal\\:\\u0063olo*n, qp.parse(literal\\:\\u0063olo*n).toString()); {code} Note: This case will always run correctly since it is only checking the input pattern string for WildcardQuery and not how the de-escaping was performed within WildcardQuery. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail:
[jira] [Updated] (LUCENE-4382) Unicode escape no longer works for non-suffix-only wildcard terms
[ https://issues.apache.org/jira/browse/LUCENE-4382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jack Krupansky updated LUCENE-4382: --- Description: LUCENE-588 added support for escaping of wildcard characters, but when the de-escaping logic was pushed down from the query parser (QueryParserBase) into WildcardQuery, support for Unicode escaping (backslash, u, and the four-digit hex Unicode code) was not included. Two solutions: 1. Do the Unicode de-escaping in the query parser before calling getWildcardQuery. 2. Support Unicode de-escaping in WildcardQuery. A suffix-only wildcard does not exhibit this problem because full de-escaping is performed in the query parser before calling getPrefixQuery. My test case, added at the beginning of TestExtendedDismaxParser.testFocusQueryParser: {code} assertQ(expected doc is missing (using escaped edismax w/field), req(q, t_special:literal\\:\\u0063olo*n, defType, edismax), //doc[1]/str[@name='id'][.='46']); {code} Note: That test case was only used to debug into WildcardQuery to see that the Unicode escape was not processed correctly. It fails in all cases, but that's because of how the field type is analyzed. Here is a Lucene-level test case that can also be debugged to see that WildcardQuery is not processing the Unicode escape properly. I added it at the start of TestMultiAnalyzer.testMultiAnalyzer: {code} assertEquals(literal\\:\\u0063olo*n, qp.parse(literal\\:\\u0063olo*n).toString()); {code} Note: This case will always run correctly since it is only checking the input pattern string for WildcardQuery and not how the de-escaping was performed within WildcardQuery. was: LUCENE-588 added support for escaping of wildcard characters, but when the de-escaping logic was pushed down from the query parser (QueryParserBase) into WildcardQuery, support for Unicode escaping (backslash, u, and the four-digit hex Unicode code) was not included. Two solutions: 1. Do the Unicode de-escaping in the query parser before calling getWildcardQuery. 2. Support Unicode de-escaping in WildcardQuery. A suffix wildcard does not exhibit this problem because full de-escaping is performed in the query parser before calling getPrefixQuery. My test case, added at the beginning of TestExtendedDismaxParser.testFocusQueryParser: {code} assertQ(expected doc is missing (using escaped edismax w/field), req(q, t_special:literal\\:\\u0063olo*n, defType, edismax), //doc[1]/str[@name='id'][.='46']); {code} Note: That test case was only used to debug into WildcardQuery to see that the Unicode escape was not processed correctly. It fails in all cases, but that's because of how the field type is analyzed. Here is a Lucene-level test case that can also be debugged to see that WildcardQuery is not processing the Unicode escape properly. I added it at the start of TestMultiAnalyzer.testMultiAnalyzer: {code} assertEquals(literal\\:\\u0063olo*n, qp.parse(literal\\:\\u0063olo*n).toString()); {code} Note: This case will always run correctly since it is only checking the input pattern string for WildcardQuery and not how the de-escaping was performed within WildcardQuery. Summary: Unicode escape no longer works for non-suffix-only wildcard terms (was: Unicode escape no longer works for non-prefix wildcard terms) Unicode escape no longer works for non-suffix-only wildcard terms - Key: LUCENE-4382 URL: https://issues.apache.org/jira/browse/LUCENE-4382 Project: Lucene - Core Issue Type: Bug Components: core/queryparser Affects Versions: 4.0-BETA Reporter: Jack Krupansky Fix For: 4.0 LUCENE-588 added support for escaping of wildcard characters, but when the de-escaping logic was pushed down from the query parser (QueryParserBase) into WildcardQuery, support for Unicode escaping (backslash, u, and the four-digit hex Unicode code) was not included. Two solutions: 1. Do the Unicode de-escaping in the query parser before calling getWildcardQuery. 2. Support Unicode de-escaping in WildcardQuery. A suffix-only wildcard does not exhibit this problem because full de-escaping is performed in the query parser before calling getPrefixQuery. My test case, added at the beginning of TestExtendedDismaxParser.testFocusQueryParser: {code} assertQ(expected doc is missing (using escaped edismax w/field), req(q, t_special:literal\\:\\u0063olo*n, defType, edismax), //doc[1]/str[@name='id'][.='46']); {code} Note: That test case was only used to debug into WildcardQuery to see that the Unicode escape was not processed correctly. It fails in all cases, but that's because of how the field type is analyzed.
[jira] [Commented] (SOLR-3589) Edismax parser does not honor mm parameter if analyzer splits a token
[ https://issues.apache.org/jira/browse/SOLR-3589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454533#comment-13454533 ] Naomi Dushay commented on SOLR-3589: I may have stumbled into something. Try setting q.op explicitly. (baseurl)/select?q=fire-fly gives me a lot more results than (baseurl)/select?q=fire-flyq.op=AND oddly, q.op=OR gives me the same results as setting it to AND. Why did I stumble into this? from http://wiki.apache.org/solr/DisMaxQParserPlugin#mm_.28Minimum_.27Should.27_Match.29 In Solr 1.4 and prior, you should basically set mm=0 if you want the equivilent of q.op=OR, and mm=100% if you want the equivilent of q.op=AND. In 3.x and trunk the default value of mm is dictated by the q.op param (q.op=AND = mm=100%; q.op=OR = mm=0%). Keep in mind the default operator is effected by your schema.xml solrQueryParser defaultOperator=xxx/ entry. In older versions of Solr the default value is 100% (all clauses must match) I have q.op set in my schema, thus: solrQueryParser defaultOperator=AND / but when I use the q.op parameter, I experience something different. Wild! Does this give us any insights? Edismax parser does not honor mm parameter if analyzer splits a token - Key: SOLR-3589 URL: https://issues.apache.org/jira/browse/SOLR-3589 Project: Solr Issue Type: Bug Components: search Affects Versions: 3.6, 4.0-BETA Reporter: Tom Burton-West Attachments: testSolr3589.xml.gz, testSolr3589.xml.gz With edismax mm set to 100% if one of the tokens is split into two tokens by the analyzer chain (i.e. fire-fly = fire fly), the mm parameter is ignored and the equivalent of OR query for fire OR fly is produced. This is particularly a problem for languages that do not use white space to separate words such as Chinese or Japenese. See these messages for more discussion: http://lucene.472066.n3.nabble.com/edismax-parser-ignores-mm-parameter-when-tokenizer-splits-tokens-hypenated-words-WDF-splitting-etc-tc3991911.html http://lucene.472066.n3.nabble.com/edismax-parser-ignores-mm-parameter-when-tokenizer-splits-tokens-i-e-CJK-tc3991438.html http://lucene.472066.n3.nabble.com/Why-won-t-dismax-create-multiple-DisjunctionMaxQueries-when-autoGeneratePhraseQueries-is-false-tc3992109.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4208) Spatial distance relevancy should use score of 1/distance
[ https://issues.apache.org/jira/browse/LUCENE-4208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454581#comment-13454581 ] Chris Male commented on LUCENE-4208: bq. TwoDoubles is getting overhauled to support the dateline and any query shape--should probably go into another issue. Yes please! Spatial distance relevancy should use score of 1/distance - Key: LUCENE-4208 URL: https://issues.apache.org/jira/browse/LUCENE-4208 Project: Lucene - Core Issue Type: New Feature Components: modules/spatial Reporter: David Smiley Fix For: 4.0 Attachments: LUCENE-4208_makeQuery_return_ConstantScoreQuery_and_remake_TwoDoublesStrategy.patch The SpatialStrategy.makeQuery() at the moment uses the distance as the score (although some strategies -- TwoDoubles if I recall might not do anything which would be a bug). The distance is a poor value to use as the score because the score should be related to relevancy, and the distance itself is inversely related to that. A score of 1/distance would be nice. Another alternative is earthCircumference/2 - distance, although I like 1/distance better. Maybe use a different constant than 1. Credit: this is Chris Male's idea. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-Linux (64bit/ibm-j9-jdk7) - Build # 1061 - Failure!
Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Linux/1061/ Java: 64bit/ibm-j9-jdk7 -Xjit:exclude={org/apache/lucene/util/fst/FST.pack(IIF)Lorg/apache/lucene/util/fst/FST;} 1 tests failed. FAILED: junit.framework.TestSuite.org.apache.lucene.index.TestTypePromotion Error Message: Clean up static fields (in @AfterClass?), your test seems to hang on to approximately 12,023,120 bytes (threshold is 10,485,760): - 2,409,168 bytes, public static org.junit.rules.TestRule org.apache.lucene.util.LuceneTestCase.classRules - 2,403,728 bytes, private static java.util.EnumSet org.apache.lucene.index.TestTypePromotion.SORTED_BYTES - 2,403,568 bytes, private static java.util.EnumSet org.apache.lucene.index.TestTypePromotion.UNSORTED_BYTES - 2,403,408 bytes, private static java.util.EnumSet org.apache.lucene.index.TestTypePromotion.FLOATS - 2,403,248 bytes, private static java.util.EnumSet org.apache.lucene.index.TestTypePromotion.INTEGERS Stack Trace: junit.framework.AssertionFailedError: Clean up static fields (in @AfterClass?), your test seems to hang on to approximately 12,023,120 bytes (threshold is 10,485,760): - 2,409,168 bytes, public static org.junit.rules.TestRule org.apache.lucene.util.LuceneTestCase.classRules - 2,403,728 bytes, private static java.util.EnumSet org.apache.lucene.index.TestTypePromotion.SORTED_BYTES - 2,403,568 bytes, private static java.util.EnumSet org.apache.lucene.index.TestTypePromotion.UNSORTED_BYTES - 2,403,408 bytes, private static java.util.EnumSet org.apache.lucene.index.TestTypePromotion.FLOATS - 2,403,248 bytes, private static java.util.EnumSet org.apache.lucene.index.TestTypePromotion.INTEGERS at __randomizedtesting.SeedInfo.seed([C2434F9AAE110129]:0) at com.carrotsearch.randomizedtesting.rules.StaticFieldsInvariantRule$1.afterAlways(StaticFieldsInvariantRule.java:119) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:43) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at java.lang.Thread.run(Thread.java:777) Build Log: [...truncated 787 lines...] [junit4:junit4] Suite: org.apache.lucene.index.TestTypePromotion [junit4:junit4] 2 NOTE: test params are: codec=Lucene40: {id=PostingsFormat(name=Lucene40WithOrds)}, sim=DefaultSimilarity, locale=kn_IN, timezone=Atlantic/Reykjavik [junit4:junit4] 2 NOTE: Linux 3.2.0-29-generic amd64/IBM Corporation 1.7.0 (64-bit)/cpus=8,threads=1,free=44841080,total=536805376 [junit4:junit4] 2 NOTE: All tests run in this JVM: [TestIndexWriterMergePolicy, TestPostingsOffsets, TestSizeBoundedForceMerge, TestToken, TestThreadedForceMerge, TestParallelTermEnum, TestRegexpRandom2, TestTermsEnum, TestPerFieldPostingsFormat, TestNumericRangeQuery32, TestSpanMultiTermQueryWrapper, TestNRTCachingDirectory, TestVersionComparator, TestNoMergePolicy, TestNRTManager, TestLockFactory, TestDuelingCodecs, TestIndexWriterDelete, TestSpansAdvanced, TestNorms, TestCopyBytes, TestBooleanMinShouldMatch, TestIOUtils, TestCachingTokenFilter, TestParallelAtomicReader, TestDateSort, TestIsCurrent, TestTopDocsCollector, TestPrefixQuery, TestDocument, TestLookaheadTokenFilter, TestTransactions, TestIndexWriterUnicode, TestPrefixCodedTerms, TestQueryWrapperFilter, TestMultiValuedNumericRangeQuery, TestFilteredSearch, TestTopDocsMerge, TestFSTs, TestConstantScoreQuery, TestPrefixRandom, TestTermRangeQuery, TestIndexWriterOnDiskFull, TestSentinelIntSet, Test2BTerms, TestBytesRefHash, TestStressIndexing, TestComplexExplanationsOfNonMatches, TestDocumentWriter, TestIndexInput, TestTransactionRollback, TestMultiThreadTermVectors, TestSpanFirstQuery, TestCheckIndex, TestBooleanQueryVisitSubscorers, TestIndexWriterLockRelease, TestFuzzyQuery, TestExplanations, TestBasics, TestMockDirectoryWrapper, TestTermVectors, TestSameTokenSamePosition, TestLucene40PostingsReader, TestSimilarityBase, TestDateTools, TestPackedInts, TestVersion, TestSpansAdvanced2, TestCharTermAttributeImpl, TestForceMergeForever, TestFilterIterator, TestPositionIncrement, TestDocumentsWriterStallControl, TestConcurrentMergeScheduler, TestRamUsageEstimatorOnWildAnimals, TestHugeRamFile, TestStressAdvance, TestMultiPhraseQuery, TestAtomicUpdate, TestIndexWriterMerging, TestOpenBitSet, TestSort, TestSearchWithThreads, TestLongPostings, TestIndexWriterCommit,
[jira] [Updated] (SOLR-3815) add hash range to shard
[ https://issues.apache.org/jira/browse/SOLR-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley updated SOLR-3815: --- Attachment: SOLR-3815_clusterState_immutable.patch Folks, while working to add the replicas level to shards (to make room for other properties), I noticed that the Overseer.updateSlice() method changed the existing ClusterState (which is advertised as being immutable). I re-wrote the method to be much shorter, and immutable with respect to the existing ClusterState, and started getting a test failure. I eventually tried just adding back the part of the code that erroneously modified the existing ClusterState, and the test passed again (see the nocommit block in Overseer). Any idea what's going on? add hash range to shard --- Key: SOLR-3815 URL: https://issues.apache.org/jira/browse/SOLR-3815 Project: Solr Issue Type: Sub-task Reporter: Yonik Seeley Attachments: SOLR-3815_addrange.patch, SOLR-3815_clusterState_immutable.patch, SOLR-3815.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-3815) add hash range to shard
[ https://issues.apache.org/jira/browse/SOLR-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454608#comment-13454608 ] Yonik Seeley edited comment on SOLR-3815 at 9/13/12 2:38 PM: - Folks, while working to add the replicas level to shards (to make room for other properties), I noticed that the Overseer.updateSlice() method changed the existing ClusterState (which is advertised as being immutable). I re-wrote the method to be much shorter, and immutable with respect to the existing ClusterState, and started getting a test failure. I eventually tried just adding back the part of the code that erroneously modified the existing ClusterState, and the test passed again (see the nocommit block in Overseer). Any idea what's going on? edit: the test that failed was LeaderElectionIntegrationTest. Not sure if it caused other failures. was (Author: ysee...@gmail.com): Folks, while working to add the replicas level to shards (to make room for other properties), I noticed that the Overseer.updateSlice() method changed the existing ClusterState (which is advertised as being immutable). I re-wrote the method to be much shorter, and immutable with respect to the existing ClusterState, and started getting a test failure. I eventually tried just adding back the part of the code that erroneously modified the existing ClusterState, and the test passed again (see the nocommit block in Overseer). Any idea what's going on? add hash range to shard --- Key: SOLR-3815 URL: https://issues.apache.org/jira/browse/SOLR-3815 Project: Solr Issue Type: Sub-task Reporter: Yonik Seeley Attachments: SOLR-3815_addrange.patch, SOLR-3815_clusterState_immutable.patch, SOLR-3815.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-3815) add hash range to shard
[ https://issues.apache.org/jira/browse/SOLR-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454608#comment-13454608 ] Yonik Seeley edited comment on SOLR-3815 at 9/13/12 2:49 PM: - Folks, while working to add the replicas level to shards (to make room for other properties), I noticed that the Overseer.updateSlice() method changed the existing ClusterState (which is advertised as being immutable). I re-wrote the method to be much shorter, and immutable with respect to the existing ClusterState, and started getting a test failure. I eventually tried just adding back the part of the code that erroneously modified the existing ClusterState, and the test passed again (see the nocommit block in Overseer). Any idea what's going on? edit: the test that failed was LeaderElectionIntegrationTest. Not sure if it caused other failures. edit: in Overseer.run() we have ClusterState clusterState = reader.getClusterState(); and that is the state that is accidentally being modified (that accidentally makes things work). I assume this is OK, as the reader is supposed to update it's state via zookeeper - which means there is perhaps something wrong with reader.updateClusterState(true)? was (Author: ysee...@gmail.com): Folks, while working to add the replicas level to shards (to make room for other properties), I noticed that the Overseer.updateSlice() method changed the existing ClusterState (which is advertised as being immutable). I re-wrote the method to be much shorter, and immutable with respect to the existing ClusterState, and started getting a test failure. I eventually tried just adding back the part of the code that erroneously modified the existing ClusterState, and the test passed again (see the nocommit block in Overseer). Any idea what's going on? edit: the test that failed was LeaderElectionIntegrationTest. Not sure if it caused other failures. add hash range to shard --- Key: SOLR-3815 URL: https://issues.apache.org/jira/browse/SOLR-3815 Project: Solr Issue Type: Sub-task Reporter: Yonik Seeley Attachments: SOLR-3815_addrange.patch, SOLR-3815_clusterState_immutable.patch, SOLR-3815.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-3815) add hash range to shard
[ https://issues.apache.org/jira/browse/SOLR-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13454608#comment-13454608 ] Yonik Seeley edited comment on SOLR-3815 at 9/13/12 2:49 PM: - Folks, while working to add the replicas level to shards (to make room for other properties), I noticed that the Overseer.updateSlice() method changed the existing ClusterState (which is advertised as being immutable). I re-wrote the method to be much shorter, and immutable with respect to the existing ClusterState, and started getting a test failure. I eventually tried just adding back the part of the code that erroneously modified the existing ClusterState, and the test passed again (see the nocommit block in Overseer). Any idea what's going on? edit: the test that failed was LeaderElectionIntegrationTest. Not sure if it caused other failures. edit: in Overseer.run() we have ClusterState clusterState = reader.getClusterState(); and that is the state that is accidentally being modified (that accidentally makes things work). I assume the reader is supposed to update it's state via zookeeper - which means there is perhaps something wrong with reader.updateClusterState(true)? was (Author: ysee...@gmail.com): Folks, while working to add the replicas level to shards (to make room for other properties), I noticed that the Overseer.updateSlice() method changed the existing ClusterState (which is advertised as being immutable). I re-wrote the method to be much shorter, and immutable with respect to the existing ClusterState, and started getting a test failure. I eventually tried just adding back the part of the code that erroneously modified the existing ClusterState, and the test passed again (see the nocommit block in Overseer). Any idea what's going on? edit: the test that failed was LeaderElectionIntegrationTest. Not sure if it caused other failures. edit: in Overseer.run() we have ClusterState clusterState = reader.getClusterState(); and that is the state that is accidentally being modified (that accidentally makes things work). I assume this is OK, as the reader is supposed to update it's state via zookeeper - which means there is perhaps something wrong with reader.updateClusterState(true)? add hash range to shard --- Key: SOLR-3815 URL: https://issues.apache.org/jira/browse/SOLR-3815 Project: Solr Issue Type: Sub-task Reporter: Yonik Seeley Attachments: SOLR-3815_addrange.patch, SOLR-3815_clusterState_immutable.patch, SOLR-3815.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-4.x-Linux (64bit/ibm-j9-jdk7) - Build # 1061 - Failure!
I'll fix this one. D. On Thu, Sep 13, 2012 at 4:37 AM, Policeman Jenkins Server jenk...@sd-datasolutions.de wrote: Build: http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Linux/1061/ Java: 64bit/ibm-j9-jdk7 -Xjit:exclude={org/apache/lucene/util/fst/FST.pack(IIF)Lorg/apache/lucene/util/fst/FST;} 1 tests failed. FAILED: junit.framework.TestSuite.org.apache.lucene.index.TestTypePromotion Error Message: Clean up static fields (in @AfterClass?), your test seems to hang on to approximately 12,023,120 bytes (threshold is 10,485,760): - 2,409,168 bytes, public static org.junit.rules.TestRule org.apache.lucene.util.LuceneTestCase.classRules - 2,403,728 bytes, private static java.util.EnumSet org.apache.lucene.index.TestTypePromotion.SORTED_BYTES - 2,403,568 bytes, private static java.util.EnumSet org.apache.lucene.index.TestTypePromotion.UNSORTED_BYTES - 2,403,408 bytes, private static java.util.EnumSet org.apache.lucene.index.TestTypePromotion.FLOATS - 2,403,248 bytes, private static java.util.EnumSet org.apache.lucene.index.TestTypePromotion.INTEGERS Stack Trace: junit.framework.AssertionFailedError: Clean up static fields (in @AfterClass?), your test seems to hang on to approximately 12,023,120 bytes (threshold is 10,485,760): - 2,409,168 bytes, public static org.junit.rules.TestRule org.apache.lucene.util.LuceneTestCase.classRules - 2,403,728 bytes, private static java.util.EnumSet org.apache.lucene.index.TestTypePromotion.SORTED_BYTES - 2,403,568 bytes, private static java.util.EnumSet org.apache.lucene.index.TestTypePromotion.UNSORTED_BYTES - 2,403,408 bytes, private static java.util.EnumSet org.apache.lucene.index.TestTypePromotion.FLOATS - 2,403,248 bytes, private static java.util.EnumSet org.apache.lucene.index.TestTypePromotion.INTEGERS at __randomizedtesting.SeedInfo.seed([C2434F9AAE110129]:0) at com.carrotsearch.randomizedtesting.rules.StaticFieldsInvariantRule$1.afterAlways(StaticFieldsInvariantRule.java:119) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:43) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at java.lang.Thread.run(Thread.java:777) Build Log: [...truncated 787 lines...] [junit4:junit4] Suite: org.apache.lucene.index.TestTypePromotion [junit4:junit4] 2 NOTE: test params are: codec=Lucene40: {id=PostingsFormat(name=Lucene40WithOrds)}, sim=DefaultSimilarity, locale=kn_IN, timezone=Atlantic/Reykjavik [junit4:junit4] 2 NOTE: Linux 3.2.0-29-generic amd64/IBM Corporation 1.7.0 (64-bit)/cpus=8,threads=1,free=44841080,total=536805376 [junit4:junit4] 2 NOTE: All tests run in this JVM: [TestIndexWriterMergePolicy, TestPostingsOffsets, TestSizeBoundedForceMerge, TestToken, TestThreadedForceMerge, TestParallelTermEnum, TestRegexpRandom2, TestTermsEnum, TestPerFieldPostingsFormat, TestNumericRangeQuery32, TestSpanMultiTermQueryWrapper, TestNRTCachingDirectory, TestVersionComparator, TestNoMergePolicy, TestNRTManager, TestLockFactory, TestDuelingCodecs, TestIndexWriterDelete, TestSpansAdvanced, TestNorms, TestCopyBytes, TestBooleanMinShouldMatch, TestIOUtils, TestCachingTokenFilter, TestParallelAtomicReader, TestDateSort, TestIsCurrent, TestTopDocsCollector, TestPrefixQuery, TestDocument, TestLookaheadTokenFilter, TestTransactions, TestIndexWriterUnicode, TestPrefixCodedTerms, TestQueryWrapperFilter, TestMultiValuedNumericRangeQuery, TestFilteredSearch, TestTopDocsMerge, TestFSTs, TestConstantScoreQuery, TestPrefixRandom, TestTermRangeQuery, TestIndexWriterOnDiskFull, TestSentinelIntSet, Test2BTerms, TestBytesRefHash, TestStressIndexing, TestComplexExplanationsOfNonMatches, TestDocumentWriter, TestIndexInput, TestTransactionRollback, TestMultiThreadTermVectors, TestSpanFirstQuery, TestCheckIndex, TestBooleanQueryVisitSubscorers, TestIndexWriterLockRelease, TestFuzzyQuery, TestExplanations, TestBasics, TestMockDirectoryWrapper, TestTermVectors, TestSameTokenSamePosition, TestLucene40PostingsReader, TestSimilarityBase, TestDateTools, TestPackedInts, TestVersion, TestSpansAdvanced2, TestCharTermAttributeImpl, TestForceMergeForever, TestFilterIterator, TestPositionIncrement, TestDocumentsWriterStallControl, TestConcurrentMergeScheduler, TestRamUsageEstimatorOnWildAnimals,