[jira] [Updated] (LUCENE-5141) CheckIndex.fixIndex doesn't need a Codec
[ https://issues.apache.org/jira/browse/LUCENE-5141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated LUCENE-5141: - Fix Version/s: 4.5 5.0 CheckIndex.fixIndex doesn't need a Codec Key: LUCENE-5141 URL: https://issues.apache.org/jira/browse/LUCENE-5141 Project: Lucene - Core Issue Type: Task Reporter: Adrien Grand Assignee: Adrien Grand Priority: Trivial Fix For: 5.0, 4.5 Attachments: LUCENE-5141.patch CheckIndex.fixIndex takes a codec as an argument although it doesn't need one. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests
[ https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-4985: --- Attachment: LUCENE-4985.patch Patch addresses the following: * Added FacetRequest.createFacetsAggregator(FacetIndexingParams). All requests implement it except RangeFacetRequest which returns null. The method is abstract and documents that you are allowed return null. * TaxonomyFacetsAccumulator: if a FacetRequest returns null from createFacetsAggregator, it throws an exception. Otherwise, it groups the requests into category lists as well as ensures that categories are not over counted. It uses MultiFacetsAggregator (new) and PerCategoryListAggregator (existing) to achieve that. ** That allows passing a combination of requests, e.g. Count(A), Count(B), Count(C), SumScore(A), SumScore(F), SumIntAssociation(D)... and works correctly when e.g. A+B were indexed in the same category list, but C, D and F weren't. * Added FacetsAccumulator.create() variants which support RangeAccumulator and either TaxonomyFacetsAccumulator or SortedSetDocValuesAccumulator. Differences are in the methods signatures. ** Renamed RangeFacestAccumulatorWrapper to MultiFacetsAccumulator. Also, the FacetResults are returned in the order of the given accumulators. ** FacetsAccumulator.create documents that you may receive ListFacetResult in a different order than you passed in, guaranteeing that all RangeFacetRequests come last. * Modified DrillSideways to take either TaxonomyReader or SortedSetDVReaderState because otherise it cannot be used with SortedSetDV facets. Mike, can you please review it? These changes simplified e.g. the associations examples, as now FacetsAccumulator.create() takes care of them too, since they implement createFacetsAggregator. Also, any future FacetRequest which will support FacetsAggregator will be supported automatically. Make it easier to mix different kinds of FacetRequests -- Key: LUCENE-4985 URL: https://issues.apache.org/jira/browse/LUCENE-4985 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Michael McCandless Fix For: 5.0, 4.5 Attachments: LUCENE-4985.patch Spinoff from LUCENE-4980, where we added a strange class called RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the FacetRequests into range and non-range, delegates to two accumulators for each set, and then zips the results back together in order. Somehow we should generalize this class and make it work with SortedSetDocValuesAccumulator as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5144) Nuke FacetRequest.createAggregator
Shai Erera created LUCENE-5144: -- Summary: Nuke FacetRequest.createAggregator Key: LUCENE-5144 URL: https://issues.apache.org/jira/browse/LUCENE-5144 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Shai Erera Assignee: Shai Erera Aggregator was replaced by FacetsAggregator. FacetRequest has createAggregator() which by default throws an UOE. It was left there until we migrate the aggregators to FacetsAggregator -- now all of our requests support FacetsAggregator. Aggregator is used only by StandardFacetsAccumulator, which too needs to vanish, at some point. But it currently it's the only one which handles sampling, complements aggregation and partitions. What I'd like to do is remove FacetRequest.createAggregator and in StandardFacetsAccumulator support only CountFacetRequest and SumScoreFacetRequest, which are the only ones that make sense for sampling and partitions. SumScore does not even support complements (which only work for counting). I'll also rename StandardFA to OldStandardFA. The plan is to eventually implement a SamplingAccumulator, PartitionsAccumulator/Aggregator and ComplementsAggregator, removing that class entirely. Until then ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests
[ https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722338#comment-13722338 ] Michael McCandless commented on LUCENE-4985: Could you post a patch with --show-copies-as-adds? (The current patch isn't easily applied since there were svn mvs involved...). Thanks. Make it easier to mix different kinds of FacetRequests -- Key: LUCENE-4985 URL: https://issues.apache.org/jira/browse/LUCENE-4985 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Michael McCandless Fix For: 5.0, 4.5 Attachments: LUCENE-4985.patch Spinoff from LUCENE-4980, where we added a strange class called RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the FacetRequests into range and non-range, delegates to two accumulators for each set, and then zips the results back together in order. Somehow we should generalize this class and make it work with SortedSetDocValuesAccumulator as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests
[ https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-4985: --- Attachment: LUCENE-4985.patch Patch with --show-copies-as-adds Make it easier to mix different kinds of FacetRequests -- Key: LUCENE-4985 URL: https://issues.apache.org/jira/browse/LUCENE-4985 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Michael McCandless Fix For: 5.0, 4.5 Attachments: LUCENE-4985.patch, LUCENE-4985.patch Spinoff from LUCENE-4980, where we added a strange class called RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the FacetRequests into range and non-range, delegates to two accumulators for each set, and then zips the results back together in order. Somehow we should generalize this class and make it work with SortedSetDocValuesAccumulator as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests
[ https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722370#comment-13722370 ] Michael McCandless commented on LUCENE-4985: This is a nice cleanup! It's still hard to mix all three kinds of facet requests? E.g. I think it's realistic for an app to use SSDV for flat fields (less RAM usage than taxo, important if there are lots of ords), range for volatile numeric fields (e.g. time delta based), and taxo for hierarchies. It seems like we could have a FacetsAccumulator.create that took both SSDVReaderState and TaxoReader and created the right FacetsAccumulator ... and I guess we'd need a SSDVFacetRequest. Or I guess I can just create the directly MultiFacetsAccumulator myself ... FA.create is just sugar. This all can wait for a follow-on issue ... these improvements are already great. Should we move MultiFacetsAccumulator somewhere else (out of .range package)? It's more generic now? bq. FacetsAccumulator.create documents that you may receive ListFacetResult in a different order than you passed in, guaranteeing that all RangeFacetRequests come last. Hmm, can we fix that? (So that the order of the results matches the order of the requests). bq. Modified DrillSideways to take either TaxonomyReader or SortedSetDVReaderState because otherise it cannot be used with SortedSetDV facets. Mike, can you please review it? Those changes look good! I think we can now simplify TestDrillSideways (previously it had to @Override getDrillDown/SidewaysAccumulator to use sorted set)? Make it easier to mix different kinds of FacetRequests -- Key: LUCENE-4985 URL: https://issues.apache.org/jira/browse/LUCENE-4985 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Michael McCandless Fix For: 5.0, 4.5 Attachments: LUCENE-4985.patch, LUCENE-4985.patch Spinoff from LUCENE-4980, where we added a strange class called RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the FacetRequests into range and non-range, delegates to two accumulators for each set, and then zips the results back together in order. Somehow we should generalize this class and make it work with SortedSetDocValuesAccumulator as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests
[ https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722375#comment-13722375 ] Shai Erera commented on LUCENE-4985: Adding State to .create() does not simplify life for an app I think, because someone (on the app side) will need to figure out if State should be null or not. I'm worried that users will end up creating State even if they don't need it? And since MultiFacetAccumulator lets you wrap any accumulator yourself, I think it's fine that these are separate methods, as a first step. I'm worried about adding SortedSetDVFacetRequest, because unlike Count/SumScore/SumIntAssociation, this request is solely about the underlying source? And it also implies only counting ... bq. Should we move MultiFacetsAccumulator somewhere else You're right! It was left there by mistake because I renamed RangeAccumulatorWrapper. Will move. {quote} Hmm, can we fix that? (So that the order of the results matches the order of the requests). {quote} I don't know how important it is ... none of our tests depend on it, and it's not clear to me how to fix it at all. FA.create() is a factory method. If it returns a single Accumulator, then it happens already (order is maintained). MultiFacetAccum loses the order. Maybe if we passed it the list of facet requests it could re-order them after accumulation, but I don't know how important it is... an app can put the ListFacetResult in a Map, and do lookups? Also, as a generic MultiFA, it's not easier to determine from which FA a source FacetRequest came? bq. I think we can now simplify TestDrillSideways You're right. Done. Make it easier to mix different kinds of FacetRequests -- Key: LUCENE-4985 URL: https://issues.apache.org/jira/browse/LUCENE-4985 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Michael McCandless Fix For: 5.0, 4.5 Attachments: LUCENE-4985.patch, LUCENE-4985.patch Spinoff from LUCENE-4980, where we added a strange class called RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the FacetRequests into range and non-range, delegates to two accumulators for each set, and then zips the results back together in order. Somehow we should generalize this class and make it work with SortedSetDocValuesAccumulator as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests
[ https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-4985: --- Attachment: LUCENE-4985.patch Patch with fixed comments. Make it easier to mix different kinds of FacetRequests -- Key: LUCENE-4985 URL: https://issues.apache.org/jira/browse/LUCENE-4985 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Michael McCandless Fix For: 5.0, 4.5 Attachments: LUCENE-4985.patch, LUCENE-4985.patch, LUCENE-4985.patch Spinoff from LUCENE-4980, where we added a strange class called RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the FacetRequests into range and non-range, delegates to two accumulators for each set, and then zips the results back together in order. Somehow we should generalize this class and make it work with SortedSetDocValuesAccumulator as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests
[ https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722377#comment-13722377 ] Michael McCandless commented on LUCENE-4985: {quote} I don't know how important it is ... none of our tests depend on it, and it's not clear to me how to fix it at all. FA.create() is a factory method. If it returns a single Accumulator, then it happens already (order is maintained). MultiFacetAccum loses the order. Maybe if we passed it the list of facet requests it could re-order them after accumulation, but I don't know how important it is... an app can put the ListFacetResult in a Map, and do lookups? Also, as a generic MultiFA, it's not easier to determine from which FA a source FacetRequest came? {quote} OK ... But, I think we should not document that range facet requests come last? Let's leave it defined as undefined? Maybe we should return Collection not List? Make it easier to mix different kinds of FacetRequests -- Key: LUCENE-4985 URL: https://issues.apache.org/jira/browse/LUCENE-4985 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Michael McCandless Fix For: 5.0, 4.5 Attachments: LUCENE-4985.patch, LUCENE-4985.patch, LUCENE-4985.patch Spinoff from LUCENE-4980, where we added a strange class called RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the FacetRequests into range and non-range, delegates to two accumulators for each set, and then zips the results back together in order. Somehow we should generalize this class and make it work with SortedSetDocValuesAccumulator as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests
[ https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722381#comment-13722381 ] Shai Erera commented on LUCENE-4985: bq. But, I think we should not document that range facet requests come last? Ok I will remove that comment. As soon as we add more accumulators, this comment is not important anyway. bq. Maybe we should return Collection not List? Why? I prefer that we don't change that since that will change tests. Many of the tests do results.get(idx). If we don't need to, let's not complicate the users? If an app does pass the requests in known order, it shouldn't suffer. It's only Multi that loses order. Make it easier to mix different kinds of FacetRequests -- Key: LUCENE-4985 URL: https://issues.apache.org/jira/browse/LUCENE-4985 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Michael McCandless Fix For: 5.0, 4.5 Attachments: LUCENE-4985.patch, LUCENE-4985.patch, LUCENE-4985.patch Spinoff from LUCENE-4980, where we added a strange class called RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the FacetRequests into range and non-range, delegates to two accumulators for each set, and then zips the results back together in order. Somehow we should generalize this class and make it work with SortedSetDocValuesAccumulator as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5086) The OR operator works incorrectly in XPathEntityProcessor
shenzhuxi created SOLR-5086: --- Summary: The OR operator works incorrectly in XPathEntityProcessor Key: SOLR-5086 URL: https://issues.apache.org/jira/browse/SOLR-5086 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Affects Versions: 4.4 Reporter: shenzhuxi I's trying to use DataImportHandler to index RSS/ATOM feed and find bizarre behaviours of the OR operator in XPathEntityProcessor. Here is the configuration. ?xml version=1.0 encoding=UTF-8? dataConfig dataSource type=FileDataSource/ document entity name=rss processor=FileListEntityProcessor baseDir=${solr.solr.home}/feed/rss fileName=^.*\.xml$ recursive=true rootEntity=false dataSource=null entity name=feed url=${rss.fileAbsolutePath} processor=XPathEntityProcessor forEach=/rss/channel/item|/feed/entry transformer=DateFormatTransformer field column=link xpath=/rss/channel/item/link|/feed/entry/link/@href/ /entity /entity /document /dataConfig The first OR operator in /rss/channel/item|/feed/entry works correctly. But the second one in /rss/channel/item/link|/feed/entry/link/@href doesn't work. If I rewrite it to either /rss/channel/item/link or /feed/entry/link/@href, it works correctly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests
[ https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722387#comment-13722387 ] Michael McCandless commented on LUCENE-4985: I just think it's a dangerous API if sometimes the order matches and sometimes it doesn't ... but we can pursue this separately ... Make it easier to mix different kinds of FacetRequests -- Key: LUCENE-4985 URL: https://issues.apache.org/jira/browse/LUCENE-4985 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Michael McCandless Fix For: 5.0, 4.5 Attachments: LUCENE-4985.patch, LUCENE-4985.patch, LUCENE-4985.patch Spinoff from LUCENE-4980, where we added a strange class called RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the FacetRequests into range and non-range, delegates to two accumulators for each set, and then zips the results back together in order. Somehow we should generalize this class and make it work with SortedSetDocValuesAccumulator as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5145) Added AppendingPackedLongBuffer extended AbstractAppendingLongBuffer family (customizable compression ratio + bulk retrieval)
Boaz Leskes created LUCENE-5145: --- Summary: Added AppendingPackedLongBuffer extended AbstractAppendingLongBuffer family (customizable compression ratio + bulk retrieval) Key: LUCENE-5145 URL: https://issues.apache.org/jira/browse/LUCENE-5145 Project: Lucene - Core Issue Type: Improvement Reporter: Boaz Leskes Made acceptableOverheadRatio configurable Added bulk get to AbstractAppendingLongBuffer classes, for faster retrieval. Introduced a new variant, AppendingPackedLongBuffer which solely relies on PackedInts as a back-end. This new class is useful where people have non-negative numbers with a fairly uniform distribution over a fixed (limited) range. Ex. facets ordinals. To distinguish it from AppendingPackedLongBuffer, delta based AppendingLongBuffer was renamed to AppendingDeltaPackedLongBuffer Fixed an Issue with NullReader where it didn't respect it's valueCount in bulk gets. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5145) Added AppendingPackedLongBuffer extended AbstractAppendingLongBuffer family (customizable compression ratio + bulk retrieval)
[ https://issues.apache.org/jira/browse/LUCENE-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722388#comment-13722388 ] Boaz Leskes commented on LUCENE-5145: - While making the above changes I did some measurements which I feel is also useful to share. PackedInts trade CPU for better CPU cache memory usage. PackedInts gives you an acceptableOverheadRatio parameter to control the trade off but is not exposed in the AbstractAppendingLongBuffer family is based on those. This is especially important when you do no rely on the AbstractAppendingLongBuffer.iterator() to extract your data. Here is some experiments I run on my laptop, using BenchmarkAppendLongBufferRead which is included in the patch. The program allows you to play with different read strategies and data size and measure reading times. This is the result of using AppendingDeltaPackedLongBuffer (previously called AppendingLongBuffer) to sequential read an array of 50 elements, using it's get method. The data was uniformly distributed numbers between 0 7. The program measure 10,000 such read. The total time is the time it took to perform all of them. You also see in the output the number of bits used to store the elements and the storage class used. --- Storage: DELTA_PACKED, Read: SEQUENTIAL, Read size: 1 SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time: 22.18s avg: 2.22ms, total read: 25 elm (class org.apache.lucene.util.packed.Packed64, 223.76kb) SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time: 19.14s avg: 1.91ms, total read: 25 elm (class org.apache.lucene.util.packed.Direct8, 521.13kb) As you can see, when retrieving elements one by one, the byte based implementation slightly faster. For comparison, the new AppendingPackedLongBuffer with the same setup: --- Storage: PACKED, Read: SEQUENTIAL, Read size: 1 SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time: 16.69s avg: 1.67ms, total read: 25 elm (class org.apache.lucene.util.packed.Packed64, 219.76kb) SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time: 13.47s avg: 1.35ms, total read: 25 elm (class org.apache.lucene.util.packed.Direct8, 517.13kb) Next to the fact that is faster, you see the same behavior. For random reads, the classes display similar behavior: --- Storage: DELTA_PACKED, Read: RANDOM, Read size: 1 SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time: 23.13s avg: 2.31ms, total read: 25 elm (class org.apache.lucene.util.packed.Packed64, 223.76kb) SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time: 19.38s avg: 1.94ms, total read: 25 elm (class org.apache.lucene.util.packed.Direct8, 521.13kb) --- Storage: PACKED, Read: RANDOM, Read size: 1 SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time: 19.23s avg: 1.92ms, total read: 25 elm (class org.apache.lucene.util.packed.Packed64, 219.76kb) SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time: 15.95s avg: 1.60ms, total read: 25 elm (class org.apache.lucene.util.packed.Direct8, 517.13kb) Next I looked at the effect of exposing the bulk reads offered by the PackedInts structures in the AppendingLongBuffer family. Here is some results from the new packed implementation, this time reading 4 16 consecutive elements in a single read. --- Storage: PACKED, Read: SEQUENTIAL, Read size: 4 SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time: 11.16s avg: 1.12ms, total read: 25 elm (class org.apache.lucene.util.packed.Packed64, 219.76kb) BULK GET:3 bits ratio 0.00 (i.e.,3 bits) total time: 24.22s avg: 2.42ms, total read: 25 elm (class org.apache.lucene.util.packed.Packed64, 219.76kb) SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time: 8.35s avg: 0.84ms, total read: 25 elm (class org.apache.lucene.util.packed.Direct8, 517.13kb) BULK GET:3 bits ratio 7.00 (i.e.,8 bits) total time: 8.44s avg: 0.84ms, total read: 25 elm (class org.apache.lucene.util.packed.Direct8, 517.13kb) --- Storage: PACKED, Read: CONTINUOUS, Read size: 16 SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time: 9.63s avg: 0.96ms, total read: 25 elm (class org.apache.lucene.util.packed.Packed64, 219.76kb) BULK GET:3 bits ratio 0.00 (i.e.,3 bits) total time: 12.52s avg: 1.25ms, total read: 25 elm (class org.apache.lucene.util.packed.Packed64, 219.76kb) SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time: 7.46s avg: 0.75ms, total read: 25 elm (class org.apache.lucene.util.packed.Direct8, 517.13kb) BULK GET:3 bits ratio 7.00 (i.e.,8 bits) total time: 3.22s avg: 0.32ms, total read: 25 elm (class org.apache.lucene.util.packed.Direct8, 517.13kb) As you can see the bulk read api for the
[jira] [Updated] (LUCENE-5145) Added AppendingPackedLongBuffer extended AbstractAppendingLongBuffer family (customizable compression ratio + bulk retrieval)
[ https://issues.apache.org/jira/browse/LUCENE-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boaz Leskes updated LUCENE-5145: Attachment: LUCENE-5145.patch Added AppendingPackedLongBuffer extended AbstractAppendingLongBuffer family (customizable compression ratio + bulk retrieval) --- Key: LUCENE-5145 URL: https://issues.apache.org/jira/browse/LUCENE-5145 Project: Lucene - Core Issue Type: Improvement Reporter: Boaz Leskes Attachments: LUCENE-5145.patch Made acceptableOverheadRatio configurable Added bulk get to AbstractAppendingLongBuffer classes, for faster retrieval. Introduced a new variant, AppendingPackedLongBuffer which solely relies on PackedInts as a back-end. This new class is useful where people have non-negative numbers with a fairly uniform distribution over a fixed (limited) range. Ex. facets ordinals. To distinguish it from AppendingPackedLongBuffer, delta based AppendingLongBuffer was renamed to AppendingDeltaPackedLongBuffer Fixed an Issue with NullReader where it didn't respect it's valueCount in bulk gets. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-5145) Added AppendingPackedLongBuffer extended AbstractAppendingLongBuffer family (customizable compression ratio + bulk retrieval)
[ https://issues.apache.org/jira/browse/LUCENE-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722388#comment-13722388 ] Boaz Leskes edited comment on LUCENE-5145 at 7/29/13 12:23 PM: --- While making the above changes I did some measurements which I feel is also useful to share. PackedInts trade CPU for better CPU cache memory usage. PackedInts gives you an acceptableOverheadRatio parameter to control the trade off but is not exposed in the AbstractAppendingLongBuffer family is based on those. This is especially important when you do no rely on the AbstractAppendingLongBuffer.iterator() to extract your data. Here is some experiments I run on my laptop, using BenchmarkAppendLongBufferRead which is included in the patch. The program allows you to play with different read strategies and data size and measure reading times. This is the result of using AppendingDeltaPackedLongBuffer (previously called AppendingLongBuffer) to sequential read an array of 50 elements, using it's get method. The data was uniformly distributed numbers between 0 7. The program measure 10,000 such read. The total time is the time it took to perform all of them. You also see in the output the number of bits used to store the elements and the storage class used. --- Storage: DELTA_PACKED, Read: SEQUENTIAL, Read size: 1 SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time: 22.18s avg: 2.22ms, total read: 25 elm (class org.apache.lucene.util.packed.Packed64, 223.76kb) SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time: 19.14s avg: 1.91ms, total read: 25 elm (class org.apache.lucene.util.packed.Direct8, 521.13kb) As you can see, when retrieving elements one by one, the byte based implementation slightly faster. For comparison, the new AppendingPackedLongBuffer with the same setup: --- Storage: PACKED, Read: SEQUENTIAL, Read size: 1 SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time: 16.69s avg: 1.67ms, total read: 25 elm (class org.apache.lucene.util.packed.Packed64, 219.76kb) SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time: 13.47s avg: 1.35ms, total read: 25 elm (class org.apache.lucene.util.packed.Direct8, 517.13kb) Next to the fact that is faster, you see the same behavior. For random reads, the classes display similar behavior: --- Storage: DELTA_PACKED, Read: RANDOM, Read size: 1 SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time: 23.13s avg: 2.31ms, total read: 25 elm (class org.apache.lucene.util.packed.Packed64, 223.76kb) SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time: 19.38s avg: 1.94ms, total read: 25 elm (class org.apache.lucene.util.packed.Direct8, 521.13kb) --- Storage: PACKED, Read: RANDOM, Read size: 1 SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time: 19.23s avg: 1.92ms, total read: 25 elm (class org.apache.lucene.util.packed.Packed64, 219.76kb) SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time: 15.95s avg: 1.60ms, total read: 25 elm (class org.apache.lucene.util.packed.Direct8, 517.13kb) Next I looked at the effect of exposing the bulk reads offered by the PackedInts structures in the AppendingLongBuffer family. Here is some results from the new packed implementation, this time reading 4 16 consecutive elements in a single read. --- Storage: PACKED, Read: SEQUENTIAL, Read size: 4 SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time: 11.16s avg: 1.12ms, total read: 25 elm (class org.apache.lucene.util.packed.Packed64, 219.76kb) BULK GET:3 bits ratio 0.00 (i.e.,3 bits) total time: 24.22s avg: 2.42ms, total read: 25 elm (class org.apache.lucene.util.packed.Packed64, 219.76kb) SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time: 8.35s avg: 0.84ms, total read: 25 elm (class org.apache.lucene.util.packed.Direct8, 517.13kb) BULK GET:3 bits ratio 7.00 (i.e.,8 bits) total time: 8.44s avg: 0.84ms, total read: 25 elm (class org.apache.lucene.util.packed.Direct8, 517.13kb) --- Storage: PACKED, Read: SEQUENTIAL, Read size: 16 SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time: 9.63s avg: 0.96ms, total read: 25 elm (class org.apache.lucene.util.packed.Packed64, 219.76kb) BULK GET:3 bits ratio 0.00 (i.e.,3 bits) total time: 12.52s avg: 1.25ms, total read: 25 elm (class org.apache.lucene.util.packed.Packed64, 219.76kb) SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time: 7.46s avg: 0.75ms, total read: 25 elm (class org.apache.lucene.util.packed.Direct8, 517.13kb) BULK GET:3 bits ratio 7.00 (i.e.,8 bits) total time: 3.22s avg: 0.32ms, total read: 25 elm (class org.apache.lucene.util.packed.Direct8,
[jira] [Comment Edited] (LUCENE-5145) Added AppendingPackedLongBuffer extended AbstractAppendingLongBuffer family (customizable compression ratio + bulk retrieval)
[ https://issues.apache.org/jira/browse/LUCENE-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722388#comment-13722388 ] Boaz Leskes edited comment on LUCENE-5145 at 7/29/13 12:25 PM: --- While making the above changes I did some measurements which I feel is also useful to share. PackedInts trade CPU for better CPU cache memory usage. PackedInts gives you an acceptableOverheadRatio parameter to control the trade off but is not exposed in the AbstractAppendingLongBuffer family is based on those. This is especially important when you do no rely on the AbstractAppendingLongBuffer.iterator() to extract your data. Here is some experiments I run on my laptop, using BenchmarkAppendLongBufferRead which is included in the patch. The program allows you to play with different read strategies and data size and measure reading times. This is the result of using AppendingDeltaPackedLongBuffer (previously called AppendingLongBuffer) to sequential read an array of 50 elements, using it's get method. The data was uniformly distributed numbers between 0 7. The program measure 10,000 such read. The total time is the time it took to perform all of them. You also see in the output the number of bits used to store the elements and the storage class used. --- Storage: DELTA_PACKED, Read: SEQUENTIAL, Read size: 1 SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time: 22.18s avg: 2.22ms, total read: 25 elm (class org.apache.lucene.util.packed.Packed64, 223.76kb) SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time: 19.14s avg: 1.91ms, total read: 25 elm (class org.apache.lucene.util.packed.Direct8, 521.13kb) As you can see, when retrieving elements one by one, the byte based implementation slightly faster. For comparison, the new AppendingPackedLongBuffer with the same setup: --- Storage: PACKED, Read: SEQUENTIAL, Read size: 1 SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time: 16.69s avg: 1.67ms, total read: 25 elm (class org.apache.lucene.util.packed.Packed64, 219.76kb) SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time: 13.47s avg: 1.35ms, total read: 25 elm (class org.apache.lucene.util.packed.Direct8, 517.13kb) Next to the fact that is faster, you see the same behavior. For random reads, the classes display similar behavior: --- Storage: DELTA_PACKED, Read: RANDOM, Read size: 1 SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time: 23.13s avg: 2.31ms, total read: 25 elm (class org.apache.lucene.util.packed.Packed64, 223.76kb) SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time: 19.38s avg: 1.94ms, total read: 25 elm (class org.apache.lucene.util.packed.Direct8, 521.13kb) --- Storage: PACKED, Read: RANDOM, Read size: 1 SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time: 19.23s avg: 1.92ms, total read: 25 elm (class org.apache.lucene.util.packed.Packed64, 219.76kb) SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time: 15.95s avg: 1.60ms, total read: 25 elm (class org.apache.lucene.util.packed.Direct8, 517.13kb) Next I looked at the effect of exposing the bulk reads offered by the PackedInts structures in the AppendingLongBuffer family. Here is some results from the new packed implementation, this time reading 4 16 consecutive elements in a single read. --- Storage: PACKED, Read: SEQUENTIAL, Read size: 4 SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time: 11.16s avg: 1.12ms, total read: 25 elm (class org.apache.lucene.util.packed.Packed64, 219.76kb) BULK GET:3 bits ratio 0.00 (i.e.,3 bits) total time: 24.22s avg: 2.42ms, total read: 25 elm (class org.apache.lucene.util.packed.Packed64, 219.76kb) SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time: 8.35s avg: 0.84ms, total read: 25 elm (class org.apache.lucene.util.packed.Direct8, 517.13kb) BULK GET:3 bits ratio 7.00 (i.e.,8 bits) total time: 8.44s avg: 0.84ms, total read: 25 elm (class org.apache.lucene.util.packed.Direct8, 517.13kb) --- Storage: PACKED, Read: SEQUENTIAL, Read size: 16 SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time: 9.63s avg: 0.96ms, total read: 25 elm (class org.apache.lucene.util.packed.Packed64, 219.76kb) BULK GET:3 bits ratio 0.00 (i.e.,3 bits) total time: 12.52s avg: 1.25ms, total read: 25 elm (class org.apache.lucene.util.packed.Packed64, 219.76kb) SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time: 7.46s avg: 0.75ms, total read: 25 elm (class org.apache.lucene.util.packed.Direct8, 517.13kb) BULK GET:3 bits ratio 7.00 (i.e.,8 bits) total time: 3.22s avg: 0.32ms, total read: 25 elm (class org.apache.lucene.util.packed.Direct8,
[jira] [Updated] (LUCENE-5145) Added AppendingPackedLongBuffer extended AbstractAppendingLongBuffer family (customizable compression ratio + bulk retrieval)
[ https://issues.apache.org/jira/browse/LUCENE-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated LUCENE-5145: - Assignee: Adrien Grand Added AppendingPackedLongBuffer extended AbstractAppendingLongBuffer family (customizable compression ratio + bulk retrieval) --- Key: LUCENE-5145 URL: https://issues.apache.org/jira/browse/LUCENE-5145 Project: Lucene - Core Issue Type: Improvement Reporter: Boaz Leskes Assignee: Adrien Grand Attachments: LUCENE-5145.patch Made acceptableOverheadRatio configurable Added bulk get to AbstractAppendingLongBuffer classes, for faster retrieval. Introduced a new variant, AppendingPackedLongBuffer which solely relies on PackedInts as a back-end. This new class is useful where people have non-negative numbers with a fairly uniform distribution over a fixed (limited) range. Ex. facets ordinals. To distinguish it from AppendingPackedLongBuffer, delta based AppendingLongBuffer was renamed to AppendingDeltaPackedLongBuffer Fixed an Issue with NullReader where it didn't respect it's valueCount in bulk gets. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests
[ https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722430#comment-13722430 ] ASF subversion and git services commented on LUCENE-4985: - Commit 1508043 from [~shaie] in branch 'dev/trunk' [ https://svn.apache.org/r1508043 ] LUCENE-4985: Make it easier to mix different kinds of FacetRequests Make it easier to mix different kinds of FacetRequests -- Key: LUCENE-4985 URL: https://issues.apache.org/jira/browse/LUCENE-4985 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Michael McCandless Fix For: 5.0, 4.5 Attachments: LUCENE-4985.patch, LUCENE-4985.patch, LUCENE-4985.patch Spinoff from LUCENE-4980, where we added a strange class called RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the FacetRequests into range and non-range, delegates to two accumulators for each set, and then zips the results back together in order. Somehow we should generalize this class and make it work with SortedSetDocValuesAccumulator as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests
[ https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722436#comment-13722436 ] ASF subversion and git services commented on LUCENE-4985: - Commit 1508046 from [~shaie] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1508046 ] LUCENE-4985: Make it easier to mix different kinds of FacetRequests Make it easier to mix different kinds of FacetRequests -- Key: LUCENE-4985 URL: https://issues.apache.org/jira/browse/LUCENE-4985 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Michael McCandless Fix For: 5.0, 4.5 Attachments: LUCENE-4985.patch, LUCENE-4985.patch, LUCENE-4985.patch Spinoff from LUCENE-4980, where we added a strange class called RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the FacetRequests into range and non-range, delegates to two accumulators for each set, and then zips the results back together in order. Somehow we should generalize this class and make it work with SortedSetDocValuesAccumulator as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests
[ https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera resolved LUCENE-4985. Resolution: Fixed Assignee: Shai Erera Lucene Fields: New,Patch Available (was: New) Committed to trunk and 4x. I think we can change .accumulate to return a MapFacetRequest,FacetResult, but this affects many of the tests, so let's do that separately. Make it easier to mix different kinds of FacetRequests -- Key: LUCENE-4985 URL: https://issues.apache.org/jira/browse/LUCENE-4985 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Michael McCandless Assignee: Shai Erera Fix For: 5.0, 4.5 Attachments: LUCENE-4985.patch, LUCENE-4985.patch, LUCENE-4985.patch Spinoff from LUCENE-4980, where we added a strange class called RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the FacetRequests into range and non-range, delegates to two accumulators for each set, and then zips the results back together in order. Somehow we should generalize this class and make it work with SortedSetDocValuesAccumulator as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS-MAVEN] Lucene-Solr-Maven-4.x #401: POMs out of sync
Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/401/ 2 tests failed. FAILED: org.apache.solr.cloud.AliasIntegrationTest.org.apache.solr.cloud.AliasIntegrationTest Error Message: 1 thread leaked from SUITE scope at org.apache.solr.cloud.AliasIntegrationTest: 1) Thread[id=8074, name=recoveryCmdExecutor-4930-thread-1, state=RUNNABLE, group=TGRP-AliasIntegrationTest] at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:385) at java.net.Socket.connect(Socket.java:546) at org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:127) at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180) at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294) at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:645) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:480) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:365) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180) at org.apache.solr.cloud.SyncStrategy$1.run(SyncStrategy.java:291) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:679) Stack Trace: com.carrotsearch.randomizedtesting.ThreadLeakError: 1 thread leaked from SUITE scope at org.apache.solr.cloud.AliasIntegrationTest: 1) Thread[id=8074, name=recoveryCmdExecutor-4930-thread-1, state=RUNNABLE, group=TGRP-AliasIntegrationTest] at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:385) at java.net.Socket.connect(Socket.java:546) at org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:127) at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180) at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294) at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:645) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:480) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:365) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180) at org.apache.solr.cloud.SyncStrategy$1.run(SyncStrategy.java:291) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:679) at __randomizedtesting.SeedInfo.seed([CB7D2916453BB11E]:0) FAILED: org.apache.solr.cloud.AliasIntegrationTest.org.apache.solr.cloud.AliasIntegrationTest Error Message: There are still zombie threads that couldn't be terminated: 1) Thread[id=8074, name=recoveryCmdExecutor-4930-thread-1, state=RUNNABLE, group=TGRP-AliasIntegrationTest] at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:385)
[jira] [Created] (LUCENE-5146) AnalyzingSuggester sort order doesn't respect the actual weight
Simon Willnauer created LUCENE-5146: --- Summary: AnalyzingSuggester sort order doesn't respect the actual weight Key: LUCENE-5146 URL: https://issues.apache.org/jira/browse/LUCENE-5146 Project: Lucene - Core Issue Type: Bug Components: modules/spellchecker Affects Versions: 4.4 Reporter: Simon Willnauer Fix For: 5.0, 4.5 Uwe would say: sorry but your code is wrong. We don't actually read the weight value in AnalyzingComparator which can cause really odd suggestions since we read parts of the input as the weight. Non of our tests catches that so I will go ahead and add some tests for it as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5147) Consider returning a MapFacetRequest,FacetResult from FacetsAccumulator
Shai Erera created LUCENE-5147: -- Summary: Consider returning a MapFacetRequest,FacetResult from FacetsAccumulator Key: LUCENE-5147 URL: https://issues.apache.org/jira/browse/LUCENE-5147 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Shai Erera Today the API returns a List which suggests there's an ordering going on. This may be confusing if one uses FacetsAccumulator.create which results in a MultiFacetsAccumulator, and then the order of the results does not correspond to the order of the requests. Rather than trying to enforce ordering, a simple mapping may be better even for consuming apps since they will be able to easily lookup desired results. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4761) add option to plug in mergedsegmentwarmer
[ https://issues.apache.org/jira/browse/SOLR-4761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722440#comment-13722440 ] Markus Jelsma commented on SOLR-4761: - This option reduces latency but is not enabled by default. Is there any reason not to enable it (by default)? Thanks add option to plug in mergedsegmentwarmer - Key: SOLR-4761 URL: https://issues.apache.org/jira/browse/SOLR-4761 Project: Solr Issue Type: New Feature Reporter: Robert Muir Fix For: 5.0, 4.4 Attachments: SOLR-4761.patch, SOLR-4761.patch This is pretty expert, but can be useful in some cases. We can also provide a simple minimalist implementation that just ensures datastructures are primed so the first queries aren't e.g. causing norms to be read from disk etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5146) AnalyzingSuggester sort order doesn't respect the actual weight
[ https://issues.apache.org/jira/browse/LUCENE-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-5146: Attachment: LUCENE-5146.patch here is a patch AnalyzingSuggester sort order doesn't respect the actual weight --- Key: LUCENE-5146 URL: https://issues.apache.org/jira/browse/LUCENE-5146 Project: Lucene - Core Issue Type: Bug Components: modules/spellchecker Affects Versions: 4.4 Reporter: Simon Willnauer Fix For: 5.0, 4.5 Attachments: LUCENE-5146.patch Uwe would say: sorry but your code is wrong. We don't actually read the weight value in AnalyzingComparator which can cause really odd suggestions since we read parts of the input as the weight. Non of our tests catches that so I will go ahead and add some tests for it as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5148) SortedSetDocValues caching / state
Adrien Grand created LUCENE-5148: Summary: SortedSetDocValues caching / state Key: LUCENE-5148 URL: https://issues.apache.org/jira/browse/LUCENE-5148 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Priority: Minor I just spent some time digging into a bug which was due to the fact that SORTED_SET doc values are stateful (setDocument/nextOrd) and are cached per thread. So if you try to get two instances from the same field in the same thread, you will actually get the same instance and won't be able to iterate over ords of two documents in parallel. This is not necessarily a bug, this behavior can be documented, but I think it would be nice if the API could prevent from such mistakes by storing the state in a separate object or cloning the SortedSetDocValues object in SegmentCoreReaders.getSortedSetDocValues? What do you think? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5148) SortedSetDocValues caching / state
[ https://issues.apache.org/jira/browse/LUCENE-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722449#comment-13722449 ] Simon Willnauer commented on LUCENE-5148: - +1 on removing the trap. Yet, it would be nice to make this object entirely stateless if possible. I can think of 2 options: {noformat} public LongsRef getOrds(int docId, LongsRef spare) {noformat} this has the advantage that we can easily reuse a LongsRef on top which is kind of consistent with other API in Lucene or maybe add an OrdsIterator like this {noformat} public OrdsIter getOrds(int docId, OrdsIter spare) // Iterate like this: int ord; while( (ord = ordsIter.nextOrd()) != NO_MORE_ORDS) { ... } {noformat} mainly thinking about consistency regarding other apis here but I don't like the stateful API we have right now. SortedSetDocValues caching / state -- Key: LUCENE-5148 URL: https://issues.apache.org/jira/browse/LUCENE-5148 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Priority: Minor I just spent some time digging into a bug which was due to the fact that SORTED_SET doc values are stateful (setDocument/nextOrd) and are cached per thread. So if you try to get two instances from the same field in the same thread, you will actually get the same instance and won't be able to iterate over ords of two documents in parallel. This is not necessarily a bug, this behavior can be documented, but I think it would be nice if the API could prevent from such mistakes by storing the state in a separate object or cloning the SortedSetDocValues object in SegmentCoreReaders.getSortedSetDocValues? What do you think? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5149) CommonTermsQuery should allow minNrShouldMatch for high low freq terms
[ https://issues.apache.org/jira/browse/LUCENE-5149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-5149: Component/s: modules/other Affects Version/s: 4.4 Fix Version/s: 4.5 5.0 Assignee: Simon Willnauer CommonTermsQuery should allow minNrShouldMatch for high low freq terms Key: LUCENE-5149 URL: https://issues.apache.org/jira/browse/LUCENE-5149 Project: Lucene - Core Issue Type: Improvement Components: modules/other Affects Versions: 4.4 Reporter: Simon Willnauer Assignee: Simon Willnauer Priority: Minor Fix For: 5.0, 4.5 Currently CommonTermsQuery only allows a minShouldMatch for the low frequent query. Yet, we should also allow this for the high frequent part to have better control over scoring. here is an ES issue that is related to this: https://github.com/elasticsearch/elasticsearch/issues/3188 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5149) CommonTermsQuery should allow minNrShouldMatch for high low freq terms
Simon Willnauer created LUCENE-5149: --- Summary: CommonTermsQuery should allow minNrShouldMatch for high low freq terms Key: LUCENE-5149 URL: https://issues.apache.org/jira/browse/LUCENE-5149 Project: Lucene - Core Issue Type: Improvement Reporter: Simon Willnauer Priority: Minor Currently CommonTermsQuery only allows a minShouldMatch for the low frequent query. Yet, we should also allow this for the high frequent part to have better control over scoring. here is an ES issue that is related to this: https://github.com/elasticsearch/elasticsearch/issues/3188 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5149) CommonTermsQuery should allow minNrShouldMatch for high low freq terms
[ https://issues.apache.org/jira/browse/LUCENE-5149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-5149: Attachment: LUCENE-5149.patch here is a patch CommonTermsQuery should allow minNrShouldMatch for high low freq terms Key: LUCENE-5149 URL: https://issues.apache.org/jira/browse/LUCENE-5149 Project: Lucene - Core Issue Type: Improvement Components: modules/other Affects Versions: 4.4 Reporter: Simon Willnauer Assignee: Simon Willnauer Priority: Minor Fix For: 5.0, 4.5 Attachments: LUCENE-5149.patch Currently CommonTermsQuery only allows a minShouldMatch for the low frequent query. Yet, we should also allow this for the high frequent part to have better control over scoring. here is an ES issue that is related to this: https://github.com/elasticsearch/elasticsearch/issues/3188 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5148) SortedSetDocValues caching / state
[ https://issues.apache.org/jira/browse/LUCENE-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722475#comment-13722475 ] Robert Muir commented on LUCENE-5148: - these other options have downsides too. LongsRef has all the disadvantages of the *Ref APIs (e.g. reuse bugs), also requires reading all the ordinals into RAM at once. Adding an additional iterator just pushes the problem into a different place to me, and makes the api more complex. The current threadlocal + state is at least simple, consistent with all of the other docvalues, and documented that it works this way. If we want to change the API, then I think we need to consider all of these issues. SortedSetDocValues caching / state -- Key: LUCENE-5148 URL: https://issues.apache.org/jira/browse/LUCENE-5148 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Priority: Minor I just spent some time digging into a bug which was due to the fact that SORTED_SET doc values are stateful (setDocument/nextOrd) and are cached per thread. So if you try to get two instances from the same field in the same thread, you will actually get the same instance and won't be able to iterate over ords of two documents in parallel. This is not necessarily a bug, this behavior can be documented, but I think it would be nice if the API could prevent from such mistakes by storing the state in a separate object or cloning the SortedSetDocValues object in SegmentCoreReaders.getSortedSetDocValues? What do you think? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5148) SortedSetDocValues caching / state
[ https://issues.apache.org/jira/browse/LUCENE-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722481#comment-13722481 ] Robert Muir commented on LUCENE-5148: - {quote} This is not necessarily a bug, this behavior can be documented, but I think it would be nice if the API could prevent from such mistakes by storing the state in a separate object or cloning the SortedSetDocValues object in SegmentCoreReaders.getSortedSetDocValues? {quote} An auto-clone could also cause traps, e.g. if someone is calling this method multiple times and its refilling buffers and so on. But adding clone to the api (so someone could do this explicitly for these expert cases) might be a good solution too. SortedSetDocValues caching / state -- Key: LUCENE-5148 URL: https://issues.apache.org/jira/browse/LUCENE-5148 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Priority: Minor I just spent some time digging into a bug which was due to the fact that SORTED_SET doc values are stateful (setDocument/nextOrd) and are cached per thread. So if you try to get two instances from the same field in the same thread, you will actually get the same instance and won't be able to iterate over ords of two documents in parallel. This is not necessarily a bug, this behavior can be documented, but I think it would be nice if the API could prevent from such mistakes by storing the state in a separate object or cloning the SortedSetDocValues object in SegmentCoreReaders.getSortedSetDocValues? What do you think? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5144) Nuke FacetRequest.createAggregator
[ https://issues.apache.org/jira/browse/LUCENE-5144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-5144: --- Attachment: LUCENE-5144.patch Patch removes FacetRequest.createAggregator (NOTE: *not* createFacetsAggregator) and replaces it by StandardFacetsAccumulator.createAggregator(FacetRequest). I also renamed SFA to OldFacetsAccumulator and moved it and all associated classes under o.a.l.facet.old, in the intention of removing them one day. Nuke FacetRequest.createAggregator -- Key: LUCENE-5144 URL: https://issues.apache.org/jira/browse/LUCENE-5144 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Shai Erera Assignee: Shai Erera Attachments: LUCENE-5144.patch Aggregator was replaced by FacetsAggregator. FacetRequest has createAggregator() which by default throws an UOE. It was left there until we migrate the aggregators to FacetsAggregator -- now all of our requests support FacetsAggregator. Aggregator is used only by StandardFacetsAccumulator, which too needs to vanish, at some point. But it currently it's the only one which handles sampling, complements aggregation and partitions. What I'd like to do is remove FacetRequest.createAggregator and in StandardFacetsAccumulator support only CountFacetRequest and SumScoreFacetRequest, which are the only ones that make sense for sampling and partitions. SumScore does not even support complements (which only work for counting). I'll also rename StandardFA to OldStandardFA. The plan is to eventually implement a SamplingAccumulator, PartitionsAccumulator/Aggregator and ComplementsAggregator, removing that class entirely. Until then ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5144) Nuke FacetRequest.createAggregator
[ https://issues.apache.org/jira/browse/LUCENE-5144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722498#comment-13722498 ] Shai Erera commented on LUCENE-5144: Tests pass, if there are no objections, I intend to commit this shortly. Nuke FacetRequest.createAggregator -- Key: LUCENE-5144 URL: https://issues.apache.org/jira/browse/LUCENE-5144 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Shai Erera Assignee: Shai Erera Attachments: LUCENE-5144.patch Aggregator was replaced by FacetsAggregator. FacetRequest has createAggregator() which by default throws an UOE. It was left there until we migrate the aggregators to FacetsAggregator -- now all of our requests support FacetsAggregator. Aggregator is used only by StandardFacetsAccumulator, which too needs to vanish, at some point. But it currently it's the only one which handles sampling, complements aggregation and partitions. What I'd like to do is remove FacetRequest.createAggregator and in StandardFacetsAccumulator support only CountFacetRequest and SumScoreFacetRequest, which are the only ones that make sense for sampling and partitions. SumScore does not even support complements (which only work for counting). I'll also rename StandardFA to OldStandardFA. The plan is to eventually implement a SamplingAccumulator, PartitionsAccumulator/Aggregator and ComplementsAggregator, removing that class entirely. Until then ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5144) Nuke FacetRequest.createAggregator
[ https://issues.apache.org/jira/browse/LUCENE-5144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722505#comment-13722505 ] ASF subversion and git services commented on LUCENE-5144: - Commit 1508085 from [~shaie] in branch 'dev/trunk' [ https://svn.apache.org/r1508085 ] LUCENE-5144: remove FacetRequest.createAggregator, rename StandardFacetsAccumulator to OldFA and move it and associated classes under o.a.l.facet.old Nuke FacetRequest.createAggregator -- Key: LUCENE-5144 URL: https://issues.apache.org/jira/browse/LUCENE-5144 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Shai Erera Assignee: Shai Erera Attachments: LUCENE-5144.patch Aggregator was replaced by FacetsAggregator. FacetRequest has createAggregator() which by default throws an UOE. It was left there until we migrate the aggregators to FacetsAggregator -- now all of our requests support FacetsAggregator. Aggregator is used only by StandardFacetsAccumulator, which too needs to vanish, at some point. But it currently it's the only one which handles sampling, complements aggregation and partitions. What I'd like to do is remove FacetRequest.createAggregator and in StandardFacetsAccumulator support only CountFacetRequest and SumScoreFacetRequest, which are the only ones that make sense for sampling and partitions. SumScore does not even support complements (which only work for counting). I'll also rename StandardFA to OldStandardFA. The plan is to eventually implement a SamplingAccumulator, PartitionsAccumulator/Aggregator and ComplementsAggregator, removing that class entirely. Until then ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5144) Nuke FacetRequest.createAggregator
[ https://issues.apache.org/jira/browse/LUCENE-5144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722509#comment-13722509 ] ASF subversion and git services commented on LUCENE-5144: - Commit 1508087 from [~shaie] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1508087 ] LUCENE-5144: remove FacetRequest.createAggregator, rename StandardFacetsAccumulator to OldFA and move it and associated classes under o.a.l.facet.old Nuke FacetRequest.createAggregator -- Key: LUCENE-5144 URL: https://issues.apache.org/jira/browse/LUCENE-5144 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Shai Erera Assignee: Shai Erera Attachments: LUCENE-5144.patch Aggregator was replaced by FacetsAggregator. FacetRequest has createAggregator() which by default throws an UOE. It was left there until we migrate the aggregators to FacetsAggregator -- now all of our requests support FacetsAggregator. Aggregator is used only by StandardFacetsAccumulator, which too needs to vanish, at some point. But it currently it's the only one which handles sampling, complements aggregation and partitions. What I'd like to do is remove FacetRequest.createAggregator and in StandardFacetsAccumulator support only CountFacetRequest and SumScoreFacetRequest, which are the only ones that make sense for sampling and partitions. SumScore does not even support complements (which only work for counting). I'll also rename StandardFA to OldStandardFA. The plan is to eventually implement a SamplingAccumulator, PartitionsAccumulator/Aggregator and ComplementsAggregator, removing that class entirely. Until then ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5144) Nuke FacetRequest.createAggregator
[ https://issues.apache.org/jira/browse/LUCENE-5144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera resolved LUCENE-5144. Resolution: Fixed Fix Version/s: 4.5 5.0 Committed to trunk and 4x. Nuke FacetRequest.createAggregator -- Key: LUCENE-5144 URL: https://issues.apache.org/jira/browse/LUCENE-5144 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Shai Erera Assignee: Shai Erera Fix For: 5.0, 4.5 Attachments: LUCENE-5144.patch Aggregator was replaced by FacetsAggregator. FacetRequest has createAggregator() which by default throws an UOE. It was left there until we migrate the aggregators to FacetsAggregator -- now all of our requests support FacetsAggregator. Aggregator is used only by StandardFacetsAccumulator, which too needs to vanish, at some point. But it currently it's the only one which handles sampling, complements aggregation and partitions. What I'd like to do is remove FacetRequest.createAggregator and in StandardFacetsAccumulator support only CountFacetRequest and SumScoreFacetRequest, which are the only ones that make sense for sampling and partitions. SumScore does not even support complements (which only work for counting). I'll also rename StandardFA to OldStandardFA. The plan is to eventually implement a SamplingAccumulator, PartitionsAccumulator/Aggregator and ComplementsAggregator, removing that class entirely. Until then ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5086) The OR operator works incorrectly in XPathEntityProcessor
[ https://issues.apache.org/jira/browse/SOLR-5086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar resolved SOLR-5086. - Resolution: Not A Problem The XPathEntityProcessor does not support the OR operator in field xpaths. The OR operator is supported only in the forEach attribute of entity. See the supported xpath types here: http://wiki.apache.org/solr/DataImportHandler#Configuration_in_data-config.xml-1 The OR operator works incorrectly in XPathEntityProcessor - Key: SOLR-5086 URL: https://issues.apache.org/jira/browse/SOLR-5086 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Affects Versions: 4.4 Reporter: shenzhuxi I's trying to use DataImportHandler to index RSS/ATOM feed and find bizarre behaviours of the OR operator in XPathEntityProcessor. Here is the configuration. ?xml version=1.0 encoding=UTF-8? dataConfig dataSource type=FileDataSource/ document entity name=rss processor=FileListEntityProcessor baseDir=${solr.solr.home}/feed/rss fileName=^.*\.xml$ recursive=true rootEntity=false dataSource=null entity name=feed url=${rss.fileAbsolutePath} processor=XPathEntityProcessor forEach=/rss/channel/item|/feed/entry transformer=DateFormatTransformer field column=link xpath=/rss/channel/item/link|/feed/entry/link/@href/ /entity /entity /document /dataConfig The first OR operator in /rss/channel/item|/feed/entry works correctly. But the second one in /rss/channel/item/link|/feed/entry/link/@href doesn't work. If I rewrite it to either /rss/channel/item/link or /feed/entry/link/@href, it works correctly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Solr realtime get vs. direct get
The Solr realtime get feature is currently documented as “Realtime-get currently relies on the update log feature”. Which is certainly true for the realtime aspect of the operation, but it happens that the /get handler works just fine when the update log feature is turned off, or if a requested ID is not in the uncommitted documents – it simply fetches committed documents rather than uncommitted documents. So, is “direct get” a non-feature or mis-feature and should not be used when the update log is disabled, or should it be a fully advertised first-class feature that is a convenient and efficient way to directly access committed documents via a list if IDs? I think at least a couple of committers should weigh in as to whether this “apparent feature” is a true feature or a non-feature to be discouraged – and then clearly document it as such. If it is a non-feature, then the code should throw a clear exception. My vote is that it be given first-class feature status - that it be advertised as “direct get” (or similar) and that real-time get is a more specialized sub-feature of it. -- Jack Krupansky
[JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.7.0_25) - Build # 6785 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/6785/ Java: 32bit/jdk1.7.0_25 -client -XX:+UseConcMarkSweepGC All tests passed Build Log: [...truncated 14980 lines...] BUILD FAILED /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:389: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:88: The following files contain @author tags, tabs or nocommits: * lucene/facet/src/java/org/apache/lucene/facet/old/OldFacetsAccumulator.java Total time: 56 minutes 6 seconds Build step 'Invoke Ant' marked build as failure Description set: Java: 32bit/jdk1.7.0_25 -client -XX:+UseConcMarkSweepGC Archiving artifacts Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-Tests-trunk-Java7 - Build # 4188 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-Tests-trunk-Java7/4188/ All tests passed Build Log: [...truncated 15115 lines...] BUILD FAILED /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/build.xml:389: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/build.xml:88: The following files contain @author tags, tabs or nocommits: * lucene/facet/src/java/org/apache/lucene/facet/old/OldFacetsAccumulator.java Total time: 68 minutes 33 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5127) FixedGapTermsIndex should use monotonic compression
[ https://issues.apache.org/jira/browse/LUCENE-5127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-5127: Attachment: LUCENE-5127.patch patch with RAMOutputStream approach (so we don't compress/uncompress/recompress) FixedGapTermsIndex should use monotonic compression --- Key: LUCENE-5127 URL: https://issues.apache.org/jira/browse/LUCENE-5127 Project: Lucene - Core Issue Type: Improvement Reporter: Robert Muir Attachments: LUCENE-5127.patch, LUCENE-5127.patch, LUCENE-5127.patch, LUCENE-5127.patch for the addresses in the big in-memory byte[] and disk blocks, we could save a good deal of RAM here. I think this codec just never got upgraded when we added these new packed improvements, but it might be interesting to try to use for the terms data of sorted/sortedset DV implementations. patch works, but has nocommits and currently ignores the divisor. The annoying problem there being that we have the shared interface with get(int) for PackedInts.Mutable/Reader, but no equivalent base class for monotonics get(long)... Still its enough that we could benchmark/compare for now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.8.0-ea-b99) - Build # 6707 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6707/ Java: 32bit/jdk1.8.0-ea-b99 -server -XX:+UseSerialGC All tests passed Build Log: [...truncated 15151 lines...] BUILD FAILED /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:395: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:88: The following files contain @author tags, tabs or nocommits: * lucene/facet/src/java/org/apache/lucene/facet/old/OldFacetsAccumulator.java Total time: 44 minutes 21 seconds Build step 'Invoke Ant' marked build as failure Description set: Java: 32bit/jdk1.8.0-ea-b99 -server -XX:+UseSerialGC Archiving artifacts Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5127) FixedGapTermsIndex should use monotonic compression
[ https://issues.apache.org/jira/browse/LUCENE-5127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722649#comment-13722649 ] Adrien Grand commented on LUCENE-5127: -- +1 FixedGapTermsIndex should use monotonic compression --- Key: LUCENE-5127 URL: https://issues.apache.org/jira/browse/LUCENE-5127 Project: Lucene - Core Issue Type: Improvement Reporter: Robert Muir Attachments: LUCENE-5127.patch, LUCENE-5127.patch, LUCENE-5127.patch, LUCENE-5127.patch for the addresses in the big in-memory byte[] and disk blocks, we could save a good deal of RAM here. I think this codec just never got upgraded when we added these new packed improvements, but it might be interesting to try to use for the terms data of sorted/sortedset DV implementations. patch works, but has nocommits and currently ignores the divisor. The annoying problem there being that we have the shared interface with get(int) for PackedInts.Mutable/Reader, but no equivalent base class for monotonics get(long)... Still its enough that we could benchmark/compare for now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Reopened] (LUCENE-5144) Nuke FacetRequest.createAggregator
[ https://issues.apache.org/jira/browse/LUCENE-5144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man reopened LUCENE-5144: -- Shai: your commited changes for this issue included a nocommit comment. rmuir changed it to a TODO in these commits... http://svn.apache.org/r1508137 http://svn.apache.org/r1508139 ...if this is an appropriate change and your goal was to address this on a more long term basis, then just re-resolve, but i wanted t omake sure it was on your radar in case this is a genuine this code should not have been committed as is situation. Nuke FacetRequest.createAggregator -- Key: LUCENE-5144 URL: https://issues.apache.org/jira/browse/LUCENE-5144 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Shai Erera Assignee: Shai Erera Fix For: 5.0, 4.5 Attachments: LUCENE-5144.patch Aggregator was replaced by FacetsAggregator. FacetRequest has createAggregator() which by default throws an UOE. It was left there until we migrate the aggregators to FacetsAggregator -- now all of our requests support FacetsAggregator. Aggregator is used only by StandardFacetsAccumulator, which too needs to vanish, at some point. But it currently it's the only one which handles sampling, complements aggregation and partitions. What I'd like to do is remove FacetRequest.createAggregator and in StandardFacetsAccumulator support only CountFacetRequest and SumScoreFacetRequest, which are the only ones that make sense for sampling and partitions. SumScore does not even support complements (which only work for counting). I'll also rename StandardFA to OldStandardFA. The plan is to eventually implement a SamplingAccumulator, PartitionsAccumulator/Aggregator and ComplementsAggregator, removing that class entirely. Until then ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5144) Nuke FacetRequest.createAggregator
[ https://issues.apache.org/jira/browse/LUCENE-5144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722653#comment-13722653 ] Robert Muir commented on LUCENE-5144: - Thanks Hoss, I almost forgot! I changed the nocommit to a TODO temporarily just to unbreak jenkins. Nuke FacetRequest.createAggregator -- Key: LUCENE-5144 URL: https://issues.apache.org/jira/browse/LUCENE-5144 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Shai Erera Assignee: Shai Erera Fix For: 5.0, 4.5 Attachments: LUCENE-5144.patch Aggregator was replaced by FacetsAggregator. FacetRequest has createAggregator() which by default throws an UOE. It was left there until we migrate the aggregators to FacetsAggregator -- now all of our requests support FacetsAggregator. Aggregator is used only by StandardFacetsAccumulator, which too needs to vanish, at some point. But it currently it's the only one which handles sampling, complements aggregation and partitions. What I'd like to do is remove FacetRequest.createAggregator and in StandardFacetsAccumulator support only CountFacetRequest and SumScoreFacetRequest, which are the only ones that make sense for sampling and partitions. SumScore does not even support complements (which only work for counting). I'll also rename StandardFA to OldStandardFA. The plan is to eventually implement a SamplingAccumulator, PartitionsAccumulator/Aggregator and ComplementsAggregator, removing that class entirely. Until then ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5127) FixedGapTermsIndex should use monotonic compression
[ https://issues.apache.org/jira/browse/LUCENE-5127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722697#comment-13722697 ] ASF subversion and git services commented on LUCENE-5127: - Commit 1508147 from [~rcmuir] in branch 'dev/trunk' [ https://svn.apache.org/r1508147 ] LUCENE-5127: FixedGapTermsIndex should use monotonic compression FixedGapTermsIndex should use monotonic compression --- Key: LUCENE-5127 URL: https://issues.apache.org/jira/browse/LUCENE-5127 Project: Lucene - Core Issue Type: Improvement Reporter: Robert Muir Attachments: LUCENE-5127.patch, LUCENE-5127.patch, LUCENE-5127.patch, LUCENE-5127.patch for the addresses in the big in-memory byte[] and disk blocks, we could save a good deal of RAM here. I think this codec just never got upgraded when we added these new packed improvements, but it might be interesting to try to use for the terms data of sorted/sortedset DV implementations. patch works, but has nocommits and currently ignores the divisor. The annoying problem there being that we have the shared interface with get(int) for PackedInts.Mutable/Reader, but no equivalent base class for monotonics get(long)... Still its enough that we could benchmark/compare for now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.8.0-ea-b99) - Build # 6786 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/6786/ Java: 32bit/jdk1.8.0-ea-b99 -server -XX:+UseSerialGC All tests passed Build Log: [...truncated 15029 lines...] BUILD FAILED /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:389: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:88: The following files contain @author tags, tabs or nocommits: * lucene/facet/src/java/org/apache/lucene/facet/old/OldFacetsAccumulator.java Total time: 41 minutes 28 seconds Build step 'Invoke Ant' marked build as failure Description set: Java: 32bit/jdk1.8.0-ea-b99 -server -XX:+UseSerialGC Archiving artifacts Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5127) FixedGapTermsIndex should use monotonic compression
[ https://issues.apache.org/jira/browse/LUCENE-5127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-5127. - Resolution: Fixed Fix Version/s: 5.0 resolving for trunk only. I think the situation is already confusing in 4.x and backporting seems risky... FixedGapTermsIndex should use monotonic compression --- Key: LUCENE-5127 URL: https://issues.apache.org/jira/browse/LUCENE-5127 Project: Lucene - Core Issue Type: Improvement Reporter: Robert Muir Fix For: 5.0 Attachments: LUCENE-5127.patch, LUCENE-5127.patch, LUCENE-5127.patch, LUCENE-5127.patch for the addresses in the big in-memory byte[] and disk blocks, we could save a good deal of RAM here. I think this codec just never got upgraded when we added these new packed improvements, but it might be interesting to try to use for the terms data of sorted/sortedset DV implementations. patch works, but has nocommits and currently ignores the divisor. The annoying problem there being that we have the shared interface with get(int) for PackedInts.Mutable/Reader, but no equivalent base class for monotonics get(long)... Still its enough that we could benchmark/compare for now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4583) StraightBytesDocValuesField fails if bytes 32k
[ https://issues.apache.org/jira/browse/LUCENE-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722740#comment-13722740 ] David Smiley commented on LUCENE-4583: -- Cool; I didn't know of the Facet42 codec with its support for large doc values. Looks like I can use it without faceting. I'll have to try that. +1 to commit. StraightBytesDocValuesField fails if bytes 32k Key: LUCENE-4583 URL: https://issues.apache.org/jira/browse/LUCENE-4583 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.0, 4.1, 5.0 Reporter: David Smiley Priority: Critical Fix For: 5.0, 4.5 Attachments: LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch I didn't observe any limitations on the size of a bytes based DocValues field value in the docs. It appears that the limit is 32k, although I didn't get any friendly error telling me that was the limit. 32k is kind of small IMO; I suspect this limit is unintended and as such is a bug.The following test fails: {code:java} public void testBigDocValue() throws IOException { Directory dir = newDirectory(); IndexWriter writer = new IndexWriter(dir, writerConfig(false)); Document doc = new Document(); BytesRef bytes = new BytesRef((4+4)*4097);//4096 works bytes.length = bytes.bytes.length;//byte data doesn't matter doc.add(new StraightBytesDocValuesField(dvField, bytes)); writer.addDocument(doc); writer.commit(); writer.close(); DirectoryReader reader = DirectoryReader.open(dir); DocValues docValues = MultiDocValues.getDocValues(reader, dvField); //FAILS IF BYTES IS BIG! docValues.getSource().getBytes(0, bytes); reader.close(); dir.close(); } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5145) Added AppendingPackedLongBuffer extended AbstractAppendingLongBuffer family (customizable compression ratio + bulk retrieval)
[ https://issues.apache.org/jira/browse/LUCENE-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722770#comment-13722770 ] Adrien Grand commented on LUCENE-5145: -- Thanks Boaz, the patch looks very good! - I like the fact that the addition of the new bulk API helped make fillValues final! - OrdinalMap.subIndexes, SortedDocValuesWriter.pending and SortedSetDocValuesWriter.pending are 0-based so they could use the new {{AppendingPackedLongBuffer}} instead of {{AppendingDeltaPackedLongBuffer}}, can you update the patch? Added AppendingPackedLongBuffer extended AbstractAppendingLongBuffer family (customizable compression ratio + bulk retrieval) --- Key: LUCENE-5145 URL: https://issues.apache.org/jira/browse/LUCENE-5145 Project: Lucene - Core Issue Type: Improvement Reporter: Boaz Leskes Assignee: Adrien Grand Attachments: LUCENE-5145.patch Made acceptableOverheadRatio configurable Added bulk get to AbstractAppendingLongBuffer classes, for faster retrieval. Introduced a new variant, AppendingPackedLongBuffer which solely relies on PackedInts as a back-end. This new class is useful where people have non-negative numbers with a fairly uniform distribution over a fixed (limited) range. Ex. facets ordinals. To distinguish it from AppendingPackedLongBuffer, delta based AppendingLongBuffer was renamed to AppendingDeltaPackedLongBuffer Fixed an Issue with NullReader where it didn't respect it's valueCount in bulk gets. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5150) WAH8DocIdSet: dense sets compression
Adrien Grand created LUCENE-5150: Summary: WAH8DocIdSet: dense sets compression Key: LUCENE-5150 URL: https://issues.apache.org/jira/browse/LUCENE-5150 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Trivial -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5150) WAH8DocIdSet: dense sets compression
[ https://issues.apache.org/jira/browse/LUCENE-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated LUCENE-5150: - Description: In LUCENE-5101, Paul Elschot mentioned that it would be interesting to be able to encode the inverse set to also compress very dense sets. WAH8DocIdSet: dense sets compression Key: LUCENE-5150 URL: https://issues.apache.org/jira/browse/LUCENE-5150 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Trivial In LUCENE-5101, Paul Elschot mentioned that it would be interesting to be able to encode the inverse set to also compress very dense sets. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5122) DiskDV probably shouldnt use BlockPackedReader for SortedDV doc-to-ord
[ https://issues.apache.org/jira/browse/LUCENE-5122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-5122: Attachment: LUCENE-5122.patch here's a patch. Ill do some benchmarking. DiskDV probably shouldnt use BlockPackedReader for SortedDV doc-to-ord -- Key: LUCENE-5122 URL: https://issues.apache.org/jira/browse/LUCENE-5122 Project: Lucene - Core Issue Type: Improvement Reporter: Robert Muir Attachments: LUCENE-5122.patch I dont think blocking provides any benefit here in general. we can assume the ordinals are essentially random and since SortedDV is single-valued, its probably better to just use the simpler packedints directly? I guess the only case where it would help is if you sorted your segments by that DV field. But that seems kinda wierd/esoteric to sort your index by a deref'ed string value, e.g. I don't think its even supported by SortingMP. For the SortedSet ord stream, this can exceed 2B values so for now I think it should stay as blockpackedreader. but it could use a large blocksize... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5150) WAH8DocIdSet: dense sets compression
[ https://issues.apache.org/jira/browse/LUCENE-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated LUCENE-5150: - Attachment: LUCENE-5150.patch Here is a patch. It reserves an additional bit in the header to say whether the encoding should be inversed (meaning clean words are actually 0xFF instead of 0x00). It should reduce the amount of memory required to build and store dense sets. In spite of this change, compression ratios remain the same for sparse sets. For random dense sets, I observed compression ratios of 87% when the load factor is 90% and 20% when the load factor is 99% (vs. 100% before). WAH8DocIdSet: dense sets compression Key: LUCENE-5150 URL: https://issues.apache.org/jira/browse/LUCENE-5150 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Assignee: Adrien Grand Priority: Trivial Attachments: LUCENE-5150.patch In LUCENE-5101, Paul Elschot mentioned that it would be interesting to be able to encode the inverse set to also compress very dense sets. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5144) Nuke FacetRequest.createAggregator
[ https://issues.apache.org/jira/browse/LUCENE-5144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera resolved LUCENE-5144. Resolution: Fixed Thanks Hoss and Rob. Sorry for letting this nocommit slip through. I removed the TODO as the intention was to remove that piece of code. Nuke FacetRequest.createAggregator -- Key: LUCENE-5144 URL: https://issues.apache.org/jira/browse/LUCENE-5144 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Shai Erera Assignee: Shai Erera Fix For: 5.0, 4.5 Attachments: LUCENE-5144.patch Aggregator was replaced by FacetsAggregator. FacetRequest has createAggregator() which by default throws an UOE. It was left there until we migrate the aggregators to FacetsAggregator -- now all of our requests support FacetsAggregator. Aggregator is used only by StandardFacetsAccumulator, which too needs to vanish, at some point. But it currently it's the only one which handles sampling, complements aggregation and partitions. What I'd like to do is remove FacetRequest.createAggregator and in StandardFacetsAccumulator support only CountFacetRequest and SumScoreFacetRequest, which are the only ones that make sense for sampling and partitions. SumScore does not even support complements (which only work for counting). I'll also rename StandardFA to OldStandardFA. The plan is to eventually implement a SamplingAccumulator, PartitionsAccumulator/Aggregator and ComplementsAggregator, removing that class entirely. Until then ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5147) Consider returning a MapFacetRequest,FacetResult from FacetsAccumulator
[ https://issues.apache.org/jira/browse/LUCENE-5147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera resolved LUCENE-5147. Resolution: Won't Fix I started to do it, but this has a large impact on tests. I don't see how much value it brings, plus an app can easily put the results in a map and lookup requests: {code} MapFacetRequest,FacetResult results = new HashMap(); for (FacetResult fres : facetResults) { results.put(fres.getFacetRequest(), fres); } {code} Resolving as Won't Fix for now, if this will be a problem we can reopen. Consider returning a MapFacetRequest,FacetResult from FacetsAccumulator - Key: LUCENE-5147 URL: https://issues.apache.org/jira/browse/LUCENE-5147 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Shai Erera Today the API returns a List which suggests there's an ordering going on. This may be confusing if one uses FacetsAccumulator.create which results in a MultiFacetsAccumulator, and then the order of the results does not correspond to the order of the requests. Rather than trying to enforce ordering, a simple mapping may be better even for consuming apps since they will be able to easily lookup desired results. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #923: POMs out of sync
Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/923/ 2 tests failed. FAILED: org.apache.solr.cloud.BasicDistributedZkTest.org.apache.solr.cloud.BasicDistributedZkTest Error Message: 1 thread leaked from SUITE scope at org.apache.solr.cloud.BasicDistributedZkTest: 1) Thread[id=4826, name=recoveryCmdExecutor-2303-thread-1, state=RUNNABLE, group=TGRP-BasicDistributedZkTest] at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:127) at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180) at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294) at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:645) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:480) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:365) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180) at org.apache.solr.cloud.SyncStrategy$1.run(SyncStrategy.java:291) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) Stack Trace: com.carrotsearch.randomizedtesting.ThreadLeakError: 1 thread leaked from SUITE scope at org.apache.solr.cloud.BasicDistributedZkTest: 1) Thread[id=4826, name=recoveryCmdExecutor-2303-thread-1, state=RUNNABLE, group=TGRP-BasicDistributedZkTest] at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:127) at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180) at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294) at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:645) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:480) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:365) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180) at org.apache.solr.cloud.SyncStrategy$1.run(SyncStrategy.java:291) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) at __randomizedtesting.SeedInfo.seed([2F2D4C74C67902F4]:0) FAILED: org.apache.solr.cloud.BasicDistributedZkTest.org.apache.solr.cloud.BasicDistributedZkTest Error Message: There are still zombie threads that couldn't be terminated: 1) Thread[id=4826, name=recoveryCmdExecutor-2303-thread-1, state=RUNNABLE, group=TGRP-BasicDistributedZkTest] at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at
[jira] [Commented] (SOLR-4981) BasicDistributedZkTest fails on FreeBSD jenkins due to thread leak.
[ https://issues.apache.org/jira/browse/SOLR-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722891#comment-13722891 ] Mark Miller commented on SOLR-4981: --- Just tried tweaking the connect timeout - it was fairly high at 45 seconds and the thread linger may just not have been long enough. I dropped it to 15s and will see how that goes. BasicDistributedZkTest fails on FreeBSD jenkins due to thread leak. --- Key: SOLR-4981 URL: https://issues.apache.org/jira/browse/SOLR-4981 Project: Solr Issue Type: Test Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller Priority: Minor -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5059) 4.4 refguide pages on schemaless schema rest api for adding fields
[ https://issues.apache.org/jira/browse/SOLR-5059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722961#comment-13722961 ] Steve Rowe commented on SOLR-5059: -- bq. I just happened to be looking at the FAQ yesterday and noticed that it has a question Does Solr support schemaless mode? that probably needs to reflect this new support. Thanks Jack, I've updated the answer. 4.4 refguide pages on schemaless schema rest api for adding fields Key: SOLR-5059 URL: https://issues.apache.org/jira/browse/SOLR-5059 Project: Solr Issue Type: Sub-task Components: documentation Reporter: Hoss Man Assignee: Steve Rowe breaking off from parent... * https://cwiki.apache.org/confluence/display/solr/Documents%2C+Fields%2C+and+Schema+Design ** SOLR-4897: Add solr/example/example-schemaless/, an example config set for schemaless mode. (Steve Rowe) *** CT: Schemaless in general needs to be added. The most likely place today is a new page under https://cwiki.apache.org/confluence/display/solr/Documents%2C+Fields%2C+and+Schema+Design * https://cwiki.apache.org/confluence/display/solr/Schema+API ** SOLR-3251: Dynamically add fields to schema. (Steve Rowe, Robert Muir, yonik) *** CT: Add to https://cwiki.apache.org/confluence/display/solr/Schema+API ** SOLR-5010: Add support for creating copy fields to the Fields REST API (gsingers) *** CT: Add to https://cwiki.apache.org/confluence/display/solr/Schema+API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4335) Builds should regenerate all generated sources
[ https://issues.apache.org/jira/browse/LUCENE-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722974#comment-13722974 ] Steve Rowe commented on LUCENE-4335: bq. I don't want to setup a fixed JFlex on Jenkins, I want to download it with IVY, so before resolving this issue we should have a JFlex version available. If Steve Rowe is not able to relaese the version on Maven, we should maybe fork jflex on Google Code and make a release including the ANT task. I can't promise I'll release JFlex anytime soon, sorry. If you want to fork, you can certainly do that. FYI, Gerwin Klein, the JFlex founder, has done some work (maybe all that needs to be done? not sure at this point) to convert JFlex to a BSD license. I'll review the source and see what state that effort is in - BSD licensing should simplify forking, I think. Builds should regenerate all generated sources -- Key: LUCENE-4335 URL: https://issues.apache.org/jira/browse/LUCENE-4335 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Attachments: LUCENE-4335.patch, LUCENE-4335.patch, LUCENE-4335.patch We have more and more sources that are generated programmatically (query parsers, fuzzy levN tables from Moman, packed ints specialized decoders, etc.), and it's dangerous because developers may directly edit the generated sources and forget to edit the meta-source. It's happened to me several times ... most recently just after landing the BlockPostingsFormat branch. I think we should re-gen all of these in our builds and fail the build if this creates a difference. I know some generators (eg JavaCC) embed timestamps and so always create mods ... we can leave them out of this for starters (or maybe post-process the sources to remove the timestamps) ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5146) AnalyzingSuggester sort order doesn't respect the actual weight
[ https://issues.apache.org/jira/browse/LUCENE-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722981#comment-13722981 ] Uwe Schindler commented on LUCENE-5146: --- Sorry but your code is of course wrong :-) AnalyzingSuggester sort order doesn't respect the actual weight --- Key: LUCENE-5146 URL: https://issues.apache.org/jira/browse/LUCENE-5146 Project: Lucene - Core Issue Type: Bug Components: modules/spellchecker Affects Versions: 4.4 Reporter: Simon Willnauer Fix For: 5.0, 4.5 Attachments: LUCENE-5146.patch Uwe would say: sorry but your code is wrong. We don't actually read the weight value in AnalyzingComparator which can cause really odd suggestions since we read parts of the input as the weight. Non of our tests catches that so I will go ahead and add some tests for it as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5082) Implement ie=charset parameter
[ https://issues.apache.org/jira/browse/SOLR-5082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722988#comment-13722988 ] Uwe Schindler commented on SOLR-5082: - [~elyograg]: Are you fine with this code? From my tests here I have seen no slowdown for query-string parsing, it is as fast as before, every slowdown is smaller than measureable. In any case, the current URLDecoder is much more efficient than the one embedded into Jetty (the one with broken UTF8 in earlier versions). The slowest part in the whole code is MultiMapSolrParams#add, because it reallocates arrays all the time on duplicate keys... Implement ie=charset parameter -- Key: SOLR-5082 URL: https://issues.apache.org/jira/browse/SOLR-5082 Project: Solr Issue Type: Improvement Affects Versions: 4.4 Reporter: Shawn Heisey Assignee: Uwe Schindler Priority: Minor Fix For: 5.0, 4.5 Attachments: SOLR-5082.patch, SOLR-5082.patch Allow a user to send a query or update to Solr in a character set other than UTF-8 and inform Solr what charset to use with an ie parameter, for input encoding. This was discussed in SOLR-4265 and SOLR-4283. Changing the default charset is a bad idea because distributed search (SolrCloud) relies on UTF-8. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4953) solrconfig.xml parsing should fail hard if there are multiple indexConfig/ blocks
[ https://issues.apache.org/jira/browse/SOLR-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-4953: --- Attachment: SOLR-4953.patch it ocured to me last night that instead of just dealing explicitly with indexConfig here, we could probably help improve the validation of a lot of config parsing with a relatively simple change to Config.getNode: throw an error in any case where Solr is looking for a single Node/String/Int/Boolean and multiple values are found instead. I wasn't sure how badly this might break things, but i've been testing it out today and except for a few cases where the text() xpath expression was getting abused (instead of a simple node check), it seems fairly straight forward. So here's a patch that broadens the scope of the issue to fail hard if any single valued config option is found more then once in the config. solrconfig.xml parsing should fail hard if there are multiple indexConfig/ blocks --- Key: SOLR-4953 URL: https://issues.apache.org/jira/browse/SOLR-4953 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Hoss Man Attachments: SOLR-4953.patch, SOLR-4953.patch while reviewing some code i think i noticed that if there are multiple {{indexConfig/}} blocks in solrconfig.xml, one just wins and hte rest are ignored. this should be a hard failure situation, and we should have a TestBadConfig method to verify it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4953) Config XML parsing should fail hard if an xpath is expect to match at most one node/string/int/boolean and multiple values are found
[ https://issues.apache.org/jira/browse/SOLR-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-4953: --- Description: while reviewing some code i think i noticed that if there are multiple {{indexConfig/}} blocks in solrconfig.xml, one just wins and hte rest are ignored. this should be a hard failure situation, and we should have a TestBadConfig method to verify it. --- broadened goal of issue to fail if configuration contains multiple nodes/values for any option where only one value is expected. was: while reviewing some code i think i noticed that if there are multiple {{indexConfig/}} blocks in solrconfig.xml, one just wins and hte rest are ignored. this should be a hard failure situation, and we should have a TestBadConfig method to verify it. Issue Type: Improvement (was: Bug) Summary: Config XML parsing should fail hard if an xpath is expect to match at most one node/string/int/boolean and multiple values are found (was: solrconfig.xml parsing should fail hard if there are multiple indexConfig/ blocks) Config XML parsing should fail hard if an xpath is expect to match at most one node/string/int/boolean and multiple values are found Key: SOLR-4953 URL: https://issues.apache.org/jira/browse/SOLR-4953 Project: Solr Issue Type: Improvement Reporter: Hoss Man Assignee: Hoss Man Attachments: SOLR-4953.patch, SOLR-4953.patch while reviewing some code i think i noticed that if there are multiple {{indexConfig/}} blocks in solrconfig.xml, one just wins and hte rest are ignored. this should be a hard failure situation, and we should have a TestBadConfig method to verify it. --- broadened goal of issue to fail if configuration contains multiple nodes/values for any option where only one value is expected. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [Solr Wiki] Update of UsingMailingLists by HossMan
Just happened to notice at the end of that update: Lucid Imagination and Sematext also maintain SOLR-powered archives at: http://www.lucidimagination.com/search/ and ... s.b. LucidWorks and http://find.searchhub.org/;, althought the latter actually says Welcome to the temporary SearchHub. To learn more about this site, please click here and has a bad image URL. In any case, the name/text is outdated. And, on the searchhub.org menu for Reference Materials it has Solr Reference Guide with this link: http://searchhub.org/category/reference-materials/solr-reference-guide-2/ That doesn't mention or link to the new Apache Solr Reference Guide. Maybe you could pass these comments over to whoever works with SearchHub. -- Jack Krupansky -Original Message- From: Apache Wiki Sent: Monday, July 29, 2013 6:06 PM To: Apache Wiki Subject: [Solr Wiki] Update of UsingMailingLists by HossMan Dear Wiki user, You have subscribed to a wiki page or wiki category on Solr Wiki for change notification. The UsingMailingLists page has been changed by HossMan: https://wiki.apache.org/solr/UsingMailingLists?action=diffrev1=9rev2=10 Comment: ref guide links == Some general guidelines == *First and foremost: Try to find the answer before posting. There's no faster way to get the answer to your question than finding it's already been answered. Some of the places to look are: - *The SOLR wiki at: http://lucene.apache.org/solr/. + * The Official Solr Documentation: https://lucene.apache.org/solr/documentation.html + * In particular, check the Solr Reference Guide for the version of Solr you are using, or check the [[https://cwiki.apache.org/confluence/display/solr/|the live draft]] of the next version of the guide for the latest updates. + * The Solr Community Wiki: https://wiki.apache.org/solr/ - *Search the users' list archives. Try the nabble searchable archive at: http://old.nabble.com/Solr-f14479.html. Lucid Imagination and Sematext also maintain SOLR-powered archives at: http://www.lucidimagination.com/search/ and http://search-lucene.com/. + * Search the users' list archives. Try the nabble searchable archive at: http://old.nabble.com/Solr-f14479.html. Lucid Imagination and Sematext also maintain SOLR-powered archives at: http://www.lucidimagination.com/search/ and http://search-lucene.com/. *And, of course, web searches (Google, Cuil, or other favorite web search engine). *Be aware of all the advice in the extremely well written: [[http://catb.org/~esr/faqs/smart-questions.html|How to ask questions the smart way]] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5087) CoreAdminHandler.handleMergeAction generating NullPointerException
Patrick Hunt created SOLR-5087: -- Summary: CoreAdminHandler.handleMergeAction generating NullPointerException Key: SOLR-5087 URL: https://issues.apache.org/jira/browse/SOLR-5087 Project: Solr Issue Type: Bug Affects Versions: 4.4 Reporter: Patrick Hunt Assignee: Patrick Hunt Fix For: 5.0, 4.5 CoreAdminHandler.handleMergeAction is generating NullPointerException If directoryFactory.get(...) in handleMergeAction throws an exception the original error is lost as the finally clause will attempt to clean up and generate an NPE. (notice that dirsToBeReleased is pre-allocated with nulls that are not filled in) {noformat} ERROR org.apache.solr.core.SolrCore: java.lang.NullPointerException at org.apache.solr.core.CachingDirectoryFactory.release(CachingDirectoryFactory.java:430) at org.apache.solr.handler.admin.CoreAdminHandler.handleMergeAction(CoreAdminHandler.java:380) at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:180) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5087) CoreAdminHandler.handleMergeAction generating NullPointerException
[ https://issues.apache.org/jira/browse/SOLR-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated SOLR-5087: --- Attachment: SOLR-5087.patch This patch fixes the problem by catching/logging/rethrowing the original problem. I've also made some changes to the code to make it less likely that the cleanup (finally clause) will fail. The test I added fails w/o the fix applied. This patch applies/passes for me on both trunk and branch4x. CoreAdminHandler.handleMergeAction generating NullPointerException -- Key: SOLR-5087 URL: https://issues.apache.org/jira/browse/SOLR-5087 Project: Solr Issue Type: Bug Affects Versions: 4.4 Reporter: Patrick Hunt Assignee: Patrick Hunt Fix For: 5.0, 4.5 Attachments: SOLR-5087.patch CoreAdminHandler.handleMergeAction is generating NullPointerException If directoryFactory.get(...) in handleMergeAction throws an exception the original error is lost as the finally clause will attempt to clean up and generate an NPE. (notice that dirsToBeReleased is pre-allocated with nulls that are not filled in) {noformat} ERROR org.apache.solr.core.SolrCore: java.lang.NullPointerException at org.apache.solr.core.CachingDirectoryFactory.release(CachingDirectoryFactory.java:430) at org.apache.solr.handler.admin.CoreAdminHandler.handleMergeAction(CoreAdminHandler.java:380) at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:180) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5082) Implement ie=charset parameter
[ https://issues.apache.org/jira/browse/SOLR-5082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723126#comment-13723126 ] ASF subversion and git services commented on SOLR-5082: --- Commit 1508237 from [~thetaphi] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1508237 ] Merged revision(s) 1508236 from lucene/dev/trunk: SOLR-5082: The encoding of URL-encoded query parameters can be changed with the ie (input encoding) parameter, e.g. select?q=m%FCllerie=ISO-8859-1. The default is UTF-8. To change the encoding of POSTed content, use the Content-Type HTTP header Implement ie=charset parameter -- Key: SOLR-5082 URL: https://issues.apache.org/jira/browse/SOLR-5082 Project: Solr Issue Type: Improvement Affects Versions: 4.4 Reporter: Shawn Heisey Assignee: Uwe Schindler Priority: Minor Fix For: 5.0, 4.5 Attachments: SOLR-5082.patch, SOLR-5082.patch Allow a user to send a query or update to Solr in a character set other than UTF-8 and inform Solr what charset to use with an ie parameter, for input encoding. This was discussed in SOLR-4265 and SOLR-4283. Changing the default charset is a bad idea because distributed search (SolrCloud) relies on UTF-8. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5082) Implement ie=charset parameter
[ https://issues.apache.org/jira/browse/SOLR-5082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723124#comment-13723124 ] ASF subversion and git services commented on SOLR-5082: --- Commit 1508236 from [~thetaphi] in branch 'dev/trunk' [ https://svn.apache.org/r1508236 ] SOLR-5082: The encoding of URL-encoded query parameters can be changed with the ie (input encoding) parameter, e.g. select?q=m%FCllerie=ISO-8859-1. The default is UTF-8. To change the encoding of POSTed content, use the Content-Type HTTP header Implement ie=charset parameter -- Key: SOLR-5082 URL: https://issues.apache.org/jira/browse/SOLR-5082 Project: Solr Issue Type: Improvement Affects Versions: 4.4 Reporter: Shawn Heisey Assignee: Uwe Schindler Priority: Minor Fix For: 5.0, 4.5 Attachments: SOLR-5082.patch, SOLR-5082.patch Allow a user to send a query or update to Solr in a character set other than UTF-8 and inform Solr what charset to use with an ie parameter, for input encoding. This was discussed in SOLR-4265 and SOLR-4283. Changing the default charset is a bad idea because distributed search (SolrCloud) relies on UTF-8. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5082) Implement ie=charset parameter
[ https://issues.apache.org/jira/browse/SOLR-5082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler resolved SOLR-5082. - Resolution: Fixed Implement ie=charset parameter -- Key: SOLR-5082 URL: https://issues.apache.org/jira/browse/SOLR-5082 Project: Solr Issue Type: Improvement Affects Versions: 4.4 Reporter: Shawn Heisey Assignee: Uwe Schindler Priority: Minor Fix For: 5.0, 4.5 Attachments: SOLR-5082.patch, SOLR-5082.patch Allow a user to send a query or update to Solr in a character set other than UTF-8 and inform Solr what charset to use with an ie parameter, for input encoding. This was discussed in SOLR-4265 and SOLR-4283. Changing the default charset is a bad idea because distributed search (SolrCloud) relies on UTF-8. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-3284) StreamingUpdateSolrServer swallows exceptions
[ https://issues.apache.org/jira/browse/SOLR-3284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Heisey reassigned SOLR-3284: -- Assignee: Shawn Heisey StreamingUpdateSolrServer swallows exceptions - Key: SOLR-3284 URL: https://issues.apache.org/jira/browse/SOLR-3284 Project: Solr Issue Type: Improvement Components: clients - java Affects Versions: 3.5, 4.0-ALPHA Reporter: Shawn Heisey Assignee: Shawn Heisey Attachments: SOLR-3284.patch StreamingUpdateSolrServer eats exceptions thrown by lower level code, such as HttpClient, when doing adds. It may happen with other methods, though I know that query and deleteByQuery will throw exceptions. I believe that this is a result of the queue/Runner design. That's what makes SUSS perform better, but it means you sacrifice the ability to programmatically determine that there was a problem with your update. All errors are logged via slf4j, but that's not terribly helpful except with determining what went wrong after the fact. When using CommonsHttpSolrServer, I've been able to rely on getting an exception thrown by pretty much any error, letting me use try/catch to detect problems. There's probably enough dependent code out there that it would not be a good idea to change the design of SUSS, unless there were alternate constructors or additional methods available to configure new/old behavior. Fixing this is probably not trivial, so it's probably a better idea to come up with a new server object based on CHSS. This is outside my current skillset. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3284) StreamingUpdateSolrServer swallows exceptions
[ https://issues.apache.org/jira/browse/SOLR-3284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723187#comment-13723187 ] Shawn Heisey commented on SOLR-3284: I have a proposed patch that is very likely to need updating because it is so old. There is an issue for CloudSolrServer, the one to route documents to the correct shard, that has a concurrent mode that apparently still will throw exceptions. Can that be adapted for use here? StreamingUpdateSolrServer swallows exceptions - Key: SOLR-3284 URL: https://issues.apache.org/jira/browse/SOLR-3284 Project: Solr Issue Type: Improvement Components: clients - java Affects Versions: 3.5, 4.0-ALPHA Reporter: Shawn Heisey Assignee: Shawn Heisey Attachments: SOLR-3284.patch StreamingUpdateSolrServer eats exceptions thrown by lower level code, such as HttpClient, when doing adds. It may happen with other methods, though I know that query and deleteByQuery will throw exceptions. I believe that this is a result of the queue/Runner design. That's what makes SUSS perform better, but it means you sacrifice the ability to programmatically determine that there was a problem with your update. All errors are logged via slf4j, but that's not terribly helpful except with determining what went wrong after the fact. When using CommonsHttpSolrServer, I've been able to rely on getting an exception thrown by pretty much any error, letting me use try/catch to detect problems. There's probably enough dependent code out there that it would not be a good idea to change the design of SUSS, unless there were alternate constructors or additional methods available to configure new/old behavior. Fixing this is probably not trivial, so it's probably a better idea to come up with a new server object based on CHSS. This is outside my current skillset. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5087) CoreAdminHandler.handleMergeAction generating NullPointerException
[ https://issues.apache.org/jira/browse/SOLR-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723190#comment-13723190 ] Mark Miller commented on SOLR-5087: --- Looks good to me - there is a little back compat breakage in the merge command, but I think that's fine. Just calling it out in case anyone else has a concern there. CoreAdminHandler.handleMergeAction generating NullPointerException -- Key: SOLR-5087 URL: https://issues.apache.org/jira/browse/SOLR-5087 Project: Solr Issue Type: Bug Affects Versions: 4.4 Reporter: Patrick Hunt Assignee: Patrick Hunt Fix For: 5.0, 4.5 Attachments: SOLR-5087.patch CoreAdminHandler.handleMergeAction is generating NullPointerException If directoryFactory.get(...) in handleMergeAction throws an exception the original error is lost as the finally clause will attempt to clean up and generate an NPE. (notice that dirsToBeReleased is pre-allocated with nulls that are not filled in) {noformat} ERROR org.apache.solr.core.SolrCore: java.lang.NullPointerException at org.apache.solr.core.CachingDirectoryFactory.release(CachingDirectoryFactory.java:430) at org.apache.solr.handler.admin.CoreAdminHandler.handleMergeAction(CoreAdminHandler.java:380) at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:180) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5087) CoreAdminHandler.handleMergeAction generating NullPointerException
[ https://issues.apache.org/jira/browse/SOLR-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723200#comment-13723200 ] Patrick Hunt commented on SOLR-5087: Oh, yes. I forgot about that, it seemed like an internal operation though. LMK if it should be reverted. (it was cleaner to push the List usage through, but not critical) CoreAdminHandler.handleMergeAction generating NullPointerException -- Key: SOLR-5087 URL: https://issues.apache.org/jira/browse/SOLR-5087 Project: Solr Issue Type: Bug Affects Versions: 4.4 Reporter: Patrick Hunt Assignee: Patrick Hunt Fix For: 5.0, 4.5 Attachments: SOLR-5087.patch CoreAdminHandler.handleMergeAction is generating NullPointerException If directoryFactory.get(...) in handleMergeAction throws an exception the original error is lost as the finally clause will attempt to clean up and generate an NPE. (notice that dirsToBeReleased is pre-allocated with nulls that are not filled in) {noformat} ERROR org.apache.solr.core.SolrCore: java.lang.NullPointerException at org.apache.solr.core.CachingDirectoryFactory.release(CachingDirectoryFactory.java:430) at org.apache.solr.handler.admin.CoreAdminHandler.handleMergeAction(CoreAdminHandler.java:380) at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:180) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5087) CoreAdminHandler.handleMergeAction generating NullPointerException
[ https://issues.apache.org/jira/browse/SOLR-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723203#comment-13723203 ] Mark Miller commented on SOLR-5087: --- bq. it seemed like an internal operation though Technically it's part of the UpdatePrcoessor chain user plugin point API's - but we are kind of ad-hoc with back compat in these API's - I think it's rare enough to do something custom with the merge command that I'm not personally worried about it though. CoreAdminHandler.handleMergeAction generating NullPointerException -- Key: SOLR-5087 URL: https://issues.apache.org/jira/browse/SOLR-5087 Project: Solr Issue Type: Bug Affects Versions: 4.4 Reporter: Patrick Hunt Assignee: Patrick Hunt Fix For: 5.0, 4.5 Attachments: SOLR-5087.patch CoreAdminHandler.handleMergeAction is generating NullPointerException If directoryFactory.get(...) in handleMergeAction throws an exception the original error is lost as the finally clause will attempt to clean up and generate an NPE. (notice that dirsToBeReleased is pre-allocated with nulls that are not filled in) {noformat} ERROR org.apache.solr.core.SolrCore: java.lang.NullPointerException at org.apache.solr.core.CachingDirectoryFactory.release(CachingDirectoryFactory.java:430) at org.apache.solr.handler.admin.CoreAdminHandler.handleMergeAction(CoreAdminHandler.java:380) at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:180) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5057) queryResultCache should not related with the order of fq's list
[ https://issues.apache.org/jira/browse/SOLR-5057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723268#comment-13723268 ] Feihong Huang commented on SOLR-5057: - So, Can anyone make a final decision for this featrue ? hi, Erickson, if we decide to fix the feature, who is responsible for submit the patch? Can i do it? queryResultCache should not related with the order of fq's list --- Key: SOLR-5057 URL: https://issues.apache.org/jira/browse/SOLR-5057 Project: Solr Issue Type: Improvement Components: search Affects Versions: 4.0, 4.1, 4.2, 4.3 Reporter: Feihong Huang Assignee: Erick Erickson Priority: Minor Attachments: SOLR-5057.patch, SOLR-5057.patch Original Estimate: 48h Remaining Estimate: 48h There are two case query with the same meaning below. But the case2 can't use the queryResultCache when case1 is executed. case1: q=*:*fq=field1:value1fq=field2:value2 case2: q=*:*fq=field2:value2fq=field1:value1 I think queryResultCache should not be related with the order of fq's list. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5087) CoreAdminHandler.handleMergeAction generating NullPointerException
[ https://issues.apache.org/jira/browse/SOLR-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723362#comment-13723362 ] Shalin Shekhar Mangar commented on SOLR-5087: - bq. there is a little back compat breakage in the merge command, but I think that's fine. That should be fine. Patch looks good. Thanks Patrick! CoreAdminHandler.handleMergeAction generating NullPointerException -- Key: SOLR-5087 URL: https://issues.apache.org/jira/browse/SOLR-5087 Project: Solr Issue Type: Bug Affects Versions: 4.4 Reporter: Patrick Hunt Assignee: Patrick Hunt Fix For: 5.0, 4.5 Attachments: SOLR-5087.patch CoreAdminHandler.handleMergeAction is generating NullPointerException If directoryFactory.get(...) in handleMergeAction throws an exception the original error is lost as the finally clause will attempt to clean up and generate an NPE. (notice that dirsToBeReleased is pre-allocated with nulls that are not filled in) {noformat} ERROR org.apache.solr.core.SolrCore: java.lang.NullPointerException at org.apache.solr.core.CachingDirectoryFactory.release(CachingDirectoryFactory.java:430) at org.apache.solr.handler.admin.CoreAdminHandler.handleMergeAction(CoreAdminHandler.java:380) at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:180) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5081) Highly parallel document insertion hangs SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723378#comment-13723378 ] Noble Paul commented on SOLR-5081: -- Can you please throw some more light into the system # numShards # Replication factor # maxShardsPerNode (I guess it is 1) # Average size per doc # VM startup params (-Xmx -Xms, GC params etc) # How are you indexing? Are you using SolrJ and the CloudSolrServer? How many clients are used to index the data? Highly parallel document insertion hangs SolrCloud -- Key: SOLR-5081 URL: https://issues.apache.org/jira/browse/SOLR-5081 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.3.1 Reporter: Mike Schrag Attachments: threads.txt If I do a highly parallel document load using a Hadoop cluster into an 18 node solrcloud cluster, I can deadlock solr every time. The ulimits on the nodes are: core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 1031181 max locked memory (kbytes, -l) unlimited max memory size (kbytes, -m) unlimited open files (-n) 32768 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 515590 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited The open file count is only around 4000 when this happens. If I bounce all the servers, things start working again, which makes me think this is Solr and not ZK. I'll attach the stack trace from one of the servers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5088) ClassCastException is thrown when trying to use custom SearchHandler.
Pavel Yaskevich created SOLR-5088: - Summary: ClassCastException is thrown when trying to use custom SearchHandler. Key: SOLR-5088 URL: https://issues.apache.org/jira/browse/SOLR-5088 Project: Solr Issue Type: Bug Affects Versions: 4.4 Reporter: Pavel Yaskevich Hi guys, I'm trying to replace solr.SearchHandler to custom one in solrconfig.xml for one of the stores, and it's throwing following exception: {noformat} Caused by: org.apache.solr.common.SolrException: RequestHandler init failure at org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:167) at org.apache.solr.core.SolrCore.init(SolrCore.java:772) ... 13 more Caused by: org.apache.solr.common.SolrException: Error Instantiating Request Handler, org.my.solr.index.CustomSearchHandler failed to instantiate org.apache.solr.request.SolrRequestHandler at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:551) at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:603) at org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:153) ... 14 more Caused by: java.lang.ClassCastException: class org.my.solr.index.CustomSearchHandler at java.lang.Class.asSubclass(Class.java:3116) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:433) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:381) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:530) ... 16 more {noformat} I actually tried extending SearchHandler, and implementing SolrRequestHandler as well as extending RequestHandlerBase and it's all the same ClassCastException result... org.my.solr.index.CustomSearchHandler is definitely in class path and recompiled every retry. Maybe I'm doing something terribly wrong? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5081) Highly parallel document insertion hangs SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723402#comment-13723402 ] Mike Schrag commented on SOLR-5081: --- 1. numShards=20 2. RF=3 3. maxShardsPerNode=1000 (aka just a big number .. we overcommit shards in this environment) 4. not very big ... maybe 0.5-1k 5. -Xms10g -Xmx10g -XX:MaxPermSize=1G -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:CMSInitiatingOccupancy Fraction=60 -XX:-OmitStackTraceInFastThrow 6. SolrJ + CloudSolrServer + when you say clients, do you mean threads, or actual client JVM instances? Talking more generically in terms of threads, I know it works at around 15-20 threads, but 100 threads makes it go sadfaced Highly parallel document insertion hangs SolrCloud -- Key: SOLR-5081 URL: https://issues.apache.org/jira/browse/SOLR-5081 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.3.1 Reporter: Mike Schrag Attachments: threads.txt If I do a highly parallel document load using a Hadoop cluster into an 18 node solrcloud cluster, I can deadlock solr every time. The ulimits on the nodes are: core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 1031181 max locked memory (kbytes, -l) unlimited max memory size (kbytes, -m) unlimited open files (-n) 32768 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 515590 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited The open file count is only around 4000 when this happens. If I bounce all the servers, things start working again, which makes me think this is Solr and not ZK. I'll attach the stack trace from one of the servers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-NightlyTests-trunk - Build # 335 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-trunk/335/ 2 tests failed. FAILED: org.apache.lucene.codecs.simpletext.TestSimpleTextPostingsFormat.testEmptyField Error Message: Stack Trace: java.lang.AssertionError at __randomizedtesting.SeedInfo.seed([6D64DFCE9911F67B:B07B8389FB8EACE0]:0) at org.junit.Assert.fail(Assert.java:92) at org.junit.Assert.assertTrue(Assert.java:43) at org.junit.Assert.assertTrue(Assert.java:54) at org.apache.lucene.index.BasePostingsFormatTestCase.testEmptyField(BasePostingsFormatTestCase.java:1154) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at java.lang.Thread.run(Thread.java:724) FAILED: org.apache.lucene.codecs.simpletext.TestSimpleTextPostingsFormat.testEmptyFieldAndEmptyTerm Error Message: Stack Trace: java.lang.AssertionError at __randomizedtesting.SeedInfo.seed([6D64DFCE9911F67B:EF1EF4C8B9869F55]:0) at org.junit.Assert.fail(Assert.java:92) at org.junit.Assert.assertTrue(Assert.java:43) at org.junit.Assert.assertTrue(Assert.java:54) at org.apache.lucene.index.BasePostingsFormatTestCase.testEmptyFieldAndEmptyTerm(BasePostingsFormatTestCase.java:1177)
[jira] [Updated] (SOLR-4951) randomize merge policy testing in solr
[ https://issues.apache.org/jira/browse/SOLR-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-4951: --- Attachment: SOLR-4951.patch here's a patch showing what i had in mind. Some of the final getter/setter methods in MergePolicy make writing a true proxy class challenging, but i think this works out well enough, and i included some reflection based tests to try and help future proof against the risk of changes being made to the API that result in the class not behaving the same as whatever random impl it wraps. randomize merge policy testing in solr -- Key: SOLR-4951 URL: https://issues.apache.org/jira/browse/SOLR-4951 Project: Solr Issue Type: Sub-task Reporter: Hoss Man Attachments: SOLR-4951.patch split off from SOLR-4942... * add a new RandomMergePolicy that implements MergePolicy by proxying to another instance selected at creation using one of the LuceneTestCase.new...MergePolicy methods * updated test configs to refer to this new MergePolicy * borrow the tests.shardhandler.randomSeed logic in SolrTestCaseJ4 to give our RandomMergePolicy a consistent seed at runtime. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5088) ClassCastException is thrown when trying to use custom SearchHandler.
[ https://issues.apache.org/jira/browse/SOLR-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723438#comment-13723438 ] Pavel Yaskevich commented on SOLR-5088: --- The same (ClassCastException) happens when I try to extend QueryComponent, I don't declare any methods in extending class tho, could that be a problem? searchComponent name=query class=org.my.solr.index.CustomQueryComponent / Here is the class definition I use: {noformat} package org.my.solr.index; import org.apache.solr.handler.component.QueryComponent; public class CustomQueryComponent extends QueryComponent {} {noformat} It throws following: {noformat} org.apache.solr.common.SolrException: Error Instantiating SearchComponent, org.my.solr.index.CustomQueryComponent failed to instantiate org.apache.solr.handler.component.SearchComponent at org.apache.solr.core.SolrCore.init(SolrCore.java:835) at org.apache.solr.core.SolrCore.init(SolrCore.java:629) at org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:622) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:657) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:364) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:356) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) Caused by: org.apache.solr.common.SolrException: Error Instantiating SearchComponent, org.my.solr.index.CustomQueryComponent failed to instantiate org.apache.solr.handler.component.SearchComponent at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:551) at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:586) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2173) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2167) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2200) at org.apache.solr.core.SolrCore.loadSearchComponents(SolrCore.java:1231) at org.apache.solr.core.SolrCore.init(SolrCore.java:766) ... 13 more Caused by: java.lang.ClassCastException: class org.my.solr.index.CustomQueryComponent at java.lang.Class.asSubclass(Class.java:3116) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:433) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:381) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:530) ... 19 more ERROR 22:49:54,923 null:org.apache.solr.common.SolrException: Unable to create core: test.users at org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:1150) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:666) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:364) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:356) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) {noformat} ClassCastException is thrown when trying to use custom SearchHandler. - Key: SOLR-5088 URL: https://issues.apache.org/jira/browse/SOLR-5088 Project: Solr Issue Type: Bug Affects Versions: 4.4 Reporter: Pavel Yaskevich Hi guys, I'm trying to replace solr.SearchHandler to custom one in solrconfig.xml for one of the stores, and it's throwing following exception: {noformat} Caused by: org.apache.solr.common.SolrException: RequestHandler init failure at org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:167) at org.apache.solr.core.SolrCore.init(SolrCore.java:772) ... 13 more Caused by: