[jira] [Commented] (SOLR-4817) Solr should not fall back to the back compat built in solr.xml in SolrCloud mode.
[ https://issues.apache.org/jira/browse/SOLR-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760008#comment-13760008 ] Shalin Shekhar Mangar commented on SOLR-4817: - Just fyi, the copyMinConf, copyMinFullSetup and copySolrHomeToTemp methods throw the following exception with Solrj tests: {quote} junit4] ERROR 0.69s | MultiCoreExampleJettyTest.testDeleteInstanceDir [junit4] Throwable #1: java.lang.RuntimeException: Cannot find resource: /Users/shalinmangar/work/oss/solr-trunk/solr/build/solr-solrj/test/J0/solr/collection1 [junit4]at __randomizedtesting.SeedInfo.seed([2AFBC83FDA207BB2:4160F4A68E96AEF0]:0) [junit4]at org.apache.solr.SolrTestCaseJ4.getFile(SolrTestCaseJ4.java:1571) [junit4]at org.apache.solr.SolrTestCaseJ4.TEST_HOME(SolrTestCaseJ4.java:1576) [junit4]at org.apache.solr.SolrTestCaseJ4.copyMinConf(SolrTestCaseJ4.java:1618) [junit4]at org.apache.solr.SolrTestCaseJ4.copyMinConf(SolrTestCaseJ4.java:1603) [junit4]at org.apache.solr.client.solrj.embedded.MultiCoreExampleJettyTest.testDeleteInstanceDir(MultiCoreExampleJettyTest.java:117) {quote} You can reproduce the error above with the patch in SOLR-5023 Solr should not fall back to the back compat built in solr.xml in SolrCloud mode. - Key: SOLR-4817 URL: https://issues.apache.org/jira/browse/SOLR-4817 Project: Solr Issue Type: Bug Components: SolrCloud Reporter: Mark Miller Assignee: Erick Erickson Priority: Minor Fix For: 4.5, 5.0 Attachments: SOLR-4817.patch, SOLR-4817.patch, SOLR-4817.patch, SOLR-4817.patch, SOLR-4817.patch A hard error is much more useful, and this built in solr.xml is not very good for solrcloud - with the old style solr.xml with cores in it, you won't have persistence and with the new style, it's not really ideal either. I think it makes it easier to debug solr.home to fail on this instead - but just in solrcloud mode for now due to back compat. We might want to pull the whole internal solr.xml for 5.0. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2562) Make Luke a Lucene/Solr Module
[ https://issues.apache.org/jira/browse/LUCENE-2562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760019#comment-13760019 ] Ajay Bhat commented on LUCENE-2562: --- The TokenStream reset call was needed to display the tokens generated by the Analyzer. I think that's the only change that was required. The main problem for me is the analyzers above are not giving the result, which I've been looking into. I had figured that since PatternAnalyzer was deprecated it would not give the result and so it might be a good idea to remove it from the list of analyzers. But there are also some analyzers that aren't deprecated, like the Snowball Analyzer and QueryAutoStopWordAnalyzer. Also, as per the schedule of my proposal I've done some work on the themes of the Application. I'll contribute another patch for that soon. Make Luke a Lucene/Solr Module -- Key: LUCENE-2562 URL: https://issues.apache.org/jira/browse/LUCENE-2562 Project: Lucene - Core Issue Type: Task Reporter: Mark Miller Labels: gsoc2013 Attachments: LUCENE-2562.patch, luke1.jpg, luke2.jpg, luke3.jpg, Luke-ALE-1.png, Luke-ALE-2.png, Luke-ALE-3.png, Luke-ALE-4.png, Luke-ALE-5.png see RE: Luke - in need of maintainer: http://markmail.org/message/m4gsto7giltvrpuf Web-based Luke: http://markmail.org/message/4xwps7p7ifltme5q I think it would be great if there was a version of Luke that always worked with trunk - and it would also be great if it was easier to match Luke jars with Lucene versions. While I'd like to get GWT Luke into the mix as well, I think the easiest starting point is to straight port Luke to another UI toolkit before abstracting out DTO objects that both GWT Luke and Pivot Luke could share. I've started slowly converting Luke's use of thinlet to Apache Pivot. I haven't/don't have a lot of time for this at the moment, but I've plugged away here and there over the past work or two. There is still a *lot* to do. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3765) Wrong handling of documents with same id in cross collection searches
[ https://issues.apache.org/jira/browse/SOLR-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760023#comment-13760023 ] Furkan KAMACI commented on SOLR-3765: - Did anything have done for this issue? Wrong handling of documents with same id in cross collection searches - Key: SOLR-3765 URL: https://issues.apache.org/jira/browse/SOLR-3765 Project: Solr Issue Type: Bug Components: search, SolrCloud Affects Versions: 4.0 Environment: Self-build version of Solr fra 4.x branch (revision ) Reporter: Per Steffensen Labels: collections, inconsistency, numFound, search Dialog with myself from solr-users mailing list: Per Steffensen skrev: {quote} Hi Due to what we have seen in recent tests I got in doubt how Solr search is actually supposed to behave * Searching with distrib=trueq=*:*rows=10collection=x,y,zsort=timestamp asc ** Is Solr supposed to return the 10 documents with the lowest timestamp across all documents in all slices of collection x, y and z, or is it supposed to just pick 10 random documents from those slices and just sort those 10 randomly selected documents? ** Put in another way - is this search supposed to be consistent, returning exactly the same set of documents when performed several times (no documents are updated between consecutive searches)? {quote} Fortunately I believe the answer is, that it ought to return the 10 documents with the lowest timestamp across all documents in all slices of collection x, y and Z. The reason I asked was because I got different responses for consecutive simular requests. Now I believe it can be explained by the bug described below. I guess they you do cross-collection/shard searches, the request-handling Solr forwards the query to all involved shards simultanious and merges sub-results into the final result as they are returned from the shards. Because of the consider documents with same id as the same document even though the come from different collections-bug it is kinda random (depending on which shards responds first/last), for a given id, what collection the document with that specific id is taken from. And if documents with the same id from different collections has different timestamp it is random where that document ends up in the final sorted result. So i believe this inconsistency can be explained by the bug described below. {quote} * A search returns a numFound-field telling how many documents all in all matches the search-criteria, even though not all those documents are returned by the search. It is a crazy question to ask, but I will do it anyway because we actually see a problem with this. Isnt it correct that two searches which only differs on the rows-number (documents to be returned) should always return the same value for numFound? {quote} Well I found out myself what the problem is (or seems to be) - see: http://lucene.472066.n3.nabble.com/Changing-value-of-start-parameter-affects-numFound-td2460645.html http://lucene.472066.n3.nabble.com/numFound-inconsistent-for-different-rows-param-td3997269.html http://lucene.472066.n3.nabble.com/Solr-v3-5-0-numFound-changes-when-paging-through-results-on-8-shard-cluster-td3990400.html Until 4.0 this bug could be ignored because it was ok for a cross-shards search to consider documents with identical id's as dublets and therefore only returning/counting one of them. It is still, in 4.0, ok within the same collection, but across collections identical id's should not be considered dublicates and should not reduce documents returned/counted. So i believe this feature has now become a bug in 4.0 when it comes to cross-collections searches. {quote} Thanks! Regards, Steff {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5057) Hunspell stemmer generates multiple tokens
[ https://issues.apache.org/jira/browse/LUCENE-5057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760034#comment-13760034 ] Lukas Vlcek commented on LUCENE-5057: - Agree Chris. Thanks. Hunspell stemmer generates multiple tokens -- Key: LUCENE-5057 URL: https://issues.apache.org/jira/browse/LUCENE-5057 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.3 Reporter: Luca Cavanna Assignee: Adrien Grand The hunspell stemmer seems to be generating multiple tokens: the original token plus the available stems. It might be a good thing in some cases but it seems to be a different behaviour compared to the other stemmers and causes problems as well. I would rather have an option to decide whether it should output only the available stems, or the stems plus the original token. I'm not sure though if it's possible to have only a single stem indexed, which would be even better in my opinion. When I look at how snowball works only one token is indexed, the stem, and that works great. Probably there's something I'm missing in how hunspell works. Here is my issue: I have a query composed of multiple terms, which is analyzed using stemming and a boolean query is generated out of it. All fine when adding all clauses as should (OR operator), but if I add all clauses as must (AND operator), then I can get back only the documents that contain the stem originated by the exactly same original word. Example for the dutch language I'm working with: fiets (means bicycle in dutch), its plural is fietsen. If I index fietsen I get both fietsen and fiets indexed, but if I index fiets I get the only fiets indexed. When I query for fietsen whatever I get the following boolean query: field:fiets field:fietsen field:whatever. If I apply the AND operator and use must clauses for each subquery, then I can only find the documents that originally contained fietsen, not the ones that originally contained fiets, which is not really what stemming is about. Any thoughts on this? I also wonder if it can be a dictionary issue since I see that different words that have the word fiets as root don't get the same stems, and using the AND operator at query time is a big issue. I would love to contribute on this and looking forward to your feedback. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5201) UIMAUpdateRequestProcessor should reuse the AnalysisEngine
[ https://issues.apache.org/jira/browse/SOLR-5201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760038#comment-13760038 ] Jun Ohtani commented on SOLR-5201: -- Thanks Tommaso. Sorry, I misunderstood about the relationship between UIMAUpdateRequestProcessorFactory and AnalysisEngine. My co-woker use this patch, it work without problems. Do you commit the above patch to branch_4x? UIMAUpdateRequestProcessor should reuse the AnalysisEngine -- Key: SOLR-5201 URL: https://issues.apache.org/jira/browse/SOLR-5201 Project: Solr Issue Type: Improvement Components: contrib - UIMA Affects Versions: 4.4 Reporter: Tommaso Teofili Assignee: Tommaso Teofili Fix For: 4.5, 5.0 Attachments: SOLR-5201-ae-cache-every-request_branch_4x.patch, SOLR-5201-ae-cache-only-single-request_branch_4x.patch As reported in http://markmail.org/thread/2psiyl4ukaejl4fx UIMAUpdateRequestProcessor instantiates an AnalysisEngine for each request which is bad for performance therefore it'd be nice if such AEs could be reused whenever that's possible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4734) FastVectorHighlighter Overlapping Proximity Queries Do Not Highlight
[ https://issues.apache.org/jira/browse/LUCENE-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760081#comment-13760081 ] Simon Willnauer commented on LUCENE-4734: - bq. The real question is: does it make more sense to invest time in LUCENE-2878 rather than further complicating FVH? FVH works great for simple phrase and single term queries but it has so many corner cases.. +1 lets do it +1 to revert the change! FastVectorHighlighter Overlapping Proximity Queries Do Not Highlight Key: LUCENE-4734 URL: https://issues.apache.org/jira/browse/LUCENE-4734 Project: Lucene - Core Issue Type: Bug Components: modules/highlighter Affects Versions: 4.0, 4.1, 5.0 Reporter: Ryan Lauck Assignee: Adrien Grand Labels: fastvectorhighlighter, highlighter Fix For: 5.0, 4.5 Attachments: LUCENE-4734-2.patch, lucene-4734.patch, LUCENE-4734.patch If a proximity phrase query overlaps with any other query term it will not be highlighted. Example Text: A B C D E F G Example Queries: B E~10 D (D will be highlighted instead of B C D E) B E~10 C F~10 (nothing will be highlighted) This can be traced to the FieldPhraseList constructor's inner while loop. From the first example query, the first TermInfo popped off the stack will be B. The second TermInfo will be D which will not be found in the submap for B E~10 and will trigger a failed match. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5101) make it easier to plugin different bitset implementations to CachingWrapperFilter
[ https://issues.apache.org/jira/browse/LUCENE-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760102#comment-13760102 ] ASF subversion and git services commented on LUCENE-5101: - Commit 1520525 from [~jpountz] in branch 'dev/trunk' [ https://svn.apache.org/r1520525 ] LUCENE-5101: Make it easier to plugin different bitset implementations to CachingWrapperFilter. make it easier to plugin different bitset implementations to CachingWrapperFilter - Key: LUCENE-5101 URL: https://issues.apache.org/jira/browse/LUCENE-5101 Project: Lucene - Core Issue Type: Improvement Reporter: Robert Muir Attachments: DocIdSetBenchmark.java, LUCENE-5101.patch, LUCENE-5101.patch, LUCENE-5101.patch, LUCENE-5101.patch Currently this is possible, but its not so friendly: {code} protected DocIdSet docIdSetToCache(DocIdSet docIdSet, AtomicReader reader) throws IOException { if (docIdSet == null) { // this is better than returning null, as the nonnull result can be cached return EMPTY_DOCIDSET; } else if (docIdSet.isCacheable()) { return docIdSet; } else { final DocIdSetIterator it = docIdSet.iterator(); // null is allowed to be returned by iterator(), // in this case we wrap with the sentinel set, // which is cacheable. if (it == null) { return EMPTY_DOCIDSET; } else { /* INTERESTING PART */ final FixedBitSet bits = new FixedBitSet(reader.maxDoc()); bits.or(it); return bits; /* END INTERESTING PART */ } } } {code} Is there any value to having all this other logic in the protected API? It seems like something thats not useful for a subclass... Maybe this stuff can become final, and INTERESTING PART calls a simpler method, something like: {code} protected DocIdSet cacheImpl(DocIdSetIterator iterator, AtomicReader reader) { final FixedBitSet bits = new FixedBitSet(reader.maxDoc()); bits.or(iterator); return bits; } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5217) CachedSqlEntity fails with stored procedure
Hardik Upadhyay created SOLR-5217: - Summary: CachedSqlEntity fails with stored procedure Key: SOLR-5217 URL: https://issues.apache.org/jira/browse/SOLR-5217 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Reporter: Hardik Upadhyay When using DIH with CachedSqlEntityProcessor and importing data from MS-sql using stored procedures, it imports data for nested entities only once and then every call with different arguments for nested entities are only served from cache.My db-data-config is attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5217) CachedSqlEntity fails with stored procedure
[ https://issues.apache.org/jira/browse/SOLR-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hardik Upadhyay updated SOLR-5217: -- Attachment: db-data-config.xml CachedSqlEntity fails with stored procedure --- Key: SOLR-5217 URL: https://issues.apache.org/jira/browse/SOLR-5217 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Reporter: Hardik Upadhyay Attachments: db-data-config.xml When using DIH with CachedSqlEntityProcessor and importing data from MS-sql using stored procedures, it imports data for nested entities only once and then every call with different arguments for nested entities are only served from cache.My db-data-config is attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-NightlyTests-trunk - Build # 372 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-trunk/372/ 1 tests failed. REGRESSION: org.apache.lucene.index.TestRollingUpdates.testRollingUpdates Error Message: Java heap space Stack Trace: java.lang.OutOfMemoryError: Java heap space at __randomizedtesting.SeedInfo.seed([5535725D2A0C4F09:DBCE73249992EAC2]:0) at org.apache.lucene.util.fst.BytesStore.init(BytesStore.java:62) at org.apache.lucene.util.fst.FST.init(FST.java:366) at org.apache.lucene.util.fst.FST.init(FST.java:301) at org.apache.lucene.codecs.memory.MemoryPostingsFormat$TermsReader.init(MemoryPostingsFormat.java:799) at org.apache.lucene.codecs.memory.MemoryPostingsFormat.fieldsProducer(MemoryPostingsFormat.java:861) at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.init(PerFieldPostingsFormat.java:194) at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsFormat.java:233) at org.apache.lucene.index.SegmentCoreReaders.init(SegmentCoreReaders.java:128) at org.apache.lucene.index.SegmentReader.init(SegmentReader.java:56) at org.apache.lucene.index.ReadersAndLiveDocs.getReader(ReadersAndLiveDocs.java:111) at org.apache.lucene.index.ReadersAndLiveDocs.getReadOnlyClone(ReadersAndLiveDocs.java:166) at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:97) at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:377) at org.apache.lucene.index.TestRollingUpdates.testRollingUpdates(TestRollingUpdates.java:113) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) Build Log: [...truncated 282 lines...] [junit4] Suite: org.apache.lucene.index.TestRollingUpdates [junit4] 2 NOTE: download the large Jenkins line-docs file by running 'ant get-jenkins-line-docs' in the lucene directory. [junit4] 2 NOTE: reproduce with: ant test -Dtestcase=TestRollingUpdates -Dtests.method=testRollingUpdates -Dtests.seed=5535725D2A0C4F09 -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.linedocsfile=/home/hudson/lucene-data/enwiki.random.lines.txt -Dtests.locale=cs -Dtests.timezone=Etc/GMT -Dtests.file.encoding=US-ASCII [junit4] ERROR 21.9s J0 | TestRollingUpdates.testRollingUpdates [junit4] Throwable #1: java.lang.OutOfMemoryError: Java heap space [junit4]at __randomizedtesting.SeedInfo.seed([5535725D2A0C4F09:DBCE73249992EAC2]:0) [junit4]at org.apache.lucene.util.fst.BytesStore.init(BytesStore.java:62) [junit4]at org.apache.lucene.util.fst.FST.init(FST.java:366) [junit4]at org.apache.lucene.util.fst.FST.init(FST.java:301) [junit4]at org.apache.lucene.codecs.memory.MemoryPostingsFormat$TermsReader.init(MemoryPostingsFormat.java:799) [junit4]at org.apache.lucene.codecs.memory.MemoryPostingsFormat.fieldsProducer(MemoryPostingsFormat.java:861) [junit4]at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.init(PerFieldPostingsFormat.java:194) [junit4]at
[jira] [Commented] (LUCENE-5101) make it easier to plugin different bitset implementations to CachingWrapperFilter
[ https://issues.apache.org/jira/browse/LUCENE-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760104#comment-13760104 ] ASF subversion and git services commented on LUCENE-5101: - Commit 1520527 from [~jpountz] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1520527 ] LUCENE-5101: Make it easier to plugin different bitset implementations to CachingWrapperFilter. make it easier to plugin different bitset implementations to CachingWrapperFilter - Key: LUCENE-5101 URL: https://issues.apache.org/jira/browse/LUCENE-5101 Project: Lucene - Core Issue Type: Improvement Reporter: Robert Muir Attachments: DocIdSetBenchmark.java, LUCENE-5101.patch, LUCENE-5101.patch, LUCENE-5101.patch, LUCENE-5101.patch Currently this is possible, but its not so friendly: {code} protected DocIdSet docIdSetToCache(DocIdSet docIdSet, AtomicReader reader) throws IOException { if (docIdSet == null) { // this is better than returning null, as the nonnull result can be cached return EMPTY_DOCIDSET; } else if (docIdSet.isCacheable()) { return docIdSet; } else { final DocIdSetIterator it = docIdSet.iterator(); // null is allowed to be returned by iterator(), // in this case we wrap with the sentinel set, // which is cacheable. if (it == null) { return EMPTY_DOCIDSET; } else { /* INTERESTING PART */ final FixedBitSet bits = new FixedBitSet(reader.maxDoc()); bits.or(it); return bits; /* END INTERESTING PART */ } } } {code} Is there any value to having all this other logic in the protected API? It seems like something thats not useful for a subclass... Maybe this stuff can become final, and INTERESTING PART calls a simpler method, something like: {code} protected DocIdSet cacheImpl(DocIdSetIterator iterator, AtomicReader reader) { final FixedBitSet bits = new FixedBitSet(reader.maxDoc()); bits.or(iterator); return bits; } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5101) make it easier to plugin different bitset implementations to CachingWrapperFilter
[ https://issues.apache.org/jira/browse/LUCENE-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-5101. -- Resolution: Fixed Fix Version/s: 4.5 5.0 Committed, thanks Robert! make it easier to plugin different bitset implementations to CachingWrapperFilter - Key: LUCENE-5101 URL: https://issues.apache.org/jira/browse/LUCENE-5101 Project: Lucene - Core Issue Type: Improvement Reporter: Robert Muir Fix For: 5.0, 4.5 Attachments: DocIdSetBenchmark.java, LUCENE-5101.patch, LUCENE-5101.patch, LUCENE-5101.patch, LUCENE-5101.patch Currently this is possible, but its not so friendly: {code} protected DocIdSet docIdSetToCache(DocIdSet docIdSet, AtomicReader reader) throws IOException { if (docIdSet == null) { // this is better than returning null, as the nonnull result can be cached return EMPTY_DOCIDSET; } else if (docIdSet.isCacheable()) { return docIdSet; } else { final DocIdSetIterator it = docIdSet.iterator(); // null is allowed to be returned by iterator(), // in this case we wrap with the sentinel set, // which is cacheable. if (it == null) { return EMPTY_DOCIDSET; } else { /* INTERESTING PART */ final FixedBitSet bits = new FixedBitSet(reader.maxDoc()); bits.or(it); return bits; /* END INTERESTING PART */ } } } {code} Is there any value to having all this other logic in the protected API? It seems like something thats not useful for a subclass... Maybe this stuff can become final, and INTERESTING PART calls a simpler method, something like: {code} protected DocIdSet cacheImpl(DocIdSetIterator iterator, AtomicReader reader) { final FixedBitSet bits = new FixedBitSet(reader.maxDoc()); bits.or(iterator); return bits; } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4734) FastVectorHighlighter Overlapping Proximity Queries Do Not Highlight
[ https://issues.apache.org/jira/browse/LUCENE-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760118#comment-13760118 ] ASF subversion and git services commented on LUCENE-4734: - Commit 1520536 from [~jpountz] in branch 'dev/trunk' [ https://svn.apache.org/r1520536 ] Revert LUCENE-4734. FastVectorHighlighter Overlapping Proximity Queries Do Not Highlight Key: LUCENE-4734 URL: https://issues.apache.org/jira/browse/LUCENE-4734 Project: Lucene - Core Issue Type: Bug Components: modules/highlighter Affects Versions: 4.0, 4.1, 5.0 Reporter: Ryan Lauck Assignee: Adrien Grand Labels: fastvectorhighlighter, highlighter Fix For: 5.0, 4.5 Attachments: LUCENE-4734-2.patch, lucene-4734.patch, LUCENE-4734.patch If a proximity phrase query overlaps with any other query term it will not be highlighted. Example Text: A B C D E F G Example Queries: B E~10 D (D will be highlighted instead of B C D E) B E~10 C F~10 (nothing will be highlighted) This can be traced to the FieldPhraseList constructor's inner while loop. From the first example query, the first TermInfo popped off the stack will be B. The second TermInfo will be D which will not be found in the submap for B E~10 and will trigger a failed match. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4734) FastVectorHighlighter Overlapping Proximity Queries Do Not Highlight
[ https://issues.apache.org/jira/browse/LUCENE-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760138#comment-13760138 ] ASF subversion and git services commented on LUCENE-4734: - Commit 1520544 from [~jpountz] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1520544 ] Revert LUCENE-4734. FastVectorHighlighter Overlapping Proximity Queries Do Not Highlight Key: LUCENE-4734 URL: https://issues.apache.org/jira/browse/LUCENE-4734 Project: Lucene - Core Issue Type: Bug Components: modules/highlighter Affects Versions: 4.0, 4.1, 5.0 Reporter: Ryan Lauck Assignee: Adrien Grand Labels: fastvectorhighlighter, highlighter Fix For: 5.0, 4.5 Attachments: LUCENE-4734-2.patch, lucene-4734.patch, LUCENE-4734.patch If a proximity phrase query overlaps with any other query term it will not be highlighted. Example Text: A B C D E F G Example Queries: B E~10 D (D will be highlighted instead of B C D E) B E~10 C F~10 (nothing will be highlighted) This can be traced to the FieldPhraseList constructor's inner while loop. From the first example query, the first TermInfo popped off the stack will be B. The second TermInfo will be D which will not be found in the submap for B E~10 and will trigger a failed match. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5202) Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench.
[ https://issues.apache.org/jira/browse/SOLR-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated SOLR-5202: -- Attachment: SOLR-5202.patch Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench. Key: SOLR-5202 URL: https://issues.apache.org/jira/browse/SOLR-5202 Project: Solr Issue Type: New Feature Reporter: Dawid Weiss Assignee: Dawid Weiss Fix For: 4.5, 5.0 Attachments: SOLR-5202.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5202) Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench.
[ https://issues.apache.org/jira/browse/SOLR-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760145#comment-13760145 ] Dawid Weiss commented on SOLR-5202: --- Todo: the example should come with the default Carrot2 algorithms preconfigured (by name) and with sensible default attribute XMLs. The benefit is twofold - better out-of-the-box source for copy-pasting and a clear indication where overridden resources must be located. We should provide the defaults for Lingo, STC and kmeans perhaps. Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench. Key: SOLR-5202 URL: https://issues.apache.org/jira/browse/SOLR-5202 Project: Solr Issue Type: New Feature Reporter: Dawid Weiss Assignee: Dawid Weiss Fix For: 4.5, 5.0 Attachments: SOLR-5202.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5202) Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench.
[ https://issues.apache.org/jira/browse/SOLR-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760145#comment-13760145 ] Dawid Weiss edited comment on SOLR-5202 at 9/6/13 11:36 AM: Todo: the example should come with the default Carrot2 algorithms preconfigured (by name) and with sensible default attribute XMLs. The benefit is twofold - better out-of-the-box source for copy-pasting and a clear indication where overridden resources must be located. We should provide the defaults for Lingo, STC and kmeans perhaps. Another thing is that LEXICAL_RESOURCES_DIR no longer reflects the true purpose of that folder... perhaps it should be aliased to something more sensible. was (Author: dweiss): Todo: the example should come with the default Carrot2 algorithms preconfigured (by name) and with sensible default attribute XMLs. The benefit is twofold - better out-of-the-box source for copy-pasting and a clear indication where overridden resources must be located. We should provide the defaults for Lingo, STC and kmeans perhaps. Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench. Key: SOLR-5202 URL: https://issues.apache.org/jira/browse/SOLR-5202 Project: Solr Issue Type: New Feature Reporter: Dawid Weiss Assignee: Dawid Weiss Fix For: 4.5, 5.0 Attachments: SOLR-5202.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary
[ https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760160#comment-13760160 ] Han Jiang commented on LUCENE-3069: --- Mike, thanks for the review! bq. In general, couldn't the writer re-use the reader's TermState? I'm afraid this somewhat makes codes longer? I'll make a patch to see this. {quote} Have you run first do no harm perf tests? Ie, compare current trunk w/ default Codec to branch w/ default Codec? Just to make sure there are no surprises... {quote} Yes, no surprise yet. bq. Why does Lucene41PostingsWriter have impersonation code? Yeah, these should be removed. {quote} I forget: why does the postings reader/writer need to handle delta coding again (take an absolute boolean argument)? Was it because of pulsing or sep? It's fine for now (progress not perfection) ... but not clean, since delta coding is really an encoding detail so in theory the terms dict should own that ... {quote} Ah, yes, because of pulsing. This is because.. PulsingPostingsBase is more than a PostingsBaseFormat. It somewhat acts like a term dict, e.g. it needs to understand how terms are structured in one block (term No.1 uses absolute value, term No.x use delta value) then judge how to restruct the inlined and wrapped block (No.1 still uses absolute value, but the first-non-pulsed term will need absolute encoding as well). Without the argument 'absolute', the real term dictionary will do the delta encoding itself, then PulsingPostingsBase will be confused, and all wrapped PostingsBase have to encode metadata values without delta-format. {quote} The new .smy file for Pulsing is sort of strange ... but necessary since it always uses 0 longs, so we have to store this somewhere ... you could put it into FieldInfo attributes instead? {quote} Yeah, it is another hairy thing... the reason is, we don't have a 'PostingsTrailer' for PostingsBaseFormat. Pulsing will not know the longs size for each field, until all the fields are consumed... and it should not write those longsSize to termsOut in close() since the term dictionary will use the DirTrailer hack here. (maybe every term dictionary should close postingsWriter first, then write field summary and close itself? I'm not sure though). bq. Should we backport this to 4.x? Yeah, OK! Lucene should have an entirely memory resident term dictionary -- Key: LUCENE-3069 URL: https://issues.apache.org/jira/browse/LUCENE-3069 Project: Lucene - Core Issue Type: Improvement Components: core/index, core/search Affects Versions: 4.0-ALPHA Reporter: Simon Willnauer Assignee: Han Jiang Labels: gsoc2013 Fix For: 5.0, 4.5 Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch FST based TermDictionary has been a great improvement yet it still uses a delta codec file for scanning to terms. Some environments have enough memory available to keep the entire FST based term dict in memory. We should add a TermDictionary implementation that encodes all needed information for each term into the FST (custom fst.Output) and builds a FST from the entire term not just the delta. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2548) Multithreaded faceting
[ https://issues.apache.org/jira/browse/SOLR-2548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-2548: - Attachment: SOLR-2548.patch Final patch, including CHANGES.txt entry. Multithreaded faceting -- Key: SOLR-2548 URL: https://issues.apache.org/jira/browse/SOLR-2548 Project: Solr Issue Type: Improvement Components: search Affects Versions: 3.1 Reporter: Janne Majaranta Assignee: Erick Erickson Priority: Minor Labels: facet Attachments: SOLR-2548_4.2.1.patch, SOLR-2548_for_31x.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch Add multithreading support for faceting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4817) Solr should not fall back to the back compat built in solr.xml in SolrCloud mode.
[ https://issues.apache.org/jira/browse/SOLR-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760226#comment-13760226 ] Erick Erickson commented on SOLR-4817: -- bq: I think it's all a bit of a mess right now Yeah, it certainly is but I haven't had the energy to try to straighten it out either. Maybe we can share some of the work Solr should not fall back to the back compat built in solr.xml in SolrCloud mode. - Key: SOLR-4817 URL: https://issues.apache.org/jira/browse/SOLR-4817 Project: Solr Issue Type: Bug Components: SolrCloud Reporter: Mark Miller Assignee: Erick Erickson Priority: Minor Fix For: 4.5, 5.0 Attachments: SOLR-4817.patch, SOLR-4817.patch, SOLR-4817.patch, SOLR-4817.patch, SOLR-4817.patch A hard error is much more useful, and this built in solr.xml is not very good for solrcloud - with the old style solr.xml with cores in it, you won't have persistence and with the new style, it's not really ideal either. I think it makes it easier to debug solr.home to fail on this instead - but just in solrcloud mode for now due to back compat. We might want to pull the whole internal solr.xml for 5.0. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4817) Solr should not fall back to the back compat built in solr.xml in SolrCloud mode.
[ https://issues.apache.org/jira/browse/SOLR-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760221#comment-13760221 ] Mark Miller commented on SOLR-4817: --- I think it's all a bit of a mess right now (the test configs situation) - we should clean this up more. I intend to take a crack at it at some point. It's still too haphazard what is done in what tests and too difficult to understand and follow when writing new tests or debugging old ones. Solr should not fall back to the back compat built in solr.xml in SolrCloud mode. - Key: SOLR-4817 URL: https://issues.apache.org/jira/browse/SOLR-4817 Project: Solr Issue Type: Bug Components: SolrCloud Reporter: Mark Miller Assignee: Erick Erickson Priority: Minor Fix For: 4.5, 5.0 Attachments: SOLR-4817.patch, SOLR-4817.patch, SOLR-4817.patch, SOLR-4817.patch, SOLR-4817.patch A hard error is much more useful, and this built in solr.xml is not very good for solrcloud - with the old style solr.xml with cores in it, you won't have persistence and with the new style, it's not really ideal either. I think it makes it easier to debug solr.home to fail on this instead - but just in solrcloud mode for now due to back compat. We might want to pull the whole internal solr.xml for 5.0. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5024) java client(solrj 4.1.0) can not get the ngroup number.
[ https://issues.apache.org/jira/browse/SOLR-5024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760239#comment-13760239 ] Sandro Mario Zbinden commented on SOLR-5024: This error exists too in Solr 4.2. java client(solrj 4.1.0) can not get the ngroup number. --- Key: SOLR-5024 URL: https://issues.apache.org/jira/browse/SOLR-5024 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 4.1 Reporter: sun Priority: Minor Labels: none Original Estimate: 10m Remaining Estimate: 10m when adding these parameters(group=truegroup.field=topicidgroup.ngroups=truegroup.format=simple ) to solrj, i can not get the group number. it's easy to fix it. at line 221 of queryresponse.java, an if-else should be here, just like those from line 203 to 208. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5216) Document updates to SolrCloud can cause a distributed deadlock.
[ https://issues.apache.org/jira/browse/SOLR-5216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-5216: -- Priority: Critical (was: Major) Document updates to SolrCloud can cause a distributed deadlock. --- Key: SOLR-5216 URL: https://issues.apache.org/jira/browse/SOLR-5216 Project: Solr Issue Type: Bug Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller Priority: Critical Fix For: 4.5, 5.0 Attachments: SOLR-5216.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary
[ https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760259#comment-13760259 ] ASF subversion and git services commented on LUCENE-3069: - Commit 1520592 from [~billy] in branch 'dev/branches/lucene3069' [ https://svn.apache.org/r1520592 ] LUCENE-3069: remove impersonate codes, fix typo Lucene should have an entirely memory resident term dictionary -- Key: LUCENE-3069 URL: https://issues.apache.org/jira/browse/LUCENE-3069 Project: Lucene - Core Issue Type: Improvement Components: core/index, core/search Affects Versions: 4.0-ALPHA Reporter: Simon Willnauer Assignee: Han Jiang Labels: gsoc2013 Fix For: 5.0, 4.5 Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch FST based TermDictionary has been a great improvement yet it still uses a delta codec file for scanning to terms. Some environments have enough memory available to keep the entire FST based term dict in memory. We should add a TermDictionary implementation that encodes all needed information for each term into the FST (custom fst.Output) and builds a FST from the entire term not just the delta. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary
[ https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760325#comment-13760325 ] Han Jiang commented on LUCENE-3069: --- I think this is ready to commit to trunk now, and I'll wait for a day or two before committing it. :) Lucene should have an entirely memory resident term dictionary -- Key: LUCENE-3069 URL: https://issues.apache.org/jira/browse/LUCENE-3069 Project: Lucene - Core Issue Type: Improvement Components: core/index, core/search Affects Versions: 4.0-ALPHA Reporter: Simon Willnauer Assignee: Han Jiang Labels: gsoc2013 Fix For: 5.0, 4.5 Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch FST based TermDictionary has been a great improvement yet it still uses a delta codec file for scanning to terms. Some environments have enough memory available to keep the entire FST based term dict in memory. We should add a TermDictionary implementation that encodes all needed information for each term into the FST (custom fst.Output) and builds a FST from the entire term not just the delta. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary
[ https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760328#comment-13760328 ] Michael McCandless commented on LUCENE-3069: Thanks Han. I think we can just leave the .smy as is for now, and keep passing boolean absolute down. We can later improve these ... I think we should first land this on trunk and let jenkins chew on it for a while ... and if all seems good, then back port. Lucene should have an entirely memory resident term dictionary -- Key: LUCENE-3069 URL: https://issues.apache.org/jira/browse/LUCENE-3069 Project: Lucene - Core Issue Type: Improvement Components: core/index, core/search Affects Versions: 4.0-ALPHA Reporter: Simon Willnauer Assignee: Han Jiang Labels: gsoc2013 Fix For: 5.0, 4.5 Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch FST based TermDictionary has been a great improvement yet it still uses a delta codec file for scanning to terms. Some environments have enough memory available to keep the entire FST based term dict in memory. We should add a TermDictionary implementation that encodes all needed information for each term into the FST (custom fst.Output) and builds a FST from the entire term not just the delta. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5217) CachedSqlEntity fails with stored procedure
[ https://issues.apache.org/jira/browse/SOLR-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760303#comment-13760303 ] Shalin Shekhar Mangar commented on SOLR-5217: - I don't think this is a bug. CachedSqlEntityProcessor will execute the query only once and that is its USP. If you don't want the caching, then just use SqlEntityProcessor. CachedSqlEntity fails with stored procedure --- Key: SOLR-5217 URL: https://issues.apache.org/jira/browse/SOLR-5217 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Reporter: Hardik Upadhyay Attachments: db-data-config.xml When using DIH with CachedSqlEntityProcessor and importing data from MS-sql using stored procedures, it imports data for nested entities only once and then every call with different arguments for nested entities are only served from cache.My db-data-config is attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary
[ https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760304#comment-13760304 ] ASF subversion and git services commented on LUCENE-3069: - Commit 1520618 from [~billy] in branch 'dev/branches/lucene3069' [ https://svn.apache.org/r1520618 ] LUCENE-3069: reuse customized TermState in PBF Lucene should have an entirely memory resident term dictionary -- Key: LUCENE-3069 URL: https://issues.apache.org/jira/browse/LUCENE-3069 Project: Lucene - Core Issue Type: Improvement Components: core/index, core/search Affects Versions: 4.0-ALPHA Reporter: Simon Willnauer Assignee: Han Jiang Labels: gsoc2013 Fix For: 5.0, 4.5 Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch FST based TermDictionary has been a great improvement yet it still uses a delta codec file for scanning to terms. Some environments have enough memory available to keep the entire FST based term dict in memory. We should add a TermDictionary implementation that encodes all needed information for each term into the FST (custom fst.Output) and builds a FST from the entire term not just the delta. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5200) HighFreqTerms has confusing behavior with -t option
[ https://issues.apache.org/jira/browse/LUCENE-5200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-5200. - Resolution: Fixed Fix Version/s: 4.5 5.0 HighFreqTerms has confusing behavior with -t option --- Key: LUCENE-5200 URL: https://issues.apache.org/jira/browse/LUCENE-5200 Project: Lucene - Core Issue Type: Bug Components: modules/other Reporter: Robert Muir Fix For: 5.0, 4.5 Attachments: LUCENE-5200.patch {code} * codeHighFreqTerms/code class extracts the top n most frequent terms * (by document frequency) from an existing Lucene index and reports their * document frequency. * p * If the -t flag is given, both document frequency and total tf (total * number of occurrences) are reported, ordered by descending total tf. {code} Problem #1: Its tricky what happens with -t: if you ask for the top-100 terms, it requests the top-100 terms (by docFreq), then resorts the top-N by totalTermFreq. So its not really the top 100 most frequently occurring terms. Problem #2: Using the -t option can be confusing and slow: the reported docFreq includes deletions, but totalTermFreq does not (it actually walks postings lists if there is even one deletion). I think this is a relic from 3.x days when lucene did not support this statistic. I think we should just always output both TermsEnum.docFreq() and TermsEnum.totalTermFreq(), and -t just determines the comparator of the PQ. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5200) HighFreqTerms has confusing behavior with -t option
[ https://issues.apache.org/jira/browse/LUCENE-5200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760296#comment-13760296 ] ASF subversion and git services commented on LUCENE-5200: - Commit 1520616 from [~rcmuir] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1520616 ] LUCENE-5200: HighFreqTerms has confusing behavior with -t option HighFreqTerms has confusing behavior with -t option --- Key: LUCENE-5200 URL: https://issues.apache.org/jira/browse/LUCENE-5200 Project: Lucene - Core Issue Type: Bug Components: modules/other Reporter: Robert Muir Attachments: LUCENE-5200.patch {code} * codeHighFreqTerms/code class extracts the top n most frequent terms * (by document frequency) from an existing Lucene index and reports their * document frequency. * p * If the -t flag is given, both document frequency and total tf (total * number of occurrences) are reported, ordered by descending total tf. {code} Problem #1: Its tricky what happens with -t: if you ask for the top-100 terms, it requests the top-100 terms (by docFreq), then resorts the top-N by totalTermFreq. So its not really the top 100 most frequently occurring terms. Problem #2: Using the -t option can be confusing and slow: the reported docFreq includes deletions, but totalTermFreq does not (it actually walks postings lists if there is even one deletion). I think this is a relic from 3.x days when lucene did not support this statistic. I think we should just always output both TermsEnum.docFreq() and TermsEnum.totalTermFreq(), and -t just determines the comparator of the PQ. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Solr-Artifacts-4.x - Build # 402 - Failure
java6 doesnt have this: i committed a fix. On Fri, Sep 6, 2013 at 10:12 AM, Apache Jenkins Server jenk...@builds.apache.org wrote: Build: https://builds.apache.org/job/Solr-Artifacts-4.x/402/ No tests ran. Build Log: [...truncated 8808 lines...] [javac] Compiling 20 source files to /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/build/misc/classes/java [javac] /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/misc/src/java/org/apache/lucene/misc/HighFreqTerms.java:140: cannot find symbol [javac] symbol : method compare(int,int) [javac] location: class java.lang.Long [javac] int res = Long.compare(a.docFreq, b.docFreq); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/misc/src/java/org/apache/lucene/misc/HighFreqTerms.java:158: cannot find symbol [javac] symbol : method compare(long,long) [javac] location: class java.lang.Long [javac] int res = Long.compare(a.totalTermFreq, b.totalTermFreq); [javac] ^ [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] 2 errors BUILD FAILED /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/solr/common-build.xml:374: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/module-build.xml:573: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/module-build.xml:507: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/common-build.xml:477: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/common-build.xml:1625: Compile failed; see the compiler error output for details. Total time: 1 minute 14 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Publishing Javadoc Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS-MAVEN] Lucene-Solr-Maven-4.x #439: POMs out of sync
Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/439/ No tests ran. Build Log: [...truncated 3328 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Solr-Artifacts-4.x - Build # 402 - Failure
Build: https://builds.apache.org/job/Solr-Artifacts-4.x/402/ No tests ran. Build Log: [...truncated 8808 lines...] [javac] Compiling 20 source files to /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/build/misc/classes/java [javac] /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/misc/src/java/org/apache/lucene/misc/HighFreqTerms.java:140: cannot find symbol [javac] symbol : method compare(int,int) [javac] location: class java.lang.Long [javac] int res = Long.compare(a.docFreq, b.docFreq); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/misc/src/java/org/apache/lucene/misc/HighFreqTerms.java:158: cannot find symbol [javac] symbol : method compare(long,long) [javac] location: class java.lang.Long [javac] int res = Long.compare(a.totalTermFreq, b.totalTermFreq); [javac] ^ [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] 2 errors BUILD FAILED /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/solr/common-build.xml:374: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/module-build.xml:573: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/module-build.xml:507: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/common-build.xml:477: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/common-build.xml:1625: Compile failed; see the compiler error output for details. Total time: 1 minute 14 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Publishing Javadoc Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5200) HighFreqTerms has confusing behavior with -t option
[ https://issues.apache.org/jira/browse/LUCENE-5200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760294#comment-13760294 ] ASF subversion and git services commented on LUCENE-5200: - Commit 1520615 from [~rcmuir] in branch 'dev/trunk' [ https://svn.apache.org/r1520615 ] LUCENE-5200: HighFreqTerms has confusing behavior with -t option HighFreqTerms has confusing behavior with -t option --- Key: LUCENE-5200 URL: https://issues.apache.org/jira/browse/LUCENE-5200 Project: Lucene - Core Issue Type: Bug Components: modules/other Reporter: Robert Muir Attachments: LUCENE-5200.patch {code} * codeHighFreqTerms/code class extracts the top n most frequent terms * (by document frequency) from an existing Lucene index and reports their * document frequency. * p * If the -t flag is given, both document frequency and total tf (total * number of occurrences) are reported, ordered by descending total tf. {code} Problem #1: Its tricky what happens with -t: if you ask for the top-100 terms, it requests the top-100 terms (by docFreq), then resorts the top-N by totalTermFreq. So its not really the top 100 most frequently occurring terms. Problem #2: Using the -t option can be confusing and slow: the reported docFreq includes deletions, but totalTermFreq does not (it actually walks postings lists if there is even one deletion). I think this is a relic from 3.x days when lucene did not support this statistic. I think we should just always output both TermsEnum.docFreq() and TermsEnum.totalTermFreq(), and -t just determines the comparator of the PQ. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5197) Add a method to SegmentReader to get the current index heap memory size
[ https://issues.apache.org/jira/browse/LUCENE-5197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-5197: Attachment: LUCENE-5197.patch Some minor cleanups / improvements: Fixed calculations for all-in-ram DV impls: for the esoteric/deprecated ones, it just uses RUE rather than making the code complicated. Facet42 is easy though and accounts correctly now. Added missing null check for VariableGapReader's FST (it can happen when there are no terms). Add a method to SegmentReader to get the current index heap memory size --- Key: LUCENE-5197 URL: https://issues.apache.org/jira/browse/LUCENE-5197 Project: Lucene - Core Issue Type: Improvement Components: core/codecs, core/index Reporter: Areek Zillur Attachments: LUCENE-5197.patch, LUCENE-5197.patch, LUCENE-5197.patch, LUCENE-5197.patch, LUCENE-5197.patch It would be useful to at least estimate the index heap size being used by Lucene. Ideally a method exposing this information at the SegmentReader level. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2548) Multithreaded faceting
[ https://issues.apache.org/jira/browse/SOLR-2548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760427#comment-13760427 ] ASF subversion and git services commented on SOLR-2548: --- Commit 1520645 from [~erickoerickson] in branch 'dev/trunk' [ https://svn.apache.org/r1520645 ] SOLR-2548, Multithread faceting Multithreaded faceting -- Key: SOLR-2548 URL: https://issues.apache.org/jira/browse/SOLR-2548 Project: Solr Issue Type: Improvement Components: search Affects Versions: 3.1 Reporter: Janne Majaranta Assignee: Erick Erickson Priority: Minor Labels: facet Attachments: SOLR-2548_4.2.1.patch, SOLR-2548_for_31x.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch Add multithreading support for faceting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-NightlyTests-4.x - Build # 368 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-4.x/368/ All tests passed Build Log: [...truncated 3511 lines...] [javac] Compiling 20 source files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/lucene/build/misc/classes/java [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/lucene/misc/src/java/org/apache/lucene/misc/HighFreqTerms.java:140: cannot find symbol [javac] symbol : method compare(int,int) [javac] location: class java.lang.Long [javac] int res = Long.compare(a.docFreq, b.docFreq); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/lucene/misc/src/java/org/apache/lucene/misc/HighFreqTerms.java:158: cannot find symbol [javac] symbol : method compare(long,long) [javac] location: class java.lang.Long [javac] int res = Long.compare(a.totalTermFreq, b.totalTermFreq); [javac] ^ [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] 2 errors BUILD FAILED /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/build.xml:409: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/build.xml:382: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/build.xml:39: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/lucene/build.xml:551: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/lucene/common-build.xml:1887: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/lucene/module-build.xml:58: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/lucene/common-build.xml:477: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/lucene/common-build.xml:1625: Compile failed; see the compiler error output for details. Total time: 39 minutes 17 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2548) Multithreaded faceting
[ https://issues.apache.org/jira/browse/SOLR-2548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson resolved SOLR-2548. -- Resolution: Fixed Fix Version/s: 5.0 4.5 Thanks Janne and Gun! Multithreaded faceting -- Key: SOLR-2548 URL: https://issues.apache.org/jira/browse/SOLR-2548 Project: Solr Issue Type: Improvement Components: search Affects Versions: 3.1 Reporter: Janne Majaranta Assignee: Erick Erickson Priority: Minor Labels: facet Fix For: 4.5, 5.0 Attachments: SOLR-2548_4.2.1.patch, SOLR-2548_for_31x.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch Add multithreading support for faceting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5216) Document updates to SolrCloud can cause a distributed deadlock.
[ https://issues.apache.org/jira/browse/SOLR-5216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760516#comment-13760516 ] Tim Vaillancourt commented on SOLR-5216: Hey guys, We tested this patch and unfortunately encountered some serious issues a few hours of 500 update-batches/sec. Our update batch is 10 docs, so we are writing about 5000 docs/sec total, using autoCommit to commit the updates (no explicit commits). Our environment: * Solr 4.3.1 w/SOLR-5216 patch. * Jetty 9, Java 1.7. * 3 solr instances, 1 per physical server. * 1 collection. * 3 shards. * 2 replicas (each instance is a leader and a replica). * Soft autoCommit is 1000ms. * Hard autoCommit is 15000ms. After a few hours of this testing, we see many of these stalled transactions, and the solr instances start to see each other as down, flooding our solr logs with Connection Refused exceptions, and otherwise no useful logs (that I could see). Stack /select seems stalled on: http://pastebin.com/Y1NCrXGC Stack /update seems stalled on: http://pastebin.com/cFLbC8Y9 Lastly, I have a summary of the ERROR-severity logs from this 24-hour soak. My script normalizes the ERROR-severity stack traces and returns them in order of ocurrance. Summary of my solr.log: http://pastebin.com/pBdMAWeb Thanks! Tim Vaillancourt Document updates to SolrCloud can cause a distributed deadlock. --- Key: SOLR-5216 URL: https://issues.apache.org/jira/browse/SOLR-5216 Project: Solr Issue Type: Bug Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller Priority: Critical Fix For: 4.5, 5.0 Attachments: SOLR-5216.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5216) Document updates to SolrCloud can cause a distributed deadlock.
[ https://issues.apache.org/jira/browse/SOLR-5216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760516#comment-13760516 ] Tim Vaillancourt edited comment on SOLR-5216 at 9/6/13 7:01 PM: Hey guys, We tested this patch and unfortunately encountered some serious issues a few hours of 500 update-batches/sec. Our update batch is 10 docs, so we are writing about 5000 docs/sec total, using autoCommit to commit the updates (no explicit commits). Our environment: * Solr 4.3.1 w/SOLR-5216 patch. * Jetty 9, Java 1.7. * 3 solr instances, 1 per physical server. * 1 collection. * 3 shards. * 2 replicas (each instance is a leader and a replica). * Soft autoCommit is 1000ms. * Hard autoCommit is 15000ms. After about 6 hours of stress-testing this patch, we see many of these stalled transactions (below), and the Solr instances start to see each other as down, flooding our Solr logs with Connection Refused exceptions, and otherwise no obviously-useful logs that I could see. I did notice some stalled transactions on both /select and /update, however. This never occurred without this patch. Stack /select seems stalled on: http://pastebin.com/Y1NCrXGC Stack /update seems stalled on: http://pastebin.com/cFLbC8Y9 Lastly, I have a summary of the ERROR-severity logs from this 24-hour soak. My script normalizes the ERROR-severity stack traces and returns them in order of occurrence. Summary of my solr.log: http://pastebin.com/pBdMAWeb Thanks! Tim Vaillancourt was (Author: tvaillancourt): Hey guys, We tested this patch and unfortunately encountered some serious issues a few hours of 500 update-batches/sec. Our update batch is 10 docs, so we are writing about 5000 docs/sec total, using autoCommit to commit the updates (no explicit commits). Our environment: * Solr 4.3.1 w/SOLR-5216 patch. * Jetty 9, Java 1.7. * 3 solr instances, 1 per physical server. * 1 collection. * 3 shards. * 2 replicas (each instance is a leader and a replica). * Soft autoCommit is 1000ms. * Hard autoCommit is 15000ms. After a few hours of this testing, we see many of these stalled transactions, and the solr instances start to see each other as down, flooding our solr logs with Connection Refused exceptions, and otherwise no useful logs (that I could see). Stack /select seems stalled on: http://pastebin.com/Y1NCrXGC Stack /update seems stalled on: http://pastebin.com/cFLbC8Y9 Lastly, I have a summary of the ERROR-severity logs from this 24-hour soak. My script normalizes the ERROR-severity stack traces and returns them in order of ocurrance. Summary of my solr.log: http://pastebin.com/pBdMAWeb Thanks! Tim Vaillancourt Document updates to SolrCloud can cause a distributed deadlock. --- Key: SOLR-5216 URL: https://issues.apache.org/jira/browse/SOLR-5216 Project: Solr Issue Type: Bug Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller Priority: Critical Fix For: 4.5, 5.0 Attachments: SOLR-5216.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2548) Multithreaded faceting
[ https://issues.apache.org/jira/browse/SOLR-2548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760528#comment-13760528 ] ASF subversion and git services commented on SOLR-2548: --- Commit 1520670 from [~erickoerickson] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1520670 ] SOLR-2548, Multithread faceting Multithreaded faceting -- Key: SOLR-2548 URL: https://issues.apache.org/jira/browse/SOLR-2548 Project: Solr Issue Type: Improvement Components: search Affects Versions: 3.1 Reporter: Janne Majaranta Assignee: Erick Erickson Priority: Minor Labels: facet Attachments: SOLR-2548_4.2.1.patch, SOLR-2548_for_31x.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch Add multithreading support for faceting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5218) Unable to extend SolrJettyTestBase within a Parametrized test
Steve Davids created SOLR-5218: -- Summary: Unable to extend SolrJettyTestBase within a Parametrized test Key: SOLR-5218 URL: https://issues.apache.org/jira/browse/SOLR-5218 Project: Solr Issue Type: Bug Components: Tests Affects Versions: 4.3.1 Reporter: Steve Davids Fix For: 4.5, 5.0 I would like to create a unit test that extends SolrJettyTestBase using the JUnit Parameterized test format. When I try to run the test I get the following messages: Method beforeClass() should be public Method afterClass() should be public at java.lang.reflect.Constructor.newInstance(Unkown Source)... Obviously it would be great if we could make those public so I can use the JUnit Runner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5202) Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench.
[ https://issues.apache.org/jira/browse/SOLR-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760569#comment-13760569 ] ASF subversion and git services commented on SOLR-5202: --- Commit 1520683 from [~dawidweiss] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1520683 ] SOLR-5202: follow-up to CHANGES.txt Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench. Key: SOLR-5202 URL: https://issues.apache.org/jira/browse/SOLR-5202 Project: Solr Issue Type: New Feature Reporter: Dawid Weiss Assignee: Dawid Weiss Fix For: 4.5, 5.0 Attachments: SOLR-5202.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5219) Refactor selection of the default clustering algorithm
Dawid Weiss created SOLR-5219: - Summary: Refactor selection of the default clustering algorithm Key: SOLR-5219 URL: https://issues.apache.org/jira/browse/SOLR-5219 Project: Solr Issue Type: Improvement Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Minor Fix For: 4.5, 5.0 This is currently quite messy: the user needs to explicitly name the 'default' algorithm. The logic should be: 1) if there's only one algorithm, it becomes the default, 2) if there's more than one algorithm, the first one becomes the default one. 3) for back-compat, if there is an algorithm called 'default', it does become the default one. The code will simplify a great deal too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5202) Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench.
[ https://issues.apache.org/jira/browse/SOLR-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss resolved SOLR-5202. --- Resolution: Fixed Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench. Key: SOLR-5202 URL: https://issues.apache.org/jira/browse/SOLR-5202 Project: Solr Issue Type: New Feature Reporter: Dawid Weiss Assignee: Dawid Weiss Fix For: 4.5, 5.0 Attachments: SOLR-5202.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5202) Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench.
[ https://issues.apache.org/jira/browse/SOLR-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760563#comment-13760563 ] ASF subversion and git services commented on SOLR-5202: --- Commit 1520678 from [~dawidweiss] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1520678 ] SOLR-5202: Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench. Polished clustering configuration examples. Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench. Key: SOLR-5202 URL: https://issues.apache.org/jira/browse/SOLR-5202 Project: Solr Issue Type: New Feature Reporter: Dawid Weiss Assignee: Dawid Weiss Fix For: 4.5, 5.0 Attachments: SOLR-5202.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5218) Unable to extend SolrJettyTestBase within a Parametrized test
[ https://issues.apache.org/jira/browse/SOLR-5218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760577#comment-13760577 ] Dawid Weiss commented on SOLR-5218: --- We use a runner that does not follow all of JUnit conventions (and there are reason why it doesn't). JUnit requires all hooks to be public methods but this leads to accidental overrides and missed super calls. In RandomizedRunner a private hook is always called, regardless of the shadowing/ override. If you want to use a parameterized test, use RandomizedRunner's factory instead, as is shown here: https://github.com/carrotsearch/randomizedtesting/blob/master/examples/maven/src/main/java/com/carrotsearch/examples/randomizedrunner/Test007ParameterizedTests.java Unable to extend SolrJettyTestBase within a Parametrized test - Key: SOLR-5218 URL: https://issues.apache.org/jira/browse/SOLR-5218 Project: Solr Issue Type: Bug Components: Tests Affects Versions: 4.3.1 Reporter: Steve Davids Fix For: 4.5, 5.0 I would like to create a unit test that extends SolrJettyTestBase using the JUnit Parameterized test format. When I try to run the test I get the following messages: Method beforeClass() should be public Method afterClass() should be public at java.lang.reflect.Constructor.newInstance(Unkown Source)... Obviously it would be great if we could make those public so I can use the JUnit Runner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5218) Unable to extend SolrJettyTestBase within a Parametrized test
[ https://issues.apache.org/jira/browse/SOLR-5218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss resolved SOLR-5218. --- Resolution: Won't Fix Assignee: Dawid Weiss Unable to extend SolrJettyTestBase within a Parametrized test - Key: SOLR-5218 URL: https://issues.apache.org/jira/browse/SOLR-5218 Project: Solr Issue Type: Bug Components: Tests Affects Versions: 4.3.1 Reporter: Steve Davids Assignee: Dawid Weiss Fix For: 4.5, 5.0 I would like to create a unit test that extends SolrJettyTestBase using the JUnit Parameterized test format. When I try to run the test I get the following messages: Method beforeClass() should be public Method afterClass() should be public at java.lang.reflect.Constructor.newInstance(Unkown Source)... Obviously it would be great if we could make those public so I can use the JUnit Runner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5202) Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench.
[ https://issues.apache.org/jira/browse/SOLR-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760560#comment-13760560 ] ASF subversion and git services commented on SOLR-5202: --- Commit 1520677 from [~dawidweiss] in branch 'dev/trunk' [ https://svn.apache.org/r1520677 ] SOLR-5202: Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench. Polished clustering configuration examples. Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench. Key: SOLR-5202 URL: https://issues.apache.org/jira/browse/SOLR-5202 Project: Solr Issue Type: New Feature Reporter: Dawid Weiss Assignee: Dawid Weiss Fix For: 4.5, 5.0 Attachments: SOLR-5202.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5202) Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench.
[ https://issues.apache.org/jira/browse/SOLR-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760566#comment-13760566 ] ASF subversion and git services commented on SOLR-5202: --- Commit 1520681 from [~dawidweiss] in branch 'dev/trunk' [ https://svn.apache.org/r1520681 ] SOLR-5202: follow-up to CHANGES.txt Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench. Key: SOLR-5202 URL: https://issues.apache.org/jira/browse/SOLR-5202 Project: Solr Issue Type: New Feature Reporter: Dawid Weiss Assignee: Dawid Weiss Fix For: 4.5, 5.0 Attachments: SOLR-5202.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.7.0_25) - Build # 7347 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/7347/ Java: 32bit/jdk1.7.0_25 -client -XX:+UseConcMarkSweepGC All tests passed Build Log: [...truncated 31814 lines...] BUILD FAILED /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:396: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:335: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:66: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:139: The following files are missing svn:eol-style (or binary svn:mime-type): * ./solr/contrib/clustering/src/test-files/clustering/solr/collection1/conf/clustering/carrot2/mock-external-attrs-attributes.xml * ./solr/example/solr/collection1/conf/clustering/carrot2/default-attributes.xml * ./solr/example/solr/collection1/conf/clustering/carrot2/kmeans-attributes.xml * ./solr/example/solr/collection1/conf/clustering/carrot2/stc-attributes.xml Total time: 51 minutes 32 seconds Build step 'Invoke Ant' marked build as failure Description set: Java: 32bit/jdk1.7.0_25 -client -XX:+UseConcMarkSweepGC Archiving artifacts Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-Tests-trunk-Java7 - Build # 4296 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-Tests-trunk-Java7/4296/ All tests passed Build Log: [...truncated 35271 lines...] BUILD FAILED /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/build.xml:396: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/build.xml:335: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/extra-targets.xml:66: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/extra-targets.xml:139: The following files are missing svn:eol-style (or binary svn:mime-type): * ./solr/contrib/clustering/src/test-files/clustering/solr/collection1/conf/clustering/carrot2/mock-external-attrs-attributes.xml * ./solr/example/solr/collection1/conf/clustering/carrot2/default-attributes.xml * ./solr/example/solr/collection1/conf/clustering/carrot2/kmeans-attributes.xml * ./solr/example/solr/collection1/conf/clustering/carrot2/stc-attributes.xml Total time: 82 minutes 19 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5220) Marking server as zombie due to 4xx response is odd
Jessica Cheng created SOLR-5220: --- Summary: Marking server as zombie due to 4xx response is odd Key: SOLR-5220 URL: https://issues.apache.org/jira/browse/SOLR-5220 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 4.4 Reporter: Jessica Cheng In LBHttpSolrServer.request, a request is retried and server marked as zombie if the return code is 404, 403, 503, or 500, and the comment says we retry on 404 or 403 or 503 - you can see this on solr shutdown. I think returning a 503 on a shutdown is reasonable, but not 4xx, which is supposed to be a client error. But even if this is can't be fixed systematically on the server-side, seems like on the client side we can retry on another server, but not mark the current server as dead, because most likely when the server returns a 403 (Forbidden) or 404 (Not Found), it's not because it's dead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5215) Deadlock in Solr Cloud ConnectionManager
[ https://issues.apache.org/jira/browse/SOLR-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760815#comment-13760815 ] Feihong Huang edited comment on SOLR-5215 at 9/7/13 12:45 AM: -- Thanks to Ricard to find the reason. I also encounter this issue in our production application servers. was (Author: ainihong001): Thanks to Ricard to finding the reason. I also encounter this issue in our production application servers. Deadlock in Solr Cloud ConnectionManager Key: SOLR-5215 URL: https://issues.apache.org/jira/browse/SOLR-5215 Project: Solr Issue Type: Bug Components: clients - java, SolrCloud Affects Versions: 4.2.1 Environment: Linux 2.6.18-164.el5 #1 SMP Tue Aug 18 15:51:48 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux java version 1.6.0_18 Java(TM) SE Runtime Environment (build 1.6.0_18-b07) Java HotSpot(TM) 64-Bit Server VM (build 16.0-b13, mixed mode) Reporter: Ricardo Merizalde We are constantly seeing a deadlocks in our production application servers. The problem seems to be that a thread A: - tries to process an event and acquires the ConnectionManager lock - the update callback acquires connectionUpdateLock and invokes waitForConnected - waitForConnected tries to acquire the ConnectionManager lock (which already has) - waitForConnected calls wait and release the ConnectionManager lock (but still has the connectionUpdateLock) The a thread B: - tries to process an event and acquires the ConnectionManager lock - the update call back tries to acquire connectionUpdateLock but gets blocked holding the ConnectionManager lock and preventing thread A from getting out of the wait state. Here is part of the thread dump: http-0.0.0.0-8080-82-EventThread daemon prio=10 tid=0x59965800 nid=0x3e81 waiting for monitor entry [0x57169000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:71) - waiting to lock 0x2aab1b0e0ce0 (a org.apache.solr.common.cloud.ConnectionManager) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) http-0.0.0.0-8080-82-EventThread daemon prio=10 tid=0x5ad4 nid=0x3e67 waiting for monitor entry [0x4dbd4000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.solr.common.cloud.ConnectionManager$1.update(ConnectionManager.java:98) - waiting to lock 0x2aab1b0e0f78 (a java.lang.Object) at org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:46) at org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:91) - locked 0x2aab1b0e0ce0 (a org.apache.solr.common.cloud.ConnectionManager) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) http-0.0.0.0-8080-82-EventThread daemon prio=10 tid=0x2aac4c2f7000 nid=0x3d9a waiting for monitor entry [0x42821000] java.lang.Thread.State: BLOCKED (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x2aab1b0e0ce0 (a org.apache.solr.common.cloud.ConnectionManager) at org.apache.solr.common.cloud.ConnectionManager.waitForConnected(ConnectionManager.java:165) - locked 0x2aab1b0e0ce0 (a org.apache.solr.common.cloud.ConnectionManager) at org.apache.solr.common.cloud.ConnectionManager$1.update(ConnectionManager.java:98) - locked 0x2aab1b0e0f78 (a java.lang.Object) at org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:46) at org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:91) - locked 0x2aab1b0e0ce0 (a org.apache.solr.common.cloud.ConnectionManager) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) Found one Java-level deadlock: = http-0.0.0.0-8080-82-EventThread: waiting to lock monitor 0x5c7694b0 (object 0x2aab1b0e0ce0, a org.apache.solr.common.cloud.ConnectionManager), which is held by http-0.0.0.0-8080-82-EventThread http-0.0.0.0-8080-82-EventThread: waiting to lock monitor 0x2aac4c314978 (object 0x2aab1b0e0f78, a java.lang.Object), which is held by
[jira] [Commented] (SOLR-5215) Deadlock in Solr Cloud ConnectionManager
[ https://issues.apache.org/jira/browse/SOLR-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760815#comment-13760815 ] Feihong Huang commented on SOLR-5215: - Thanks to Ricard to finding the reason. I also encounter this issue in our production application servers. Deadlock in Solr Cloud ConnectionManager Key: SOLR-5215 URL: https://issues.apache.org/jira/browse/SOLR-5215 Project: Solr Issue Type: Bug Components: clients - java, SolrCloud Affects Versions: 4.2.1 Environment: Linux 2.6.18-164.el5 #1 SMP Tue Aug 18 15:51:48 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux java version 1.6.0_18 Java(TM) SE Runtime Environment (build 1.6.0_18-b07) Java HotSpot(TM) 64-Bit Server VM (build 16.0-b13, mixed mode) Reporter: Ricardo Merizalde We are constantly seeing a deadlocks in our production application servers. The problem seems to be that a thread A: - tries to process an event and acquires the ConnectionManager lock - the update callback acquires connectionUpdateLock and invokes waitForConnected - waitForConnected tries to acquire the ConnectionManager lock (which already has) - waitForConnected calls wait and release the ConnectionManager lock (but still has the connectionUpdateLock) The a thread B: - tries to process an event and acquires the ConnectionManager lock - the update call back tries to acquire connectionUpdateLock but gets blocked holding the ConnectionManager lock and preventing thread A from getting out of the wait state. Here is part of the thread dump: http-0.0.0.0-8080-82-EventThread daemon prio=10 tid=0x59965800 nid=0x3e81 waiting for monitor entry [0x57169000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:71) - waiting to lock 0x2aab1b0e0ce0 (a org.apache.solr.common.cloud.ConnectionManager) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) http-0.0.0.0-8080-82-EventThread daemon prio=10 tid=0x5ad4 nid=0x3e67 waiting for monitor entry [0x4dbd4000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.solr.common.cloud.ConnectionManager$1.update(ConnectionManager.java:98) - waiting to lock 0x2aab1b0e0f78 (a java.lang.Object) at org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:46) at org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:91) - locked 0x2aab1b0e0ce0 (a org.apache.solr.common.cloud.ConnectionManager) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) http-0.0.0.0-8080-82-EventThread daemon prio=10 tid=0x2aac4c2f7000 nid=0x3d9a waiting for monitor entry [0x42821000] java.lang.Thread.State: BLOCKED (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x2aab1b0e0ce0 (a org.apache.solr.common.cloud.ConnectionManager) at org.apache.solr.common.cloud.ConnectionManager.waitForConnected(ConnectionManager.java:165) - locked 0x2aab1b0e0ce0 (a org.apache.solr.common.cloud.ConnectionManager) at org.apache.solr.common.cloud.ConnectionManager$1.update(ConnectionManager.java:98) - locked 0x2aab1b0e0f78 (a java.lang.Object) at org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:46) at org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:91) - locked 0x2aab1b0e0ce0 (a org.apache.solr.common.cloud.ConnectionManager) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) Found one Java-level deadlock: = http-0.0.0.0-8080-82-EventThread: waiting to lock monitor 0x5c7694b0 (object 0x2aab1b0e0ce0, a org.apache.solr.common.cloud.ConnectionManager), which is held by http-0.0.0.0-8080-82-EventThread http-0.0.0.0-8080-82-EventThread: waiting to lock monitor 0x2aac4c314978 (object 0x2aab1b0e0f78, a java.lang.Object), which is held by http-0.0.0.0-8080-82-EventThread http-0.0.0.0-8080-82-EventThread: waiting to lock monitor 0x5c7694b0 (object 0x2aab1b0e0ce0, a org.apache.solr.common.cloud.ConnectionManager), which is held by
[jira] [Updated] (LUCENE-5198) Strengthen the function of Min should match, making it select BooleanClause as Occur.MUST according to the weight of query
[ https://issues.apache.org/jira/browse/LUCENE-5198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] HeXin updated LUCENE-5198: -- Description: In current version, when we using BooleanQuery do disjunction, the top scorer will select the doc which meet at least mm numbers of sub scorers. But in some case, we wish that the weight of sub scorers larger than the threshold can be selected as Occur.MUST automatically. The threshold can be configurable, equaling the minimum integer by default. Any comments is welcomed. was: In some case, we want the value of mm to select BooleanClause as Occur.MUST can according to the weight of query. Only if the weight larger than the threshold, it can be selected as Occur.MUST. The threshold can be configurable, equaling the minimum integer by default. Any comments is welcomed. Strengthen the function of Min should match, making it select BooleanClause as Occur.MUST according to the weight of query -- Key: LUCENE-5198 URL: https://issues.apache.org/jira/browse/LUCENE-5198 Project: Lucene - Core Issue Type: Improvement Components: core/search Affects Versions: 4.4 Reporter: HeXin Priority: Trivial In current version, when we using BooleanQuery do disjunction, the top scorer will select the doc which meet at least mm numbers of sub scorers. But in some case, we wish that the weight of sub scorers larger than the threshold can be selected as Occur.MUST automatically. The threshold can be configurable, equaling the minimum integer by default. Any comments is welcomed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-SmokeRelease-4.x - Build # 106 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-SmokeRelease-4.x/106/ No tests ran. Build Log: [...truncated 34242 lines...] prepare-release-no-sign: [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/lucene/build/fakeRelease [copy] Copying 416 files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/lucene/build/fakeRelease/lucene [copy] Copying 194 files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/lucene/build/fakeRelease/solr [exec] JAVA6_HOME is /home/hudson/tools/java/latest1.6 [exec] JAVA7_HOME is /home/hudson/tools/java/latest1.7 [exec] NOTE: output encoding is US-ASCII [exec] [exec] Load release URL file:/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/lucene/build/fakeRelease/... [exec] [exec] Test Lucene... [exec] test basics... [exec] get KEYS [exec] 0.1 MB in 0.01 sec (10.1 MB/sec) [exec] check changes HTML... [exec] download lucene-4.5.0-src.tgz... [exec] 27.1 MB in 0.04 sec (681.4 MB/sec) [exec] verify md5/sha1 digests [exec] download lucene-4.5.0.tgz... [exec] 49.0 MB in 0.07 sec (690.3 MB/sec) [exec] verify md5/sha1 digests [exec] download lucene-4.5.0.zip... [exec] 58.9 MB in 0.11 sec (516.1 MB/sec) [exec] verify md5/sha1 digests [exec] unpack lucene-4.5.0.tgz... [exec] verify JAR/WAR metadata... [exec] test demo with 1.6... [exec] got 5723 hits for query lucene [exec] test demo with 1.7... [exec] got 5723 hits for query lucene [exec] check Lucene's javadoc JAR [exec] [exec] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/lucene/build/fakeReleaseTmp/unpack/lucene-4.5.0/docs/core/org/apache/lucene/util/AttributeSource.html [exec] broken details HTML: Method Detail: addAttributeImpl: closing /code does not match opening T [exec] broken details HTML: Method Detail: getAttribute: closing /code does not match opening T [exec] Traceback (most recent call last): [exec] File /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py, line 1450, in module [exec] main() [exec] File /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py, line 1394, in main [exec] smokeTest(baseURL, svnRevision, version, tmpDir, isSigned, testArgs) [exec] File /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py, line 1431, in smokeTest [exec] unpackAndVerify('lucene', tmpDir, artifact, svnRevision, version, testArgs) [exec] File /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py, line 607, in unpackAndVerify [exec] verifyUnpacked(project, artifact, unpackPath, svnRevision, version, testArgs) [exec] File /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py, line 786, in verifyUnpacked [exec] checkJavadocpath('%s/docs' % unpackPath) [exec] File /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py, line 904, in checkJavadocpath [exec] raise RuntimeError('missing javadocs package summaries!') [exec] RuntimeError: missing javadocs package summaries! BUILD FAILED /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/build.xml:321: exec returned: 1 Total time: 19 minutes 30 seconds Build step 'Invoke Ant' marked build as failure Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org