[jira] [Commented] (SOLR-6067) add buildAndRunCollectorChain method to reduce code duplication in SolrIndexSearcher
[ https://issues.apache.org/jira/browse/SOLR-6067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998760#comment-13998760 ] ASF GitHub Bot commented on SOLR-6067: -- Github user cpoerschke commented on the pull request: https://github.com/apache/lucene-solr/pull/48#issuecomment-43208284 Extra commit above relates to [comments] on SOLR-6067 ticket. [comments]: https://issues.apache.org/jira/browse/SOLR-6067?focusedCommentId=13998594page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13998594 add buildAndRunCollectorChain method to reduce code duplication in SolrIndexSearcher Key: SOLR-6067 URL: https://issues.apache.org/jira/browse/SOLR-6067 Project: Solr Issue Type: Improvement Reporter: Christine Poerschke Priority: Minor Attachments: SOLR-6067.patch https://github.com/apache/lucene-solr/pull/48 has the proposed change. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[GitHub] lucene-solr pull request: solr: add buildAndRunCollectorChain meth...
Github user cpoerschke commented on the pull request: https://github.com/apache/lucene-solr/pull/48#issuecomment-43208284 Extra commit above relates to [comments] on SOLR-6067 ticket. [comments]: https://issues.apache.org/jira/browse/SOLR-6067?focusedCommentId=13998594page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13998594 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-6013) Fix method visibility of Evaluator, refactor DateFormatEvaluator for extensibility
[ https://issues.apache.org/jira/browse/SOLR-6013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar resolved SOLR-6013. - Resolution: Fixed bq. Shalin, that's fine I suppose (sorry I didn't notice you changed them to public/final). I'm just wondering though, wouldn't it make sense to access the bean properties using traditional getter methods instead of accessing them directly? Just curious as to the reasoning of not providing the getters. In either case, I'm fine with whatever you decide and re-closing this issue. This is just a simple internal object with final values. There is no value added by getters here. Fix method visibility of Evaluator, refactor DateFormatEvaluator for extensibility -- Key: SOLR-6013 URL: https://issues.apache.org/jira/browse/SOLR-6013 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler Affects Versions: 4.7 Reporter: Aaron LaBella Assignee: Shalin Shekhar Mangar Fix For: 4.9, 5.0 Attachments: 0001-add-getters-for-datemathparser.patch, 0001-change-method-access-to-protected.patch, 0001-change-method-variable-visibility-and-refactor-for-extensibility.patch, SOLR-6013.patch Original Estimate: 1h Remaining Estimate: 1h This is similar to issue 5981, the Evaluator class is declared as abstract, yet the parseParams method is package private? Surely this is an oversight, as I wouldn't expect everyone writing their own evaluators to have to deal with parsing the parameters. Similarly, I needed to refactor DateFormatEvaluator because I need to do some custom date math/parsing and it wasn't written in a way that I can extend it. Please review/apply my attached patch to the next version of Solr, ie: 4.8 or 4.9 if I must wait. Thanks! -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5675) ID postings format
[ https://issues.apache.org/jira/browse/LUCENE-5675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998883#comment-13998883 ] ASF subversion and git services commented on LUCENE-5675: - Commit 1594971 from [~mikemccand] in branch 'dev/branches/lucene5675' [ https://svn.apache.org/r1594971 ] LUCENE-5675: initial scaffolding for new IDVPF ID postings format Key: LUCENE-5675 URL: https://issues.apache.org/jira/browse/LUCENE-5675 Project: Lucene - Core Issue Type: New Feature Reporter: Robert Muir Today the primary key lookup in lucene is not that great for systems like solr and elasticsearch that have versioning in front of IndexWriter. To some extend BlockTree can sometimes help avoid seeks by telling you the term does not exist for a segment. But this technique (based on FST prefix) is fragile. The only other choice today is bloom filters, which use up huge amounts of memory. I don't think we are using everything we know: particularly the version semantics. Instead, if the FST for the terms index used an algebra that represents the max version for any subtree, we might be able to answer that there is no term T with version V in that segment very efficiently. Also ID fields dont need postings lists, they dont need stats like docfreq/totaltermfreq, etc this stuff is all implicit. As far as API, i think for users to provide IDs with versions to such a PF, a start would to set a payload or whatever on the term field to get it thru indexwriter to the codec. And a consumer of the codec can just cast the Terms to a subclass that exposes the FST to do this version check efficiently. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5675) ID postings format
[ https://issues.apache.org/jira/browse/LUCENE-5675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999032#comment-13999032 ] ASF subversion and git services commented on LUCENE-5675: - Commit 1595006 from [~mikemccand] in branch 'dev/branches/lucene5675' [ https://svn.apache.org/r1595006 ] LUCENE-5675: pull out FieldReader from BTTR ID postings format Key: LUCENE-5675 URL: https://issues.apache.org/jira/browse/LUCENE-5675 Project: Lucene - Core Issue Type: New Feature Reporter: Robert Muir Today the primary key lookup in lucene is not that great for systems like solr and elasticsearch that have versioning in front of IndexWriter. To some extend BlockTree can sometimes help avoid seeks by telling you the term does not exist for a segment. But this technique (based on FST prefix) is fragile. The only other choice today is bloom filters, which use up huge amounts of memory. I don't think we are using everything we know: particularly the version semantics. Instead, if the FST for the terms index used an algebra that represents the max version for any subtree, we might be able to answer that there is no term T with version V in that segment very efficiently. Also ID fields dont need postings lists, they dont need stats like docfreq/totaltermfreq, etc this stuff is all implicit. As far as API, i think for users to provide IDs with versions to such a PF, a start would to set a payload or whatever on the term field to get it thru indexwriter to the codec. And a consumer of the codec can just cast the Terms to a subclass that exposes the FST to do this version check efficiently. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5599) SolrCloud Admin UI Cluster Table View
[ https://issues.apache.org/jira/browse/SOLR-5599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998827#comment-13998827 ] Erick Erickson commented on SOLR-5599: -- [~steffkes] But you should see the _really cool_ graph I saw of a bunch of nodes in the radial view. It'd make a great t-shirt! SolrCloud Admin UI Cluster Table View - Key: SOLR-5599 URL: https://issues.apache.org/jira/browse/SOLR-5599 Project: Solr Issue Type: New Feature Components: SolrCloud, web gui Reporter: Mark Miller A new table view for viewing the cluster layout. Sortable by field would be best. I talked a bit about this with Stefan in Dublin. We really need a better view for large scale clusters. A sortable table view seems like a great option to have. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6058) Solr needs a new website
[ https://issues.apache.org/jira/browse/SOLR-6058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998709#comment-13998709 ] Grant Ingersoll commented on SOLR-6058: --- [~L_Nino] Thanks for uploading. Next step would be to generate a patch for the current site using the instructions here: http://lucene.apache.org/site-instructions.html Let me know if you need help. I'm going to spin out some sub tasks that will handle overhauling the content. Solr needs a new website Key: SOLR-6058 URL: https://issues.apache.org/jira/browse/SOLR-6058 Project: Solr Issue Type: Task Reporter: Grant Ingersoll Assignee: Grant Ingersoll Attachments: HTML.rar Solr needs a new website: better organization of content, less verbose, more pleasing graphics, etc. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-6053) 搜索文档总数不一致
[ https://issues.apache.org/jira/browse/SOLR-6053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998705#comment-13998705 ] Ahmet Arslan edited comment on SOLR-6053 at 5/15/14 11:55 AM: -- Hi [~wgybzb] can you explain in detail what the problem is here? Is this a SolrJ thing? It this a solr-cloud setup? Can you update summary of the ticket in English language? was (Author: iorixxx): Hi [~wanggang] can you explain in detail what the problem is here? It his a SolrJ thing? It this a solr-cloud setup? Can you update summary of the ticket in English language? 搜索文档总数不一致 - Key: SOLR-6053 URL: https://issues.apache.org/jira/browse/SOLR-6053 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 4.7 Environment: Centos Reporter: wanggang Priority: Critical Fix For: 4.7 Original Estimate: 0.05h Remaining Estimate: 0.05h http://192.168.3.21:8901/sentiment/search?q=%E6%B2%A5%E9%9D%92%E7%BD%90%E8%B5%B7%E7%81%ABhlfl=title,contenthlsimple=redstart=0rows=10 start 切换不同的数值就能看到效果了 As my in deep test find out, if the rows=0, the results size is consistently the total sum of the documents on all shards regardless there is any duplicates; if the rows is a number larger than the supposedly returned the merge document number, the result numFound is accurate and consistent, however, if the rows is with a number smaller than the supposedly merge results size, it will be non-deterministic. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.7.0) - Build # 1576 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/1576/ Java: 64bit/jdk1.7.0 -XX:+UseCompressedOops -XX:+UseParallelGC All tests passed Build Log: [...truncated 11064 lines...] [junit4] JVM J0: stderr was not empty, see: /Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-core/test/temp/junit4-J0-20140515_142109_109.syserr [junit4] JVM J0: stderr (verbatim) [junit4] java(458,0x148f97000) malloc: *** error for object 0x80148f85f20: pointer being freed was not allocated [junit4] *** set a breakpoint in malloc_error_break to debug [junit4] JVM J0: EOF [...truncated 1 lines...] [junit4] ERROR: JVM J0 ended with an exception, command line: /Library/Java/JavaVirtualMachines/jdk1.7.0_55.jdk/Contents/Home/jre/bin/java -XX:+UseCompressedOops -XX:+UseParallelGC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/heapdumps -Dtests.prefix=tests -Dtests.seed=1373EEECA2193851 -Xmx512M -Dtests.iters= -Dtests.verbose=false -Dtests.infostream=false -Dtests.codec=random -Dtests.postingsformat=random -Dtests.docvaluesformat=random -Dtests.locale=random -Dtests.timezone=random -Dtests.directory=random -Dtests.linedocsfile=europarl.lines.txt.gz -Dtests.luceneMatchVersion=5.0 -Dtests.cleanthreads=perClass -Djava.util.logging.config.file=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/logging.properties -Dtests.nightly=false -Dtests.weekly=false -Dtests.monster=false -Dtests.slow=true -Dtests.asserts.gracious=false -Dtests.multiplier=1 -DtempDir=. -Djava.io.tmpdir=. -Djunit4.tempDir=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-core/test/temp -Dclover.db.dir=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/clover/db -Djava.security.manager=org.apache.lucene.util.TestSecurityManager -Djava.security.policy=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/tests.policy -Dlucene.version=5.0-SNAPSHOT -Djetty.testMode=1 -Djetty.insecurerandom=1 -Dsolr.directoryFactory=org.apache.solr.core.MockDirectoryFactory -Djava.awt.headless=true -Djdk.map.althashing.threshold=0 -Dtests.leaveTemporary=false -Dtests.filterstacks=true -Dtests.disableHdfs=true -classpath
[jira] [Updated] (SOLR-6051) Field names beginning with numbers give different and incorrect results depending on placement in URL query
[ https://issues.apache.org/jira/browse/SOLR-6051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated SOLR-6051: --- Component/s: documentation Field names beginning with numbers give different and incorrect results depending on placement in URL query --- Key: SOLR-6051 URL: https://issues.apache.org/jira/browse/SOLR-6051 Project: Solr Issue Type: Bug Components: documentation Affects Versions: 4.7.2 Environment: CentOS 6+ Reporter: Mark Ebbert Priority: Minor Labels: documentation, patch I've looked all over for specific field name requirements and can't find any official documentation. Is there official documentation on field names? If not, *please* provide some! We created several field names that begin with numbers, but SOLR doesn't seem to handle that well. Here are two identical URL queries that produce different output: {quote} http://our_server:8080/solr/query?q=chr:19%20AND%20pos:1101fl=chr,pos,ref,alt,1000G_freq,AFR_freq,ASN_freq {quote} and {quote} http://our_server:8080/solr/query?q=chr:19%20AND%20pos:1101fl=chr,pos,ref,alt,AFR_freq,ASN_freq,1000G_freq {quote} The only difference between the two queries is the location of '1000G_freq' (middle vs. end). The first query does not return the 1000G_freq value but the second does. Additionally, both return a value that does not exist (1000:1000). Seems to be doing something funky with the 1000 in the field name. The 1000:1000 disappears if I remove '1000G_freq' from the query. Here are the outputs from both queries: h2. Query 1 Results {code:title=Query 1 Results|borderStyle=solid} { responseHeader:{ status:0, QTime:1, params:{ fl:chr,pos,ref,alt,1000G_freq,AFR_freq,ASN_freq, q:chr:19 AND pos:1101}}, response:{numFound:5,start:0,docs:[ { chr:19, pos:1101, ref:G, alt:C, 1000:1000}, { chr:19, pos:1101, ref:G, alt:C, AFR_freq:0.05, ASN_freq:0.55, 1000:1000}, { chr:19, pos:1101, ref:G, alt:C, AFR_freq:0.05, ASN_freq:0.55, 1000:1000}, { chr:19, pos:1101, ref:G, alt:C, 1000:1000}, { chr:19, pos:1101, ref:G, alt:C, 1000:1000}] }} {code} h2. Query 2 Results {code:title=Query 2 Results|borderStyle=solid} { responseHeader:{ status:0, QTime:0, params:{ fl:chr,pos,ref,alt,AFR_freq,ASN_freq,1000G_freq, q:chr:19 AND pos:1101}}, response:{numFound:5,start:0,docs:[ { chr:19, pos:1101, ref:G, alt:C, 1000:1000}, { chr:19, pos:1101, ref:G, alt:C, 1000G_freq:0.43, AFR_freq:0.05, ASN_freq:0.55, 1000:1000}, { chr:19, pos:1101, ref:G, alt:C, 1000G_freq:0.43, AFR_freq:0.05, ASN_freq:0.55, 1000:1000}, { chr:19, pos:1101, ref:G, alt:C, 1000:1000}, { chr:19, pos:1101, ref:G, alt:C, 1000:1000}] }} {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5675) ID postings format
[ https://issues.apache.org/jira/browse/LUCENE-5675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999268#comment-13999268 ] ASF subversion and git services commented on LUCENE-5675: - Commit 1595052 from [~mikemccand] in branch 'dev/branches/lucene5675' [ https://svn.apache.org/r1595052 ] LUCENE-5675: small cleanups ID postings format Key: LUCENE-5675 URL: https://issues.apache.org/jira/browse/LUCENE-5675 Project: Lucene - Core Issue Type: New Feature Reporter: Robert Muir Today the primary key lookup in lucene is not that great for systems like solr and elasticsearch that have versioning in front of IndexWriter. To some extend BlockTree can sometimes help avoid seeks by telling you the term does not exist for a segment. But this technique (based on FST prefix) is fragile. The only other choice today is bloom filters, which use up huge amounts of memory. I don't think we are using everything we know: particularly the version semantics. Instead, if the FST for the terms index used an algebra that represents the max version for any subtree, we might be able to answer that there is no term T with version V in that segment very efficiently. Also ID fields dont need postings lists, they dont need stats like docfreq/totaltermfreq, etc this stuff is all implicit. As far as API, i think for users to provide IDs with versions to such a PF, a start would to set a payload or whatever on the term field to get it thru indexwriter to the codec. And a consumer of the codec can just cast the Terms to a subclass that exposes the FST to do this version check efficiently. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6081) Hook in Videos, Books, Reference work
Grant Ingersoll created SOLR-6081: - Summary: Hook in Videos, Books, Reference work Key: SOLR-6081 URL: https://issues.apache.org/jira/browse/SOLR-6081 Project: Solr Issue Type: Sub-task Reporter: Grant Ingersoll There is a ton of great content available on Solr in the interwebs, let's make it easy to highlight this content so users know their is a large community ready to assist. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Solr Admin UI and SolrCloud
Stefan: Sure, any time! I'm not -8 UTC (California) now. You're +1 or some such? Just let me know when our schedules overlap. I've put up an umbrella JIRA (SOLR-6082) and made a couple of sub-tasks as a start. Here's the place I'm thinking of starting: follow the SolrCloud getting started (see Simple Two-Shard cluster on the Same Machine in the reference guide), EXCEPT 1 remove the entire collection1 directory before you copy example to example2 2 start your two instances like this: java -DzkRun -jar start.jar java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar Now open the cloud link and there are no nodes shown since there are no collections defined. What's a poor user to do? On Wed, May 14, 2014 at 3:18 PM, Stefan Matheis matheis.ste...@gmail.com wrote: Erick it indeed does need a bit work. your mail reminded me of your last mail and the collections api :~ especially regarding the whole cloud thing i’m having a hard time, since i’m not really using cloud - not even playing around with it. so i have a vague idea on how it might work for people .. but that isn’t probably enough to start working on it. if we could have a short chat about it .. i guess that would help :) otherwise i’ll try to read about the current features, namings relations to get it sorted it. just for the record, i’ve talked with mark at LSR in Dublin last year a bit about some helpful cloud-stuff, which he wrote down in SOLR-5599 - a bit basic perhaps, but might help as well. -Stefan On Monday, May 12, 2014 at 6:58 PM, Erick Erickson wrote: The admin UI (and kudos to _everyone_ who made the new version) could use more cloud awareness. There are cluster-wide operations and individual node operations, they're intermixed at this point. Plus, we make people switch between a UI and the command-line to accomplish what they need to. How can we restructure them? And should we? Straw-man proposal follows. NOTE: I have no real attachment to this layout, just looking to generate a discussion! split the cluster-wide operations and node-specific stuff into two pages (how to navigate?) The rest of the points are really for the cloud-specific page add a collections API interface, similar to the core admin bits. creating collections, adding nodes, all that stuff. querying should be do-able on a collection basis rather than after you've selected a node on a particular machine showing all the nodes on the system, even ones that don't host current shards would be great super, especially wonderful would be a way to select a node and add a replica right there for a particular shard, with drop-down lists showing available collections, available shards and even suggesting a name for it and all that kind of thing. ditto with creating new collections. A drop-down listing the available configs would be very cool some UI way to upload a config set. How would we keep security issues around allowing file uploads from being a problem? Thoughts? - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5584) Allow FST read method to also recycle the output value when traversing FST
[ https://issues.apache.org/jira/browse/LUCENE-5584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998560#comment-13998560 ] Christian Ziech commented on LUCENE-5584: - The additional ctor would be a solution as well, yes. We then could keep the FSTs in some cache and use one per thread. Allow FST read method to also recycle the output value when traversing FST -- Key: LUCENE-5584 URL: https://issues.apache.org/jira/browse/LUCENE-5584 Project: Lucene - Core Issue Type: Improvement Components: core/FSTs Affects Versions: 4.7.1 Reporter: Christian Ziech Attachments: fst-itersect-benchmark.tgz The FST class heavily reuses Arc instances when traversing the FST. The output of an Arc however is not reused. This can especially be important when traversing large portions of a FST and using the ByteSequenceOutputs and CharSequenceOutputs. Those classes create a new byte[] or char[] for every node read (which has an output). In our use case we intersect a lucene Automaton with a FSTBytesRef much like it is done in org.apache.lucene.search.suggest.analyzing.FSTUtil.intersectPrefixPaths() and since the Automaton and the FST are both rather large tens or even hundreds of thousands of temporary byte array objects are created. One possible solution to the problem would be to change the org.apache.lucene.util.fst.Outputs class to have two additional methods (if you don't want to change the existing methods for compatibility): {code} /** Decode an output value previously written with {@link * #write(Object, DataOutput)} reusing the object passed in if possible */ public abstract T read(DataInput in, T reuse) throws IOException; /** Decode an output value previously written with {@link * #writeFinalOutput(Object, DataOutput)}. By default this * just calls {@link #read(DataInput)}. This tries to reuse the object * passed in if possible */ public T readFinalOutput(DataInput in, T reuse) throws IOException { return read(in, reuse); } {code} The new methods could then be used in the FST in the readNextRealArc() method passing in the output of the reused Arc. For most inputs they could even just invoke the original read(in) method. If you should decide to make that change I'd be happy to supply a patch and/or tests for the feature. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Solr Admin UI and SolrCloud
I think there was a discussion/JIRA on moving to AngularJS (or ReactJS?). Maybe this should be a part of that discussion. What is the process for discussion UI? Was this a heroic effort by one/two individual or was there a subgroup of some sort? The entire current state of the admin UI represents the work of many, but I believe the overall design and initial implementation was done by Stefan Matheis, in SOLR-2399. The work earned him a committer role in the project. Now he is the owner of the admin UI code. If you are proposing something huge, it would be a good idea to start a discussion here. If it's relatively straightforward, especially if you have a patch, searching for an issue in jira and opening a new one if nothing exists already is a good first step. Thanks, Shawn - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5675) ID postings format
[ https://issues.apache.org/jira/browse/LUCENE-5675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999174#comment-13999174 ] ASF subversion and git services commented on LUCENE-5675: - Commit 1595026 from [~mikemccand] in branch 'dev/branches/lucene5675' [ https://svn.apache.org/r1595026 ] LUCENE-5675: rename ID postings format Key: LUCENE-5675 URL: https://issues.apache.org/jira/browse/LUCENE-5675 Project: Lucene - Core Issue Type: New Feature Reporter: Robert Muir Today the primary key lookup in lucene is not that great for systems like solr and elasticsearch that have versioning in front of IndexWriter. To some extend BlockTree can sometimes help avoid seeks by telling you the term does not exist for a segment. But this technique (based on FST prefix) is fragile. The only other choice today is bloom filters, which use up huge amounts of memory. I don't think we are using everything we know: particularly the version semantics. Instead, if the FST for the terms index used an algebra that represents the max version for any subtree, we might be able to answer that there is no term T with version V in that segment very efficiently. Also ID fields dont need postings lists, they dont need stats like docfreq/totaltermfreq, etc this stuff is all implicit. As far as API, i think for users to provide IDs with versions to such a PF, a start would to set a payload or whatever on the term field to get it thru indexwriter to the codec. And a consumer of the codec can just cast the Terms to a subclass that exposes the FST to do this version check efficiently. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5675) ID postings format
[ https://issues.apache.org/jira/browse/LUCENE-5675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998922#comment-13998922 ] ASF subversion and git services commented on LUCENE-5675: - Commit 1594985 from [~mikemccand] in branch 'dev/branches/lucene5675' [ https://svn.apache.org/r1594985 ] LUCENE-5675: add docs/AndPositionsEnums ID postings format Key: LUCENE-5675 URL: https://issues.apache.org/jira/browse/LUCENE-5675 Project: Lucene - Core Issue Type: New Feature Reporter: Robert Muir Today the primary key lookup in lucene is not that great for systems like solr and elasticsearch that have versioning in front of IndexWriter. To some extend BlockTree can sometimes help avoid seeks by telling you the term does not exist for a segment. But this technique (based on FST prefix) is fragile. The only other choice today is bloom filters, which use up huge amounts of memory. I don't think we are using everything we know: particularly the version semantics. Instead, if the FST for the terms index used an algebra that represents the max version for any subtree, we might be able to answer that there is no term T with version V in that segment very efficiently. Also ID fields dont need postings lists, they dont need stats like docfreq/totaltermfreq, etc this stuff is all implicit. As far as API, i think for users to provide IDs with versions to such a PF, a start would to set a payload or whatever on the term field to get it thru indexwriter to the codec. And a consumer of the codec can just cast the Terms to a subclass that exposes the FST to do this version check efficiently. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6085) Suggester crashes
[ https://issues.apache.org/jira/browse/SOLR-6085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Ferrández updated SOLR-6085: -- Fix Version/s: 5.0 4.9 4.8.1 4.7.3 Suggester crashes - Key: SOLR-6085 URL: https://issues.apache.org/jira/browse/SOLR-6085 Project: Solr Issue Type: Bug Components: SearchComponents - other Affects Versions: 4.7.1 Reporter: Jorge Ferrández Fix For: 4.7.3, 4.8.1, 4.9, 5.0 AnalyzingInfixSuggester class fails when is queried with a ß character (ezsett) used in German, but it doesn't happen for all data or for all words containing this character. The exception reported is the following: {code:java} response lst name=responseHeader int name=status500/int int name=QTime18/int /lst lst name=error str name=msgString index out of range: 5/str str name=trace java.lang.StringIndexOutOfBoundsException: String index out of range: 5 at java.lang.String.substring(String.java:1907) at org.apache.lucene.search.suggest.analyzing.AnalyzingInfixSuggester.addPrefixMatch(AnalyzingInfixSuggester.java:575) at org.apache.lucene.search.suggest.analyzing.AnalyzingInfixSuggester.highlight(AnalyzingInfixSuggester.java:525) at org.apache.lucene.search.suggest.analyzing.AnalyzingInfixSuggester.createResults(AnalyzingInfixSuggester.java:479) at org.apache.lucene.search.suggest.analyzing.AnalyzingInfixSuggester.lookup(AnalyzingInfixSuggester.java:437) at org.apache.lucene.search.suggest.analyzing.AnalyzingInfixSuggester.lookup(AnalyzingInfixSuggester.java:338) at org.apache.solr.spelling.suggest.SolrSuggester.getSuggestions(SolrSuggester.java:181) at org.apache.solr.handler.component.SuggestComponent.process(SuggestComponent.java:232) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:217) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:241) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1916) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:780) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:368) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:744) /str int name=code500/int /lst /response {code} With this query http://localhost:8983/solr/suggest_de?suggest.q=gieß (for gießen, which is actually in the data) The problem seems to be that we use ASCIIFolding to unify ss and
[jira] [Updated] (LUCENE-5673) MmapDirectory shouldn't pass along OOM wrapped as IOException
[ https://issues.apache.org/jira/browse/LUCENE-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-5673: -- Attachment: LUCENE-5673.patch Here is a new patch: - The additional information and resourceDescription is now used on *any* IOException while mapping. - If the cause is a OOM, the new Exception does not get a cause anymore, just Map failed and the additional infos - In all other cases, the original message is preserved and annotated with our information. The original cause is preserved (initCause on new Ex). - the stack trace of original Exception is preserved MmapDirectory shouldn't pass along OOM wrapped as IOException - Key: LUCENE-5673 URL: https://issues.apache.org/jira/browse/LUCENE-5673 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-5673.patch, LUCENE-5673.patch, LUCENE-5673.patch, LUCENE-5673.patch The bug here is in java (not MMapDir), but i think we shoudl do something. Users get confused when they configure their JVM to trigger something on OOM, and then see OutOfMemoryError: Map Failed: but their trigger doesnt fire. Thats because in the jdk, when it maps files it catches OutOfMemoryError, asks for a garbage collection, sleeps for 100 milliseconds, then tries to map again. if it fails a second time it wraps the OOM in a generic IOException. I think we should add a try/catch to our filechannel.map -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Closed] (SOLR-6053) 搜索文档总数不一致
[ https://issues.apache.org/jira/browse/SOLR-6053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Heisey closed SOLR-6053. -- Resolution: Not a Problem Fix Version/s: (was: 4.7) Trying to read between the lines (because as Ahmet has noted, you haven't really given us anything to go on)... If you have a sharded index and the numFound changes when you run the same query more than once, then it is likely that you have documents with the same uniqueKey field value in more than one shard. Solr assumes that every document across all shards has a unique value in the uniqueKey field. If this is not the case, then Solr cannot guarantee correct results. Solr is smart enough to eliminate duplicates from any results that are returned, but in order for that to happen across the whole index, every document must be considered -- which is why it works properly when rows is larger than numFound. This should have been brought up on the solr-user mailing list, not as an issue in Jira. http://lucene.apache.org/solr/discussion.html Closing as Not a Problem. If further investigation via regular support avenues (like the mailing list or the IRC channel) reveals that there is a bug, we can reopen. 搜索文档总数不一致 - Key: SOLR-6053 URL: https://issues.apache.org/jira/browse/SOLR-6053 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 4.7 Environment: Centos Reporter: wanggang Priority: Critical Original Estimate: 0.05h Remaining Estimate: 0.05h http://192.168.3.21:8901/sentiment/search?q=%E6%B2%A5%E9%9D%92%E7%BD%90%E8%B5%B7%E7%81%ABhlfl=title,contenthlsimple=redstart=0rows=10 start 切换不同的数值就能看到效果了 As my in deep test find out, if the rows=0, the results size is consistently the total sum of the documents on all shards regardless there is any duplicates; if the rows is a number larger than the supposedly returned the merge document number, the result numFound is accurate and consistent, however, if the rows is with a number smaller than the supposedly merge results size, it will be non-deterministic. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6054) Log progress of transaction log replays
[ https://issues.apache.org/jira/browse/SOLR-6054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999279#comment-13999279 ] Yonik Seeley commented on SOLR-6054: +1... perhaps log progress every 10 seconds? Log progress of transaction log replays --- Key: SOLR-6054 URL: https://issues.apache.org/jira/browse/SOLR-6054 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Shalin Shekhar Mangar Priority: Minor Fix For: 4.9, 5.0 There is zero logging of how a transaction log replay is progressing. We should add some simple checkpoint based progress information. Logging the size of the log file at the beginning would also be useful. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5681) Make the OverseerCollectionProcessor multi-threaded
[ https://issues.apache.org/jira/browse/SOLR-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anshum Gupta updated SOLR-5681: --- Attachment: SOLR-5681-2.patch Added a new test that tests that a short running task (OVERSEERSTATUS) fired after a long running SHARDSPLIT returns before the completion of the latter. Make the OverseerCollectionProcessor multi-threaded --- Key: SOLR-5681 URL: https://issues.apache.org/jira/browse/SOLR-5681 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Anshum Gupta Assignee: Anshum Gupta Attachments: SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch Right now, the OverseerCollectionProcessor is single threaded i.e submitting anything long running would have it block processing of other mutually exclusive tasks. When OCP tasks become optionally async (SOLR-5477), it'd be good to have truly non-blocking behavior by multi-threading the OCP itself. For example, a ShardSplit call on Collection1 would block the thread and thereby, not processing a create collection task (which would stay queued in zk) though both the tasks are mutually exclusive. Here are a few of the challenges: * Mutual exclusivity: Only let mutually exclusive tasks run in parallel. An easy way to handle that is to only let 1 task per collection run at a time. * ZK Distributed Queue to feed tasks: The OCP consumes tasks from a queue. The task from the workQueue is only removed on completion so that in case of a failure, the new Overseer can re-consume the same task and retry. A queue is not the right data structure in the first place to look ahead i.e. get the 2nd task from the queue when the 1st one is in process. Also, deleting tasks which are not at the head of a queue is not really an 'intuitive' thing. Proposed solutions for task management: * Task funnel and peekAfter(): The parent thread is responsible for getting and passing the request to a new thread (or one from the pool). The parent method uses a peekAfter(last element) instead of a peek(). The peekAfter returns the task after the 'last element'. Maintain this request information and use it for deleting/cleaning up the workQueue. * Another (almost duplicate) queue: While offering tasks to workQueue, also offer them to a new queue (call it volatileWorkQueue?). The difference is, as soon as a task from this is picked up for processing by the thread, it's removed from the queue. At the end, the cleanup is done from the workQueue. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.8.0_20-ea-b11) - Build # 10148 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/10148/ Java: 32bit/jdk1.8.0_20-ea-b11 -server -XX:+UseConcMarkSweepGC All tests passed Build Log: [...truncated 20627 lines...] BUILD FAILED /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:467: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:92: The following files contain @author tags, tabs or nocommits: * solr/core/src/test/org/apache/solr/handler/TestReplicationHandler.java Total time: 69 minutes 14 seconds Build step 'Invoke Ant' marked build as failure Description set: Java: 32bit/jdk1.8.0_20-ea-b11 -server -XX:+UseConcMarkSweepGC Archiving artifacts Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5673) MmapDirectory shouldn't pass along OOM wrapped as IOException
[ https://issues.apache.org/jira/browse/LUCENE-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999799#comment-13999799 ] Robert Muir commented on LUCENE-5673: - +1 ! this is going to save a lot of headaches. MmapDirectory shouldn't pass along OOM wrapped as IOException - Key: LUCENE-5673 URL: https://issues.apache.org/jira/browse/LUCENE-5673 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-5673.patch, LUCENE-5673.patch, LUCENE-5673.patch, LUCENE-5673.patch The bug here is in java (not MMapDir), but i think we shoudl do something. Users get confused when they configure their JVM to trigger something on OOM, and then see OutOfMemoryError: Map Failed: but their trigger doesnt fire. Thats because in the jdk, when it maps files it catches OutOfMemoryError, asks for a garbage collection, sleeps for 100 milliseconds, then tries to map again. if it fails a second time it wraps the OOM in a generic IOException. I think we should add a try/catch to our filechannel.map -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6058) Solr needs a new website
[ https://issues.apache.org/jira/browse/SOLR-6058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999806#comment-13999806 ] Jan Høydahl commented on SOLR-6058: --- Great contribution. At the bottom on the page you have a list of Solr Users with selected logos. How should those be selected or contributed? In one way the web page must be true to the non-commercial, neutral nature of being an Apache project. But on the other side our users are commercial and after having been sold on the nice web site and having tried Solr themselves, the next thing they look for on the site is Anyone I trust using it? and then Where can I find books, consulting, training and support?. Today we expose some books, but list of users is well hidden in https://wiki.apache.org/solr/PublicServers and Solr professionals also hide quite well in the oddly named wiki page https://wiki.apache.org/solr/Support - not very well maintained nor beautiful pages and horribly difficult to find. Could we fix this on the new home page e.g. with some sub-items under the Community dropdown menu? Instead of a boring outdated bullet list, we could have list of info cards, each one presenting a success-story or consulting firm based on some predefined content items and size-limit. Then perhaps the logo of 5 random users from that list could be presented on the front-page? Is this accepted by Apache rules? How to maintain? Solr needs a new website Key: SOLR-6058 URL: https://issues.apache.org/jira/browse/SOLR-6058 Project: Solr Issue Type: Task Reporter: Grant Ingersoll Assignee: Grant Ingersoll Attachments: HTML.rar Solr needs a new website: better organization of content, less verbose, more pleasing graphics, etc. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-MacOSX (64bit/jdk1.7.0) - Build # 1543 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/1543/ Java: 64bit/jdk1.7.0 -XX:+UseCompressedOops -XX:+UseParallelGC All tests passed Build Log: [...truncated 10922 lines...] [junit4] JVM J0: stdout was not empty, see: /Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/solr/build/solr-core/test/temp/junit4-J0-20140516_003131_360.sysout [junit4] JVM J0: stdout (verbatim) [junit4] # [junit4] # A fatal error has been detected by the Java Runtime Environment: [junit4] # [junit4] # SIGSEGV (0xb) at pc=0x00010530e25b, pid=289, tid=19251 [junit4] # [junit4] # JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build 1.7.0_55-b13) [junit4] # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode bsd-amd64 compressed oops) [junit4] # Problematic frame: [junit4] # C [libjava.dylib+0x925b] JNU_NewStringPlatform+0x1d3 [junit4] # [junit4] # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try ulimit -c unlimited before starting Java again [junit4] # [junit4] # An error report file with more information is saved as: [junit4] # /Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/solr/build/solr-core/test/J0/hs_err_pid289.log [junit4] # [junit4] # If you would like to submit a bug report, please visit: [junit4] # http://bugreport.sun.com/bugreport/crash.jsp [junit4] # The crash happened outside the Java Virtual Machine in native code. [junit4] # See problematic frame for where to report the bug. [junit4] # [junit4] JVM J0: EOF [...truncated 1 lines...] [junit4] ERROR: JVM J0 ended with an exception, command line: /Library/Java/JavaVirtualMachines/jdk1.7.0_55.jdk/Contents/Home/jre/bin/java -XX:+UseCompressedOops -XX:+UseParallelGC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/heapdumps -Dtests.prefix=tests -Dtests.seed=8953B5A95A523D42 -Xmx512M -Dtests.iters= -Dtests.verbose=false -Dtests.infostream=false -Dtests.codec=random -Dtests.postingsformat=random -Dtests.docvaluesformat=random -Dtests.locale=random -Dtests.timezone=random -Dtests.directory=random -Dtests.linedocsfile=europarl.lines.txt.gz -Dtests.luceneMatchVersion=4.9 -Dtests.cleanthreads=perClass -Djava.util.logging.config.file=/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/tools/junit4/logging.properties -Dtests.nightly=false -Dtests.weekly=false -Dtests.monster=false -Dtests.slow=true -Dtests.asserts.gracious=false -Dtests.multiplier=1 -DtempDir=. -Djava.io.tmpdir=. -Djunit4.tempDir=/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/solr/build/solr-core/test/temp -Dclover.db.dir=/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/build/clover/db -Djava.security.manager=org.apache.lucene.util.TestSecurityManager -Djava.security.policy=/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/tools/junit4/tests.policy -Dlucene.version=4.9-SNAPSHOT -Djetty.testMode=1 -Djetty.insecurerandom=1 -Dsolr.directoryFactory=org.apache.solr.core.MockDirectoryFactory -Djava.awt.headless=true -Djdk.map.althashing.threshold=0 -Dtests.leaveTemporary=false -Dtests.filterstacks=true -Dtests.disableHdfs=true -classpath
[jira] [Commented] (LUCENE-5673) MmapDirectory shouldn't pass along OOM wrapped as IOException
[ https://issues.apache.org/jira/browse/LUCENE-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999718#comment-13999718 ] Uwe Schindler commented on LUCENE-5673: --- The attached patch should bring only the Map failed. But in any case we can also hardcode the text, so we can remove the getMessage(). I just wanted to preserve the original message. The OutOfMemoryError comes from the wrapped exception, but is not part of the message (see FileChannelImpl of Java 7): {{throw new IOException(Map failed, oome);}}. My code takes the message of the IOException (Map failed), ignores the cause and adds more information like resourceDescription and the hint why it failed. I was thinking about the problem a bit more, we should always add the resource description, so have 2 exception reformats: - Change IOExceptions with OOM wrapped to have a hard-coded text Map failed: resourceDescription (this may be caused...) - All other IOExceptions maybe get the resourceDescription just appended? I am not sure about this, which is a more general issue of adding resourceDescription to all IOExceptions our DirectoryImpls throw. MmapDirectory shouldn't pass along OOM wrapped as IOException - Key: LUCENE-5673 URL: https://issues.apache.org/jira/browse/LUCENE-5673 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-5673.patch, LUCENE-5673.patch, LUCENE-5673.patch The bug here is in java (not MMapDir), but i think we shoudl do something. Users get confused when they configure their JVM to trigger something on OOM, and then see OutOfMemoryError: Map Failed: but their trigger doesnt fire. Thats because in the jdk, when it maps files it catches OutOfMemoryError, asks for a garbage collection, sleeps for 100 milliseconds, then tries to map again. if it fails a second time it wraps the OOM in a generic IOException. I think we should add a try/catch to our filechannel.map -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6027) Replica assignments should try to take the host name into account so all replicas don't end up on the same host
[ https://issues.apache.org/jira/browse/SOLR-6027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998530#comment-13998530 ] Noble Paul commented on SOLR-6027: -- hi Mark, it would be great if you can post your patch (in whatever form) Replica assignments should try to take the host name into account so all replicas don't end up on the same host --- Key: SOLR-6027 URL: https://issues.apache.org/jira/browse/SOLR-6027 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Timothy Potter Priority: Minor Attachments: SOLR-6027.patch I have 18 SolrCloud nodes distributed across 3 Ec2 instances, so 6 per instance. One of my collections was created with all replicas landing on different SolrCloud nodes on the same instance. Ideally, SolrCloud would be a little smarter and ensure that at least one of the replicas was on one of the other hosts. shard4: { http://ec2-??-??-??-239.compute-1.amazonaws.com:8988/solr/med_collection_shard4_replica1/ LEADER http://ec2-??-??-??-239.compute-1.amazonaws.com:8984/solr/med_collection_shard4_replica3/ http://ec2-??-??-??-239.compute-1.amazonaws.com:8985/solr/med_collection_shard4_replica2/ } I marked this as minor for now as it could be argued that I shouldn't be running that many Solr nodes per instance, but I'm seeing plenty of installs that are using higher-end instance types / server hardware and then running multiple Solr nodes per host. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5681) Make the OverseerCollectionProcessor multi-threaded
[ https://issues.apache.org/jira/browse/SOLR-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998573#comment-13998573 ] Anshum Gupta commented on SOLR-5681: [~noble.paul] Thanks for taking a look at it, but seems like you looked at an older patch. DistributedQueue.peekTopN issues are already fixed in the latest patch. I'll make changes to never return a null from peekTopN. I'll also change markTaskAsRunning objects final. shardHandlerOCP is also already removed in the last patch. Make the OverseerCollectionProcessor multi-threaded --- Key: SOLR-5681 URL: https://issues.apache.org/jira/browse/SOLR-5681 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Anshum Gupta Assignee: Anshum Gupta Attachments: SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch Right now, the OverseerCollectionProcessor is single threaded i.e submitting anything long running would have it block processing of other mutually exclusive tasks. When OCP tasks become optionally async (SOLR-5477), it'd be good to have truly non-blocking behavior by multi-threading the OCP itself. For example, a ShardSplit call on Collection1 would block the thread and thereby, not processing a create collection task (which would stay queued in zk) though both the tasks are mutually exclusive. Here are a few of the challenges: * Mutual exclusivity: Only let mutually exclusive tasks run in parallel. An easy way to handle that is to only let 1 task per collection run at a time. * ZK Distributed Queue to feed tasks: The OCP consumes tasks from a queue. The task from the workQueue is only removed on completion so that in case of a failure, the new Overseer can re-consume the same task and retry. A queue is not the right data structure in the first place to look ahead i.e. get the 2nd task from the queue when the 1st one is in process. Also, deleting tasks which are not at the head of a queue is not really an 'intuitive' thing. Proposed solutions for task management: * Task funnel and peekAfter(): The parent thread is responsible for getting and passing the request to a new thread (or one from the pool). The parent method uses a peekAfter(last element) instead of a peek(). The peekAfter returns the task after the 'last element'. Maintain this request information and use it for deleting/cleaning up the workQueue. * Another (almost duplicate) queue: While offering tasks to workQueue, also offer them to a new queue (call it volatileWorkQueue?). The difference is, as soon as a task from this is picked up for processing by the thread, it's removed from the queue. At the end, the cleanup is done from the workQueue. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5674) A new token filter: SubSequence
Nitzan Shaked created LUCENE-5674: - Summary: A new token filter: SubSequence Key: LUCENE-5674 URL: https://issues.apache.org/jira/browse/LUCENE-5674 Project: Lucene - Core Issue Type: Improvement Components: core/other Reporter: Nitzan Shaked Priority: Minor Attachments: 0001-SubSeqFilter.patch A new configurable token filter which, given a token breaks it into sub-parts and outputs consecutive sub-sequences of those sub-parts. Useful for, for example, using during indexing to generate variations on domain names, so that www.google.com can be found by searching for google.com, or www.google.com. Parameters: sepRegexp: A regular expression used split incoming tokens into sub-parts. glue: A string used to concatenate sub-parts together when creating sub-sequences. minLen: Minimum length (in sub-parts) of output sub-sequences maxLen: Maximum length (in sub-parts) of output sub-sequences (0 for unlimited; negative numbers for token length in sub-parts minus specified length) anchor: Anchor.START to output only prefixes, or Anchor.END to output only suffixes, or Anchor.NONE to output any sub-sequence withOriginal: whether to output also the original token Note: tests will follow, I currently believe BaseTokenStreamTestCase is broken/mis-designed for such use cases, I will open a separate Jira for that. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6079) First week Docs
Grant Ingersoll created SOLR-6079: - Summary: First week Docs Key: SOLR-6079 URL: https://issues.apache.org/jira/browse/SOLR-6079 Project: Solr Issue Type: Sub-task Reporter: Grant Ingersoll Over the course of the week, we want to highlight the things users need to think about as the go from novice to production. The goal here is to provide just enough info along the way about the underpinnings of Solr while staying focused on the user's data and queries. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5681) Make the OverseerCollectionProcessor multi-threaded
[ https://issues.apache.org/jira/browse/SOLR-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998539#comment-13998539 ] Noble Paul edited comment on SOLR-5681 at 5/15/14 7:42 AM: --- bq.There are unrelated changes in OCP.prioritizeOverseerNodes I made those changes. I would commit it to trunk anyway because this logging was unnecessary. you can remove it was (Author: noble.paul): bq.There are unrelated changes in OCP.prioritizeOverseerNodes I made those changes. I would commit it to trunk anyway because this logging was unnecessary. In this case this change was required because OCPTest would fail w/o that change . Just keep it around Make the OverseerCollectionProcessor multi-threaded --- Key: SOLR-5681 URL: https://issues.apache.org/jira/browse/SOLR-5681 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Anshum Gupta Assignee: Anshum Gupta Attachments: SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch Right now, the OverseerCollectionProcessor is single threaded i.e submitting anything long running would have it block processing of other mutually exclusive tasks. When OCP tasks become optionally async (SOLR-5477), it'd be good to have truly non-blocking behavior by multi-threading the OCP itself. For example, a ShardSplit call on Collection1 would block the thread and thereby, not processing a create collection task (which would stay queued in zk) though both the tasks are mutually exclusive. Here are a few of the challenges: * Mutual exclusivity: Only let mutually exclusive tasks run in parallel. An easy way to handle that is to only let 1 task per collection run at a time. * ZK Distributed Queue to feed tasks: The OCP consumes tasks from a queue. The task from the workQueue is only removed on completion so that in case of a failure, the new Overseer can re-consume the same task and retry. A queue is not the right data structure in the first place to look ahead i.e. get the 2nd task from the queue when the 1st one is in process. Also, deleting tasks which are not at the head of a queue is not really an 'intuitive' thing. Proposed solutions for task management: * Task funnel and peekAfter(): The parent thread is responsible for getting and passing the request to a new thread (or one from the pool). The parent method uses a peekAfter(last element) instead of a peek(). The peekAfter returns the task after the 'last element'. Maintain this request information and use it for deleting/cleaning up the workQueue. * Another (almost duplicate) queue: While offering tasks to workQueue, also offer them to a new queue (call it volatileWorkQueue?). The difference is, as soon as a task from this is picked up for processing by the thread, it's removed from the queue. At the end, the cleanup is done from the workQueue. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6082) Umbrella JIRA for Admin UI and SolrCloud.
Erick Erickson created SOLR-6082: Summary: Umbrella JIRA for Admin UI and SolrCloud. Key: SOLR-6082 URL: https://issues.apache.org/jira/browse/SOLR-6082 Project: Solr Issue Type: Improvement Components: web gui Affects Versions: 4.9, 5.0 Reporter: Erick Erickson Assignee: Stefan Matheis (steffkes) It would be very helpful if the admin UI were more cloud friendly. This is an umbrella JIRA so we can collect sub-tasks as necessary. I think there might be scattered JIRAs about this, let's link them in as we find them. [~steffkes] - I've taken the liberty of assigning it to you since you expressed some interest. Feel free to assign it back if you want... Let's imagine that a user has a cluster with _no_ collections assigned and start from there. Here's a simple way to set this up. Basically you follow the reference guide tutorial but _don't_ define a collection. 1 completely delete the collection1 directory from example 2 cp -r example example2 3 in example, execute java -DzkRun -jar start.jar 4 in example2, execute java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar Now the cloud link appears. If you expand the tree view, you see the two live nodes. But, there's nothing in the graph view, no cores are selectable, etc. First problem (need to solve before any sub-jiras, so including it here): You have to push a configuration directory to ZK. [~thetapi] The _last_ time Stefan and I started allowing files to be written to Solr from the UI it was...unfortunate. I'm assuming that there's something similar here. That is, we shouldn't allow pushing the Solr config _to_ ZooKeeper through the Admin UI, where they'd be distributed to all the solr nodes. Is that true? If this is a security issue, we can keep pushing the config dirs to ZK a manual step for now... Once we determine how to get configurations up, we can work on the various sub-jiras. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5673) MmapDirectory shouldn't pass along OOM wrapped as IOException
[ https://issues.apache.org/jira/browse/LUCENE-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998711#comment-13998711 ] Uwe Schindler commented on LUCENE-5673: --- Thanks Robert! I already improved the error message in my patch to also include the file name and the original message of the OOM. We can improve that. The message already contains some hints, but no direct refercces to sysctl, becaus ethis is too OS specific. Of course we could add a switch statement with constants giving some hints per OS. The logic in the patch is ready, we can now just improve the error message. MmapDirectory shouldn't pass along OOM wrapped as IOException - Key: LUCENE-5673 URL: https://issues.apache.org/jira/browse/LUCENE-5673 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-5673.patch The bug here is in java (not MMapDir), but i think we shoudl do something. Users get confused when they configure their JVM to trigger something on OOM, and then see OutOfMemoryError: Map Failed: but their trigger doesnt fire. Thats because in the jdk, when it maps files it catches OutOfMemoryError, asks for a garbage collection, sleeps for 100 milliseconds, then tries to map again. if it fails a second time it wraps the OOM in a generic IOException. I think we should add a try/catch to our filechannel.map -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6076) Using Solr with Windchill 10.0 and there is a need to index one single document
Priyanka Jadhav created SOLR-6076: - Summary: Using Solr with Windchill 10.0 and there is a need to index one single document Key: SOLR-6076 URL: https://issues.apache.org/jira/browse/SOLR-6076 Project: Solr Issue Type: Wish Reporter: Priyanka Jadhav Priority: Minor -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6075) CoreAdminHandler should synchronize while adding a task to the tracking map
[ https://issues.apache.org/jira/browse/SOLR-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar updated SOLR-6075: Attachment: SOLR-6075.patch This patch removes the extra synchronization and makes the requestStatusMap non-static. Since the map was already synchronized, there is no thread safety issue because the map is never cleared and addTask(String map, TaskObject o, boolean limit) is the only method which removes stuff from the map. Therefore this isn't a bug and there is no urgent need to fix it in 4.8.1 CoreAdminHandler should synchronize while adding a task to the tracking map --- Key: SOLR-6075 URL: https://issues.apache.org/jira/browse/SOLR-6075 Project: Solr Issue Type: Bug Reporter: Anshum Gupta Assignee: Anshum Gupta Priority: Critical Fix For: 4.9, 5.0 Attachments: SOLR-6075.patch, SOLR-6075.patch CoreAdminHandler should synchronize on the tracker maps when adding a task. It's a rather nasty bug and we should get this in asap. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5673) MmapDirectory shouldn't pass along OOM wrapped as IOException
[ https://issues.apache.org/jira/browse/LUCENE-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999744#comment-13999744 ] Robert Muir commented on LUCENE-5673: - ok now we just need the practical advice to the message... CONSTANTS.32BIT: get a new computer CONSTANTS.WINDOWS: get a new operating system CONSTANTS.LINUX: please review 'ulimit -v' and 'sysctl vm.max_map_count' MmapDirectory shouldn't pass along OOM wrapped as IOException - Key: LUCENE-5673 URL: https://issues.apache.org/jira/browse/LUCENE-5673 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-5673.patch, LUCENE-5673.patch, LUCENE-5673.patch The bug here is in java (not MMapDir), but i think we shoudl do something. Users get confused when they configure their JVM to trigger something on OOM, and then see OutOfMemoryError: Map Failed: but their trigger doesnt fire. Thats because in the jdk, when it maps files it catches OutOfMemoryError, asks for a garbage collection, sleeps for 100 milliseconds, then tries to map again. if it fails a second time it wraps the OOM in a generic IOException. I think we should add a try/catch to our filechannel.map -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5473) Make one state.json per collection
[ https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998695#comment-13998695 ] Noble Paul commented on SOLR-5473: -- bq.If we want to make such a distinction in the code, The external collection references are gone but for some internal method level variables. Make one state.json per collection -- Key: SOLR-5473 URL: https://issues.apache.org/jira/browse/SOLR-5473 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul Fix For: 5.0 Attachments: SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-configname-fix.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473_undo.patch, ec2-23-20-119-52_solr.log, ec2-50-16-38-73_solr.log As defined in the parent issue, store the states of each collection under /collections/collectionname/state.json node -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5673) MMapDirectory shouldn't pass along OOM wrapped as IOException
[ https://issues.apache.org/jira/browse/LUCENE-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999830#comment-13999830 ] ASF subversion and git services commented on LUCENE-5673: - Commit 1595214 from [~thetaphi] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1595214 ] Merged revision(s) 1595213 from lucene/dev/trunk: LUCENE-5673: MMapDirectory: Work around a bug in the JDK that throws a confusing OutOfMemoryError wrapped inside IOException if the FileChannel mapping failed because of lack of virtual address space. The IOException is rethrown with more useful information about the problem, omitting the incorrect OutOfMemoryError MMapDirectory shouldn't pass along OOM wrapped as IOException - Key: LUCENE-5673 URL: https://issues.apache.org/jira/browse/LUCENE-5673 Project: Lucene - Core Issue Type: Bug Components: core/store Affects Versions: 4.8 Reporter: Robert Muir Assignee: Uwe Schindler Fix For: 4.9, 5.0 Attachments: LUCENE-5673.patch, LUCENE-5673.patch, LUCENE-5673.patch, LUCENE-5673.patch The bug here is in java (not MMapDir), but i think we shoudl do something. Users get confused when they configure their JVM to trigger something on OOM, and then see OutOfMemoryError: Map Failed: but their trigger doesnt fire. Thats because in the jdk, when it maps files it catches OutOfMemoryError, asks for a garbage collection, sleeps for 100 milliseconds, then tries to map again. if it fails a second time it wraps the OOM in a generic IOException. I think we should add a try/catch to our filechannel.map -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5673) MMapDirectory shouldn't pass along OOM wrapped as IOException
[ https://issues.apache.org/jira/browse/LUCENE-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler resolved LUCENE-5673. --- Resolution: Fixed Reopen, if we backport on respin! MMapDirectory shouldn't pass along OOM wrapped as IOException - Key: LUCENE-5673 URL: https://issues.apache.org/jira/browse/LUCENE-5673 Project: Lucene - Core Issue Type: Bug Components: core/store Affects Versions: 4.8 Reporter: Robert Muir Assignee: Uwe Schindler Fix For: 4.9, 5.0 Attachments: LUCENE-5673.patch, LUCENE-5673.patch, LUCENE-5673.patch, LUCENE-5673.patch The bug here is in java (not MMapDir), but i think we shoudl do something. Users get confused when they configure their JVM to trigger something on OOM, and then see OutOfMemoryError: Map Failed: but their trigger doesnt fire. Thats because in the jdk, when it maps files it catches OutOfMemoryError, asks for a garbage collection, sleeps for 100 milliseconds, then tries to map again. if it fails a second time it wraps the OOM in a generic IOException. I think we should add a try/catch to our filechannel.map -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5673) MmapDirectory shouldn't pass along OOM wrapped as IOException
[ https://issues.apache.org/jira/browse/LUCENE-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998706#comment-13998706 ] Robert Muir commented on LUCENE-5673: - I think its a good start but we need to be careful here. First of all my problem is with the OutOfMemoryError text. I do still think the message should start with Map failed instead of Memory mapping failed. We want users to be able to google the error and still find some assistance. If we are going to offer more explanation in addition to that, it woudl be good to try to add practical stuff: e.g. mention 'ulimit' and 'sysctl vm.max_map_count' and so on. MmapDirectory shouldn't pass along OOM wrapped as IOException - Key: LUCENE-5673 URL: https://issues.apache.org/jira/browse/LUCENE-5673 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-5673.patch The bug here is in java (not MMapDir), but i think we shoudl do something. Users get confused when they configure their JVM to trigger something on OOM, and then see OutOfMemoryError: Map Failed: but their trigger doesnt fire. Thats because in the jdk, when it maps files it catches OutOfMemoryError, asks for a garbage collection, sleeps for 100 milliseconds, then tries to map again. if it fails a second time it wraps the OOM in a generic IOException. I think we should add a try/catch to our filechannel.map -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5673) MMapDirectory shouldn't pass along OOM wrapped as IOException
[ https://issues.apache.org/jira/browse/LUCENE-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999826#comment-13999826 ] ASF subversion and git services commented on LUCENE-5673: - Commit 1595213 from [~thetaphi] in branch 'dev/trunk' [ https://svn.apache.org/r1595213 ] LUCENE-5673: MMapDirectory: Work around a bug in the JDK that throws a confusing OutOfMemoryError wrapped inside IOException if the FileChannel mapping failed because of lack of virtual address space. The IOException is rethrown with more useful information about the problem, omitting the incorrect OutOfMemoryError MMapDirectory shouldn't pass along OOM wrapped as IOException - Key: LUCENE-5673 URL: https://issues.apache.org/jira/browse/LUCENE-5673 Project: Lucene - Core Issue Type: Bug Components: core/store Affects Versions: 4.8 Reporter: Robert Muir Assignee: Uwe Schindler Fix For: 4.9, 5.0 Attachments: LUCENE-5673.patch, LUCENE-5673.patch, LUCENE-5673.patch, LUCENE-5673.patch The bug here is in java (not MMapDir), but i think we shoudl do something. Users get confused when they configure their JVM to trigger something on OOM, and then see OutOfMemoryError: Map Failed: but their trigger doesnt fire. Thats because in the jdk, when it maps files it catches OutOfMemoryError, asks for a garbage collection, sleeps for 100 milliseconds, then tries to map again. if it fails a second time it wraps the OOM in a generic IOException. I think we should add a try/catch to our filechannel.map -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5675) ID postings format
[ https://issues.apache.org/jira/browse/LUCENE-5675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998863#comment-13998863 ] Michael McCandless commented on LUCENE-5675: +1 ID postings format Key: LUCENE-5675 URL: https://issues.apache.org/jira/browse/LUCENE-5675 Project: Lucene - Core Issue Type: New Feature Reporter: Robert Muir Today the primary key lookup in lucene is not that great for systems like solr and elasticsearch that have versioning in front of IndexWriter. To some extend BlockTree can sometimes help avoid seeks by telling you the term does not exist for a segment. But this technique (based on FST prefix) is fragile. The only other choice today is bloom filters, which use up huge amounts of memory. I don't think we are using everything we know: particularly the version semantics. Instead, if the FST for the terms index used an algebra that represents the max version for any subtree, we might be able to answer that there is no term T with version V in that segment very efficiently. Also ID fields dont need postings lists, they dont need stats like docfreq/totaltermfreq, etc this stuff is all implicit. As far as API, i think for users to provide IDs with versions to such a PF, a start would to set a payload or whatever on the term field to get it thru indexwriter to the codec. And a consumer of the codec can just cast the Terms to a subclass that exposes the FST to do this version check efficiently. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5673) MmapDirectory shouldn't pass along OOM wrapped as IOException
Robert Muir created LUCENE-5673: --- Summary: MmapDirectory shouldn't pass along OOM wrapped as IOException Key: LUCENE-5673 URL: https://issues.apache.org/jira/browse/LUCENE-5673 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-5673.patch The bug here is in java (not MMapDir), but i think we shoudl do something. Users get confused when they configure their JVM to trigger something on OOM, and then see OutOfMemoryError: Map Failed: but their trigger doesnt fire. Thats because in the jdk, when it maps files it catches OutOfMemoryError, asks for a garbage collection, sleeps for 100 milliseconds, then tries to map again. if it fails a second time it wraps the OOM in a generic IOException. I think we should add a try/catch to our filechannel.map -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6057) Duplicate background-color in #content #analysis #analysis-result .match (analysis.css)
[ https://issues.apache.org/jira/browse/SOLR-6057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999291#comment-13999291 ] Shalin Shekhar Mangar commented on SOLR-6057: - In my experience, the current highlighting color does not show up well on projector screens which makes it hard to demonstrate the analysis screen in Solr workshops :) Duplicate background-color in #content #analysis #analysis-result .match (analysis.css) --- Key: SOLR-6057 URL: https://issues.apache.org/jira/browse/SOLR-6057 Project: Solr Issue Type: Bug Reporter: Al Krinker Priority: Trivial Inside of solr/webapp/web/css/styles/analysis.css, you can find #content #analysis #analysis-result .match element with following content: #content #analysis #analysis-result .match { background-color: #e9eff7; background-color: #f2f2ff; } background-color listed twice. Also, it was very hard for me to see the highlight. Recommend to change it to background-color: #FF; -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4396) BooleanScorer should sometimes be used for MUST clauses
[ https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998736#comment-13998736 ] Da Huang commented on LUCENE-4396: -- Thanks for your suggestions! {quote} maybe we could test on fewer terms, for the Low/HighAndManyLow/High tasks? I think it's more common to have a handful (3-5 maybe) of terms. {quote} When terms are few, BooleanNovelScorer performs slower than BS (about -10%). However, I have to generate tasks with fewer terms and rerun the tasks to reconfirm the specific perf. difference. {quote} But maybe keep your current category and rename it to Tons instead of Many? {quote} OK, I will do so. {quote} Maybe we can improve the test so that it exercises BS and NBS? E.g., toggle the require docs in order via a custom collector? {quote} Yes, I think that's a good idea. {quote} Hmm do we know why the scores changed? {quote} Yes, it's because the calculating orders are different. BS adds up scores of all SHOULD clauses, and then add their sum to the final score. BNS adds score of each SHOULD clause to final score one by one. {quote} Are we comparing BS2 to NovelBS? {quote} Yes. {quote} I think BS and BS2 already have different scores today? {quote} Yes. Actually, the score calculating order of BS is the same as BNS. {quote} but you commented this out in your patch in order to test NBS I guess? {quote} yes, I did that in order to test BNS. Otherwise, luceneutil would throw exception. {quote} Do you have any perf results of BS w/ required clauses (as a BulkScorer) vs BS2 (what trunk does today)? {quote} Hmm, I haven't carried out such experiment yet. Checking the perf. results of BS vs BS2 is a good idea. I will do that. :) BooleanScorer should sometimes be used for MUST clauses --- Key: LUCENE-4396 URL: https://issues.apache.org/jira/browse/LUCENE-4396 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Attachments: AndOr.tasks, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, luceneutil-score-equal.patch Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT. If there is one or more MUST clauses we always use BooleanScorer2. But I suspect that unless the MUST clauses have very low hit count compared to the other clauses, that BooleanScorer would perform better than BooleanScorer2. BooleanScorer still has some vestiges from when it used to handle MUST so it shouldn't be hard to bring back this capability ... I think the challenging part might be the heuristics on when to use which (likely we would have to use firstDocID as proxy for total hit count). Likely we should also have BooleanScorer sometimes use .advance() on the subs in this case, eg if suddenly the MUST clause skips 100 docs then you want to .advance() all the SHOULD clauses. I won't have near term time to work on this so feel free to take it if you are inspired! -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5675) ID postings format
[ https://issues.apache.org/jira/browse/LUCENE-5675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998973#comment-13998973 ] ASF subversion and git services commented on LUCENE-5675: - Commit 1594991 from [~mikemccand] in branch 'dev/branches/lucene5675' [ https://svn.apache.org/r1594991 ] LUCENE-5675: move BlockTree* under its own package ID postings format Key: LUCENE-5675 URL: https://issues.apache.org/jira/browse/LUCENE-5675 Project: Lucene - Core Issue Type: New Feature Reporter: Robert Muir Today the primary key lookup in lucene is not that great for systems like solr and elasticsearch that have versioning in front of IndexWriter. To some extend BlockTree can sometimes help avoid seeks by telling you the term does not exist for a segment. But this technique (based on FST prefix) is fragile. The only other choice today is bloom filters, which use up huge amounts of memory. I don't think we are using everything we know: particularly the version semantics. Instead, if the FST for the terms index used an algebra that represents the max version for any subtree, we might be able to answer that there is no term T with version V in that segment very efficiently. Also ID fields dont need postings lists, they dont need stats like docfreq/totaltermfreq, etc this stuff is all implicit. As far as API, i think for users to provide IDs with versions to such a PF, a start would to set a payload or whatever on the term field to get it thru indexwriter to the codec. And a consumer of the codec can just cast the Terms to a subclass that exposes the FST to do this version check efficiently. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6086) Replica active during Warming
ludovic Boutros created SOLR-6086: - Summary: Replica active during Warming Key: SOLR-6086 URL: https://issues.apache.org/jira/browse/SOLR-6086 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6.1 Reporter: ludovic Boutros At least with Solr 4.6.1, replica are considered as active during the warming process. This means that if you restart a replica or create a new one, queries will be send to this replica and the query will hang until the end of the warming process (If cold searchers are not used). You cannot add or restart a node silently anymore. I think that the fact that the replica is active is not a bad thing. But, the HttpShardHandler and the CloudSolrServer class should take the warming process in account. Currently, I have developped a new very simple component which check that a searcher is registered. I am also developping custom HttpShardHandler and CloudSolrServer classes which will check the warming process in addition to the ACTIVE status in the cluster state. This seems to be more a workaround than a solution but that's all I can do in this version. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5673) MmapDirectory shouldn't pass along OOM wrapped as IOException
[ https://issues.apache.org/jira/browse/LUCENE-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998678#comment-13998678 ] Uwe Schindler commented on LUCENE-5673: --- Hi, this is indeed a problem. The existence of the OOM somewhere in the stack trace confuses users, because whenever they see OOM, they start to raise -Xmx and by that make the problem worse. The behaviour is correct from the API, because the javadocs of FileChannel.map specify that it throws IOException when mapping fails. But the implementation in OpenJDK is bullshit: We should do something like this: {code:java} ...catch (IOException ioe) { if (ioe.getCause() instanceof OutOfMemoryError) { throw new IOException(Memory mapping failed. There might not be enough unfragmented address space available to mmap the index file: + ioe.getCause().getMessage()); // without root cause!!! } throw ioe; } {code} By that the user just gets a good IOException not referring to OOM. Indeed the OOM is the real bug, because its caused by OutOfAddressSpaceError and not OutOfMemoryError :-) MmapDirectory shouldn't pass along OOM wrapped as IOException - Key: LUCENE-5673 URL: https://issues.apache.org/jira/browse/LUCENE-5673 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-5673.patch The bug here is in java (not MMapDir), but i think we shoudl do something. Users get confused when they configure their JVM to trigger something on OOM, and then see OutOfMemoryError: Map Failed: but their trigger doesnt fire. Thats because in the jdk, when it maps files it catches OutOfMemoryError, asks for a garbage collection, sleeps for 100 milliseconds, then tries to map again. if it fails a second time it wraps the OOM in a generic IOException. I think we should add a try/catch to our filechannel.map -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5681) Make the OverseerCollectionProcessor multi-threaded
[ https://issues.apache.org/jira/browse/SOLR-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-5681: - Attachment: SOLR-5681-2.patch [~shalinmangar] your comments are taken care of in distributed queue Make the OverseerCollectionProcessor multi-threaded --- Key: SOLR-5681 URL: https://issues.apache.org/jira/browse/SOLR-5681 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Anshum Gupta Assignee: Anshum Gupta Attachments: SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch Right now, the OverseerCollectionProcessor is single threaded i.e submitting anything long running would have it block processing of other mutually exclusive tasks. When OCP tasks become optionally async (SOLR-5477), it'd be good to have truly non-blocking behavior by multi-threading the OCP itself. For example, a ShardSplit call on Collection1 would block the thread and thereby, not processing a create collection task (which would stay queued in zk) though both the tasks are mutually exclusive. Here are a few of the challenges: * Mutual exclusivity: Only let mutually exclusive tasks run in parallel. An easy way to handle that is to only let 1 task per collection run at a time. * ZK Distributed Queue to feed tasks: The OCP consumes tasks from a queue. The task from the workQueue is only removed on completion so that in case of a failure, the new Overseer can re-consume the same task and retry. A queue is not the right data structure in the first place to look ahead i.e. get the 2nd task from the queue when the 1st one is in process. Also, deleting tasks which are not at the head of a queue is not really an 'intuitive' thing. Proposed solutions for task management: * Task funnel and peekAfter(): The parent thread is responsible for getting and passing the request to a new thread (or one from the pool). The parent method uses a peekAfter(last element) instead of a peek(). The peekAfter returns the task after the 'last element'. Maintain this request information and use it for deleting/cleaning up the workQueue. * Another (almost duplicate) queue: While offering tasks to workQueue, also offer them to a new queue (call it volatileWorkQueue?). The difference is, as soon as a task from this is picked up for processing by the thread, it's removed from the queue. At the end, the cleanup is done from the workQueue. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5681) Make the OverseerCollectionProcessor multi-threaded
[ https://issues.apache.org/jira/browse/SOLR-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anshum Gupta updated SOLR-5681: --- Attachment: SOLR-5681-2.patch Another patch, integrates the patch for SOLR-6075. Will remove this before committing once that goes into trunk. Make the OverseerCollectionProcessor multi-threaded --- Key: SOLR-5681 URL: https://issues.apache.org/jira/browse/SOLR-5681 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Anshum Gupta Assignee: Anshum Gupta Attachments: SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch Right now, the OverseerCollectionProcessor is single threaded i.e submitting anything long running would have it block processing of other mutually exclusive tasks. When OCP tasks become optionally async (SOLR-5477), it'd be good to have truly non-blocking behavior by multi-threading the OCP itself. For example, a ShardSplit call on Collection1 would block the thread and thereby, not processing a create collection task (which would stay queued in zk) though both the tasks are mutually exclusive. Here are a few of the challenges: * Mutual exclusivity: Only let mutually exclusive tasks run in parallel. An easy way to handle that is to only let 1 task per collection run at a time. * ZK Distributed Queue to feed tasks: The OCP consumes tasks from a queue. The task from the workQueue is only removed on completion so that in case of a failure, the new Overseer can re-consume the same task and retry. A queue is not the right data structure in the first place to look ahead i.e. get the 2nd task from the queue when the 1st one is in process. Also, deleting tasks which are not at the head of a queue is not really an 'intuitive' thing. Proposed solutions for task management: * Task funnel and peekAfter(): The parent thread is responsible for getting and passing the request to a new thread (or one from the pool). The parent method uses a peekAfter(last element) instead of a peek(). The peekAfter returns the task after the 'last element'. Maintain this request information and use it for deleting/cleaning up the workQueue. * Another (almost duplicate) queue: While offering tasks to workQueue, also offer them to a new queue (call it volatileWorkQueue?). The difference is, as soon as a task from this is picked up for processing by the thread, it's removed from the queue. At the end, the cleanup is done from the workQueue. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6075) CoreAdminHandler should synchronize while adding a task to the tracking map
[ https://issues.apache.org/jira/browse/SOLR-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar updated SOLR-6075: Attachment: SOLR-6075.patch This patch removes the extra synchronization and makes the requestStatusMap non-static. Since the map was already synchronized, there is no thread safety issue because the map is never cleared and addTask(String map, TaskObject o, boolean limit) is the only method which removes stuff from the map. Therefore this isn't a bug and there is no urgent need to fix it in 4.8.1 CoreAdminHandler should synchronize while adding a task to the tracking map --- Key: SOLR-6075 URL: https://issues.apache.org/jira/browse/SOLR-6075 Project: Solr Issue Type: Bug Reporter: Anshum Gupta Assignee: Anshum Gupta Priority: Critical Fix For: 4.9, 5.0 Attachments: SOLR-6075.patch, SOLR-6075.patch CoreAdminHandler should synchronize on the tracker maps when adding a task. It's a rather nasty bug and we should get this in asap. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5681) Make the OverseerCollectionProcessor multi-threaded
[ https://issues.apache.org/jira/browse/SOLR-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anshum Gupta updated SOLR-5681: --- Attachment: SOLR-5681-2.patch Patch that makes a few vars final (tracking related). Also, completedTasks is no longer a synchronizedHashMap but the synchronization is self-managed. Make the OverseerCollectionProcessor multi-threaded --- Key: SOLR-5681 URL: https://issues.apache.org/jira/browse/SOLR-5681 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Anshum Gupta Assignee: Anshum Gupta Attachments: SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch Right now, the OverseerCollectionProcessor is single threaded i.e submitting anything long running would have it block processing of other mutually exclusive tasks. When OCP tasks become optionally async (SOLR-5477), it'd be good to have truly non-blocking behavior by multi-threading the OCP itself. For example, a ShardSplit call on Collection1 would block the thread and thereby, not processing a create collection task (which would stay queued in zk) though both the tasks are mutually exclusive. Here are a few of the challenges: * Mutual exclusivity: Only let mutually exclusive tasks run in parallel. An easy way to handle that is to only let 1 task per collection run at a time. * ZK Distributed Queue to feed tasks: The OCP consumes tasks from a queue. The task from the workQueue is only removed on completion so that in case of a failure, the new Overseer can re-consume the same task and retry. A queue is not the right data structure in the first place to look ahead i.e. get the 2nd task from the queue when the 1st one is in process. Also, deleting tasks which are not at the head of a queue is not really an 'intuitive' thing. Proposed solutions for task management: * Task funnel and peekAfter(): The parent thread is responsible for getting and passing the request to a new thread (or one from the pool). The parent method uses a peekAfter(last element) instead of a peek(). The peekAfter returns the task after the 'last element'. Maintain this request information and use it for deleting/cleaning up the workQueue. * Another (almost duplicate) queue: While offering tasks to workQueue, also offer them to a new queue (call it volatileWorkQueue?). The difference is, as soon as a task from this is picked up for processing by the thread, it's removed from the queue. At the end, the cleanup is done from the workQueue. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6076) Using Solr with Windchill 10.0 and there is a need to index one single document
[ https://issues.apache.org/jira/browse/SOLR-6076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Priyanka Jadhav updated SOLR-6076: -- Description: Is there any functionality to index a single document Using Solr with Windchill 10.0 and there is a need to index one single document Key: SOLR-6076 URL: https://issues.apache.org/jira/browse/SOLR-6076 Project: Solr Issue Type: Wish Reporter: Priyanka Jadhav Priority: Minor Labels: performance Is there any functionality to index a single document -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5285) Solr response format should support child Docs
[ https://issues.apache.org/jira/browse/SOLR-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999615#comment-13999615 ] Varun Thacker commented on SOLR-5285: - [~Raveendra] It won't be part of the 4.8.* releases. There is no particular timeline for the 4.9 release but it could be out within a month. Solr response format should support child Docs -- Key: SOLR-5285 URL: https://issues.apache.org/jira/browse/SOLR-5285 Project: Solr Issue Type: New Feature Reporter: Varun Thacker Fix For: 4.9, 5.0 Attachments: SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, javabin_backcompat_child_docs.bin Solr has added support for taking childDocs as input ( only XML till now ). It's currently used for BlockJoinQuery. I feel that if a user indexes a document with child docs, even if he isn't using the BJQ features and is just searching which results in a hit on the parentDoc, it's childDocs should be returned in the response format. [~hossman_luc...@fucit.org] on IRC suggested that the DocTransformers would be the place to add childDocs to the response. Now given a docId one needs to find out all the childDoc id's. A couple of approaches which I could think of are 1. Maintain the relation between a parentDoc and it's childDocs during indexing time in maybe a separate index? 2. Somehow emulate what happens in ToParentBlockJoinQuery.nextDoc() - Given a parentDoc it finds out all the childDocs but this requires a childScorer. Am I missing something obvious on how to find the relation between a parentDoc and it's childDocs because none of the above solutions for this look right. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-trunk-Linux-Java7-64-test-only - Build # 85326 - Failure!
Build: builds.flonkings.com/job/Lucene-trunk-Linux-Java7-64-test-only/85326/ 5 tests failed. REGRESSION: org.apache.lucene.index.TestIndexWriterMerging.testForceMergeDeletes2 Error Message: background merge hit exception: _0(5.0):C2/1:delGen=1 _1(5.0):C2/1:delGen=1 _2(5.0):C2/1:delGen=1 into _1d Stack Trace: java.io.IOException: background merge hit exception: _0(5.0):C2/1:delGen=1 _1(5.0):C2/1:delGen=1 _2(5.0):C2/1:delGen=1 into _1d at __randomizedtesting.SeedInfo.seed([54EC7EB684C6F5CB:F269191A6AE4C502]:0) at org.apache.lucene.index.IndexWriter.forceMergeDeletes(IndexWriter.java:1807) at org.apache.lucene.index.IndexWriter.forceMergeDeletes(IndexWriter.java:1847) at org.apache.lucene.index.TestIndexWriterMerging.testForceMergeDeletes2(TestIndexWriterMerging.java:238) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:793) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:453) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360) at java.lang.Thread.run(Thread.java:745) REGRESSION: org.apache.lucene.index.TestIndexWriterMerging.testForceMergeDeletes Error Message: background merge hit exception: _0(5.0):c2/1:delGen=1 into _5 Stack Trace: java.io.IOException:
[jira] [Commented] (LUCENE-5666) Add UninvertingReader
[ https://issues.apache.org/jira/browse/LUCENE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999871#comment-13999871 ] ASF subversion and git services commented on LUCENE-5666: - Commit 1595228 from [~rcmuir] in branch 'dev/branches/lucene5666' [ https://svn.apache.org/r1595228 ] LUCENE-5666: merge trunk Add UninvertingReader - Key: LUCENE-5666 URL: https://issues.apache.org/jira/browse/LUCENE-5666 Project: Lucene - Core Issue Type: Improvement Reporter: Robert Muir Fix For: 5.0 Attachments: LUCENE-5666.patch Currently the fieldcache is not pluggable at all. It would be better if everything used the docvalues apis. This would allow people to customize the implementation, extend the classes with custom subclasses with additional stuff, etc etc. FieldCache can be accessed via the docvalues apis, using the FilterReader api. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-Windows (32bit/jdk1.7.0_60-ea-b15) - Build # 3956 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Windows/3956/ Java: 32bit/jdk1.7.0_60-ea-b15 -server -XX:+UseConcMarkSweepGC 4 tests failed. REGRESSION: org.apache.lucene.analysis.core.TestBugInSomething.testUnicodeShinglesAndNgrams Error Message: Test abandoned because suite timeout was reached. Stack Trace: java.lang.Exception: Test abandoned because suite timeout was reached. at __randomizedtesting.SeedInfo.seed([910A4664FDCA418A]:0) FAILED: junit.framework.TestSuite.org.apache.lucene.analysis.core.TestBugInSomething Error Message: Suite timeout exceeded (= 720 msec). Stack Trace: java.lang.Exception: Suite timeout exceeded (= 720 msec). at __randomizedtesting.SeedInfo.seed([910A4664FDCA418A]:0) FAILED: junit.framework.TestSuite.org.apache.lucene.analysis.core.TestBugInSomething Error Message: Captured an uncaught exception in thread: Thread[id=1003, name=Thread-759, state=RUNNABLE, group=TGRP-TestBugInSomething] Stack Trace: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=1003, name=Thread-759, state=RUNNABLE, group=TGRP-TestBugInSomething] Caused by: java.lang.AssertionError: actual mem: 36606768 byte, expected mem: 35383744 byte, flush mem: 16053992, active mem: 20552776, pending DWPT: 1, flushing DWPT: 0, blocked DWPT: 0, peakDelta mem: 1829312 byte at __randomizedtesting.SeedInfo.seed([910A4664FDCA418A]:0) at org.apache.lucene.index.DocumentsWriterFlushControl.assertMemory(DocumentsWriterFlushControl.java:127) at org.apache.lucene.index.DocumentsWriterFlushControl.doAfterDocument(DocumentsWriterFlushControl.java:194) at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:471) at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1541) at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1211) at org.apache.lucene.index.RandomIndexWriter.addDocument(RandomIndexWriter.java:148) at org.apache.lucene.index.RandomIndexWriter.addDocument(RandomIndexWriter.java:109) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:622) at org.apache.lucene.analysis.BaseTokenStreamTestCase.access$000(BaseTokenStreamTestCase.java:61) at org.apache.lucene.analysis.BaseTokenStreamTestCase$AnalysisThread.run(BaseTokenStreamTestCase.java:483) FAILED: junit.framework.TestSuite.org.apache.lucene.analysis.core.TestBugInSomething Error Message: Captured an uncaught exception in thread: Thread[id=1004, name=Thread-760, state=RUNNABLE, group=TGRP-TestBugInSomething] Stack Trace: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=1004, name=Thread-760, state=RUNNABLE, group=TGRP-TestBugInSomething] Caused by: java.lang.AssertionError: actual mem: 38149188 byte, expected mem: 37213056 byte, flush mem: 25479100, active mem: 12670088, pending DWPT: 0, flushing DWPT: 2, blocked DWPT: 0, peakDelta mem: 1829312 byte at __randomizedtesting.SeedInfo.seed([910A4664FDCA418A]:0) at org.apache.lucene.index.DocumentsWriterFlushControl.assertMemory(DocumentsWriterFlushControl.java:127) at org.apache.lucene.index.DocumentsWriterFlushControl.doAfterDocument(DocumentsWriterFlushControl.java:194) at org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:433) at org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1310) at org.apache.lucene.index.IndexWriter.addDocuments(IndexWriter.java:1271) at org.apache.lucene.index.RandomIndexWriter.addDocument(RandomIndexWriter.java:119) at org.apache.lucene.index.RandomIndexWriter.addDocument(RandomIndexWriter.java:109) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:622) at org.apache.lucene.analysis.BaseTokenStreamTestCase.access$000(BaseTokenStreamTestCase.java:61) at org.apache.lucene.analysis.BaseTokenStreamTestCase$AnalysisThread.run(BaseTokenStreamTestCase.java:483) Build Log: [...truncated 5629 lines...] [junit4] Suite: org.apache.lucene.analysis.core.TestBugInSomething [junit4] 2 TEST FAIL: useCharFilter=true text='\u4e5e pqyxwp i. ' [junit4] 2 Mai 16, 2014 5:08:10 AM com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler uncaughtException [junit4] 2 ADVERTÊNCIA: Uncaught exception in thread: Thread[Thread-759,5,TGRP-TestBugInSomething] [junit4] 2 java.lang.AssertionError: actual mem: 36606768 byte, expected mem: 35383744 byte, flush mem: 16053992, active mem: 20552776, pending DWPT: 1, flushing DWPT: 0, blocked DWPT: 0, peakDelta mem: 1829312 byte [junit4] 2at __randomizedtesting.SeedInfo.seed([910A4664FDCA418A]:0) [junit4] 2at
[jira] [Commented] (LUCENE-5675) ID postings format
[ https://issues.apache.org/jira/browse/LUCENE-5675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999878#comment-13999878 ] ASF subversion and git services commented on LUCENE-5675: - Commit 1595229 from [~mikemccand] in branch 'dev/branches/lucene5675' [ https://svn.apache.org/r1595229 ] LUCENE-5675: add testRandom; sometimes fails ID postings format Key: LUCENE-5675 URL: https://issues.apache.org/jira/browse/LUCENE-5675 Project: Lucene - Core Issue Type: New Feature Reporter: Robert Muir Today the primary key lookup in lucene is not that great for systems like solr and elasticsearch that have versioning in front of IndexWriter. To some extend BlockTree can sometimes help avoid seeks by telling you the term does not exist for a segment. But this technique (based on FST prefix) is fragile. The only other choice today is bloom filters, which use up huge amounts of memory. I don't think we are using everything we know: particularly the version semantics. Instead, if the FST for the terms index used an algebra that represents the max version for any subtree, we might be able to answer that there is no term T with version V in that segment very efficiently. Also ID fields dont need postings lists, they dont need stats like docfreq/totaltermfreq, etc this stuff is all implicit. As far as API, i think for users to provide IDs with versions to such a PF, a start would to set a payload or whatever on the term field to get it thru indexwriter to the codec. And a consumer of the codec can just cast the Terms to a subclass that exposes the FST to do this version check efficiently. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-4x-Linux-Java7-64-test-only - Build # 21653 - Still Failing!
Build: builds.flonkings.com/job/Lucene-4x-Linux-Java7-64-test-only/21653/ 5 tests failed. FAILED: org.apache.lucene.index.TestIndexWriterMerging.testForceMergeDeletes2 Error Message: background merge hit exception: _0(4.9):c2/1:delGen=1 _1(4.9):c2/1:delGen=1 _2(4.9):c2/1:delGen=1 into _1d Stack Trace: java.io.IOException: background merge hit exception: _0(4.9):c2/1:delGen=1 _1(4.9):c2/1:delGen=1 _2(4.9):c2/1:delGen=1 into _1d at __randomizedtesting.SeedInfo.seed([982BE583FBB537FD:3EAE822F15970734]:0) at org.apache.lucene.index.IndexWriter.forceMergeDeletes(IndexWriter.java:1915) at org.apache.lucene.index.IndexWriter.forceMergeDeletes(IndexWriter.java:1955) at org.apache.lucene.index.TestIndexWriterMerging.testForceMergeDeletes2(TestIndexWriterMerging.java:237) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:793) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:453) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360) at java.lang.Thread.run(Thread.java:745) FAILED: org.apache.lucene.index.TestIndexWriterMerging.testForceMergeDeletes Error Message: background merge hit exception: _0(4.9):C2/1:delGen=1 into _5 Stack Trace: java.io.IOException: background
[jira] [Updated] (LUCENE-5673) MMapDirectory shouldn't pass along OOM wrapped as IOException
[ https://issues.apache.org/jira/browse/LUCENE-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-5673: -- Component/s: core/store Affects Version/s: 4.8 Fix Version/s: 5.0 4.9 Assignee: Uwe Schindler Summary: MMapDirectory shouldn't pass along OOM wrapped as IOException (was: MmapDirectory shouldn't pass along OOM wrapped as IOException) MMapDirectory shouldn't pass along OOM wrapped as IOException - Key: LUCENE-5673 URL: https://issues.apache.org/jira/browse/LUCENE-5673 Project: Lucene - Core Issue Type: Bug Components: core/store Affects Versions: 4.8 Reporter: Robert Muir Assignee: Uwe Schindler Fix For: 4.9, 5.0 Attachments: LUCENE-5673.patch, LUCENE-5673.patch, LUCENE-5673.patch, LUCENE-5673.patch The bug here is in java (not MMapDir), but i think we shoudl do something. Users get confused when they configure their JVM to trigger something on OOM, and then see OutOfMemoryError: Map Failed: but their trigger doesnt fire. Thats because in the jdk, when it maps files it catches OutOfMemoryError, asks for a garbage collection, sleeps for 100 milliseconds, then tries to map again. if it fails a second time it wraps the OOM in a generic IOException. I think we should add a try/catch to our filechannel.map -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-5673) MMapDirectory shouldn't pass along OOM wrapped as IOException
[ https://issues.apache.org/jira/browse/LUCENE-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999807#comment-13999807 ] Uwe Schindler edited comment on LUCENE-5673 at 5/16/14 1:55 PM: Should I backport to 4.8.1? This does not brea logic, it just cleans up the exception, so no risk to break something. was (Author: thetaphi): Should I backport to 4.8.1? MMapDirectory shouldn't pass along OOM wrapped as IOException - Key: LUCENE-5673 URL: https://issues.apache.org/jira/browse/LUCENE-5673 Project: Lucene - Core Issue Type: Bug Components: core/store Affects Versions: 4.8 Reporter: Robert Muir Assignee: Uwe Schindler Fix For: 4.9, 5.0 Attachments: LUCENE-5673.patch, LUCENE-5673.patch, LUCENE-5673.patch, LUCENE-5673.patch The bug here is in java (not MMapDir), but i think we shoudl do something. Users get confused when they configure their JVM to trigger something on OOM, and then see OutOfMemoryError: Map Failed: but their trigger doesnt fire. Thats because in the jdk, when it maps files it catches OutOfMemoryError, asks for a garbage collection, sleeps for 100 milliseconds, then tries to map again. if it fails a second time it wraps the OOM in a generic IOException. I think we should add a try/catch to our filechannel.map -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-trunk-Linux-Java7-64-test-only - Build # 85327 - Still Failing!
Build: builds.flonkings.com/job/Lucene-trunk-Linux-Java7-64-test-only/85327/ 5 tests failed. FAILED: org.apache.lucene.index.TestIndexWriterMerging.testForceMergeDeletes Error Message: background merge hit exception: _0(5.0):C2/1:delGen=1 into _5 Stack Trace: java.io.IOException: background merge hit exception: _0(5.0):C2/1:delGen=1 into _5 at __randomizedtesting.SeedInfo.seed([C2D758E55A7692F:3E5D25286A2DE900]:0) at org.apache.lucene.index.IndexWriter.forceMergeDeletes(IndexWriter.java:1807) at org.apache.lucene.index.IndexWriter.forceMergeDeletes(IndexWriter.java:1847) at org.apache.lucene.index.TestIndexWriterMerging.testForceMergeDeletes(TestIndexWriterMerging.java:171) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:793) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:453) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360) at java.lang.Thread.run(Thread.java:745) FAILED: org.apache.lucene.index.TestIndexWriterMerging.testForceMergeDeletes2 Error Message: background merge hit exception: _0(5.0):C2/1:delGen=1 _1(5.0):C2/1:delGen=1 _2(5.0):C2/1:delGen=1 into _1d Stack Trace: java.io.IOException: background merge hit exception: _0(5.0):C2/1:delGen=1
[jira] [Commented] (LUCENE-5650) createTempDir and associated functions no longer create java.io.tmpdir
[ https://issues.apache.org/jira/browse/LUCENE-5650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999864#comment-13999864 ] Dawid Weiss commented on LUCENE-5650: - Awesome, thanks Ryan! I'll rerun the tests in the evening and then commit. createTempDir and associated functions no longer create java.io.tmpdir -- Key: LUCENE-5650 URL: https://issues.apache.org/jira/browse/LUCENE-5650 Project: Lucene - Core Issue Type: Improvement Components: general/test Reporter: Ryan Ernst Assignee: Dawid Weiss Priority: Minor Fix For: 4.9, 5.0 Attachments: LUCENE-5650.patch, LUCENE-5650.patch, LUCENE-5650.patch, LUCENE-5650.patch The recent refactoring to all the create temp file/dir functions (which is great!) has a minor regression from what existed before. With the old {{LuceneTestCase.TEMP_DIR}}, the directory was created if it did not exist. So, if you set {{java.io.tmpdir}} to {{./temp}}, then it would create that dir within the per jvm working dir. However, {{getBaseTempDirForClass()}} now does asserts that check the dir exists, is a dir, and is writeable. Lucene uses {{.}} as {{java.io.tmpdir}}. Then in the test security manager, the per jvm cwd has read/write/execute permissions. However, this allows tests to write to their cwd, which I'm trying to protect against (by setting cwd to read/execute in my test security manager). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6053) 搜索文档总数不一致
[ https://issues.apache.org/jira/browse/SOLR-6053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998705#comment-13998705 ] Ahmet Arslan commented on SOLR-6053: Hi [~wanggang] can you explain in detail what the problem is here? It his a SolrJ thing? It this a solr-cloud setup? Can you update summary of the ticket in English language? 搜索文档总数不一致 - Key: SOLR-6053 URL: https://issues.apache.org/jira/browse/SOLR-6053 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 4.7 Environment: Centos Reporter: wanggang Priority: Critical Fix For: 4.7 Original Estimate: 0.05h Remaining Estimate: 0.05h http://192.168.3.21:8901/sentiment/search?q=%E6%B2%A5%E9%9D%92%E7%BD%90%E8%B5%B7%E7%81%ABhlfl=title,contenthlsimple=redstart=0rows=10 start 切换不同的数值就能看到效果了 As my in deep test find out, if the rows=0, the results size is consistently the total sum of the documents on all shards regardless there is any duplicates; if the rows is a number larger than the supposedly returned the merge document number, the result numFound is accurate and consistent, however, if the rows is with a number smaller than the supposedly merge results size, it will be non-deterministic. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-4x-Linux-Java7-64-test-only - Build # 21652 - Failure!
Build: builds.flonkings.com/job/Lucene-4x-Linux-Java7-64-test-only/21652/ 5 tests failed. REGRESSION: org.apache.lucene.index.TestIndexWriterMerging.testForceMergeDeletes Error Message: background merge hit exception: _0(4.9):c2/1:delGen=1 Stack Trace: java.io.IOException: background merge hit exception: _0(4.9):c2/1:delGen=1 at __randomizedtesting.SeedInfo.seed([9D5B5E4655F102B:3BA5E5425AD59004]:0) at org.apache.lucene.index.IndexWriter.forceMergeDeletes(IndexWriter.java:1915) at org.apache.lucene.index.IndexWriter.forceMergeDeletes(IndexWriter.java:1955) at org.apache.lucene.index.TestIndexWriterMerging.testForceMergeDeletes(TestIndexWriterMerging.java:170) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:793) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:453) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360) at java.lang.Thread.run(Thread.java:745) REGRESSION: org.apache.lucene.index.TestIndexWriterMerging.testForceMergeDeletes2 Error Message: background merge hit exception: _0(4.9):C2/1:delGen=1 _1(4.9):C2/1:delGen=1 _2(4.9):C2/1:delGen=1 into _1d Stack Trace: java.io.IOException: background merge hit exception: _0(4.9):C2/1:delGen=1
[jira] [Commented] (SOLR-5285) Solr response format should support child Docs
[ https://issues.apache.org/jira/browse/SOLR-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998844#comment-13998844 ] Raveendra Yerraguntl commented on SOLR-5285: Thanks Varun and all. Just in time. Is it possible to get into 4.8.* versions . It reduces lot of UI work. If not when is the 4.9 scheduled for a stable release? Solr response format should support child Docs -- Key: SOLR-5285 URL: https://issues.apache.org/jira/browse/SOLR-5285 Project: Solr Issue Type: New Feature Reporter: Varun Thacker Fix For: 4.9, 5.0 Attachments: SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, SOLR-5285.patch, javabin_backcompat_child_docs.bin Solr has added support for taking childDocs as input ( only XML till now ). It's currently used for BlockJoinQuery. I feel that if a user indexes a document with child docs, even if he isn't using the BJQ features and is just searching which results in a hit on the parentDoc, it's childDocs should be returned in the response format. [~hossman_luc...@fucit.org] on IRC suggested that the DocTransformers would be the place to add childDocs to the response. Now given a docId one needs to find out all the childDoc id's. A couple of approaches which I could think of are 1. Maintain the relation between a parentDoc and it's childDocs during indexing time in maybe a separate index? 2. Somehow emulate what happens in ToParentBlockJoinQuery.nextDoc() - Given a parentDoc it finds out all the childDocs but this requires a childScorer. Am I missing something obvious on how to find the relation between a parentDoc and it's childDocs because none of the above solutions for this look right. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5681) Make the OverseerCollectionProcessor multi-threaded
[ https://issues.apache.org/jira/browse/SOLR-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1311#comment-1311 ] Anshum Gupta commented on SOLR-5681: I would like to move ahead with committing this patch if I don't receive any feedback soon. Make the OverseerCollectionProcessor multi-threaded --- Key: SOLR-5681 URL: https://issues.apache.org/jira/browse/SOLR-5681 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Anshum Gupta Assignee: Anshum Gupta Attachments: SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch Right now, the OverseerCollectionProcessor is single threaded i.e submitting anything long running would have it block processing of other mutually exclusive tasks. When OCP tasks become optionally async (SOLR-5477), it'd be good to have truly non-blocking behavior by multi-threading the OCP itself. For example, a ShardSplit call on Collection1 would block the thread and thereby, not processing a create collection task (which would stay queued in zk) though both the tasks are mutually exclusive. Here are a few of the challenges: * Mutual exclusivity: Only let mutually exclusive tasks run in parallel. An easy way to handle that is to only let 1 task per collection run at a time. * ZK Distributed Queue to feed tasks: The OCP consumes tasks from a queue. The task from the workQueue is only removed on completion so that in case of a failure, the new Overseer can re-consume the same task and retry. A queue is not the right data structure in the first place to look ahead i.e. get the 2nd task from the queue when the 1st one is in process. Also, deleting tasks which are not at the head of a queue is not really an 'intuitive' thing. Proposed solutions for task management: * Task funnel and peekAfter(): The parent thread is responsible for getting and passing the request to a new thread (or one from the pool). The parent method uses a peekAfter(last element) instead of a peek(). The peekAfter returns the task after the 'last element'. Maintain this request information and use it for deleting/cleaning up the workQueue. * Another (almost duplicate) queue: While offering tasks to workQueue, also offer them to a new queue (call it volatileWorkQueue?). The difference is, as soon as a task from this is picked up for processing by the thread, it's removed from the queue. At the end, the cleanup is done from the workQueue. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5656) IndexWriter leaks CFS handles in some exceptional cases
[ https://issues.apache.org/jira/browse/LUCENE-5656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13992685#comment-13992685 ] ASF subversion and git services commented on LUCENE-5656: - Commit 1593241 from [~rcmuir] in branch 'dev/branches/lucene_solr_4_8' [ https://svn.apache.org/r1593241 ] LUCENE-5656: don't leak dv producers if one of them throws exception IndexWriter leaks CFS handles in some exceptional cases --- Key: LUCENE-5656 URL: https://issues.apache.org/jira/browse/LUCENE-5656 Project: Lucene - Core Issue Type: Bug Affects Versions: 4.8.1 Reporter: Robert Muir Fix For: 4.8.1, 4.9, 5.0 in trunk: ant test -Dtestcase=TestIndexWriterOutOfMemory -Dtests.method=testBasics -Dtests.seed=3D485DE153FCA22D -Dtests.nightly=true -Dtests.locale=no_NO -Dtests.timezone=CAT -Dtests.file.encoding=US-ASCII Seems to happen when an exception is thrown here: {noformat} [junit4] 1 java.lang.OutOfMemoryError: Fake OutOfMemoryError [junit4] 1 at org.apache.lucene.index.TestIndexWriterOutOfMemory$2.eval(TestIndexWriterOutOfMemory.java:117) [junit4] 1 at org.apache.lucene.store.MockDirectoryWrapper.maybeThrowDeterministicException(MockDirectoryWrapper.java:888) [junit4] 1 at org.apache.lucene.store.MockDirectoryWrapper.openInput(MockDirectoryWrapper.java:575) [junit4] 1 at org.apache.lucene.store.Directory.openChecksumInput(Directory.java:107) [junit4] 1 at org.apache.lucene.codecs.lucene45.Lucene45DocValuesProducer.init(Lucene45DocValuesProducer.java:84) [junit4] 1 at org.apache.lucene.codecs.lucene45.Lucene45DocValuesFormat.fieldsProducer(Lucene45DocValuesFormat.java:178) [junit4] 1 at org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsReader.init(PerFieldDocValuesFormat.java:232) [junit4] 1 at org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat.fieldsProducer(PerFieldDocValuesFormat.java:324) [junit4] 1 at org.apache.lucene.index.SegmentDocValues.newDocValuesProducer(SegmentDocValues.java:51) [junit4] 1 at org.apache.lucene.index.SegmentDocValues.getDocValuesProducer(SegmentDocValues.java:68) [junit4] 1 at org.apache.lucene.index.SegmentReader.initDocValuesProducers(SegmentReader.java:189) [junit4] 1 at org.apache.lucene.index.SegmentReader.init(SegmentReader.java:166) [junit4] 1 at org.apache.lucene.index.ReadersAndUpdates.writeFieldUpdates(ReadersAndUpdates.java:553) [junit4] 1 at org.apache.lucene.index.BufferedUpdatesStream.applyDeletesAndUpdates(BufferedUpdatesStream.java:230) [junit4] 1 at org.apache.lucene.index.IndexWriter.applyAllDeletesAndUpdates(IndexWriter.java:3086) [junit4] 1 at org.apache.lucene.index.IndexWriter.maybeApplyDeletes(IndexWriter.java:3077) [junit4] 1 at org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2791) [junit4] 1 at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2940) [junit4] 1 at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2907) {noformat} and the leak is from here: {noformat} [junit4] Caused by: java.lang.RuntimeException: unclosed IndexInput: _0_Asserting_0.dvd [junit4] at org.apache.lucene.store.MockDirectoryWrapper.addFileHandle(MockDirectoryWrapper.java:560) [junit4] at org.apache.lucene.store.MockDirectoryWrapper.openInput(MockDirectoryWrapper.java:604) [junit4] at org.apache.lucene.codecs.lucene45.Lucene45DocValuesProducer.init(Lucene45DocValuesProducer.java:116) [junit4] at org.apache.lucene.codecs.lucene45.Lucene45DocValuesFormat.fieldsProducer(Lucene45DocValuesFormat.java:178) [junit4] at org.apache.lucene.codecs.asserting.AssertingDocValuesFormat.fieldsProducer(AssertingDocValuesFormat.java:61) [junit4] at org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsReader.init(PerFieldDocValuesFormat.java:232) [junit4] at org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat.fieldsProducer(PerFieldDocValuesFormat.java:324) [junit4] at org.apache.lucene.index.SegmentDocValues.newDocValuesProducer(SegmentDocValues.java:51) [junit4] at org.apache.lucene.index.SegmentDocValues.getDocValuesProducer(SegmentDocValues.java:68) [junit4] at org.apache.lucene.index.SegmentReader.initDocValuesProducers(SegmentReader.java:189) [junit4] at org.apache.lucene.index.SegmentReader.init(SegmentReader.java:116) [junit4] at org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:133)
[jira] [Created] (SOLR-6083) Provide a way to list configurationsets in SolrCloud from the admin screen.
Erick Erickson created SOLR-6083: Summary: Provide a way to list configurationsets in SolrCloud from the admin screen. Key: SOLR-6083 URL: https://issues.apache.org/jira/browse/SOLR-6083 Project: Solr Issue Type: Improvement Reporter: Erick Erickson subtask of SOLR-6082. Set up a cluster with no collections (i.e. don't use the bootstrap convention of the getting started guide). Push a configuration set up to Solr via command-line. It would be nice to show a list of the available configuration sets. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5673) MMapDirectory shouldn't pass along OOM wrapped as IOException
[ https://issues.apache.org/jira/browse/LUCENE-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999807#comment-13999807 ] Uwe Schindler commented on LUCENE-5673: --- Should I backport to 4.8.1? MMapDirectory shouldn't pass along OOM wrapped as IOException - Key: LUCENE-5673 URL: https://issues.apache.org/jira/browse/LUCENE-5673 Project: Lucene - Core Issue Type: Bug Components: core/store Affects Versions: 4.8 Reporter: Robert Muir Assignee: Uwe Schindler Fix For: 4.9, 5.0 Attachments: LUCENE-5673.patch, LUCENE-5673.patch, LUCENE-5673.patch, LUCENE-5673.patch The bug here is in java (not MMapDir), but i think we shoudl do something. Users get confused when they configure their JVM to trigger something on OOM, and then see OutOfMemoryError: Map Failed: but their trigger doesnt fire. Thats because in the jdk, when it maps files it catches OutOfMemoryError, asks for a garbage collection, sleeps for 100 milliseconds, then tries to map again. if it fails a second time it wraps the OOM in a generic IOException. I think we should add a try/catch to our filechannel.map -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Reopened] (LUCENE-5644) ThreadAffinityDocumentsWriterThreadPool should clear the bindings on flush
[ https://issues.apache.org/jira/browse/LUCENE-5644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reopened LUCENE-5644: Reopening ... I realized the simple LIFO logic is too simple, right after a flush. When that happens, we should try to pick a ThreadState that's already initialized, so that if no full-flushing (getReader) is happening, we don't tie up RAM indefinitely in the pending segments. ThreadAffinityDocumentsWriterThreadPool should clear the bindings on flush -- Key: LUCENE-5644 URL: https://issues.apache.org/jira/browse/LUCENE-5644 Project: Lucene - Core Issue Type: Bug Components: core/index Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.8.1, 4.9, 5.0 Attachments: LUCENE-5644.patch, LUCENE-5644.patch, LUCENE-5644.patch, LUCENE-5644.patch, LUCENE-5644.patch This class remembers which thread used which DWPT, but it never clears this affinity. It really should clear it on flush, this way if the number of threads doing indexing has changed we only use as many DWPTs as there are incoming threads. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6080) Getting Finished Docs
Grant Ingersoll created SOLR-6080: - Summary: Getting Finished Docs Key: SOLR-6080 URL: https://issues.apache.org/jira/browse/SOLR-6080 Project: Solr Issue Type: Sub-task Reporter: Grant Ingersoll It's one thing to be easy to start, it's a whole other level to get finished, and getting to production and being stable is one of Solr's strongest suits, thanks to it's maturity and testing. This section of the website should highlight what user's need to know about getting to production and maintaining a happy cluster. This can dovetail with the Ref Guide's Well configured Solr section -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5681) Make the OverseerCollectionProcessor multi-threaded
[ https://issues.apache.org/jira/browse/SOLR-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998700#comment-13998700 ] Shalin Shekhar Mangar commented on SOLR-5681: - More comments: # DistributedQueue.peekTopN should count stats in the same way as peek() does by using “peekN_wait_forever” and “peekN_wait_” + wait. # DistributedQueue.peekTopN is still not correct. Suppose orderedChildren returns 0 nodes, the childWatcher.await will be called, thread will wait and immediately return 0 results even if children were available. So there was no point in waiting at all if we were going to return 0 results. # The same thing happens later in DQ.peekTopN after the loop. There’s no point in calling await if we’re going to return null anyway. {code} childWatcher.await(wait == Long.MAX_VALUE ? DEFAULT_TIMEOUT : wait); waitedEnough = wait != Long.MAX_VALUE; if (waitedEnough) { return null; } {code} # The DQ.getTailId (renamed from getLastElementId) still has an empty catch block for KeeperException. # We should probably add unit test for the DQ.peekTopN method. Make the OverseerCollectionProcessor multi-threaded --- Key: SOLR-5681 URL: https://issues.apache.org/jira/browse/SOLR-5681 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Anshum Gupta Assignee: Anshum Gupta Attachments: SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch Right now, the OverseerCollectionProcessor is single threaded i.e submitting anything long running would have it block processing of other mutually exclusive tasks. When OCP tasks become optionally async (SOLR-5477), it'd be good to have truly non-blocking behavior by multi-threading the OCP itself. For example, a ShardSplit call on Collection1 would block the thread and thereby, not processing a create collection task (which would stay queued in zk) though both the tasks are mutually exclusive. Here are a few of the challenges: * Mutual exclusivity: Only let mutually exclusive tasks run in parallel. An easy way to handle that is to only let 1 task per collection run at a time. * ZK Distributed Queue to feed tasks: The OCP consumes tasks from a queue. The task from the workQueue is only removed on completion so that in case of a failure, the new Overseer can re-consume the same task and retry. A queue is not the right data structure in the first place to look ahead i.e. get the 2nd task from the queue when the 1st one is in process. Also, deleting tasks which are not at the head of a queue is not really an 'intuitive' thing. Proposed solutions for task management: * Task funnel and peekAfter(): The parent thread is responsible for getting and passing the request to a new thread (or one from the pool). The parent method uses a peekAfter(last element) instead of a peek(). The peekAfter returns the task after the 'last element'. Maintain this request information and use it for deleting/cleaning up the workQueue. * Another (almost duplicate) queue: While offering tasks to workQueue, also offer them to a new queue (call it volatileWorkQueue?). The difference is, as soon as a task from this is picked up for processing by the thread, it's removed from the queue. At the end, the cleanup is done from the workQueue. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6062) Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query)
[ https://issues.apache.org/jira/browse/SOLR-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999275#comment-13999275 ] Michael Dodsworth commented on SOLR-6062: - all comments and feedback welcome. Phrase queries are created for each field supplied through edismax's pf, pf2 and pf3 parameters (rather them being combined in a single dismax query) - Key: SOLR-6062 URL: https://issues.apache.org/jira/browse/SOLR-6062 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 4.0 Reporter: Michael Dodsworth Priority: Minor Attachments: combined-phrased-dismax.patch https://issues.apache.org/jira/browse/SOLR-2058 subtly changed how phrase queries, created through the pf, pf2 and pf3 parameters, are merged into the main user query. For the query: 'term1 term2' with pf2:[field1, field2, field3] we now get (omitting the non phrase query section for clarity): {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field2:term1 term2^1.0)~0.1) DisjunctionMaxQuery((field3:term1 term2^1.0)~0.1) {code} Prior to this change, we had: {code:java} main query DisjunctionMaxQuery((field1:term1 term2^1.0 | field2:term1 term2^1.0 | field3:term1 term2^1.0)~0.1) {code} The upshot being that if the phrase query term1 term2 appears in multiple fields, it will get a significant boost over the previous implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5673) MmapDirectory shouldn't pass along OOM wrapped as IOException
[ https://issues.apache.org/jira/browse/LUCENE-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-5673: -- Attachment: (was: LUCENE-5673.patch) MmapDirectory shouldn't pass along OOM wrapped as IOException - Key: LUCENE-5673 URL: https://issues.apache.org/jira/browse/LUCENE-5673 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-5673.patch, LUCENE-5673.patch The bug here is in java (not MMapDir), but i think we shoudl do something. Users get confused when they configure their JVM to trigger something on OOM, and then see OutOfMemoryError: Map Failed: but their trigger doesnt fire. Thats because in the jdk, when it maps files it catches OutOfMemoryError, asks for a garbage collection, sleeps for 100 milliseconds, then tries to map again. if it fails a second time it wraps the OOM in a generic IOException. I think we should add a try/catch to our filechannel.map -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-trunk-Linux-Java7-64-test-only - Build # 85328 - Still Failing!
Build: builds.flonkings.com/job/Lucene-trunk-Linux-Java7-64-test-only/85328/ 5 tests failed. FAILED: org.apache.lucene.index.TestIndexWriterMerging.testForceMergeDeletes Error Message: background merge hit exception: _0(5.0):C2/1:delGen=1 into _5 Stack Trace: java.io.IOException: background merge hit exception: _0(5.0):C2/1:delGen=1 into _5 at __randomizedtesting.SeedInfo.seed([77E3C2EF41ED5F6B:459392497E67DF44]:0) at org.apache.lucene.index.IndexWriter.forceMergeDeletes(IndexWriter.java:1807) at org.apache.lucene.index.IndexWriter.forceMergeDeletes(IndexWriter.java:1847) at org.apache.lucene.index.TestIndexWriterMerging.testForceMergeDeletes(TestIndexWriterMerging.java:171) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:793) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:453) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:360) at java.lang.Thread.run(Thread.java:745) FAILED: org.apache.lucene.index.TestIndexWriterMerging.testForceMergeDeletes2 Error Message: background merge hit exception: _0(5.0):c2/1:delGen=1 _1(5.0):c2/1:delGen=1 _2(5.0):c2/1:delGen=1 into _1d Stack Trace: java.io.IOException: background merge hit exception: _0(5.0):c2/1:delGen=1
[jira] [Comment Edited] (SOLR-5681) Make the OverseerCollectionProcessor multi-threaded
[ https://issues.apache.org/jira/browse/SOLR-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999323#comment-13999323 ] Anshum Gupta edited comment on SOLR-5681 at 5/15/14 10:55 PM: -- Added a new test that tests that a short running task (OVERSEERSTATUS) fired after a long running SHARDSPLIT returns before the completion of the latter. Also, moved the invokeCollectionApi() from OverseerStatus test to the parent AbstractFullDistribZkTestBase test as that method is useful for other tests too. was (Author: anshumg): Added a new test that tests that a short running task (OVERSEERSTATUS) fired after a long running SHARDSPLIT returns before the completion of the latter. Make the OverseerCollectionProcessor multi-threaded --- Key: SOLR-5681 URL: https://issues.apache.org/jira/browse/SOLR-5681 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Anshum Gupta Assignee: Anshum Gupta Attachments: SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch Right now, the OverseerCollectionProcessor is single threaded i.e submitting anything long running would have it block processing of other mutually exclusive tasks. When OCP tasks become optionally async (SOLR-5477), it'd be good to have truly non-blocking behavior by multi-threading the OCP itself. For example, a ShardSplit call on Collection1 would block the thread and thereby, not processing a create collection task (which would stay queued in zk) though both the tasks are mutually exclusive. Here are a few of the challenges: * Mutual exclusivity: Only let mutually exclusive tasks run in parallel. An easy way to handle that is to only let 1 task per collection run at a time. * ZK Distributed Queue to feed tasks: The OCP consumes tasks from a queue. The task from the workQueue is only removed on completion so that in case of a failure, the new Overseer can re-consume the same task and retry. A queue is not the right data structure in the first place to look ahead i.e. get the 2nd task from the queue when the 1st one is in process. Also, deleting tasks which are not at the head of a queue is not really an 'intuitive' thing. Proposed solutions for task management: * Task funnel and peekAfter(): The parent thread is responsible for getting and passing the request to a new thread (or one from the pool). The parent method uses a peekAfter(last element) instead of a peek(). The peekAfter returns the task after the 'last element'. Maintain this request information and use it for deleting/cleaning up the workQueue. * Another (almost duplicate) queue: While offering tasks to workQueue, also offer them to a new queue (call it volatileWorkQueue?). The difference is, as soon as a task from this is picked up for processing by the thread, it's removed from the queue. At the end, the cleanup is done from the workQueue. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5673) MmapDirectory shouldn't pass along OOM wrapped as IOException
[ https://issues.apache.org/jira/browse/LUCENE-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999776#comment-13999776 ] Uwe Schindler commented on LUCENE-5673: --- I tried the latest patch with Linux 32 bit and an {{ulimit -v 1_000_000}}: {noformat} [junit4] ERROR132s | Test4GBStoredFields.test [junit4] Throwable #1: java.io.IOException: Map failed: MMapIndexInput(path=/media/sf_Host/Projects/lucene/trunk-lusolr1/lucene/build/core/test/J0/lucene.index.Test4GBStoredFields-C16129C282E2746E-001/4GBStoredFields-001/_0.fdt) [this may be caused by lack of enough unfragmented virtual address space or too restrictive virtual memory limits enforced by the operating system, preventing us to map a chunk of 268435456 bytes. MMapDirectory should only be used on 64bit platforms, because the address space on 32bit operating systems is too small. More information: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html] [junit4]at __randomizedtesting.SeedInfo.seed([C16129C282E2746E:493516182C1E1996]:0) [junit4]at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:888) [junit4]at org.apache.lucene.store.MMapDirectory.map(MMapDirectory.java:271) [junit4]at org.apache.lucene.store.MMapDirectory$MMapIndexInput.init(MMapDirectory.java:221) [junit4]at org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:196) [junit4]at org.apache.lucene.store.Directory.copy(Directory.java:187) [junit4]at org.apache.lucene.store.MockDirectoryWrapper.copy(MockDirectoryWrapper.java:947) [junit4]at org.apache.lucene.store.TrackingDirectoryWrapper.copy(TrackingDirectoryWrapper.java:50) [junit4]at org.apache.lucene.index.IndexWriter.createCompoundFile(IndexWriter.java:4504) [junit4]at org.apache.lucene.index.DocumentsWriterPerThread.sealFlushedSegment(DocumentsWriterPerThread.java:485) [junit4]at org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:452) [junit4]at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:518) [junit4]at org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:629) [junit4]at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3042) [junit4]at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3018) [junit4]at org.apache.lucene.index.IndexWriter.forceMerge(IndexWriter.java:1671) [junit4]at org.apache.lucene.index.IndexWriter.forceMerge(IndexWriter.java:1647) [junit4]at org.apache.lucene.index.Test4GBStoredFields.test(Test4GBStoredFields.java:83) [junit4]at java.lang.Thread.run(Thread.java:744) {noformat} MmapDirectory shouldn't pass along OOM wrapped as IOException - Key: LUCENE-5673 URL: https://issues.apache.org/jira/browse/LUCENE-5673 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-5673.patch, LUCENE-5673.patch, LUCENE-5673.patch, LUCENE-5673.patch The bug here is in java (not MMapDir), but i think we shoudl do something. Users get confused when they configure their JVM to trigger something on OOM, and then see OutOfMemoryError: Map Failed: but their trigger doesnt fire. Thats because in the jdk, when it maps files it catches OutOfMemoryError, asks for a garbage collection, sleeps for 100 milliseconds, then tries to map again. if it fails a second time it wraps the OOM in a generic IOException. I think we should add a try/catch to our filechannel.map -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5681) Make the OverseerCollectionProcessor multi-threaded
[ https://issues.apache.org/jira/browse/SOLR-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998546#comment-13998546 ] Noble Paul commented on SOLR-5681: -- DistributedQueue.peekTopN returning List and null both is not required. Always return a non null List . No need to do null checks unnecessarily Make the OverseerCollectionProcessor multi-threaded --- Key: SOLR-5681 URL: https://issues.apache.org/jira/browse/SOLR-5681 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Anshum Gupta Assignee: Anshum Gupta Attachments: SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch Right now, the OverseerCollectionProcessor is single threaded i.e submitting anything long running would have it block processing of other mutually exclusive tasks. When OCP tasks become optionally async (SOLR-5477), it'd be good to have truly non-blocking behavior by multi-threading the OCP itself. For example, a ShardSplit call on Collection1 would block the thread and thereby, not processing a create collection task (which would stay queued in zk) though both the tasks are mutually exclusive. Here are a few of the challenges: * Mutual exclusivity: Only let mutually exclusive tasks run in parallel. An easy way to handle that is to only let 1 task per collection run at a time. * ZK Distributed Queue to feed tasks: The OCP consumes tasks from a queue. The task from the workQueue is only removed on completion so that in case of a failure, the new Overseer can re-consume the same task and retry. A queue is not the right data structure in the first place to look ahead i.e. get the 2nd task from the queue when the 1st one is in process. Also, deleting tasks which are not at the head of a queue is not really an 'intuitive' thing. Proposed solutions for task management: * Task funnel and peekAfter(): The parent thread is responsible for getting and passing the request to a new thread (or one from the pool). The parent method uses a peekAfter(last element) instead of a peek(). The peekAfter returns the task after the 'last element'. Maintain this request information and use it for deleting/cleaning up the workQueue. * Another (almost duplicate) queue: While offering tasks to workQueue, also offer them to a new queue (call it volatileWorkQueue?). The difference is, as soon as a task from this is picked up for processing by the thread, it's removed from the queue. At the end, the cleanup is done from the workQueue. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5283) Fail the build if ant test didn't execute any tests (everything filtered out).
[ https://issues.apache.org/jira/browse/LUCENE-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999669#comment-13999669 ] Dawid Weiss commented on LUCENE-5283: - I'll look into it again though. I'll see what's possible. Fail the build if ant test didn't execute any tests (everything filtered out). -- Key: LUCENE-5283 URL: https://issues.apache.org/jira/browse/LUCENE-5283 Project: Lucene - Core Issue Type: Wish Reporter: Dawid Weiss Assignee: Dawid Weiss Priority: Trivial Fix For: 4.6, 5.0 Attachments: LUCENE-5283-permgen.patch, LUCENE-5283.patch, LUCENE-5283.patch, LUCENE-5283.patch This should be an optional setting that defaults to 'false' (the build proceeds). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5666) Add UninvertingReader
[ https://issues.apache.org/jira/browse/LUCENE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1357#comment-1357 ] ASF subversion and git services commented on LUCENE-5666: - Commit 1595259 from [~rcmuir] in branch 'dev/trunk' [ https://svn.apache.org/r1595259 ] LUCENE-5666: Add UninvertingReader Add UninvertingReader - Key: LUCENE-5666 URL: https://issues.apache.org/jira/browse/LUCENE-5666 Project: Lucene - Core Issue Type: Improvement Reporter: Robert Muir Fix For: 5.0 Attachments: LUCENE-5666.patch Currently the fieldcache is not pluggable at all. It would be better if everything used the docvalues apis. This would allow people to customize the implementation, extend the classes with custom subclasses with additional stuff, etc etc. FieldCache can be accessed via the docvalues apis, using the FilterReader api. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6083) Provide a way to list configurationsets in SolrCloud from the admin screen.
[ https://issues.apache.org/jira/browse/SOLR-6083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1355#comment-1355 ] Shalin Shekhar Mangar commented on SOLR-6083: - I was just thinking about it the other day. Thanks Erick. Should we fold this into the clusterstatus API? I have already opened issues to add the roles and live nodes information to the clusterstatus API. Provide a way to list configurationsets in SolrCloud from the admin screen. --- Key: SOLR-6083 URL: https://issues.apache.org/jira/browse/SOLR-6083 Project: Solr Issue Type: Improvement Reporter: Erick Erickson subtask of SOLR-6082. Set up a cluster with no collections (i.e. don't use the bootstrap convention of the getting started guide). Push a configuration set up to Solr via command-line. It would be nice to show a list of the available configuration sets. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5681) Make the OverseerCollectionProcessor multi-threaded
[ https://issues.apache.org/jira/browse/SOLR-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anshum Gupta updated SOLR-5681: --- Attachment: SOLR-5681-2.patch Patch that addresses all of the things you had recommended. Here's a summary of the changes: * Synchronized variables are fixed. They are explicitly handled now. * Renamed the inner class to Runner. * Failed task behavior switched back to what it should be like i.e. the OCP retries. * Stats handling fixed. * Close is now handled gracefully. Also, there's a call to OCP.close in the finally block for the main OCP thread run(). * The threadpool is no longer created in the constructor but in the run method. Make the OverseerCollectionProcessor multi-threaded --- Key: SOLR-5681 URL: https://issues.apache.org/jira/browse/SOLR-5681 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Anshum Gupta Assignee: Anshum Gupta Attachments: SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch Right now, the OverseerCollectionProcessor is single threaded i.e submitting anything long running would have it block processing of other mutually exclusive tasks. When OCP tasks become optionally async (SOLR-5477), it'd be good to have truly non-blocking behavior by multi-threading the OCP itself. For example, a ShardSplit call on Collection1 would block the thread and thereby, not processing a create collection task (which would stay queued in zk) though both the tasks are mutually exclusive. Here are a few of the challenges: * Mutual exclusivity: Only let mutually exclusive tasks run in parallel. An easy way to handle that is to only let 1 task per collection run at a time. * ZK Distributed Queue to feed tasks: The OCP consumes tasks from a queue. The task from the workQueue is only removed on completion so that in case of a failure, the new Overseer can re-consume the same task and retry. A queue is not the right data structure in the first place to look ahead i.e. get the 2nd task from the queue when the 1st one is in process. Also, deleting tasks which are not at the head of a queue is not really an 'intuitive' thing. Proposed solutions for task management: * Task funnel and peekAfter(): The parent thread is responsible for getting and passing the request to a new thread (or one from the pool). The parent method uses a peekAfter(last element) instead of a peek(). The peekAfter returns the task after the 'last element'. Maintain this request information and use it for deleting/cleaning up the workQueue. * Another (almost duplicate) queue: While offering tasks to workQueue, also offer them to a new queue (call it volatileWorkQueue?). The difference is, as soon as a task from this is picked up for processing by the thread, it's removed from the queue. At the end, the cleanup is done from the workQueue. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6055) TestMiniSolrCloudCluster has data dir in test's CWD
[ https://issues.apache.org/jira/browse/SOLR-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999523#comment-13999523 ] Ryan Ernst commented on SOLR-6055: -- I successfully fixed this by adding a separate ulogDir to SolrCore, and then making sure this is absolute in the local filesystem (not using directory factory). See LUCENE-5650. I can put it into a separate patch if you think it is important, it was just easier to verify on that other issue, and also required a change to the base temp dir to be absolute. TestMiniSolrCloudCluster has data dir in test's CWD --- Key: SOLR-6055 URL: https://issues.apache.org/jira/browse/SOLR-6055 Project: Solr Issue Type: Bug Reporter: Ryan Ernst While investigating one of the test failures created when tightening test permissions to restrict write access to CWD (see LUCENE-5650), I've found {{TestMiniSolrCloudCluster}} is attemping to write transaction logs to {{$CWD/data/tlog}}. I've traced this down to two things which are happening: # The test uses {{RAMDirectoryFactory}}, which always returns true for {{isAbsolute}}. This causes the directory factory to *not* adjust the default relative to bring it under the instance dir. # The {{UpdateLog}} creates its tlog file with the relative data dir. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5667) Optimize common-prefix across all terms in a field
[ https://issues.apache.org/jira/browse/LUCENE-5667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998645#comment-13998645 ] ASF subversion and git services commented on LUCENE-5667: - Commit 1594846 from [~mikemccand] in branch 'dev/trunk' [ https://svn.apache.org/r1594846 ] LUCENE-5667: add test case Optimize common-prefix across all terms in a field -- Key: LUCENE-5667 URL: https://issues.apache.org/jira/browse/LUCENE-5667 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.9, 5.0 I tested different UUID sources in Lucene http://blog.mikemccandless.com/2014/05/choosing-fast-unique-identifier-uuid.html and I was surprised to see that Flake IDs were slower than UUID V1. They use the same raw sources of info (timestamp, node id, sequence counter) but Flake ID preserves total order by keeping the timestamp intact in the leading 64 bits. I think the reason might be because a Flake ID will typically have a longish common prefix for all docs, and I think we might be able to optimize this in block-tree by storing that common prefix outside of the FST, or maybe just pre-computing the common prefix on init and storing the effective start node for the FST. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.7.0) - Build # 1577 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/1577/ Java: 64bit/jdk1.7.0 -XX:-UseCompressedOops -XX:+UseParallelGC All tests passed Build Log: [...truncated 11034 lines...] [junit4] JVM J0: stdout was not empty, see: /Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-core/test/temp/junit4-J0-20140515_231315_651.sysout [junit4] JVM J0: stdout (verbatim) [junit4] # [junit4] # A fatal error has been detected by the Java Runtime Environment: [junit4] # [junit4] # SIGSEGV (0xb) at pc=0x00010b37d250, pid=216, tid=104467 [junit4] # [junit4] # JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build 1.7.0_55-b13) [junit4] # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode bsd-amd64 ) [junit4] # Problematic frame: [junit4] # C [libjava.dylib+0x9250] JNU_NewStringPlatform+0x1c8 [junit4] # [junit4] # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try ulimit -c unlimited before starting Java again [junit4] # [junit4] # An error report file with more information is saved as: [junit4] # /Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-core/test/J0/hs_err_pid216.log [junit4] # [junit4] # If you would like to submit a bug report, please visit: [junit4] # http://bugreport.sun.com/bugreport/crash.jsp [junit4] # The crash happened outside the Java Virtual Machine in native code. [junit4] # See problematic frame for where to report the bug. [junit4] # [junit4] JVM J0: EOF [...truncated 1 lines...] [junit4] ERROR: JVM J0 ended with an exception, command line: /Library/Java/JavaVirtualMachines/jdk1.7.0_55.jdk/Contents/Home/jre/bin/java -XX:-UseCompressedOops -XX:+UseParallelGC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/heapdumps -Dtests.prefix=tests -Dtests.seed=EECA8FC0566C5F82 -Xmx512M -Dtests.iters= -Dtests.verbose=false -Dtests.infostream=false -Dtests.codec=random -Dtests.postingsformat=random -Dtests.docvaluesformat=random -Dtests.locale=random -Dtests.timezone=random -Dtests.directory=random -Dtests.linedocsfile=europarl.lines.txt.gz -Dtests.luceneMatchVersion=5.0 -Dtests.cleanthreads=perClass -Djava.util.logging.config.file=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/logging.properties -Dtests.nightly=false -Dtests.weekly=false -Dtests.monster=false -Dtests.slow=true -Dtests.asserts.gracious=false -Dtests.multiplier=1 -DtempDir=. -Djava.io.tmpdir=. -Djunit4.tempDir=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-core/test/temp -Dclover.db.dir=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/clover/db -Djava.security.manager=org.apache.lucene.util.TestSecurityManager -Djava.security.policy=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/tests.policy -Dlucene.version=5.0-SNAPSHOT -Djetty.testMode=1 -Djetty.insecurerandom=1 -Dsolr.directoryFactory=org.apache.solr.core.MockDirectoryFactory -Djava.awt.headless=true -Djdk.map.althashing.threshold=0 -Dtests.leaveTemporary=false -Dtests.filterstacks=true -Dtests.disableHdfs=true -classpath
[jira] [Commented] (SOLR-6057) Duplicate background-color in #content #analysis #analysis-result .match (analysis.css)
[ https://issues.apache.org/jira/browse/SOLR-6057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1391#comment-1391 ] Al Krinker commented on SOLR-6057: -- [~steffkes] - Yeah, the big thing was inability to see the highlights, and the duplicate was me just trying to keep the code clean :) I personally like the design since it is simple. Btw is it possible to add me to the contrib group so I can remove the duplicate and see if I can pitch in by taking care of some of the bugs in Jira? Duplicate background-color in #content #analysis #analysis-result .match (analysis.css) --- Key: SOLR-6057 URL: https://issues.apache.org/jira/browse/SOLR-6057 Project: Solr Issue Type: Bug Reporter: Al Krinker Priority: Trivial Inside of solr/webapp/web/css/styles/analysis.css, you can find #content #analysis #analysis-result .match element with following content: #content #analysis #analysis-result .match { background-color: #e9eff7; background-color: #f2f2ff; } background-color listed twice. Also, it was very hard for me to see the highlight. Recommend to change it to background-color: #FF; -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4370) Let Collector know when all docs have been collected
[ https://issues.apache.org/jira/browse/LUCENE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999732#comment-13999732 ] Shikhar Bhushan commented on LUCENE-4370: - Been thinking about the semantics of these done callbacks not being invoked in case of exceptions which was a concern raised by [~jpountz] in LUCENE-5527, this seems to be not very helpful when e.g. you have a TimeExceededException or EarlyTerminatingCollectorException thrown and you need to maybe merge in some state into the parent collector in {{LeafCollector.leafDone()}}, or perhaps finalize results in {{Collector.done()}}. Maybe we need a special kind of exception, just like CollectionTerminatedException. The semantics for CollectionTerminatedException are currently that collection continues with the next leaf. So some new base-class for the rethrow me but invoke done callbacks case? In case of any other kinds of exception like IOException, I don't think we should be invoking done() callbacks because the collector's results should not be expected to be usable. Let Collector know when all docs have been collected Key: LUCENE-4370 URL: https://issues.apache.org/jira/browse/LUCENE-4370 Project: Lucene - Core Issue Type: Improvement Components: core/search Affects Versions: 4.0-BETA Reporter: Tomás Fernández Löbbe Priority: Minor Attachments: LUCENE-4370.patch, LUCENE-4370.patch Collectors are a good point for extension/customization of Lucene/Solr, however sometimes it's necessary to know when the last document has been collected (for example, for flushing cached data). It would be nice to have a method that gets called after the last doc has been collected. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5966) Admin UI - menu is fixed, doesn't respect smaller viewports
[ https://issues.apache.org/jira/browse/SOLR-5966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1366#comment-1366 ] Stefan Matheis (steffkes) commented on SOLR-5966: - so we're good to go? i'll commit this ofter the weekend then :) Admin UI - menu is fixed, doesn't respect smaller viewports --- Key: SOLR-5966 URL: https://issues.apache.org/jira/browse/SOLR-5966 Project: Solr Issue Type: Bug Components: web gui Environment: Operating system: windows 7 64-bit, hard disk - 320GB, Memory - 3GB Reporter: Aman Tandon Priority: Minor Attachments: SOLR-5966.patch I am a window 7 user, i am new in solr, i downloaded the setup for solr 4.7.1 and when i start the server and opened the admin interface using this url: http://localhost:8983/solr/#/collection1, then I noticed that on selecting the collection1 from cores menu, I was unable to view the full list for collection1. Please find this google doc link https://drive.google.com/file/d/0B5GzwVkR3aDzNzJheHVmWFRFYzA/edit?usp=sharing containing the screenshot. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5966) Admin UI - menu is fixed, doesn't respect smaller viewports
[ https://issues.apache.org/jira/browse/SOLR-5966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Matheis (steffkes) updated SOLR-5966: Affects Version/s: 4.8 4.3 4.4 4.5 4.6 4.7 Fix Version/s: 5.0 4.9 Assignee: Stefan Matheis (steffkes) Admin UI - menu is fixed, doesn't respect smaller viewports --- Key: SOLR-5966 URL: https://issues.apache.org/jira/browse/SOLR-5966 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 4.3, 4.4, 4.5, 4.6, 4.7, 4.8 Environment: Operating system: windows 7 64-bit, hard disk - 320GB, Memory - 3GB Reporter: Aman Tandon Assignee: Stefan Matheis (steffkes) Priority: Minor Fix For: 4.9, 5.0 Attachments: SOLR-5966.patch I am a window 7 user, i am new in solr, i downloaded the setup for solr 4.7.1 and when i start the server and opened the admin interface using this url: http://localhost:8983/solr/#/collection1, then I noticed that on selecting the collection1 from cores menu, I was unable to view the full list for collection1. Please find this google doc link https://drive.google.com/file/d/0B5GzwVkR3aDzNzJheHVmWFRFYzA/edit?usp=sharing containing the screenshot. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6075) CoreAdminHandler should synchronize while adding a task to the tracking map
[ https://issues.apache.org/jira/browse/SOLR-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999464#comment-13999464 ] Anshum Gupta commented on SOLR-6075: The addTask(String map, TaskObject o, boolean limit) method reads in one statement and removes in the next. That was the reason I thought we should synchronize it. Collections.synchronizedMap ensures that statements like put(), remove() are thread-safe. There's no guarantee on a bunch of statements comprising of read/update operations together being thread-safe. Also, you're correct, I accidentally synchronized it on the string. Thanks for noticing that! CoreAdminHandler should synchronize while adding a task to the tracking map --- Key: SOLR-6075 URL: https://issues.apache.org/jira/browse/SOLR-6075 Project: Solr Issue Type: Improvement Reporter: Anshum Gupta Assignee: Anshum Gupta Priority: Minor Fix For: 4.9, 5.0 Attachments: SOLR-6075.patch, SOLR-6075.patch CoreAdminHandler should synchronize on the tracker maps when adding a task. It's a rather nasty bug and we should get this in asap. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6087) SolrIndexSearcher makes no DelegatingCollector.finish() call when IndexSearcher throws an expected exception.
[ https://issues.apache.org/jira/browse/SOLR-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1398#comment-1398 ] Christine Poerschke commented on SOLR-6087: --- https://github.com/bloomberg/lucene-solr/pull/1 has proposed change. SolrIndexSearcher makes no DelegatingCollector.finish() call when IndexSearcher throws an expected exception. - Key: SOLR-6087 URL: https://issues.apache.org/jira/browse/SOLR-6087 Project: Solr Issue Type: Bug Reporter: Christine Poerschke Priority: Minor This seems like an omission. github pull request with proposed change to follow. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5650) createTempDir and associated functions no longer create java.io.tmpdir
[ https://issues.apache.org/jira/browse/LUCENE-5650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Ernst updated LUCENE-5650: --- Attachment: LUCENE-5650.patch New patch. All tests pass. I fixed SOLR-6055 in this by adding a separate update log dir to SolrCore, which is independently forced to be absolute (for the local filesystem, not the DirectoryFactory). I also made the javaTempDir in the new test rule always absolute so that the ulog isAbsolute check would work. createTempDir and associated functions no longer create java.io.tmpdir -- Key: LUCENE-5650 URL: https://issues.apache.org/jira/browse/LUCENE-5650 Project: Lucene - Core Issue Type: Improvement Components: general/test Reporter: Ryan Ernst Assignee: Dawid Weiss Priority: Minor Fix For: 4.9, 5.0 Attachments: LUCENE-5650.patch, LUCENE-5650.patch, LUCENE-5650.patch, LUCENE-5650.patch The recent refactoring to all the create temp file/dir functions (which is great!) has a minor regression from what existed before. With the old {{LuceneTestCase.TEMP_DIR}}, the directory was created if it did not exist. So, if you set {{java.io.tmpdir}} to {{./temp}}, then it would create that dir within the per jvm working dir. However, {{getBaseTempDirForClass()}} now does asserts that check the dir exists, is a dir, and is writeable. Lucene uses {{.}} as {{java.io.tmpdir}}. Then in the test security manager, the per jvm cwd has read/write/execute permissions. However, this allows tests to write to their cwd, which I'm trying to protect against (by setting cwd to read/execute in my test security manager). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5673) MmapDirectory shouldn't pass along OOM wrapped as IOException
[ https://issues.apache.org/jira/browse/LUCENE-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999129#comment-13999129 ] Robert Muir commented on LUCENE-5673: - Wait, we absolutely don't want the original text. This is the whole thing that causes confusion, the whole reason I opened this issue. The whole reason its confusing is because its OutOfMemoryError: Map failed. Why can't it just start with Map failed If you want the text to say OutOfMemoryError, then please, unwrap the OutOfMemoryError from the IOE and throw that. MmapDirectory shouldn't pass along OOM wrapped as IOException - Key: LUCENE-5673 URL: https://issues.apache.org/jira/browse/LUCENE-5673 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Attachments: LUCENE-5673.patch, LUCENE-5673.patch The bug here is in java (not MMapDir), but i think we shoudl do something. Users get confused when they configure their JVM to trigger something on OOM, and then see OutOfMemoryError: Map Failed: but their trigger doesnt fire. Thats because in the jdk, when it maps files it catches OutOfMemoryError, asks for a garbage collection, sleeps for 100 milliseconds, then tries to map again. if it fails a second time it wraps the OOM in a generic IOException. I think we should add a try/catch to our filechannel.map -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5681) Make the OverseerCollectionProcessor multi-threaded
[ https://issues.apache.org/jira/browse/SOLR-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anshum Gupta updated SOLR-5681: --- Attachment: SOLR-5681-2.patch Made 'stale' in OCP a local variable and documented the use of the variable in the code. Make the OverseerCollectionProcessor multi-threaded --- Key: SOLR-5681 URL: https://issues.apache.org/jira/browse/SOLR-5681 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Anshum Gupta Assignee: Anshum Gupta Attachments: SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681-2.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch, SOLR-5681.patch Right now, the OverseerCollectionProcessor is single threaded i.e submitting anything long running would have it block processing of other mutually exclusive tasks. When OCP tasks become optionally async (SOLR-5477), it'd be good to have truly non-blocking behavior by multi-threading the OCP itself. For example, a ShardSplit call on Collection1 would block the thread and thereby, not processing a create collection task (which would stay queued in zk) though both the tasks are mutually exclusive. Here are a few of the challenges: * Mutual exclusivity: Only let mutually exclusive tasks run in parallel. An easy way to handle that is to only let 1 task per collection run at a time. * ZK Distributed Queue to feed tasks: The OCP consumes tasks from a queue. The task from the workQueue is only removed on completion so that in case of a failure, the new Overseer can re-consume the same task and retry. A queue is not the right data structure in the first place to look ahead i.e. get the 2nd task from the queue when the 1st one is in process. Also, deleting tasks which are not at the head of a queue is not really an 'intuitive' thing. Proposed solutions for task management: * Task funnel and peekAfter(): The parent thread is responsible for getting and passing the request to a new thread (or one from the pool). The parent method uses a peekAfter(last element) instead of a peek(). The peekAfter returns the task after the 'last element'. Maintain this request information and use it for deleting/cleaning up the workQueue. * Another (almost duplicate) queue: While offering tasks to workQueue, also offer them to a new queue (call it volatileWorkQueue?). The difference is, as soon as a task from this is picked up for processing by the thread, it's removed from the queue. At the end, the cleanup is done from the workQueue. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6074) Field names with colons doesn't work on the query screen on the web UI
[ https://issues.apache.org/jira/browse/SOLR-6074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999809#comment-13999809 ] Jan Høydahl commented on SOLR-6074: --- There are still no well defined list of characters formally supported in field names. Normally you're advised to stick to a-z0-9 as well as underscore or dash, but it will most often work with a lot of strange chars, but you may see some components fail while other works. Some known limitation is e.g. spaces not allowed in function queries, and fields cannot start with - since that will be parsed as NOT in a query etc etc. I'm pretty sure that : will cause trouble in far more places than the UI, so simply stay away from it. Perhaps this JIRA should result in an updated documentation on supported field name recommendation? Field names with colons doesn't work on the query screen on the web UI -- Key: SOLR-6074 URL: https://issues.apache.org/jira/browse/SOLR-6074 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 4.4 Environment: linux debian testing / LL(ighthttpd)MP stack with drupal as the web frontend Reporter: Ariel Barreiro Priority: Minor I was looking into a search not working from the frontend and I wanted to check on how the data was inserted. I was unable to run any proper query that provides useful results other than *:*. Even more, when I followed the links from the schema browser from the Top Results for the field I was interested in querying, I was redirected to the query and again no results although there's obviously some because they appear as top results on the search page. By speaking with steffkes in #irc, he pointed out that the field name in question had a colon in its name, like: fieldname:name So the search ended up being q=fieldname:name:DATA He suggested to escape the colon as fieldname\:name, but that made no difference. In the end, there was a setting in the drupal plugin I am using (https://drupal.org/project/search_api_solr) that mentioned about cleaning the field ids to remove that colon in the field name, which I tried, and then I coiuld properly use the web ui to query results. New field name is fieldname$name. I quote a bit of the README from that module related to this: This will change the Solr field names used for all fields whose Search API identifiers contain a colon (i.e., all nested fields) to support some advanced functionality, like sorting by distance, for which Solr is buggy when using field names with colons. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5673) MMapDirectory shouldn't pass along OOM wrapped as IOException
[ https://issues.apache.org/jira/browse/LUCENE-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999819#comment-13999819 ] Uwe Schindler commented on LUCENE-5673: --- Oh the vote was already called. If we respin we can add this, but I will for now only do 4.9 and 5.0 MMapDirectory shouldn't pass along OOM wrapped as IOException - Key: LUCENE-5673 URL: https://issues.apache.org/jira/browse/LUCENE-5673 Project: Lucene - Core Issue Type: Bug Components: core/store Affects Versions: 4.8 Reporter: Robert Muir Assignee: Uwe Schindler Fix For: 4.9, 5.0 Attachments: LUCENE-5673.patch, LUCENE-5673.patch, LUCENE-5673.patch, LUCENE-5673.patch The bug here is in java (not MMapDir), but i think we shoudl do something. Users get confused when they configure their JVM to trigger something on OOM, and then see OutOfMemoryError: Map Failed: but their trigger doesnt fire. Thats because in the jdk, when it maps files it catches OutOfMemoryError, asks for a garbage collection, sleeps for 100 milliseconds, then tries to map again. if it fails a second time it wraps the OOM in a generic IOException. I think we should add a try/catch to our filechannel.map -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6073) CollectionAdminRequest has createCollection methods with hard-coded router=implicit
[ https://issues.apache.org/jira/browse/SOLR-6073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1451#comment-1451 ] Varun Thacker commented on SOLR-6073: - bq. Removing the N create methods is enough. This would be done on both trunk and branch_4x? CollectionAdminRequest has createCollection methods with hard-coded router=implicit - Key: SOLR-6073 URL: https://issues.apache.org/jira/browse/SOLR-6073 Project: Solr Issue Type: Bug Components: clients - java, SolrCloud Affects Versions: 4.8 Reporter: Shalin Shekhar Mangar Fix For: 4.9, 5.0 The CollectionAdminRequest has a createCollection() method which has the following hard-coded: {code} req.setRouterName(implicit); {code} This is a bug and we should remove it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6057) Duplicate background-color in #content #analysis #analysis-result .match (analysis.css)
[ https://issues.apache.org/jira/browse/SOLR-6057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1481#comment-1481 ] Al Krinker commented on SOLR-6057: -- p.s. i could not even log into my original account 'krinker', and had to create 'al.krinker' instead... and just got my access back to krinker. It was a mess. p.s. could you add krinker to the list of contributors? Thanks! Duplicate background-color in #content #analysis #analysis-result .match (analysis.css) --- Key: SOLR-6057 URL: https://issues.apache.org/jira/browse/SOLR-6057 Project: Solr Issue Type: Bug Reporter: Al Krinker Priority: Trivial Inside of solr/webapp/web/css/styles/analysis.css, you can find #content #analysis #analysis-result .match element with following content: #content #analysis #analysis-result .match { background-color: #e9eff7; background-color: #f2f2ff; } background-color listed twice. Also, it was very hard for me to see the highlight. Recommend to change it to background-color: #FF; -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org