Re: [JENKINS] Lucene-Solr-4.x-Linux (64bit/jdk1.8.0-fcs-b129) - Build # 9505 - Still Failing!
I committed a fix for this On Thu, Feb 27, 2014 at 2:39 AM, Policeman Jenkins Server jenk...@thetaphi.de wrote: Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/9505/ Java: 64bit/jdk1.8.0-fcs-b129 -XX:-UseCompressedOops -XX:+UseParallelGC All tests passed Build Log: [...truncated 57285 lines...] BUILD FAILED /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:471: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:410: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/extra-targets.xml:87: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/extra-targets.xml:187: Source checkout is dirty after running tests!!! Offending files: * ./solr/licenses/jcl-over-slf4j-1.6.6.jar.sha1 * ./solr/licenses/jul-to-slf4j-1.6.6.jar.sha1 * ./solr/licenses/slf4j-api-1.6.6.jar.sha1 * ./solr/licenses/slf4j-log4j12-1.6.6.jar.sha1 Total time: 58 minutes 34 seconds Build step 'Invoke Ant' marked build as failure Description set: Java: 64bit/jdk1.8.0-fcs-b129 -XX:-UseCompressedOops -XX:+UseParallelGC Archiving artifacts Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5762) SOLR-5658 broke backward compatibility of Javabin format
[ https://issues.apache.org/jira/browse/SOLR-5762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914197#comment-13914197 ] Shawn Heisey commented on SOLR-5762: Just adding a comment to indicate that this bug fixes problems with clients that say Unknown type 19 in the error message. It's mentioned in SOLR-5658 by Yonik. This will help users who search the Jira project for this error message. SOLR-5658 broke backward compatibility of Javabin format Key: SOLR-5762 URL: https://issues.apache.org/jira/browse/SOLR-5762 Project: Solr Issue Type: Bug Affects Versions: 4.6.1, 4.7 Reporter: Noble Paul Fix For: 4.7, 4.8, 5.0 Attachments: SOLR-5672.patch, SOLR-5762-test.patch, SOLR-5762.patch, updateReq_4_5.bin In SOLR-5658 the docsMap entry was changed from a Map to ListMap this broke back compat of older clients with 4.6.1 and later {noformat} ERROR - 2014-02-20 21:28:36.332; org.apache.solr.common.SolrException; java.lang.ClassCastException: java.util.LinkedHashMap cannot be cast to java.util.List at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:188) at org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:106) at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:58) at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:721) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:417) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:201) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:368) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:953) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:744) {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5460) Allow driving a query by sparse filters
[ https://issues.apache.org/jira/browse/LUCENE-5460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Khludnev updated LUCENE-5460: - Attachment: TestSlowQuery.java see TestSlowQuery.java attached. SampleSlowQuery verifies documents by checking stored field in SlowQueryScorer.confirm(int) the key thing is to prohibit advance, just because it is inefficient per se: {code} SlowQueryScorer.advance(int) { throw new UnsupportedOperationException(this + doesn't support advancing); } {code} so far, nothing special. The tricky thing is to handle filtering. I propose to make FilteredQuery.rewrite() aware about such 'slow' queries. see SlowQuery.rewriteFilteredQuery(IndexReader, FilteredQuery) FilteredQuery(SlowQuery(coreQuery)) = SlowQuery(FilteredQuery(coreQuery)) I suppose we can introduce such sort of 'slow' queries in Lucene, make FilteredQuery.rewrite aware about them, as well as BooleanQuery.rewrite (I can provide the prototype, if you wish to look at). Allow driving a query by sparse filters --- Key: LUCENE-5460 URL: https://issues.apache.org/jira/browse/LUCENE-5460 Project: Lucene - Core Issue Type: Improvement Components: core/search Reporter: Shai Erera Attachments: TestSlowQuery.java Today if a filter is very sparse we execute the query in sort of a leap-frog manner between the query and filter. If the query is very expensive to compute, and/or matching few docs only too, calling scorer.advance(doc) just to discover the doc it landed on isn't accepted by the filter, is a waste of time. Since Filter is always the final ruler, I wonder if we had something like {{boolean DISI.advanceExact(doc)}} we could use it instead, in some cases. There are many combinations in which I think we'd want to use/not-use this API, and they depend on: Filter's complexity, Filter.cost(), Scorer.cost(), query complexity (span-near, many clauses) etc. I open an issue so we can discuss. DISI.advanceExact(doc) is just a preliminary proposal, to get an API we could experiment with. The default implementation should be fairly easy and straightforward, and we could override where we can offer a more optimized imp. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.7.0_51) - Build # 9613 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/9613/ Java: 64bit/jdk1.7.0_51 -XX:-UseCompressedOops -XX:+UseParallelGC -XX:-UseSuperWord All tests passed Build Log: [...truncated 57614 lines...] BUILD FAILED /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:465: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:404: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:87: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:187: Source checkout is dirty after running tests!!! Offending files: * ./solr/licenses/jcl-over-slf4j-1.6.6.jar.sha1 * ./solr/licenses/jul-to-slf4j-1.6.6.jar.sha1 * ./solr/licenses/slf4j-api-1.6.6.jar.sha1 * ./solr/licenses/slf4j-log4j12-1.6.6.jar.sha1 Total time: 64 minutes 23 seconds Build step 'Invoke Ant' marked build as failure Description set: Java: 64bit/jdk1.7.0_51 -XX:-UseCompressedOops -XX:+UseParallelGC -XX:-UseSuperWord Archiving artifacts Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5474) Add example for retrieving facet counts without retrieving documents
[ https://issues.apache.org/jira/browse/LUCENE-5474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rob Audenaerde updated LUCENE-5474: --- Attachment: SimpleFacetsExample.java Yes, that prevents some duplicate stuff. Here is the modified file. Add example for retrieving facet counts without retrieving documents Key: LUCENE-5474 URL: https://issues.apache.org/jira/browse/LUCENE-5474 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Affects Versions: 4.7 Reporter: Rob Audenaerde Attachments: FacetOnlyExample.java, SimpleFacetsExample.java In the examples of facetting the {{FacetsCollector.search()}} is used. There are use cases where you do not need the documents that match the search. It would be nice if there is an example showing this. Basically, it comes down to using {{searcher.search(query, null /* Filter */, facetCollector)}} -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5781) Make the Collections API timeout configurable.
[ https://issues.apache.org/jira/browse/SOLR-5781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914315#comment-13914315 ] Noble Paul commented on SOLR-5781: -- How do we plan to do this? on a per call basis or on a cluster-wide property? Make the Collections API timeout configurable. -- Key: SOLR-5781 URL: https://issues.apache.org/jira/browse/SOLR-5781 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Mark Miller Fix For: 4.8, 5.0 This would also help with tests - nightlies can be quite intensive and need a very high timeout. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5474) Add example for retrieving facet counts without retrieving documents
[ https://issues.apache.org/jira/browse/LUCENE-5474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914322#comment-13914322 ] Shai Erera commented on LUCENE-5474: Looks good. Could you please: * Create a .patch (diff) file is it's easier to note what you modified/added? * Can you add a test to TestSimpleFacetsExample, along the lines of testSimple? Add example for retrieving facet counts without retrieving documents Key: LUCENE-5474 URL: https://issues.apache.org/jira/browse/LUCENE-5474 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Affects Versions: 4.7 Reporter: Rob Audenaerde Attachments: FacetOnlyExample.java, SimpleFacetsExample.java In the examples of facetting the {{FacetsCollector.search()}} is used. There are use cases where you do not need the documents that match the search. It would be nice if there is an example showing this. Basically, it comes down to using {{searcher.search(query, null /* Filter */, facetCollector)}} -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5476) Facet sampling
Rob Audenaerde created LUCENE-5476: -- Summary: Facet sampling Key: LUCENE-5476 URL: https://issues.apache.org/jira/browse/LUCENE-5476 Project: Lucene - Core Issue Type: Improvement Reporter: Rob Audenaerde With LUCENE-5339 facet sampling disappeared. When trying to display facet counts on large datasets (10M documents) counting facets is rather expensive, as all the hits are collected and processed. Sampling greatly reduced this and thus provided a nice speedup. Could it be brought back? -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: VOTE: Release apache-solr-ref-guide-4.7.pdf (RC1)
I regularized the preformatted blocks, so that none now have an extra blank line at the top - for some reason, all of the boxes on about half of the cwiki pages had this issue, but the other half didn’t. (I had to edit the source format to achieve this, and even there I couldn’t see a difference between preformatted boxes with an extra leading blank line and those without - must have been some invisible whitespace, not sure what.) Once I’d done that, the content still wasn’t vertically centered, so I edited the Solr space’s export PDF stylesheet and got rid of the negative margin-top thing I’d put in place for a previous release, which didn’t appear to be having any effect anymore, and instead overrode the default CSS to adjust the padding on preformatted blocks and their containing div-s. Content in preformatted blocks now appears to be vertically centered, with no extra vertical space. I also tried to apply “page-break-inside: avoid” in several places to see if it would help with the few poorly distributed multi-page preformatted boxes, but it didn’t seem to help. I noticed that a couple of “Topics covered in this section” boxes are too narrow to allow their content to be legible - the ones on pages 251 and 300 look really bad, and some others are only marginally legible. I don’t know if these issues are worthy of respin - I didn’t address the latter one. Steve On Feb 26, 2014, at 11:59 AM, Cassandra Targett casstarg...@gmail.com wrote: I generated a new release candidate for the Solr Reference Guide. This fixes the page numbering problem and a few other minor edits folks made yesterday after I generated RC0. https://dist.apache.org/repos/dist/dev/lucene/solr/ref-guide/apache-solr-ref-guide-4.7-RC1/ +1 from me. Cassandra - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5476) Facet sampling
[ https://issues.apache.org/jira/browse/LUCENE-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914415#comment-13914415 ] Michael McCandless commented on LUCENE-5476: +1 to bring it back. I think we could expose methods that take a FBS and either sub-sample it in place, or return a new FBS? Facet sampling -- Key: LUCENE-5476 URL: https://issues.apache.org/jira/browse/LUCENE-5476 Project: Lucene - Core Issue Type: Improvement Reporter: Rob Audenaerde With LUCENE-5339 facet sampling disappeared. When trying to display facet counts on large datasets (10M documents) counting facets is rather expensive, as all the hits are collected and processed. Sampling greatly reduced this and thus provided a nice speedup. Could it be brought back? -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5733) Solr 4.5.0, 4.5.1, and 4.6.1 spontaneously chashes within first 10min of their life
[ https://issues.apache.org/jira/browse/SOLR-5733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914417#comment-13914417 ] Marcus Engene commented on SOLR-5733: - Hi, Going from $ java -version java version 1.6.0_18 OpenJDK Runtime Environment (IcedTea6 1.8.13) (6b18-1.8.13-0+squeeze2) OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode) ...to... $ java -version java version 1.6.0_27 OpenJDK Runtime Environment (IcedTea6 1.12.6) (6b27-1.12.6-1~deb6u1) OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode) Seems to kill off the problem. Thanks, Marcus Solr 4.5.0, 4.5.1, and 4.6.1 spontaneously chashes within first 10min of their life --- Key: SOLR-5733 URL: https://issues.apache.org/jira/browse/SOLR-5733 Project: Solr Issue Type: Bug Affects Versions: 4.5, 4.5.1, 4.6.1 Environment: Linux solrssd2 2.6.32-5-amd64 #1 SMP Fri May 10 08:43:19 UTC 2013 x86_64 GNU/Linux java version 1.6.0_18 OpenJDK Runtime Environment (IcedTea6 1.8.13) (6b18-1.8.13-0+squeeze2) OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode) Reporter: Marcus Engene Fix For: 4.6.1 tien@solrssd2:/solr461stem/example$ cat start.sh #!/bin/sh java -Xms9G -Xmx22G -Djetty.host=0.0.0.0 -Djetty.port=9993 -DhostPort=9993 -jar start.jar 2/dev/null 1/dev/null Solr crashes spontaneously about every 2nd start within the first 10min of the process life. tien@solrssd2:/solr461stem/example/solr/collection1$ du -ks data 5405556 data Machine is not heavily used Tasks: 317 total, 1 running, 316 sleeping, 0 stopped, 0 zombie Cpu(s): 1.3%us, 0.0%sy, 0.0%ni, 98.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 264660644k total, 227656492k used, 37004152k free, 544848k buffers Swap: 4000144k total, 102940k used, 3897204k free, 204332940k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 7700 tien 20 0 32.4g 3.3g 1.2g S 13 1.3 2:23.15 java 8208 tien 20 0 27.6g 3.9g 805m S 10 1.5 0:56.45 java 7785 tien 20 0 26.7g 5.6g 2.2g S2 2.2 3:42.94 java 6102 tien 20 0 27.6g 9.9g 4.3g S0 3.9 61:03.26 java 8337 tien 20 0 19204 1552 1016 R0 0.0 0:00.02 top 1 root 20 0 8356 796 664 S0 0.0 0:12.90 init 2 root 20 0 000 S0 0.0 0:00.00 kthreadd 3 root RT 0 000 S0 0.0 0:05.30 migration/0 4 root 20 0 000 S0 0.0 0:13.17 ksoftirqd/0 5 root RT 0 000 S0 0.0 0:00.00 watchdog/0 I'll try to attach the hs-dump. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Closed] (SOLR-5733) Solr 4.5.0, 4.5.1, and 4.6.1 spontaneously chashes within first 10min of their life
[ https://issues.apache.org/jira/browse/SOLR-5733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Engene closed SOLR-5733. --- Resolution: Done Fix Version/s: 4.6.1 Solr 4.5.0, 4.5.1, and 4.6.1 spontaneously chashes within first 10min of their life --- Key: SOLR-5733 URL: https://issues.apache.org/jira/browse/SOLR-5733 Project: Solr Issue Type: Bug Affects Versions: 4.5, 4.5.1, 4.6.1 Environment: Linux solrssd2 2.6.32-5-amd64 #1 SMP Fri May 10 08:43:19 UTC 2013 x86_64 GNU/Linux java version 1.6.0_18 OpenJDK Runtime Environment (IcedTea6 1.8.13) (6b18-1.8.13-0+squeeze2) OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode) Reporter: Marcus Engene Fix For: 4.6.1 tien@solrssd2:/solr461stem/example$ cat start.sh #!/bin/sh java -Xms9G -Xmx22G -Djetty.host=0.0.0.0 -Djetty.port=9993 -DhostPort=9993 -jar start.jar 2/dev/null 1/dev/null Solr crashes spontaneously about every 2nd start within the first 10min of the process life. tien@solrssd2:/solr461stem/example/solr/collection1$ du -ks data 5405556 data Machine is not heavily used Tasks: 317 total, 1 running, 316 sleeping, 0 stopped, 0 zombie Cpu(s): 1.3%us, 0.0%sy, 0.0%ni, 98.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 264660644k total, 227656492k used, 37004152k free, 544848k buffers Swap: 4000144k total, 102940k used, 3897204k free, 204332940k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 7700 tien 20 0 32.4g 3.3g 1.2g S 13 1.3 2:23.15 java 8208 tien 20 0 27.6g 3.9g 805m S 10 1.5 0:56.45 java 7785 tien 20 0 26.7g 5.6g 2.2g S2 2.2 3:42.94 java 6102 tien 20 0 27.6g 9.9g 4.3g S0 3.9 61:03.26 java 8337 tien 20 0 19204 1552 1016 R0 0.0 0:00.02 top 1 root 20 0 8356 796 664 S0 0.0 0:12.90 init 2 root 20 0 000 S0 0.0 0:00.00 kthreadd 3 root RT 0 000 S0 0.0 0:05.30 migration/0 4 root 20 0 000 S0 0.0 0:13.17 ksoftirqd/0 5 root RT 0 000 S0 0.0 0:00.00 watchdog/0 I'll try to attach the hs-dump. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5477) add near-real-time suggest building to AnalyzingInfixSuggester
Michael McCandless created LUCENE-5477: -- Summary: add near-real-time suggest building to AnalyzingInfixSuggester Key: LUCENE-5477 URL: https://issues.apache.org/jira/browse/LUCENE-5477 Project: Lucene - Core Issue Type: Improvement Components: modules/spellchecker Reporter: Michael McCandless Fix For: 4.8, 5.0 Because this suggester impl. is just a Lucene index under-the-hood, it should be straightforward to enable near-real-time additions/removals of suggestions. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5478) Allow CommonTermsQuery to create custom term queries
[ https://issues.apache.org/jira/browse/LUCENE-5478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-5478: Attachment: LUCENE-5478.patch here is a patch Allow CommonTermsQuery to create custom term queries Key: LUCENE-5478 URL: https://issues.apache.org/jira/browse/LUCENE-5478 Project: Lucene - Core Issue Type: Improvement Components: modules/other Affects Versions: 4.7 Reporter: Simon Willnauer Fix For: 4.8, 5.0 Attachments: LUCENE-5478.patch currently we create term queries with _new TermQuery(..)_ directly in _CommonTermsQuery_ I'd like to extend the creation of the term query just like you can do that in the the query parser. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5733) Solr 4.5.0, 4.5.1, and 4.6.1 spontaneously chashes within first 10min of their life
[ https://issues.apache.org/jira/browse/SOLR-5733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914429#comment-13914429 ] Uwe Schindler commented on SOLR-5733: - bq. If you do find that you're using one of the known bad Java versions, please come back and close this JIRA. The list is here: http://wiki.apache.org/lucene-java/JavaBugs In general, if you really want to use Java 6 (which is no longer supported by Oracle), update to at least 1.6.0 u45 (latest available). In addition, OpenJDK 1.6 has major performance problems because of missing patches from official JDK 6. If you want to use OpenJDK, use OpenJDK 7, which is identical on the patch-level and features for server applications with Oracle JDK 7. Solr 4.5.0, 4.5.1, and 4.6.1 spontaneously chashes within first 10min of their life --- Key: SOLR-5733 URL: https://issues.apache.org/jira/browse/SOLR-5733 Project: Solr Issue Type: Bug Affects Versions: 4.5, 4.5.1, 4.6.1 Environment: Linux solrssd2 2.6.32-5-amd64 #1 SMP Fri May 10 08:43:19 UTC 2013 x86_64 GNU/Linux java version 1.6.0_18 OpenJDK Runtime Environment (IcedTea6 1.8.13) (6b18-1.8.13-0+squeeze2) OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode) Reporter: Marcus Engene Fix For: 4.6.1 tien@solrssd2:/solr461stem/example$ cat start.sh #!/bin/sh java -Xms9G -Xmx22G -Djetty.host=0.0.0.0 -Djetty.port=9993 -DhostPort=9993 -jar start.jar 2/dev/null 1/dev/null Solr crashes spontaneously about every 2nd start within the first 10min of the process life. tien@solrssd2:/solr461stem/example/solr/collection1$ du -ks data 5405556 data Machine is not heavily used Tasks: 317 total, 1 running, 316 sleeping, 0 stopped, 0 zombie Cpu(s): 1.3%us, 0.0%sy, 0.0%ni, 98.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 264660644k total, 227656492k used, 37004152k free, 544848k buffers Swap: 4000144k total, 102940k used, 3897204k free, 204332940k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 7700 tien 20 0 32.4g 3.3g 1.2g S 13 1.3 2:23.15 java 8208 tien 20 0 27.6g 3.9g 805m S 10 1.5 0:56.45 java 7785 tien 20 0 26.7g 5.6g 2.2g S2 2.2 3:42.94 java 6102 tien 20 0 27.6g 9.9g 4.3g S0 3.9 61:03.26 java 8337 tien 20 0 19204 1552 1016 R0 0.0 0:00.02 top 1 root 20 0 8356 796 664 S0 0.0 0:12.90 init 2 root 20 0 000 S0 0.0 0:00.00 kthreadd 3 root RT 0 000 S0 0.0 0:05.30 migration/0 4 root 20 0 000 S0 0.0 0:13.17 ksoftirqd/0 5 root RT 0 000 S0 0.0 0:00.00 watchdog/0 I'll try to attach the hs-dump. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5478) Allow CommonTermsQuery to create custom term queries
Simon Willnauer created LUCENE-5478: --- Summary: Allow CommonTermsQuery to create custom term queries Key: LUCENE-5478 URL: https://issues.apache.org/jira/browse/LUCENE-5478 Project: Lucene - Core Issue Type: Improvement Components: modules/other Affects Versions: 4.7 Reporter: Simon Willnauer Fix For: 4.8, 5.0 currently we create term queries with _new TermQuery(..)_ directly in _CommonTermsQuery_ I'd like to extend the creation of the term query just like you can do that in the the query parser. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5478) Allow CommonTermsQuery to create custom term queries
[ https://issues.apache.org/jira/browse/LUCENE-5478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914438#comment-13914438 ] Uwe Schindler commented on LUCENE-5478: --- Cool, +1 Allow CommonTermsQuery to create custom term queries Key: LUCENE-5478 URL: https://issues.apache.org/jira/browse/LUCENE-5478 Project: Lucene - Core Issue Type: Improvement Components: modules/other Affects Versions: 4.7 Reporter: Simon Willnauer Fix For: 4.8, 5.0 Attachments: LUCENE-5478.patch currently we create term queries with _new TermQuery(..)_ directly in _CommonTermsQuery_ I'd like to extend the creation of the term query just like you can do that in the the query parser. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5609) Don't let cores create slices/named replicas
[ https://issues.apache.org/jira/browse/SOLR-5609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914446#comment-13914446 ] ASF subversion and git services commented on SOLR-5609: --- Commit 1572530 from [~noble.paul] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1572530 ] SOLR-5609 use coreNodeName to compare replicas, CollectionsAPIDistributedZkTest.testCollectionsAPI() randomly switches to legacyCloud=false Don't let cores create slices/named replicas Key: SOLR-5609 URL: https://issues.apache.org/jira/browse/SOLR-5609 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Noble Paul Fix For: 4.8, 5.0 Attachments: SOLR-5609.patch, SOLR-5609.patch, SOLR-5609_5130.patch, SOLR-5609_5130.patch, SOLR-5609_5130.patch, SOLR-5609_5130.patch In SolrCloud, it is possible for a core to come up in any node , and register itself with an arbitrary slice/coreNodeName. This is a legacy requirement and we would like to make it only possible for Overseer to initiate creation of slice/replicas We plan to introduce cluster level properties at the top level /cluster-props.json {code:javascript} { noSliceOrReplicaByCores:true } {code} If this property is set to true, cores won't be able to send STATE commands with unknown slice/coreNodeName . Those commands will fail at Overseer. This is useful for SOLR-5310 / SOLR-5311 where a core/replica is deleted by a command and it comes up later and tries to create a replica/slice -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5474) Add example for retrieving facet counts without retrieving documents
[ https://issues.apache.org/jira/browse/LUCENE-5474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rob Audenaerde updated LUCENE-5474: --- Attachment: LUCENE-5474.patch Here is the patch Add example for retrieving facet counts without retrieving documents Key: LUCENE-5474 URL: https://issues.apache.org/jira/browse/LUCENE-5474 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Affects Versions: 4.7 Reporter: Rob Audenaerde Attachments: FacetOnlyExample.java, LUCENE-5474.patch, SimpleFacetsExample.java In the examples of facetting the {{FacetsCollector.search()}} is used. There are use cases where you do not need the documents that match the search. It would be nice if there is an example showing this. Basically, it comes down to using {{searcher.search(query, null /* Filter */, facetCollector)}} -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5479) Make default dimension config in FacetConfig adjustable
Rob Audenaerde created LUCENE-5479: -- Summary: Make default dimension config in FacetConfig adjustable Key: LUCENE-5479 URL: https://issues.apache.org/jira/browse/LUCENE-5479 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Rob Audenaerde Priority: Minor Attachments: LUCENE-5479.patch Now it is hardcoded to DEFAULT_DIM_CONFIG. This may be useful for most standard approaches. However, I use lots of facets. These facets can be multivalued, I do not know that on beforehand. So what I would like to do is to change the default config to {{mulitvalued = true}}. Currently I have a working, but rather ugly workaround that subclasses FacetConfig, like this: {code:title=CustomFacetConfig.java|borderStyle=solid} public class CustomFacetsConfig extends FacetsConfig { public final static DimConfig DEFAULT_D2A_DIM_CONFIG = new DimConfig(); static { DEFAULT_D2A_DIM_CONFIG.multiValued = true; } @Override public synchronized DimConfig getDimConfig( String dimName ) { DimConfig ft = super.getDimConfig( dimName ); if ( DEFAULT_DIM_CONFIG.equals( ft ) ) { return DEFAULT_D2A_DIM_CONFIG; } return ft; } } {code} I created a patch to illustrate what I would like to change. Also, maybe there are better way to accomplish my goal (easy default to multivalue?) -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5479) Make default dimension config in FacetConfig adjustable
[ https://issues.apache.org/jira/browse/LUCENE-5479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rob Audenaerde updated LUCENE-5479: --- Attachment: LUCENE-5479.patch Make default dimension config in FacetConfig adjustable Key: LUCENE-5479 URL: https://issues.apache.org/jira/browse/LUCENE-5479 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Rob Audenaerde Priority: Minor Attachments: LUCENE-5479.patch Now it is hardcoded to DEFAULT_DIM_CONFIG. This may be useful for most standard approaches. However, I use lots of facets. These facets can be multivalued, I do not know that on beforehand. So what I would like to do is to change the default config to {{mulitvalued = true}}. Currently I have a working, but rather ugly workaround that subclasses FacetConfig, like this: {code:title=CustomFacetConfig.java|borderStyle=solid} public class CustomFacetsConfig extends FacetsConfig { public final static DimConfig DEFAULT_D2A_DIM_CONFIG = new DimConfig(); static { DEFAULT_D2A_DIM_CONFIG.multiValued = true; } @Override public synchronized DimConfig getDimConfig( String dimName ) { DimConfig ft = super.getDimConfig( dimName ); if ( DEFAULT_DIM_CONFIG.equals( ft ) ) { return DEFAULT_D2A_DIM_CONFIG; } return ft; } } {code} I created a patch to illustrate what I would like to change. Also, maybe there are better way to accomplish my goal (easy default to multivalue?) -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5479) Make default dimension config in FacetConfig adjustable
[ https://issues.apache.org/jira/browse/LUCENE-5479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rob Audenaerde updated LUCENE-5479: --- Description: Now it is hardcoded to DEFAULT_DIM_CONFIG. This may be useful for most standard approaches. However, I use lots of facets. These facets can be multivalued, I do not know that on beforehand. So what I would like to do is to change the default config to {{mulitvalued = true}}. Currently I have a working, but rather ugly workaround that subclasses FacetConfig, like this: {code:title=CustomFacetConfig.java|borderStyle=solid} public class CustomFacetsConfig extends FacetsConfig { public final static DimConfig DEFAULT_D2A_DIM_CONFIG = new DimConfig(); static { DEFAULT_D2A_DIM_CONFIG.multiValued = true; } @Override public synchronized DimConfig getDimConfig( String dimName ) { DimConfig ft = super.getDimConfig( dimName ); if ( DEFAULT_DIM_CONFIG.equals( ft ) ) { return DEFAULT_D2A_DIM_CONFIG; } return ft; } } {code} I created a patch to illustrate what I would like to change. By making a protected method it is easier to create a custom subclass of FacetConfig. Also, maybe there are better way to accomplish my goal (easy default to multivalue?) was: Now it is hardcoded to DEFAULT_DIM_CONFIG. This may be useful for most standard approaches. However, I use lots of facets. These facets can be multivalued, I do not know that on beforehand. So what I would like to do is to change the default config to {{mulitvalued = true}}. Currently I have a working, but rather ugly workaround that subclasses FacetConfig, like this: {code:title=CustomFacetConfig.java|borderStyle=solid} public class CustomFacetsConfig extends FacetsConfig { public final static DimConfig DEFAULT_D2A_DIM_CONFIG = new DimConfig(); static { DEFAULT_D2A_DIM_CONFIG.multiValued = true; } @Override public synchronized DimConfig getDimConfig( String dimName ) { DimConfig ft = super.getDimConfig( dimName ); if ( DEFAULT_DIM_CONFIG.equals( ft ) ) { return DEFAULT_D2A_DIM_CONFIG; } return ft; } } {code} I created a patch to illustrate what I would like to change. Also, maybe there are better way to accomplish my goal (easy default to multivalue?) Make default dimension config in FacetConfig adjustable Key: LUCENE-5479 URL: https://issues.apache.org/jira/browse/LUCENE-5479 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Rob Audenaerde Priority: Minor Attachments: LUCENE-5479.patch Now it is hardcoded to DEFAULT_DIM_CONFIG. This may be useful for most standard approaches. However, I use lots of facets. These facets can be multivalued, I do not know that on beforehand. So what I would like to do is to change the default config to {{mulitvalued = true}}. Currently I have a working, but rather ugly workaround that subclasses FacetConfig, like this: {code:title=CustomFacetConfig.java|borderStyle=solid} public class CustomFacetsConfig extends FacetsConfig { public final static DimConfig DEFAULT_D2A_DIM_CONFIG = new DimConfig(); static { DEFAULT_D2A_DIM_CONFIG.multiValued = true; } @Override public synchronized DimConfig getDimConfig( String dimName ) { DimConfig ft = super.getDimConfig( dimName ); if ( DEFAULT_DIM_CONFIG.equals( ft ) ) { return DEFAULT_D2A_DIM_CONFIG; } return ft; } } {code} I created a patch to illustrate what I would like to change. By making a protected method it is easier to create a custom subclass of FacetConfig. Also, maybe there are better way to accomplish my goal (easy default to multivalue?) -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5479) Make default dimension config in FacetConfig adjustable
[ https://issues.apache.org/jira/browse/LUCENE-5479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914473#comment-13914473 ] Michael McCandless commented on LUCENE-5479: +1, makes sense! Make default dimension config in FacetConfig adjustable Key: LUCENE-5479 URL: https://issues.apache.org/jira/browse/LUCENE-5479 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Rob Audenaerde Priority: Minor Attachments: LUCENE-5479.patch Now it is hardcoded to DEFAULT_DIM_CONFIG. This may be useful for most standard approaches. However, I use lots of facets. These facets can be multivalued, I do not know that on beforehand. So what I would like to do is to change the default config to {{mulitvalued = true}}. Currently I have a working, but rather ugly workaround that subclasses FacetConfig, like this: {code:title=CustomFacetConfig.java|borderStyle=solid} public class CustomFacetsConfig extends FacetsConfig { public final static DimConfig DEFAULT_D2A_DIM_CONFIG = new DimConfig(); static { DEFAULT_D2A_DIM_CONFIG.multiValued = true; } @Override public synchronized DimConfig getDimConfig( String dimName ) { DimConfig ft = super.getDimConfig( dimName ); if ( DEFAULT_DIM_CONFIG.equals( ft ) ) { return DEFAULT_D2A_DIM_CONFIG; } return ft; } } {code} I created a patch to illustrate what I would like to change. By making a protected method it is easier to create a custom subclass of FacetConfig. Also, maybe there are better way to accomplish my goal (easy default to multivalue?) -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5479) Make default dimension config in FacetConfig adjustable
[ https://issues.apache.org/jira/browse/LUCENE-5479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914474#comment-13914474 ] Shai Erera commented on LUCENE-5479: +1. Can you please document the method? Make default dimension config in FacetConfig adjustable Key: LUCENE-5479 URL: https://issues.apache.org/jira/browse/LUCENE-5479 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Rob Audenaerde Priority: Minor Attachments: LUCENE-5479.patch Now it is hardcoded to DEFAULT_DIM_CONFIG. This may be useful for most standard approaches. However, I use lots of facets. These facets can be multivalued, I do not know that on beforehand. So what I would like to do is to change the default config to {{mulitvalued = true}}. Currently I have a working, but rather ugly workaround that subclasses FacetConfig, like this: {code:title=CustomFacetConfig.java|borderStyle=solid} public class CustomFacetsConfig extends FacetsConfig { public final static DimConfig DEFAULT_D2A_DIM_CONFIG = new DimConfig(); static { DEFAULT_D2A_DIM_CONFIG.multiValued = true; } @Override public synchronized DimConfig getDimConfig( String dimName ) { DimConfig ft = super.getDimConfig( dimName ); if ( DEFAULT_DIM_CONFIG.equals( ft ) ) { return DEFAULT_D2A_DIM_CONFIG; } return ft; } } {code} I created a patch to illustrate what I would like to change. By making a protected method it is easier to create a custom subclass of FacetConfig. Also, maybe there are better way to accomplish my goal (easy default to multivalue?) -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5785) Turn down absurdly verbose test logging (megabytes)
Robert Muir created SOLR-5785: - Summary: Turn down absurdly verbose test logging (megabytes) Key: SOLR-5785 URL: https://issues.apache.org/jira/browse/SOLR-5785 Project: Solr Issue Type: Bug Reporter: Robert Muir I wanted to look at a solr test failure to see if i could help fix it. unfortunately, it dumped 26MB of useless logging to the console. This means i cannot even click on the jenkins console (https://builds.apache.org/job/Lucene-Solr-NightlyTests-4.x/524/consoleText) to look at some stuff about the fail without totally crashing my browser. This ridiculous amount of verbosity is preventing people from fixing tests, not helping. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #1115: POMs out of sync
Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/1115/ 4 tests failed. REGRESSION: org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.testDistribSearch Error Message: Captured an uncaught exception in thread: Thread[id=95696, name=qtp1254086174-95696, state=RUNNABLE, group=TGRP-ChaosMonkeySafeLeaderTest] Stack Trace: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=95696, name=qtp1254086174-95696, state=RUNNABLE, group=TGRP-ChaosMonkeySafeLeaderTest] Caused by: java.lang.OutOfMemoryError: unable to create new native thread at __randomizedtesting.SeedInfo.seed([501CDB0132A418E5]:0) at java.lang.Thread.start0(Native Method) at java.lang.Thread.start(Thread.java:693) at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1047) at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1312) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1339) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1323) at org.eclipse.jetty.server.ssl.SslSocketConnector$SslConnectorEndPoint.run(SslSocketConnector.java:665) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:724) REGRESSION: org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.testDistribSearch Error Message: Captured an uncaught exception in thread: Thread[id=95704, name=qtp1254086174-95704, state=RUNNABLE, group=TGRP-ChaosMonkeySafeLeaderTest] Stack Trace: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=95704, name=qtp1254086174-95704, state=RUNNABLE, group=TGRP-ChaosMonkeySafeLeaderTest] Caused by: java.lang.OutOfMemoryError: unable to create new native thread at __randomizedtesting.SeedInfo.seed([501CDB0132A418E5]:0) at java.lang.Thread.start0(Native Method) at java.lang.Thread.start(Thread.java:693) at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1047) at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1312) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1339) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1323) at org.eclipse.jetty.server.ssl.SslSocketConnector$SslConnectorEndPoint.run(SslSocketConnector.java:665) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:724) REGRESSION: org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.testDistribSearch Error Message: Captured an uncaught exception in thread: Thread[id=95703, name=qtp1795561555-95703, state=RUNNABLE, group=TGRP-ChaosMonkeySafeLeaderTest] Stack Trace: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=95703, name=qtp1795561555-95703, state=RUNNABLE, group=TGRP-ChaosMonkeySafeLeaderTest] Caused by: java.lang.OutOfMemoryError: unable to create new native thread at __randomizedtesting.SeedInfo.seed([501CDB0132A418E5]:0) at java.lang.Thread.start0(Native Method) at java.lang.Thread.start(Thread.java:693) at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1047) at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1312) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1339) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1323) at org.eclipse.jetty.server.ssl.SslSocketConnector$SslConnectorEndPoint.run(SslSocketConnector.java:665) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:724) REGRESSION: org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.testDistribSearch Error Message: Captured an uncaught exception in thread: Thread[id=95714, name=qtp473666576-95714, state=RUNNABLE, group=TGRP-ChaosMonkeySafeLeaderTest] Stack Trace: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=95714, name=qtp473666576-95714, state=RUNNABLE, group=TGRP-ChaosMonkeySafeLeaderTest] Caused by: java.lang.OutOfMemoryError: unable to create new native thread at __randomizedtesting.SeedInfo.seed([501CDB0132A418E5]:0) at java.lang.Thread.start0(Native Method) at java.lang.Thread.start(Thread.java:693) at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1047) at
[jira] [Created] (SOLR-5786) MapReduceIndexerTool --help text is missing large parts of the help text
wolfgang hoschek created SOLR-5786: -- Summary: MapReduceIndexerTool --help text is missing large parts of the help text Key: SOLR-5786 URL: https://issues.apache.org/jira/browse/SOLR-5786 Project: Solr Issue Type: Bug Components: contrib - MapReduce Affects Versions: 4.7 Reporter: wolfgang hoschek Assignee: Mark Miller Fix For: 4.8 As already mentioned repeatedly and at length, this is a regression introduced by the fix in https://issues.apache.org/jira/browse/SOLR-5605 Here is the diff of --help output before SOLR-5605 vs after SOLR-5605: {code} 130,235c130 lucene segments left in this index. Merging segments involves reading and rewriting all data in all these segment files, potentially multiple times, which is very I/O intensive and time consuming. However, an index with fewer segments can later be merged faster, and it can later be queried faster once deployed to a live Solr serving shard. Set maxSegments to 1 to optimize the index for low query latency. In a nutshell, a small maxSegments value trades indexing latency for subsequently improved query latency. This can be a reasonable trade-off for batch indexing systems. (default: 1) --fair-scheduler-pool STRING Optional tuning knob that indicates the name of the fair scheduler pool to submit jobs to. The Fair Scheduler is a pluggable MapReduce scheduler that provides a way to share large clusters. Fair scheduling is a method of assigning resources to jobs such that all jobs get, on average, an equal share of resources over time. When there is a single job running, that job uses the entire cluster. When other jobs are submitted, tasks slots that free up are assigned to the new jobs, so that each job gets roughly the same amount of CPU time. Unlike the default Hadoop scheduler, which forms a queue of jobs, this lets short jobs finish in reasonable time while not starving long jobs. It is also an easy way to share a cluster between multiple of users. Fair sharing can also work with job priorities - the priorities are used as weights to determine the fraction of total compute time that each job gets. --dry-run Run in local mode and print documents to stdout instead of loading them into Solr. This executes the morphline in the client process (without submitting a job to MR) for quicker turnaround during early trialdebug sessions. (default: false) --log4j FILE Relative or absolute path to a log4j.properties config file on the local file system. This file will be uploaded to each MR task. Example: /path/to/log4j.properties --verbose, -v Turn on verbose output. (default: false) --show-non-solr-cloud Also show options for Non-SolrCloud mode as part of --help. (default: false) Required arguments: --output-dir HDFS_URI HDFS directory to write Solr indexes to. Inside there one output directory per shard will be generated.Example: hdfs://c2202.mycompany. com/user/$USER/test --morphline-file FILE Relative or absolute path to a local config file that contains one or more morphlines. The file must be UTF-8 encoded. Example: /path/to/morphline.conf Cluster arguments: Arguments that provide information about your Solr cluster. --zk-host STRING The address of a ZooKeeper ensemble being used by a SolrCloud cluster. This ZooKeeper ensemble will be examined to determine the number of output
[jira] [Commented] (LUCENE-5475) add required attribute bugUrl to @BadApple
[ https://issues.apache.org/jira/browse/LUCENE-5475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914546#comment-13914546 ] Dawid Weiss commented on LUCENE-5475: - I've added dumping full annotation content (with attribute). Unfortunately there's no way to reference a snapshot build so you'll have to wait for a release (which I'll try to make in a day or two). add required attribute bugUrl to @BadApple -- Key: LUCENE-5475 URL: https://issues.apache.org/jira/browse/LUCENE-5475 Project: Lucene - Core Issue Type: Bug Components: general/test Reporter: Robert Muir Fix For: 4.8, 5.0 Attachments: LUCENE-5475.patch This makes it impossible to tag a test as a badapple without a pointer to a JIRA issue. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5605) MapReduceIndexerTool fails in some locales -- seen in random failures of MapReduceIndexerToolArgumentParserTest
[ https://issues.apache.org/jira/browse/SOLR-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914549#comment-13914549 ] wolfgang hoschek commented on SOLR-5605: Correspondingly, I filed https://issues.apache.org/jira/browse/SOLR-5786 Look, as you know, I wrote almost all of the original solr-mapreduce contrib, and I know this code inside out. To be honest, this kind of repetitive ignorance is tiresome at best and completely turns me off. MapReduceIndexerTool fails in some locales -- seen in random failures of MapReduceIndexerToolArgumentParserTest --- Key: SOLR-5605 URL: https://issues.apache.org/jira/browse/SOLR-5605 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Mark Miller Fix For: 4.7, 5.0 I noticed a randomized failure in MapReduceIndexerToolArgumentParserTest which is reproducible with any seed -- all that matters is the locale. The problem sounded familiar, and a quick search verified that jenkins has in fact hit this a couple of times in the past -- Uwe commented on the list that this is due to a real problem in one of the third-party dependencies (that does the argument parsing) that will affect usage on some systems. If working around the bug in the arg parsing lib isn't feasible, MapReduceIndexerTool should fail cleanly if the locale isn't one we know is supported -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5787) Get spellcheck frequency relatively to current query
Hakim created SOLR-5787: --- Summary: Get spellcheck frequency relatively to current query Key: SOLR-5787 URL: https://issues.apache.org/jira/browse/SOLR-5787 Project: Solr Issue Type: Improvement Components: spellchecker Affects Versions: 4.6 Environment: Solr deployed on Jetty 9 Servlet container Reporter: Hakim Priority: Minor I guess that this functionnality isn't implemented yet. I'll begin by an example to explain what I'm requesting: I have a lucene query that get articles satisfying a certain query. With this same command, I'm getting at the same time suggestions if this query doesnt return any article (so far, nothing unusual). The frequency (count) associated with these suggestions is relative to all index (it counts all occurences of the suggestion in the whole index). What I want is that it counts only suggestion occurences satisfying current lucene query. P.S: I'm using solr's spellcheck component (solr.DirectSolrSpellChecker). -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5786) MapReduceIndexerTool --help output is missing large parts of the help text
[ https://issues.apache.org/jira/browse/SOLR-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wolfgang hoschek updated SOLR-5786: --- Summary: MapReduceIndexerTool --help output is missing large parts of the help text (was: MapReduceIndexerTool --help text is missing large parts of the help text) MapReduceIndexerTool --help output is missing large parts of the help text -- Key: SOLR-5786 URL: https://issues.apache.org/jira/browse/SOLR-5786 Project: Solr Issue Type: Bug Components: contrib - MapReduce Affects Versions: 4.7 Reporter: wolfgang hoschek Assignee: Mark Miller Fix For: 4.8 As already mentioned repeatedly and at length, this is a regression introduced by the fix in https://issues.apache.org/jira/browse/SOLR-5605 Here is the diff of --help output before SOLR-5605 vs after SOLR-5605: {code} 130,235c130 lucene segments left in this index. Merging segments involves reading and rewriting all data in all these segment files, potentially multiple times, which is very I/O intensive and time consuming. However, an index with fewer segments can later be merged faster, and it can later be queried faster once deployed to a live Solr serving shard. Set maxSegments to 1 to optimize the index for low query latency. In a nutshell, a small maxSegments value trades indexing latency for subsequently improved query latency. This can be a reasonable trade-off for batch indexing systems. (default: 1) --fair-scheduler-pool STRING Optional tuning knob that indicates the name of the fair scheduler pool to submit jobs to. The Fair Scheduler is a pluggable MapReduce scheduler that provides a way to share large clusters. Fair scheduling is a method of assigning resources to jobs such that all jobs get, on average, an equal share of resources over time. When there is a single job running, that job uses the entire cluster. When other jobs are submitted, tasks slots that free up are assigned to the new jobs, so that each job gets roughly the same amount of CPU time. Unlike the default Hadoop scheduler, which forms a queue of jobs, this lets short jobs finish in reasonable time while not starving long jobs. It is also an easy way to share a cluster between multiple of users. Fair sharing can also work with job priorities - the priorities are used as weights to determine the fraction of total compute time that each job gets. --dry-run Run in local mode and print documents to stdout instead of loading them into Solr. This executes the morphline in the client process (without submitting a job to MR) for quicker turnaround during early trialdebug sessions. (default: false) --log4j FILE Relative or absolute path to a log4j.properties config file on the local file system. This file will be uploaded to each MR task. Example: /path/to/log4j.properties --verbose, -v Turn on verbose output. (default: false) --show-non-solr-cloud Also show options for Non-SolrCloud mode as part of --help. (default: false) Required arguments: --output-dir HDFS_URI HDFS directory to write Solr indexes to. Inside there one output directory per shard will be generated.Example: hdfs://c2202.mycompany. com/user/$USER/test --morphline-file FILE Relative or absolute path to a local config file that contains one or more morphlines. The file must be UTF-8
[jira] [Updated] (SOLR-5786) MapReduceIndexerTool --help output is missing large parts of the help text
[ https://issues.apache.org/jira/browse/SOLR-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wolfgang hoschek updated SOLR-5786: --- Description: As already mentioned repeatedly and at length, this is a regression introduced by the fix in https://issues.apache.org/jira/browse/SOLR-5605 Here is the diff of --help output before SOLR-5605 vs after SOLR-5605: {code} 130,235c130 lucene segments left in this index. Merging segments involves reading and rewriting all data in all these segment files, potentially multiple times, which is very I/O intensive and time consuming. However, an index with fewer segments can later be merged faster, and it can later be queried faster once deployed to a live Solr serving shard. Set maxSegments to 1 to optimize the index for low query latency. In a nutshell, a small maxSegments value trades indexing latency for subsequently improved query latency. This can be a reasonable trade-off for batch indexing systems. (default: 1) --fair-scheduler-pool STRING Optional tuning knob that indicates the name of the fair scheduler pool to submit jobs to. The Fair Scheduler is a pluggable MapReduce scheduler that provides a way to share large clusters. Fair scheduling is a method of assigning resources to jobs such that all jobs get, on average, an equal share of resources over time. When there is a single job running, that job uses the entire cluster. When other jobs are submitted, tasks slots that free up are assigned to the new jobs, so that each job gets roughly the same amount of CPU time. Unlike the default Hadoop scheduler, which forms a queue of jobs, this lets short jobs finish in reasonable time while not starving long jobs. It is also an easy way to share a cluster between multiple of users. Fair sharing can also work with job priorities - the priorities are used as weights to determine the fraction of total compute time that each job gets. --dry-run Run in local mode and print documents to stdout instead of loading them into Solr. This executes the morphline in the client process (without submitting a job to MR) for quicker turnaround during early trialdebug sessions. (default: false) --log4j FILE Relative or absolute path to a log4j.properties config file on the local file system. This file will be uploaded to each MR task. Example: /path/to/log4j.properties --verbose, -v Turn on verbose output. (default: false) --show-non-solr-cloud Also show options for Non-SolrCloud mode as part of --help. (default: false) Required arguments: --output-dir HDFS_URI HDFS directory to write Solr indexes to. Inside there one output directory per shard will be generated.Example: hdfs://c2202.mycompany. com/user/$USER/test --morphline-file FILE Relative or absolute path to a local config file that contains one or more morphlines. The file must be UTF-8 encoded. Example: /path/to/morphline.conf Cluster arguments: Arguments that provide information about your Solr cluster. --zk-host STRING The address of a ZooKeeper ensemble being used by a SolrCloud cluster. This ZooKeeper ensemble will be examined to determine the number of output shards to create as well as the Solr URLs to merge the output shards into when using the --go- live option. Requires that you also pass the -- collection to merge the shards into.
[jira] [Commented] (SOLR-5787) Get spellcheck frequency relatively to current query
[ https://issues.apache.org/jira/browse/SOLR-5787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914571#comment-13914571 ] James Dyer commented on SOLR-5787: -- Can you explain why spellcheck#maxCollationTries and spellcheck#collateExtendedResults do not satisify your needs? This will give you the # of results that the query returns if you take all of the suggestions provided in the collation. Get spellcheck frequency relatively to current query Key: SOLR-5787 URL: https://issues.apache.org/jira/browse/SOLR-5787 Project: Solr Issue Type: Improvement Components: spellchecker Affects Versions: 4.6 Environment: Solr deployed on Jetty 9 Servlet container Reporter: Hakim Priority: Minor Labels: features, newbie I guess that this functionnality isn't implemented yet. I'll begin by an example to explain what I'm requesting: I have a lucene query that get articles satisfying a certain query. With this same command, I'm getting at the same time suggestions if this query doesnt return any article (so far, nothing unusual). The frequency (count) associated with these suggestions is relative to all index (it counts all occurences of the suggestion in the whole index). What I want is that it counts only suggestion occurences satisfying current lucene query. P.S: I'm using solr's spellcheck component (solr.DirectSolrSpellChecker). -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5733) Solr 4.5.0, 4.5.1, and 4.6.1 spontaneously chashes within first 10min of their life
[ https://issues.apache.org/jira/browse/SOLR-5733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914609#comment-13914609 ] Erick Erickson commented on SOLR-5733: -- Thanks Uwe! I finally bookmarked that page! Solr 4.5.0, 4.5.1, and 4.6.1 spontaneously chashes within first 10min of their life --- Key: SOLR-5733 URL: https://issues.apache.org/jira/browse/SOLR-5733 Project: Solr Issue Type: Bug Affects Versions: 4.5, 4.5.1, 4.6.1 Environment: Linux solrssd2 2.6.32-5-amd64 #1 SMP Fri May 10 08:43:19 UTC 2013 x86_64 GNU/Linux java version 1.6.0_18 OpenJDK Runtime Environment (IcedTea6 1.8.13) (6b18-1.8.13-0+squeeze2) OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode) Reporter: Marcus Engene Fix For: 4.6.1 tien@solrssd2:/solr461stem/example$ cat start.sh #!/bin/sh java -Xms9G -Xmx22G -Djetty.host=0.0.0.0 -Djetty.port=9993 -DhostPort=9993 -jar start.jar 2/dev/null 1/dev/null Solr crashes spontaneously about every 2nd start within the first 10min of the process life. tien@solrssd2:/solr461stem/example/solr/collection1$ du -ks data 5405556 data Machine is not heavily used Tasks: 317 total, 1 running, 316 sleeping, 0 stopped, 0 zombie Cpu(s): 1.3%us, 0.0%sy, 0.0%ni, 98.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 264660644k total, 227656492k used, 37004152k free, 544848k buffers Swap: 4000144k total, 102940k used, 3897204k free, 204332940k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 7700 tien 20 0 32.4g 3.3g 1.2g S 13 1.3 2:23.15 java 8208 tien 20 0 27.6g 3.9g 805m S 10 1.5 0:56.45 java 7785 tien 20 0 26.7g 5.6g 2.2g S2 2.2 3:42.94 java 6102 tien 20 0 27.6g 9.9g 4.3g S0 3.9 61:03.26 java 8337 tien 20 0 19204 1552 1016 R0 0.0 0:00.02 top 1 root 20 0 8356 796 664 S0 0.0 0:12.90 init 2 root 20 0 000 S0 0.0 0:00.00 kthreadd 3 root RT 0 000 S0 0.0 0:05.30 migration/0 4 root 20 0 000 S0 0.0 0:13.17 ksoftirqd/0 5 root RT 0 000 S0 0.0 0:00.00 watchdog/0 I'll try to attach the hs-dump. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5788) Document update in case of error doesn't return the error message correctly
[ https://issues.apache.org/jira/browse/SOLR-5788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yago Riveiro updated SOLR-5788: --- Summary: Document update in case of error doesn't return the error message correctly (was: Document update in case of error doesn't returns the error message correctly) Document update in case of error doesn't return the error message correctly --- Key: SOLR-5788 URL: https://issues.apache.org/jira/browse/SOLR-5788 Project: Solr Issue Type: Bug Affects Versions: 4.6.1 Reporter: Yago Riveiro I found a issue when updating a document. If for any reason the update can't be done, example: the schema doesn't match with the incoming doc; the error raise to the user is something like: {noformat} curl 'http://localhost:8983/solr/collection1/update?commit=true' --data-binary @doc.json -H 'Content-type:application/json' {responseHeader:{status:400,QTime:52},error:{msg:Bad Request\n\n\n\nrequest: http://localhost:8983/solr/collection1_shard3_replica1/update?update.distrib=TOLEADERdistrib.from=http%3A%2F%2Flocalhost%3A8983%2Fsolr%2Fcollection1_shard1_replica2%2Fwt=javabinversion=2,code:400}} {noformat} In case that the update was done on the leader, the error message is (IMHO) the correct and with valuable info: {noformat} curl 'http://localhost:8983/solr/collection1/update?commit=true' --data-binary @doc.json -H 'Content-type:application/json' {responseHeader:{status:400,QTime:19},error:{msg:ERROR: [doc=01!12967564] Error adding field 'source'='[Direct]' msg=For input string: \Direct\,code:400}} {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5788) Document update in case of error doesn't returns the error message correctly
Yago Riveiro created SOLR-5788: -- Summary: Document update in case of error doesn't returns the error message correctly Key: SOLR-5788 URL: https://issues.apache.org/jira/browse/SOLR-5788 Project: Solr Issue Type: Bug Affects Versions: 4.6.1 Reporter: Yago Riveiro I found a issue when updating a document. If for any reason the update can't be done, example: the schema doesn't match with the incoming doc; the error raise to the user is something like: {noformat} curl 'http://localhost:8983/solr/collection1/update?commit=true' --data-binary @doc.json -H 'Content-type:application/json' {responseHeader:{status:400,QTime:52},error:{msg:Bad Request\n\n\n\nrequest: http://localhost:8983/solr/collection1_shard3_replica1/update?update.distrib=TOLEADERdistrib.from=http%3A%2F%2Flocalhost%3A8983%2Fsolr%2Fcollection1_shard1_replica2%2Fwt=javabinversion=2,code:400}} {noformat} In case that the update was done on the leader, the error message is (IMHO) the correct and with valuable info: {noformat} curl 'http://localhost:8983/solr/collection1/update?commit=true' --data-binary @doc.json -H 'Content-type:application/json' {responseHeader:{status:400,QTime:19},error:{msg:ERROR: [doc=01!12967564] Error adding field 'source'='[Direct]' msg=For input string: \Direct\,code:400}} {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5478) Allow CommonTermsQuery to create custom term queries
[ https://issues.apache.org/jira/browse/LUCENE-5478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914631#comment-13914631 ] ASF subversion and git services commented on LUCENE-5478: - Commit 1572613 from [~simonw] in branch 'dev/trunk' [ https://svn.apache.org/r1572613 ] LUCENE-5478: CommonTermsQuery now allows to create custom term queries Allow CommonTermsQuery to create custom term queries Key: LUCENE-5478 URL: https://issues.apache.org/jira/browse/LUCENE-5478 Project: Lucene - Core Issue Type: Improvement Components: modules/other Affects Versions: 4.7 Reporter: Simon Willnauer Fix For: 4.8, 5.0 Attachments: LUCENE-5478.patch currently we create term queries with _new TermQuery(..)_ directly in _CommonTermsQuery_ I'd like to extend the creation of the term query just like you can do that in the the query parser. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5478) Allow CommonTermsQuery to create custom term queries
[ https://issues.apache.org/jira/browse/LUCENE-5478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914652#comment-13914652 ] ASF subversion and git services commented on LUCENE-5478: - Commit 1572624 from [~simonw] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1572624 ] LUCENE-5478: CommonTermsQuery now allows to create custom term queries Allow CommonTermsQuery to create custom term queries Key: LUCENE-5478 URL: https://issues.apache.org/jira/browse/LUCENE-5478 Project: Lucene - Core Issue Type: Improvement Components: modules/other Affects Versions: 4.7 Reporter: Simon Willnauer Fix For: 4.8, 5.0 Attachments: LUCENE-5478.patch currently we create term queries with _new TermQuery(..)_ directly in _CommonTermsQuery_ I'd like to extend the creation of the term query just like you can do that in the the query parser. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Closed] (LUCENE-5478) Allow CommonTermsQuery to create custom term queries
[ https://issues.apache.org/jira/browse/LUCENE-5478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer closed LUCENE-5478. --- Resolution: Fixed Allow CommonTermsQuery to create custom term queries Key: LUCENE-5478 URL: https://issues.apache.org/jira/browse/LUCENE-5478 Project: Lucene - Core Issue Type: Improvement Components: modules/other Affects Versions: 4.7 Reporter: Simon Willnauer Fix For: 4.8, 5.0 Attachments: LUCENE-5478.patch currently we create term queries with _new TermQuery(..)_ directly in _CommonTermsQuery_ I'd like to extend the creation of the term query just like you can do that in the the query parser. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5376) Add a demo search server
[ https://issues.apache.org/jira/browse/LUCENE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914680#comment-13914680 ] ASF subversion and git services commented on LUCENE-5376: - Commit 1572637 from [~mikemccand] in branch 'dev/branches/lucene5376' [ https://svn.apache.org/r1572637 ] LUCENE-5376: add factory for SuggestStopFilter; get PostingsHighlighter MTQ highlighting working with block join queries; fix 0.0 score from block join group parent; add explicit label faceting; fix analyzing infix suggester highlighting; allow drill-downs on range facets Add a demo search server Key: LUCENE-5376 URL: https://issues.apache.org/jira/browse/LUCENE-5376 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Attachments: lucene-demo-server.tgz I think it'd be useful to have a demo search server for Lucene. Rather than being fully featured, like Solr, it would be minimal, just wrapping the existing Lucene modules to show how you can make use of these features in a server setting. The purpose is to demonstrate how one can build a minimal search server on top of APIs like SearchManager, SearcherLifetimeManager, etc. This is also useful for finding rough edges / issues in Lucene's APIs that make building a server unnecessarily hard. I don't think it should have back compatibility promises (except Lucene's index back compatibility), so it's free to improve as Lucene's APIs change. As a starting point, I'll post what I built for the eating your own dog food search app for Lucene's Solr's jira issues http://jirasearch.mikemccandless.com (blog: http://blog.mikemccandless.com/2013/05/eating-dog-food-with-lucene.html ). It uses Netty to expose basic indexing searching APIs via JSON, but it's very rough (lots nocommits). -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5605) MapReduceIndexerTool fails in some locales -- seen in random failures of MapReduceIndexerToolArgumentParserTest
[ https://issues.apache.org/jira/browse/SOLR-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914682#comment-13914682 ] Mark Miller commented on SOLR-5605: --- A few points: * Are you not a committer? At Apache, those who do decide. * I did not realize Patricks patch did not include the latest code updates from MapReduce. You were not clear that you had looked at the latest code or the latest build. You have not contributed any real effort to the upstream work, therefore I don't have a lot of trust in your knowledge of the upstream work. * I had and still have bigger concerns around the usability of this code in Solr than this issue. It is very, very far from easy for someone to get started with this contrib right now. Which is why the contrib is marked experimental, which is why non of these smaller issues concern me very much at this point. MapReduceIndexerTool fails in some locales -- seen in random failures of MapReduceIndexerToolArgumentParserTest --- Key: SOLR-5605 URL: https://issues.apache.org/jira/browse/SOLR-5605 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Mark Miller Fix For: 4.7, 5.0 I noticed a randomized failure in MapReduceIndexerToolArgumentParserTest which is reproducible with any seed -- all that matters is the locale. The problem sounded familiar, and a quick search verified that jenkins has in fact hit this a couple of times in the past -- Uwe commented on the list that this is due to a real problem in one of the third-party dependencies (that does the argument parsing) that will affect usage on some systems. If working around the bug in the arg parsing lib isn't feasible, MapReduceIndexerTool should fail cleanly if the locale isn't one we know is supported -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5786) MapReduceIndexerTool --help output is missing large parts of the help text
[ https://issues.apache.org/jira/browse/SOLR-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved SOLR-5786. --- Resolution: Duplicate MapReduceIndexerTool --help output is missing large parts of the help text -- Key: SOLR-5786 URL: https://issues.apache.org/jira/browse/SOLR-5786 Project: Solr Issue Type: Bug Components: contrib - MapReduce Affects Versions: 4.7 Reporter: wolfgang hoschek Assignee: Mark Miller Fix For: 4.8 As already mentioned repeatedly and at length, this is a regression introduced by the fix in https://issues.apache.org/jira/browse/SOLR-5605 Here is the diff of --help output before SOLR-5605 vs after SOLR-5605: {code} 130,235c130 lucene segments left in this index. Merging segments involves reading and rewriting all data in all these segment files, potentially multiple times, which is very I/O intensive and time consuming. However, an index with fewer segments can later be merged faster, and it can later be queried faster once deployed to a live Solr serving shard. Set maxSegments to 1 to optimize the index for low query latency. In a nutshell, a small maxSegments value trades indexing latency for subsequently improved query latency. This can be a reasonable trade-off for batch indexing systems. (default: 1) --fair-scheduler-pool STRING Optional tuning knob that indicates the name of the fair scheduler pool to submit jobs to. The Fair Scheduler is a pluggable MapReduce scheduler that provides a way to share large clusters. Fair scheduling is a method of assigning resources to jobs such that all jobs get, on average, an equal share of resources over time. When there is a single job running, that job uses the entire cluster. When other jobs are submitted, tasks slots that free up are assigned to the new jobs, so that each job gets roughly the same amount of CPU time. Unlike the default Hadoop scheduler, which forms a queue of jobs, this lets short jobs finish in reasonable time while not starving long jobs. It is also an easy way to share a cluster between multiple of users. Fair sharing can also work with job priorities - the priorities are used as weights to determine the fraction of total compute time that each job gets. --dry-run Run in local mode and print documents to stdout instead of loading them into Solr. This executes the morphline in the client process (without submitting a job to MR) for quicker turnaround during early trialdebug sessions. (default: false) --log4j FILE Relative or absolute path to a log4j.properties config file on the local file system. This file will be uploaded to each MR task. Example: /path/to/log4j.properties --verbose, -v Turn on verbose output. (default: false) --show-non-solr-cloud Also show options for Non-SolrCloud mode as part of --help. (default: false) Required arguments: --output-dir HDFS_URI HDFS directory to write Solr indexes to. Inside there one output directory per shard will be generated.Example: hdfs://c2202.mycompany. com/user/$USER/test --morphline-file FILE Relative or absolute path to a local config file that contains one or more morphlines. The file must be UTF-8 encoded. Example: /path/to/morphline.conf Cluster arguments: Arguments that provide information about your
[jira] [Commented] (SOLR-5733) Solr 4.5.0, 4.5.1, and 4.6.1 spontaneously chashes within first 10min of their life
[ https://issues.apache.org/jira/browse/SOLR-5733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914684#comment-13914684 ] Marcus Engene commented on SOLR-5733: - Thanks, I'll try Oracle's too. Sorry, I thought I did close the ticket? I waited until I had some conclusions after testing, which perhaps was bad. Solr 4.5.0, 4.5.1, and 4.6.1 spontaneously chashes within first 10min of their life --- Key: SOLR-5733 URL: https://issues.apache.org/jira/browse/SOLR-5733 Project: Solr Issue Type: Bug Affects Versions: 4.5, 4.5.1, 4.6.1 Environment: Linux solrssd2 2.6.32-5-amd64 #1 SMP Fri May 10 08:43:19 UTC 2013 x86_64 GNU/Linux java version 1.6.0_18 OpenJDK Runtime Environment (IcedTea6 1.8.13) (6b18-1.8.13-0+squeeze2) OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode) Reporter: Marcus Engene Fix For: 4.6.1 tien@solrssd2:/solr461stem/example$ cat start.sh #!/bin/sh java -Xms9G -Xmx22G -Djetty.host=0.0.0.0 -Djetty.port=9993 -DhostPort=9993 -jar start.jar 2/dev/null 1/dev/null Solr crashes spontaneously about every 2nd start within the first 10min of the process life. tien@solrssd2:/solr461stem/example/solr/collection1$ du -ks data 5405556 data Machine is not heavily used Tasks: 317 total, 1 running, 316 sleeping, 0 stopped, 0 zombie Cpu(s): 1.3%us, 0.0%sy, 0.0%ni, 98.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 264660644k total, 227656492k used, 37004152k free, 544848k buffers Swap: 4000144k total, 102940k used, 3897204k free, 204332940k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 7700 tien 20 0 32.4g 3.3g 1.2g S 13 1.3 2:23.15 java 8208 tien 20 0 27.6g 3.9g 805m S 10 1.5 0:56.45 java 7785 tien 20 0 26.7g 5.6g 2.2g S2 2.2 3:42.94 java 6102 tien 20 0 27.6g 9.9g 4.3g S0 3.9 61:03.26 java 8337 tien 20 0 19204 1552 1016 R0 0.0 0:00.02 top 1 root 20 0 8356 796 664 S0 0.0 0:12.90 init 2 root 20 0 000 S0 0.0 0:00.00 kthreadd 3 root RT 0 000 S0 0.0 0:05.30 migration/0 4 root 20 0 000 S0 0.0 0:13.17 ksoftirqd/0 5 root RT 0 000 S0 0.0 0:00.00 watchdog/0 I'll try to attach the hs-dump. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5781) Make the Collections API timeout configurable.
[ https://issues.apache.org/jira/browse/SOLR-5781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914691#comment-13914691 ] Mark Miller commented on SOLR-5781: --- I was initially thinking cluster wide. Make the Collections API timeout configurable. -- Key: SOLR-5781 URL: https://issues.apache.org/jira/browse/SOLR-5781 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Mark Miller Fix For: 4.8, 5.0 This would also help with tests - nightlies can be quite intensive and need a very high timeout. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5789) Add min/max modifiers to Atomic Updates
Nim Lhûg created SOLR-5789: -- Summary: Add min/max modifiers to Atomic Updates Key: SOLR-5789 URL: https://issues.apache.org/jira/browse/SOLR-5789 Project: Solr Issue Type: New Feature Reporter: Nim Lhûg The Atomic Updates feature currently suppors add/inc/set. A min max modifier would allow for conditional updates: update if new value is smaller/greater than the current value. This is much more convenient than fetching the document, comparing the values and then sending an update. The patch seems to work, but probably requires more testing. Note: will add a link to the pull request in a minute. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[GitHub] lucene-solr pull request: SOLR-5789 Add min/max modifiers to Atomi...
GitHub user codematters opened a pull request: https://github.com/apache/lucene-solr/pull/39 SOLR-5789 Add min/max modifiers to Atomic Updates Allows for conditional atomic updates -- if new value is smaller or larger than the old value. Jira: https://issues.apache.org/jira/browse/SOLR-5789 You can merge this pull request into a Git repository by running: $ git pull https://github.com/INTIXnv/lucene-solr trunk Alternatively you can review and apply these changes as the patch at: https://github.com/apache/lucene-solr/pull/39.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #39 commit 901a810c98eae381862f79f7cf4f2c10bffb8730 Author: Bram gitb...@codematters.be Date: 2014-02-27T16:17:45Z SOLR-5789 Add min/max modifiers to Atomic Updates Allows for conditional atomic updates -- if new value is smaller or larger than the old value. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5789) Add min/max modifiers to Atomic Updates
[ https://issues.apache.org/jira/browse/SOLR-5789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914697#comment-13914697 ] ASF GitHub Bot commented on SOLR-5789: -- GitHub user codematters opened a pull request: https://github.com/apache/lucene-solr/pull/39 SOLR-5789 Add min/max modifiers to Atomic Updates Allows for conditional atomic updates -- if new value is smaller or larger than the old value. Jira: https://issues.apache.org/jira/browse/SOLR-5789 You can merge this pull request into a Git repository by running: $ git pull https://github.com/INTIXnv/lucene-solr trunk Alternatively you can review and apply these changes as the patch at: https://github.com/apache/lucene-solr/pull/39.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #39 commit 901a810c98eae381862f79f7cf4f2c10bffb8730 Author: Bram gitb...@codematters.be Date: 2014-02-27T16:17:45Z SOLR-5789 Add min/max modifiers to Atomic Updates Allows for conditional atomic updates -- if new value is smaller or larger than the old value. Add min/max modifiers to Atomic Updates --- Key: SOLR-5789 URL: https://issues.apache.org/jira/browse/SOLR-5789 Project: Solr Issue Type: New Feature Reporter: Nim Lhûg The Atomic Updates feature currently suppors add/inc/set. A min max modifier would allow for conditional updates: update if new value is smaller/greater than the current value. This is much more convenient than fetching the document, comparing the values and then sending an update. The patch seems to work, but probably requires more testing. Note: will add a link to the pull request in a minute. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5468) Hunspell very high memory use when loading dictionary
[ https://issues.apache.org/jira/browse/LUCENE-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914696#comment-13914696 ] ASF subversion and git services commented on LUCENE-5468: - Commit 1572643 from [~rcmuir] in branch 'dev/branches/lucene5468' [ https://svn.apache.org/r1572643 ] LUCENE-5468: don't create unnecessary objects Hunspell very high memory use when loading dictionary - Key: LUCENE-5468 URL: https://issues.apache.org/jira/browse/LUCENE-5468 Project: Lucene - Core Issue Type: Bug Affects Versions: 3.5 Reporter: Maciej Lisiewski Priority: Minor Attachments: patch.txt Hunspell stemmer requires gigantic (for the task) amounts of memory to load dictionary/rules files. For example loading a 4.5 MB polish dictionary (with empty index!) will cause whole core to crash with various out of memory errors unless you set max heap size close to 2GB or more. By comparison Stempel using the same dictionary file works just fine with 1/8 of that (and possibly lower values as well). Sample error log entries: http://pastebin.com/fSrdd5W1 http://pastebin.com/Lmi0re7Z -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5789) Add min/max modifiers to Atomic Updates
[ https://issues.apache.org/jira/browse/SOLR-5789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nim Lhûg updated SOLR-5789: --- Description: The Atomic Updates feature currently suppors add/inc/set. A min max modifier would allow for conditional updates: update if new value is smaller/greater than the current value. This is much more convenient than fetching the document, comparing the values and then sending an update. The patch seems to work, but probably requires more testing. Pull request: https://github.com/apache/lucene-solr/pull/39 was: The Atomic Updates feature currently suppors add/inc/set. A min max modifier would allow for conditional updates: update if new value is smaller/greater than the current value. This is much more convenient than fetching the document, comparing the values and then sending an update. The patch seems to work, but probably requires more testing. Note: will add a link to the pull request in a minute. Add min/max modifiers to Atomic Updates --- Key: SOLR-5789 URL: https://issues.apache.org/jira/browse/SOLR-5789 Project: Solr Issue Type: New Feature Reporter: Nim Lhûg The Atomic Updates feature currently suppors add/inc/set. A min max modifier would allow for conditional updates: update if new value is smaller/greater than the current value. This is much more convenient than fetching the document, comparing the values and then sending an update. The patch seems to work, but probably requires more testing. Pull request: https://github.com/apache/lucene-solr/pull/39 -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-Tests-4.x-Java7 - Build # 1913 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-Tests-4.x-Java7/1913/ All tests passed Build Log: [...truncated 28417 lines...] check-licenses: [echo] License check under: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/solr [licenses] MISSING sha1 checksum file for: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/solr/example/lib/ext/jcl-over-slf4j-1.6.6.jar [licenses] EXPECTED sha1 checksum file : /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/solr/licenses/jcl-over-slf4j-1.6.6.jar.sha1 [licenses] MISSING sha1 checksum file for: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/solr/example/lib/ext/jul-to-slf4j-1.6.6.jar [licenses] EXPECTED sha1 checksum file : /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/solr/licenses/jul-to-slf4j-1.6.6.jar.sha1 [licenses] MISSING sha1 checksum file for: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/solr/example/lib/ext/slf4j-api-1.6.6.jar [licenses] EXPECTED sha1 checksum file : /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/solr/licenses/slf4j-api-1.6.6.jar.sha1 [...truncated 3 lines...] BUILD FAILED /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/build.xml:471: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/build.xml:64: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/solr/build.xml:254: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/lucene/tools/custom-tasks.xml:62: License check failed. Check the logs. Total time: 108 minutes 19 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [GitHub] lucene-solr pull request: SOLR-5789 Add min/max modifiers to Atomi...
GitHub user codematters opened a pull request: https://github.com/apache/lucene-solr/pull/39 Discussion and tips for improvements are welcome! Thanks - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5790) SolrException: Unknown document router '{name=compositeId}'.
Günther Ruck created SOLR-5790: -- Summary: SolrException: Unknown document router '{name=compositeId}'. Key: SOLR-5790 URL: https://issues.apache.org/jira/browse/SOLR-5790 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6.1 Environment: Windows 7 64 Bit Reporter: Günther Ruck Priority: Minor I tried to use the CloudServerClass of the SolrJ-Api. SolrJ and Solr-Server both in version 4.6.1. {{serverCloud = new CloudSolrServer(zkHost);}} My JUnit starts with a deleteByQuery. In DocRouter.java:46 a SolrException is thrown because {{routerMap.get(routerSpec);}} finds no entry. _Hints:_ routerSpec is an instance of LinkedHashMapK,V with one entry (key:name, value:compositeId). routerMap is a HashMapK,V holding 4 entries, especially key:compositeId has value: org.apache.solr.common.cloud.CompositeIdRouter. Probably there is a type mismatch at the routerMap.get call. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5781) Make the Collections API timeout configurable.
[ https://issues.apache.org/jira/browse/SOLR-5781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914716#comment-13914716 ] Noble Paul commented on SOLR-5781: -- At least it should be overridable on a per call basis Make the Collections API timeout configurable. -- Key: SOLR-5781 URL: https://issues.apache.org/jira/browse/SOLR-5781 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Mark Miller Fix For: 4.8, 5.0 This would also help with tests - nightlies can be quite intensive and need a very high timeout. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5438) add near-real-time replication
[ https://issues.apache.org/jira/browse/LUCENE-5438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914721#comment-13914721 ] ASF subversion and git services commented on LUCENE-5438: - Commit 1572653 from [~mikemccand] in branch 'dev/branches/lucene5438' [ https://svn.apache.org/r1572653 ] LUCENE-5438: commit current [broken] state add near-real-time replication -- Key: LUCENE-5438 URL: https://issues.apache.org/jira/browse/LUCENE-5438 Project: Lucene - Core Issue Type: Improvement Components: modules/replicator Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.7, 5.0 Attachments: LUCENE-5438.patch, LUCENE-5438.patch Lucene's replication module makes it easy to incrementally sync index changes from a master index to any number of replicas, and it handles/abstracts all the underlying complexity of holding a time-expiring snapshot, finding which files need copying, syncing more than one index (e.g., taxo + index), etc. But today you must first commit on the master, and then again the replica's copied files are fsync'd, because the code operates on commit points. But this isn't technically necessary, and it mixes up durability and fast turnaround time. Long ago we added near-real-time readers to Lucene, for the same reason: you shouldn't have to commit just to see the new index changes. I think we should do the same for replication: allow the new segments to be copied out to replica(s), and new NRT readers to be opened, to fully decouple committing from visibility. This way apps can then separately choose when to replicate (for freshness), and when to commit (for durability). I think for some apps this could be a compelling alternative to the re-index all documents on each shard approach that Solr Cloud / ElasticSearch implement today, and it may also mean that the transaction log can remain external to / above the cluster. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5438) add near-real-time replication
[ https://issues.apache.org/jira/browse/LUCENE-5438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914724#comment-13914724 ] Michael McCandless commented on LUCENE-5438: I've committed my current [broken] state here, but I'm gonna moth ball this for now. I had made the test case more evil, by adding randomly shutting down a master and moving it to another node (promoting a replica to master). It turns out this is very hard to do properly, because in this case, file names can be re-used (Lucene is no longer write-once) and detecting that is tricky, unless we can rely on some external global reliable storage (e.g. something stored in Zookeeper maybe) to record the last segments gen / segment name that was written on any node ... add near-real-time replication -- Key: LUCENE-5438 URL: https://issues.apache.org/jira/browse/LUCENE-5438 Project: Lucene - Core Issue Type: Improvement Components: modules/replicator Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.7, 5.0 Attachments: LUCENE-5438.patch, LUCENE-5438.patch Lucene's replication module makes it easy to incrementally sync index changes from a master index to any number of replicas, and it handles/abstracts all the underlying complexity of holding a time-expiring snapshot, finding which files need copying, syncing more than one index (e.g., taxo + index), etc. But today you must first commit on the master, and then again the replica's copied files are fsync'd, because the code operates on commit points. But this isn't technically necessary, and it mixes up durability and fast turnaround time. Long ago we added near-real-time readers to Lucene, for the same reason: you shouldn't have to commit just to see the new index changes. I think we should do the same for replication: allow the new segments to be copied out to replica(s), and new NRT readers to be opened, to fully decouple committing from visibility. This way apps can then separately choose when to replicate (for freshness), and when to commit (for durability). I think for some apps this could be a compelling alternative to the re-index all documents on each shard approach that Solr Cloud / ElasticSearch implement today, and it may also mean that the transaction log can remain external to / above the cluster. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5781) Make the Collections API timeout configurable.
[ https://issues.apache.org/jira/browse/SOLR-5781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914727#comment-13914727 ] Mark Miller commented on SOLR-5781: --- That doesn't concern me much for this issue - my motivation is for easy adjustment for tests - it's just kind of a side affect that it will also benefit users. If someone wants to make it available per call for users as well, that's fine with me. Though it's not likely they are going to know how they should set it depending on the call they are making. Someone might argue that it's almost just adding confusion to the API more than it helps really. I wouldn't argue though. Make the Collections API timeout configurable. -- Key: SOLR-5781 URL: https://issues.apache.org/jira/browse/SOLR-5781 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Mark Miller Fix For: 4.8, 5.0 This would also help with tests - nightlies can be quite intensive and need a very high timeout. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5471) Classloader issues when running Lucene under a java SecurityManager
[ https://issues.apache.org/jira/browse/LUCENE-5471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914731#comment-13914731 ] Rick Hillegas commented on LUCENE-5471: --- Thanks for the help and the discussion so far, Hoss and Uwe. Attaching a second rev of the SecureLucene test program. This version pares back the permissions in order to expose the minimal attack surface which I can configure by myself. Here are the minimal permissions which the test program grants in order to run successfully under a Java Security Manager: {noformat} // permissions granted to Lucene grant codeBase file:/Users/rh161140/derby/derby-590/trunk/tools/java/lucene-core-4.5.0.jar { // permissions for file access, write access only to sandbox: permission java.io.FilePermission ALL FILES, read; permission java.io.FilePermission /Users/rh161140/derby/derby-590/luceneTest, read,write,delete; permission java.io.FilePermission /Users/rh161140/derby/derby-590/luceneTest/-, read,write,delete; // Basic permissions needed for Lucene to work: permission java.util.PropertyPermission user.dir, read; permission java.util.PropertyPermission sun.arch.data.model, read; permission java.lang.reflect.ReflectPermission *; permission java.lang.RuntimePermission *; }; // permissions granted to the application grant codeBase file:/Users/rh161140/src/ { // permissions for file access, write access only to sandbox: permission java.io.FilePermission ALL FILES, read; permission java.io.FilePermission /Users/rh161140/derby/derby-590/luceneTest, read,write; permission java.io.FilePermission /Users/rh161140/derby/derby-590/luceneTest/-, read,write,delete; // Basic permissions needed for Lucene to work: permission java.util.PropertyPermission user.dir, read; permission java.util.PropertyPermission sun.arch.data.model, read; }; {noformat} I have some follow on comments and questions: 1) Is it really necessary to grant Lucene every RuntimePermission and the privilege to read every file in the file system? Maybe these grants can be tightened. 2) I don't understand why the calling, application code needs to be granted any permissions. Maybe some more privilege blocks could be added to the Lucene code? In particular, it seems a shame that the application has to be granted the privilege to read every file in the file system. 3) Most of the application permissions are self-revealing. That is, if I omit one of them, then I get an exception telling me that the permission needs to be granted. However, that is not the case for the first permission granted to the application... permission java.io.FilePermission ALL FILES, read; ...Without that permission, I get the original puzzling exception: Caused by: java.lang.IllegalArgumentException: A SPI class of type org.apache.lucene.codecs.Codec..., which doesn't really tell me what the problem is. Maybe the wording of that exception could be improved so that the user can be told that one of its root causes is a failure to grant the application and Lucene read access to every file in the file system. Thanks, -Rick Classloader issues when running Lucene under a java SecurityManager --- Key: LUCENE-5471 URL: https://issues.apache.org/jira/browse/LUCENE-5471 Project: Lucene - Core Issue Type: Bug Affects Versions: 4.5 Reporter: Rick Hillegas Attachments: SecureLucene.java I see the following error when I run Lucene 4.5.0 under a java SecurityManager. I will attach a test program which shows this problem. The program works fine when a SecurityManager is not installed. But the program fails when I install a SecurityManager. Even more puzzling, the program works if I first run it without a SecurityManager, then install a SecurityManager, then re-run the program, all within the lifetime of a single JVM. I would appreciate advice about how to work around this problem: Exception in thread main java.lang.ExceptionInInitializerError at org.apache.lucene.index.LiveIndexWriterConfig.init(LiveIndexWriterConfig.java:122) at org.apache.lucene.index.IndexWriterConfig.init(IndexWriterConfig.java:165) at SecureLucene$1.run(SecureLucene.java:129) at SecureLucene$1.run(SecureLucene.java:122) at java.security.AccessController.doPrivileged(Native Method) at SecureLucene.getIndexWriter(SecureLucene.java:120) at SecureLucene.runTest(SecureLucene.java:72) at SecureLucene.main(SecureLucene.java:52) Caused by: java.lang.IllegalArgumentException: A SPI class of type org.apache.lucene.codecs.Codec with name 'Lucene45' does not exist. You need to add the corresponding JAR file supporting this SPI to your classpath.The current classpath supports the following names: []
[jira] [Updated] (SOLR-5753) eliminate blue Topics covered in this section box from ref guide
[ https://issues.apache.org/jira/browse/SOLR-5753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cassandra Targett updated SOLR-5753: Component/s: documentation eliminate blue Topics covered in this section box from ref guide -- Key: SOLR-5753 URL: https://issues.apache.org/jira/browse/SOLR-5753 Project: Solr Issue Type: Task Components: documentation Reporter: Hoss Man Assignee: Cassandra Targett Fix For: 4.8 a bunch of pages in the ref guide have a blue box at the top right of the page that says Topics covered in this section and has links down to the major anchors on the page... https://cwiki.apache.org/confluence/dosearchsite.action?queryString=%22Topics+covered+in+this+section%22where=solrtype=lastModified=contributor=contributorUsername= ...this blue box looks great on the webpage, but doesn't look very good in the exported PDF. we should consider eliminating it, or reformatting it, or replacing it with something that makes more sense in the context of the PDF -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: VOTE: Release apache-solr-ref-guide-4.7.pdf (RC1)
On Thu, Feb 27, 2014 at 4:11 AM, Steve Rowe sar...@gmail.com wrote: I noticed that a couple of Topics covered in this section boxes are too narrow to allow their content to be legible - the ones on pages 251 and 300 look really bad, and some others are only marginally legible. I finally found a fix for this problem (which is a more egregious example of the same issue Hoss filed as SOLR-5753) on this Atlassian issue: https://jira.atlassian.com/browse/CONF-14758. However it states that it's for Confluence 5.3+, so I don't know if it will work with 5.0.3 which is the version CWIKI is on. Maybe worth a try? I've posted the possible solution to SOLR-5753 so maybe if you put it in the PDF CSS we can see how it works.
[jira] [Commented] (SOLR-5753) eliminate blue Topics covered in this section box from ref guide
[ https://issues.apache.org/jira/browse/SOLR-5753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914737#comment-13914737 ] Cassandra Targett commented on SOLR-5753: - According to this issue: https://jira.atlassian.com/browse/CONF-14758 (in the description), a fix for this problem should be to add the below to the CSS for the PDF export. However, the same issue says it's for Confluence 5.3+, so may not work with the current CWIKI version, but is maybe worth a try for 4.7? {code} .sectionMacro .columnMacro { border: none; padding: 0; } .columnMacro { display: table-cell; vertical-align: top; } {code} eliminate blue Topics covered in this section box from ref guide -- Key: SOLR-5753 URL: https://issues.apache.org/jira/browse/SOLR-5753 Project: Solr Issue Type: Task Components: documentation Reporter: Hoss Man Assignee: Cassandra Targett Fix For: 4.8 a bunch of pages in the ref guide have a blue box at the top right of the page that says Topics covered in this section and has links down to the major anchors on the page... https://cwiki.apache.org/confluence/dosearchsite.action?queryString=%22Topics+covered+in+this+section%22where=solrtype=lastModified=contributor=contributorUsername= ...this blue box looks great on the webpage, but doesn't look very good in the exported PDF. we should consider eliminating it, or reformatting it, or replacing it with something that makes more sense in the context of the PDF -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5468) Hunspell very high memory use when loading dictionary
[ https://issues.apache.org/jira/browse/LUCENE-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914755#comment-13914755 ] ASF subversion and git services commented on LUCENE-5468: - Commit 1572660 from [~rcmuir] in branch 'dev/branches/lucene5468' [ https://svn.apache.org/r1572660 ] LUCENE-5468: encode affix data as 8 bytes per affix, before cutting over to FST Hunspell very high memory use when loading dictionary - Key: LUCENE-5468 URL: https://issues.apache.org/jira/browse/LUCENE-5468 Project: Lucene - Core Issue Type: Bug Affects Versions: 3.5 Reporter: Maciej Lisiewski Priority: Minor Attachments: patch.txt Hunspell stemmer requires gigantic (for the task) amounts of memory to load dictionary/rules files. For example loading a 4.5 MB polish dictionary (with empty index!) will cause whole core to crash with various out of memory errors unless you set max heap size close to 2GB or more. By comparison Stempel using the same dictionary file works just fine with 1/8 of that (and possibly lower values as well). Sample error log entries: http://pastebin.com/fSrdd5W1 http://pastebin.com/Lmi0re7Z -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5753) eliminate blue Topics covered in this section box from ref guide
[ https://issues.apache.org/jira/browse/SOLR-5753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Rowe updated SOLR-5753: - Attachment: solr-ResultClustering-270214-1714-9084.pdf Cassandra, I added the CSS snippet you quoted above to the PDF export stylesheet, and the result is attached for the Result Clustering page - it definitely looks better to me. eliminate blue Topics covered in this section box from ref guide -- Key: SOLR-5753 URL: https://issues.apache.org/jira/browse/SOLR-5753 Project: Solr Issue Type: Task Components: documentation Reporter: Hoss Man Assignee: Cassandra Targett Fix For: 4.8 Attachments: solr-ResultClustering-270214-1714-9084.pdf a bunch of pages in the ref guide have a blue box at the top right of the page that says Topics covered in this section and has links down to the major anchors on the page... https://cwiki.apache.org/confluence/dosearchsite.action?queryString=%22Topics+covered+in+this+section%22where=solrtype=lastModified=contributor=contributorUsername= ...this blue box looks great on the webpage, but doesn't look very good in the exported PDF. we should consider eliminating it, or reformatting it, or replacing it with something that makes more sense in the context of the PDF -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5753) eliminate blue Topics covered in this section box from ref guide
[ https://issues.apache.org/jira/browse/SOLR-5753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914758#comment-13914758 ] Steve Rowe commented on SOLR-5753: -- I noticed that the attached Result Clustering page says Topics covered on this page, which is confusing in the PDF - probably should change this (and others like it) to say section instead of page eliminate blue Topics covered in this section box from ref guide -- Key: SOLR-5753 URL: https://issues.apache.org/jira/browse/SOLR-5753 Project: Solr Issue Type: Task Components: documentation Reporter: Hoss Man Assignee: Cassandra Targett Fix For: 4.8 Attachments: solr-ResultClustering-270214-1714-9084.pdf a bunch of pages in the ref guide have a blue box at the top right of the page that says Topics covered in this section and has links down to the major anchors on the page... https://cwiki.apache.org/confluence/dosearchsite.action?queryString=%22Topics+covered+in+this+section%22where=solrtype=lastModified=contributor=contributorUsername= ...this blue box looks great on the webpage, but doesn't look very good in the exported PDF. we should consider eliminating it, or reformatting it, or replacing it with something that makes more sense in the context of the PDF -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-5616) Make grouping code use response builder needDocList
[ https://issues.apache.org/jira/browse/SOLR-5616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson reassigned SOLR-5616: Assignee: Erick Erickson Make grouping code use response builder needDocList --- Key: SOLR-5616 URL: https://issues.apache.org/jira/browse/SOLR-5616 Project: Solr Issue Type: Bug Reporter: Steven Bower Assignee: Erick Erickson Attachments: SOLR-5616.patch Right now the grouping code does this to check if it needs to generate a docList for grouped results: {code} if (rb.doHighlights || rb.isDebug() || params.getBool(MoreLikeThisParams.MLT, false) ){ ... } {code} this is ugly because any new component that needs a doclist, from grouped results, will need to modify QueryComponent to add a check to this if. Ideally this should just use the rb.isNeedDocList() flag... Coincidentally this boolean is really never used at for non-grouped results it always gets generated.. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5753) eliminate blue Topics covered in this section box from ref guide
[ https://issues.apache.org/jira/browse/SOLR-5753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Rowe updated SOLR-5753: - Attachment: solr-RequestHandlersandSearchComponentsinSolrConfig-270214-1724-9098.pdf Attaching the Request Handlers and Search Components in SolrConfig PDF export using the modified PDF export CSS - this one is way better, almost looks good! This is the one from page 300 in the Ref Guide RC1 PDF. eliminate blue Topics covered in this section box from ref guide -- Key: SOLR-5753 URL: https://issues.apache.org/jira/browse/SOLR-5753 Project: Solr Issue Type: Task Components: documentation Reporter: Hoss Man Assignee: Cassandra Targett Fix For: 4.8 Attachments: solr-RequestHandlersandSearchComponentsinSolrConfig-270214-1724-9098.pdf, solr-ResultClustering-270214-1714-9084.pdf a bunch of pages in the ref guide have a blue box at the top right of the page that says Topics covered in this section and has links down to the major anchors on the page... https://cwiki.apache.org/confluence/dosearchsite.action?queryString=%22Topics+covered+in+this+section%22where=solrtype=lastModified=contributor=contributorUsername= ...this blue box looks great on the webpage, but doesn't look very good in the exported PDF. we should consider eliminating it, or reformatting it, or replacing it with something that makes more sense in the context of the PDF -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5758) need ref guide doc on building indexes with mapreduce (morphlines-cell contrib)
[ https://issues.apache.org/jira/browse/SOLR-5758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914760#comment-13914760 ] Mark Miller commented on SOLR-5758: --- That output is affect by SOLR-5782 - I'll make another dump shortly. need ref guide doc on building indexes with mapreduce (morphlines-cell contrib) --- Key: SOLR-5758 URL: https://issues.apache.org/jira/browse/SOLR-5758 Project: Solr Issue Type: Task Components: documentation Reporter: Hoss Man Assignee: Mark Miller Fix For: 4.8 This is marked experimental for 4.7, but we should have a section on it in the ref guide in 4.8 -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5753) eliminate blue Topics covered in this section box from ref guide
[ https://issues.apache.org/jira/browse/SOLR-5753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914754#comment-13914754 ] Steve Rowe edited comment on SOLR-5753 at 2/27/14 5:20 PM: --- Cassandra, I added the CSS snippet you quoted above to the PDF export stylesheet, and the result is attached for the Result Clustering page (corresponding to page 251 in the Solr Ref Guide RC1 PDF) - it definitely looks better to me. was (Author: steve_rowe): Cassandra, I added the CSS snippet you quoted above to the PDF export stylesheet, and the result is attached for the Result Clustering page - it definitely looks better to me. eliminate blue Topics covered in this section box from ref guide -- Key: SOLR-5753 URL: https://issues.apache.org/jira/browse/SOLR-5753 Project: Solr Issue Type: Task Components: documentation Reporter: Hoss Man Assignee: Cassandra Targett Fix For: 4.8 Attachments: solr-ResultClustering-270214-1714-9084.pdf a bunch of pages in the ref guide have a blue box at the top right of the page that says Topics covered in this section and has links down to the major anchors on the page... https://cwiki.apache.org/confluence/dosearchsite.action?queryString=%22Topics+covered+in+this+section%22where=solrtype=lastModified=contributor=contributorUsername= ...this blue box looks great on the webpage, but doesn't look very good in the exported PDF. we should consider eliminating it, or reformatting it, or replacing it with something that makes more sense in the context of the PDF -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5787) Get spellcheck frequency relatively to current query
[ https://issues.apache.org/jira/browse/SOLR-5787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914770#comment-13914770 ] Hakim commented on SOLR-5787: - Because the frequency returned with each suggestion word, count the occurences in the WHOLE index instead of counting it only for the documents satisfying current lucene query. Get spellcheck frequency relatively to current query Key: SOLR-5787 URL: https://issues.apache.org/jira/browse/SOLR-5787 Project: Solr Issue Type: Improvement Components: spellchecker Affects Versions: 4.6 Environment: Solr deployed on Jetty 9 Servlet container Reporter: Hakim Priority: Minor Labels: features, newbie I guess that this functionnality isn't implemented yet. I'll begin by an example to explain what I'm requesting: I have a lucene query that get articles satisfying a certain query. With this same command, I'm getting at the same time suggestions if this query doesnt return any article (so far, nothing unusual). The frequency (count) associated with these suggestions is relative to all index (it counts all occurences of the suggestion in the whole index). What I want is that it counts only suggestion occurences satisfying current lucene query. P.S: I'm using solr's spellcheck component (solr.DirectSolrSpellChecker). -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5468) Hunspell very high memory use when loading dictionary
[ https://issues.apache.org/jira/browse/LUCENE-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914795#comment-13914795 ] ASF subversion and git services commented on LUCENE-5468: - Commit 1572666 from [~rcmuir] in branch 'dev/branches/lucene5468' [ https://svn.apache.org/r1572666 ] LUCENE-5468: convert affixes to FST Hunspell very high memory use when loading dictionary - Key: LUCENE-5468 URL: https://issues.apache.org/jira/browse/LUCENE-5468 Project: Lucene - Core Issue Type: Bug Affects Versions: 3.5 Reporter: Maciej Lisiewski Priority: Minor Attachments: patch.txt Hunspell stemmer requires gigantic (for the task) amounts of memory to load dictionary/rules files. For example loading a 4.5 MB polish dictionary (with empty index!) will cause whole core to crash with various out of memory errors unless you set max heap size close to 2GB or more. By comparison Stempel using the same dictionary file works just fine with 1/8 of that (and possibly lower values as well). Sample error log entries: http://pastebin.com/fSrdd5W1 http://pastebin.com/Lmi0re7Z -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5468) Hunspell very high memory use when loading dictionary
[ https://issues.apache.org/jira/browse/LUCENE-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914799#comment-13914799 ] Robert Muir commented on LUCENE-5468: - I am finished compressing for now. I think its pretty reasonable across all the languages. I will cleanup and try to add back the multiple dictionary/ignore case stuff and clean up some other things. ||dict||old RAM||new RAM|| |af_ZA.zip|18 MB|917.1 KB| |ak_GH.zip|1.5 MB|103.2 KB| |bg_BG.zip|FAIL|465.7 KB| |ca_ANY.zip|28.9 MB|675.4 KB| |ca_ES.zip|15.1 MB|639.8 KB| |cop_EG.zip|2.1 MB|144.5 KB| |cs_CZ.zip|50.4 MB|1.5 MB| |cy_GB.zip|FAIL|627.4 KB| |da_DK.zip|FAIL|669.8 KB| |de_AT.zip|1.3 MB|123.9 KB| |de_CH.zip|12.6 MB|725.4 KB| |de_DE.zip|12.6 MB|726 KB| |de_DE_comb.zip|102.2 MB|4.2 MB| |de_DE_frami.zip|20.9 MB|1023.5 KB| |de_DE_neu.zip|101.5 MB|4.2 MB| |el_GR.zip|74.3 MB|1 MB| |en_AU.zip|8.1 MB|521 KB| |en_CA.zip|9.8 MB|450.5 KB| |en_GB-oed.zip|8.2 MB|526.6 KB| |en_GB.zip|8.3 MB|527.3 KB| |en_NZ.zip|8.4 MB|532.4 KB| |eo.zip|4.9 MB|310.5 KB| |eo_EO.zip|4.9 MB|310.5 KB| |es_AR.zip|14.8 MB|734.9 KB| |es_BO.zip|14.8 MB|735 KB| |es_CL.zip|14.7 MB|734.9 KB| |es_CO.zip|14.3 MB|722.1 KB| |es_CR.zip|14.8 MB|733.9 KB| |es_CU.zip|14.7 MB|732.8 KB| |es_DO.zip|14.7 MB|731.9 KB| |es_EC.zip|14.8 MB|733.5 KB| |es_ES.zip|15.1 MB|743 KB| |es_GT.zip|14.8 MB|734.5 KB| |es_HN.zip|14.8 MB|735.2 KB| |es_MX.zip|14.3 MB|723.8 KB| |es_NEW.zip|15.5 MB|768.5 KB| |es_NI.zip|14.8 MB|734.5 KB| |es_PA.zip|14.8 MB|733.8 KB| |es_PE.zip|14.2 MB|721.3 KB| |es_PR.zip|14.7 MB|732.4 KB| |es_PY.zip|14.8 MB|734.1 KB| |es_SV.zip|14.8 MB|733.6 KB| |es_UY.zip|14.8 MB|736.9 KB| |es_VE.zip|14.3 MB|722.7 KB| |et_EE.zip|53.6 MB|473.6 KB| |fo_FO.zip|18.6 MB|517.9 KB| |fr_FR-1990_1-3-2.zip|14 MB|526.7 KB| |fr_FR-classique_1-3-2.zip|14 MB|539.2 KB| |fr_FR_1-3-2.zip|14.5 MB|550.4 KB| |fy_NL.zip|4.2 MB|265.6 KB| |ga_IE.zip|14 MB|460.6 KB| |gd_GB.zip|2.7 MB|143.1 KB| |gl_ES.zip|FAIL|479.4 KB| |gsc_FR.zip|FAIL|1.3 MB| |gu_IN.zip|20.3 MB|947 KB| |he_IL.zip|53.3 MB|539.2 KB| |hi_IN.zip|2.7 MB|169 KB| |hil_PH.zip|3.4 MB|197 KB| |hr_HR.zip|29.7 MB|573 KB| |hu_HU.zip|FAIL|1.2 MB| |hu_HU_comb.zip|FAIL|5.4 MB| |ia.zip|4.9 MB|222.9 KB| |id_ID.zip|3.9 MB|226.3 KB| |it_IT.zip|15.3 MB|612.9 KB| |ku_TR.zip|1.6 MB|118.7 KB| |la.zip|5.1 MB|199.3 KB| |lt_LT.zip|15 MB|682.5 KB| |lv_LV.zip|36.3 MB|763.9 KB| |mg_MG.zip|2.9 MB|163.8 KB| |mi_NZ.zip|FAIL|191.4 KB| |mk_MK.zip|FAIL|469.1 KB| |mos_BF.zip|13.3 MB|242.2 KB| |mr_IN.zip|FAIL|147.7 KB| |ms_MY.zip|4.1 MB|226.9 KB| |nb_NO.zip|22.9 MB|1.2 MB| |ne_NP.zip|5.5 MB|328.1 KB| |nl_NL.zip|22.9 MB|1.1 MB| |nl_med.zip|1.2 MB|92.3 KB| |nn_NO.zip|16.5 MB|914 KB| |nr_ZA.zip|3.1 MB|203.3 KB| |ns_ZA.zip|1.7 MB|118 KB| |ny_MW.zip|FAIL|101.8 KB| |oc_FR.zip|9.1 MB|401.5 KB| |pl_PL.zip|43.9 MB|1.7 MB| |pt_BR.zip|FAIL|2.1 MB| |pt_PT.zip|5.8 MB|379.4 KB| |ro_RO.zip|5.1 MB|256.3 KB| |ru_RU.zip|21.7 MB|882 KB| |ru_RU_ye.zip|43.7 MB|1.5 MB| |ru_RU_yo.zip|21.7 MB|897.3 KB| |rw_RW.zip|1.6 MB|102.3 KB| |sk_SK.zip|25.1 MB|1.2 MB| |sl_SI.zip|38.3 MB|604 KB||af_ZA.zip|18 MB|917.1 KB| |ak_GH.zip|1.5 MB|103.2 KB| |bg_BG.zip|FAIL|465.7 KB| |ca_ANY.zip|28.9 MB|675.4 KB| |ca_ES.zip|15.1 MB|639.8 KB| |cop_EG.zip|2.1 MB|144.5 KB| |cs_CZ.zip|50.4 MB|1.5 MB| |cy_GB.zip|FAIL|627.4 KB| |da_DK.zip|FAIL|669.8 KB| |de_AT.zip|1.3 MB|123.9 KB| |de_CH.zip|12.6 MB|725.4 KB| |de_DE.zip|12.6 MB|726 KB| |de_DE_comb.zip|102.2 MB|4.2 MB| |de_DE_frami.zip|20.9 MB|1023.5 KB| |de_DE_neu.zip|101.5 MB|4.2 MB| |el_GR.zip|74.3 MB|1 MB| |en_AU.zip|8.1 MB|521 KB| |en_CA.zip|9.8 MB|450.5 KB| |en_GB-oed.zip|8.2 MB|526.6 KB| |en_GB.zip|8.3 MB|527.3 KB| |en_NZ.zip|8.4 MB|532.4 KB| |eo.zip|4.9 MB|310.5 KB| |eo_EO.zip|4.9 MB|310.5 KB| |es_AR.zip|14.8 MB|734.9 KB| |es_BO.zip|14.8 MB|735 KB| |es_CL.zip|14.7 MB|734.9 KB| |es_CO.zip|14.3 MB|722.1 KB| |es_CR.zip|14.8 MB|733.9 KB| |es_CU.zip|14.7 MB|732.8 KB| |es_DO.zip|14.7 MB|731.9 KB| |es_EC.zip|14.8 MB|733.5 KB| |es_ES.zip|15.1 MB|743 KB| |es_GT.zip|14.8 MB|734.5 KB| |es_HN.zip|14.8 MB|735.2 KB| |es_MX.zip|14.3 MB|723.8 KB| |es_NEW.zip|15.5 MB|768.5 KB| |es_NI.zip|14.8 MB|734.5 KB| |es_PA.zip|14.8 MB|733.8 KB| |es_PE.zip|14.2 MB|721.3 KB| |es_PR.zip|14.7 MB|732.4 KB| |es_PY.zip|14.8 MB|734.1 KB| |es_SV.zip|14.8 MB|733.6 KB| |es_UY.zip|14.8 MB|736.9 KB| |es_VE.zip|14.3 MB|722.7 KB| |et_EE.zip|53.6 MB|473.6 KB| |fo_FO.zip|18.6 MB|517.9 KB| |fr_FR-1990_1-3-2.zip|14 MB|526.7 KB| |fr_FR-classique_1-3-2.zip|14 MB|539.2 KB| |fr_FR_1-3-2.zip|14.5 MB|550.4 KB| |fy_NL.zip|4.2 MB|265.6 KB| |ga_IE.zip|14 MB|460.6 KB| |gd_GB.zip|2.7 MB|143.1 KB| |gl_ES.zip|FAIL|479.4 KB| |gsc_FR.zip|FAIL|1.3 MB| |gu_IN.zip|20.3 MB|947 KB| |he_IL.zip|53.3 MB|539.2 KB| |hi_IN.zip|2.7 MB|169 KB| |hil_PH.zip|3.4 MB|197 KB| |hr_HR.zip|29.7 MB|573 KB| |hu_HU.zip|FAIL|1.2 MB| |hu_HU_comb.zip|FAIL|5.4 MB| |ia.zip|4.9 MB|222.9 KB| |id_ID.zip|3.9 MB|226.3 KB| |it_IT.zip|15.3 MB|612.9 KB| |ku_TR.zip|1.6 MB|118.7 KB| |la.zip|5.1
Re: VOTE: Release apache-solr-ref-guide-4.7.pdf (RC1)
I posted some problematic pages exported to PDF using the revised CSS on SOLR-5753 - looks better to me. Nevertheless, I vote +1 for the RC1 Solr Reference Guide. The issues I brought up are cosmetic ones, and will be addressed in the next version. Let’s get it out the door! Steve On Feb 27, 2014, at 12:06 PM, Cassandra Targett casstarg...@gmail.com wrote: On Thu, Feb 27, 2014 at 4:11 AM, Steve Rowe sar...@gmail.com wrote: I noticed that a couple of “Topics covered in this section” boxes are too narrow to allow their content to be legible - the ones on pages 251 and 300 look really bad, and some others are only marginally legible. I finally found a fix for this problem (which is a more egregious example of the same issue Hoss filed as SOLR-5753) on this Atlassian issue: https://jira.atlassian.com/browse/CONF-14758. However it states that it's for Confluence 5.3+, so I don't know if it will work with 5.0.3 which is the version CWIKI is on. Maybe worth a try? I've posted the possible solution to SOLR-5753 so maybe if you put it in the PDF CSS we can see how it works. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5791) DistributedQueryElevationComponentTest routinely fails on J9
Hoss Man created SOLR-5791: -- Summary: DistributedQueryElevationComponentTest routinely fails on J9 Key: SOLR-5791 URL: https://issues.apache.org/jira/browse/SOLR-5791 Project: Solr Issue Type: Bug Reporter: Hoss Man Either there is a bug in how the params are handled that only manifests itself in J9, or the test needs fixed to not expect the params in a certain order {noformat} REGRESSION: org.apache.solr.handler.component.DistributedQueryElevationComponentTest.testDistribSearch Error Message: .responseHeader.params.fl!=version (unordered or missing) Stack Trace: junit.framework.AssertionFailedError: .responseHeader.params.fl!=version (unordered or missing) at __randomizedtesting.SeedInfo.seed([C6763A182C2489BA:4790B4005B7BE986]:0) at junit.framework.Assert.fail(Assert.java:50) at org.apache.solr.BaseDistributedSearchTestCase.compareSolrResponses(BaseDistributedSearchTestCase.java:843) at org.apache.solr.BaseDistributedSearchTestCase.compareResponses(BaseDistributedSearchTestCase.java:862) at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:565) at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:545) at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:524) at org.apache.solr.handler.component.DistributedQueryElevationComponentTest.doTest(DistributedQueryElevationComponentTe st.java:81) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:870) {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5792) TermVectorComponentDistributedTest routinely fails on J9
Hoss Man created SOLR-5792: -- Summary: TermVectorComponentDistributedTest routinely fails on J9 Key: SOLR-5792 URL: https://issues.apache.org/jira/browse/SOLR-5792 Project: Solr Issue Type: Bug Reporter: Hoss Man Perhaps the code is using a Map when it should be using a NamedList? or perhaps the test should be configured not to care about the order .. is hte order meaningful in this part of the output? {noformat} REGRESSION: org.apache.solr.handler.component.TermVectorComponentDistributedTest.testDistribSearch Error Message: .termVectors.0.test_basictv!=test_postv (unordered or missing) Stack Trace: junit.framework.AssertionFailedError: .termVectors.0.test_basictv!=test_postv (unordered or missing) at __randomizedtesting.SeedInfo.seed([C6763A182C2489BA:4790B4005B7BE986]:0) at junit.framework.Assert.fail(Assert.java:50) at org.apache.solr.BaseDistributedSearchTestCase.compareSolrResponses(BaseDistributedSearchTestCase.java:843) at org.apache.solr.BaseDistributedSearchTestCase.compareResponses(BaseDistributedSearchTestCase.java:862) at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:565) at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:545) at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:524) at org.apache.solr.handler.component.TermVectorComponentDistributedTest.doTest(TermVectorComponentDistributedTest.java: 164) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:876) {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5477) add near-real-time suggest building to AnalyzingInfixSuggester
[ https://issues.apache.org/jira/browse/LUCENE-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-5477: --- Attachment: LUCENE-5477.patch Initial patch, with new add/update/refresh methods added to AnalyzingInfixSuggester. I added a testBasicNRT and it seems to pass, but I still need to add a randomized test. I think the approach will work well: I'm just using SortingMergePolicy and EarlyTerminatingSortingCollector, and I switched to SearcherManager to pull the current searcher. add near-real-time suggest building to AnalyzingInfixSuggester -- Key: LUCENE-5477 URL: https://issues.apache.org/jira/browse/LUCENE-5477 Project: Lucene - Core Issue Type: Improvement Components: modules/spellchecker Reporter: Michael McCandless Fix For: 4.8, 5.0 Attachments: LUCENE-5477.patch Because this suggester impl. is just a Lucene index under-the-hood, it should be straightforward to enable near-real-time additions/removals of suggestions. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5793) SignatureUpdateProcessorFactoryTest routinely fails on J9
Hoss Man created SOLR-5793: -- Summary: SignatureUpdateProcessorFactoryTest routinely fails on J9 Key: SOLR-5793 URL: https://issues.apache.org/jira/browse/SOLR-5793 Project: Solr Issue Type: Bug Reporter: Hoss Man Two very similar looking failures pop up frequently, but not always together... {noformat} REGRESSION: org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.testMultiThreaded Error Message: expected:1 but was:2 Stack Trace: java.lang.AssertionError: expected:1 but was:2 at __randomizedtesting.SeedInfo.seed([791041A112471F1D:18859B41FA9615EB]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.junit.Assert.assertEquals(Assert.java:456) at org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.checkNumDocs(SignatureUpdateProcessorFactoryTest.java:71) at org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.testMultiThreaded(SignatureUpdateProcessorFactoryTest.java:222) {noformat} {noformat} REGRESSION: org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.testDupeDetection Error Message: expected:1 but was:2 Stack Trace: java.lang.AssertionError: expected:1 but was:2 at __randomizedtesting.SeedInfo.seed([16A8922439B48E61:4D9869EC3AF32D1D]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.junit.Assert.assertEquals(Assert.java:456) at org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.checkNumDocs(SignatureUpdateProcessorFactoryTest.java:71) at org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.testDupeDetection(SignatureUpdateProcessorFactoryTest.java:119) {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5791) DistributedQueryElevationComponentTest routinely fails on J9
[ https://issues.apache.org/jira/browse/SOLR-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914917#comment-13914917 ] ASF subversion and git services commented on SOLR-5791: --- Commit 1572706 from hoss...@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1572706 ] SOLR-5793, SOLR-5792, SOLR-5791: disable these three tests on J9 JVM DistributedQueryElevationComponentTest routinely fails on J9 Key: SOLR-5791 URL: https://issues.apache.org/jira/browse/SOLR-5791 Project: Solr Issue Type: Bug Reporter: Hoss Man Either there is a bug in how the params are handled that only manifests itself in J9, or the test needs fixed to not expect the params in a certain order {noformat} REGRESSION: org.apache.solr.handler.component.DistributedQueryElevationComponentTest.testDistribSearch Error Message: .responseHeader.params.fl!=version (unordered or missing) Stack Trace: junit.framework.AssertionFailedError: .responseHeader.params.fl!=version (unordered or missing) at __randomizedtesting.SeedInfo.seed([C6763A182C2489BA:4790B4005B7BE986]:0) at junit.framework.Assert.fail(Assert.java:50) at org.apache.solr.BaseDistributedSearchTestCase.compareSolrResponses(BaseDistributedSearchTestCase.java:843) at org.apache.solr.BaseDistributedSearchTestCase.compareResponses(BaseDistributedSearchTestCase.java:862) at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:565) at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:545) at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:524) at org.apache.solr.handler.component.DistributedQueryElevationComponentTest.doTest(DistributedQueryElevationComponentTe st.java:81) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:870) {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5793) SignatureUpdateProcessorFactoryTest routinely fails on J9
[ https://issues.apache.org/jira/browse/SOLR-5793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914915#comment-13914915 ] ASF subversion and git services commented on SOLR-5793: --- Commit 1572706 from hoss...@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1572706 ] SOLR-5793, SOLR-5792, SOLR-5791: disable these three tests on J9 JVM SignatureUpdateProcessorFactoryTest routinely fails on J9 - Key: SOLR-5793 URL: https://issues.apache.org/jira/browse/SOLR-5793 Project: Solr Issue Type: Bug Reporter: Hoss Man Two very similar looking failures pop up frequently, but not always together... {noformat} REGRESSION: org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.testMultiThreaded Error Message: expected:1 but was:2 Stack Trace: java.lang.AssertionError: expected:1 but was:2 at __randomizedtesting.SeedInfo.seed([791041A112471F1D:18859B41FA9615EB]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.junit.Assert.assertEquals(Assert.java:456) at org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.checkNumDocs(SignatureUpdateProcessorFactoryTest.java:71) at org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.testMultiThreaded(SignatureUpdateProcessorFactoryTest.java:222) {noformat} {noformat} REGRESSION: org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.testDupeDetection Error Message: expected:1 but was:2 Stack Trace: java.lang.AssertionError: expected:1 but was:2 at __randomizedtesting.SeedInfo.seed([16A8922439B48E61:4D9869EC3AF32D1D]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.junit.Assert.assertEquals(Assert.java:456) at org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.checkNumDocs(SignatureUpdateProcessorFactoryTest.java:71) at org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.testDupeDetection(SignatureUpdateProcessorFactoryTest.java:119) {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5792) TermVectorComponentDistributedTest routinely fails on J9
[ https://issues.apache.org/jira/browse/SOLR-5792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914916#comment-13914916 ] ASF subversion and git services commented on SOLR-5792: --- Commit 1572706 from hoss...@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1572706 ] SOLR-5793, SOLR-5792, SOLR-5791: disable these three tests on J9 JVM TermVectorComponentDistributedTest routinely fails on J9 Key: SOLR-5792 URL: https://issues.apache.org/jira/browse/SOLR-5792 Project: Solr Issue Type: Bug Reporter: Hoss Man Perhaps the code is using a Map when it should be using a NamedList? or perhaps the test should be configured not to care about the order .. is hte order meaningful in this part of the output? {noformat} REGRESSION: org.apache.solr.handler.component.TermVectorComponentDistributedTest.testDistribSearch Error Message: .termVectors.0.test_basictv!=test_postv (unordered or missing) Stack Trace: junit.framework.AssertionFailedError: .termVectors.0.test_basictv!=test_postv (unordered or missing) at __randomizedtesting.SeedInfo.seed([C6763A182C2489BA:4790B4005B7BE986]:0) at junit.framework.Assert.fail(Assert.java:50) at org.apache.solr.BaseDistributedSearchTestCase.compareSolrResponses(BaseDistributedSearchTestCase.java:843) at org.apache.solr.BaseDistributedSearchTestCase.compareResponses(BaseDistributedSearchTestCase.java:862) at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:565) at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:545) at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:524) at org.apache.solr.handler.component.TermVectorComponentDistributedTest.doTest(TermVectorComponentDistributedTest.java: 164) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:876) {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5791) DistributedQueryElevationComponentTest routinely fails on J9
[ https://issues.apache.org/jira/browse/SOLR-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914924#comment-13914924 ] ASF subversion and git services commented on SOLR-5791: --- Commit 1572709 from hoss...@apache.org in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1572709 ] SOLR-5793, SOLR-5792, SOLR-5791: disable these three tests on J9 JVM (merge r1572706) DistributedQueryElevationComponentTest routinely fails on J9 Key: SOLR-5791 URL: https://issues.apache.org/jira/browse/SOLR-5791 Project: Solr Issue Type: Bug Reporter: Hoss Man Either there is a bug in how the params are handled that only manifests itself in J9, or the test needs fixed to not expect the params in a certain order {noformat} REGRESSION: org.apache.solr.handler.component.DistributedQueryElevationComponentTest.testDistribSearch Error Message: .responseHeader.params.fl!=version (unordered or missing) Stack Trace: junit.framework.AssertionFailedError: .responseHeader.params.fl!=version (unordered or missing) at __randomizedtesting.SeedInfo.seed([C6763A182C2489BA:4790B4005B7BE986]:0) at junit.framework.Assert.fail(Assert.java:50) at org.apache.solr.BaseDistributedSearchTestCase.compareSolrResponses(BaseDistributedSearchTestCase.java:843) at org.apache.solr.BaseDistributedSearchTestCase.compareResponses(BaseDistributedSearchTestCase.java:862) at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:565) at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:545) at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:524) at org.apache.solr.handler.component.DistributedQueryElevationComponentTest.doTest(DistributedQueryElevationComponentTe st.java:81) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:870) {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5792) TermVectorComponentDistributedTest routinely fails on J9
[ https://issues.apache.org/jira/browse/SOLR-5792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914923#comment-13914923 ] ASF subversion and git services commented on SOLR-5792: --- Commit 1572709 from hoss...@apache.org in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1572709 ] SOLR-5793, SOLR-5792, SOLR-5791: disable these three tests on J9 JVM (merge r1572706) TermVectorComponentDistributedTest routinely fails on J9 Key: SOLR-5792 URL: https://issues.apache.org/jira/browse/SOLR-5792 Project: Solr Issue Type: Bug Reporter: Hoss Man Perhaps the code is using a Map when it should be using a NamedList? or perhaps the test should be configured not to care about the order .. is hte order meaningful in this part of the output? {noformat} REGRESSION: org.apache.solr.handler.component.TermVectorComponentDistributedTest.testDistribSearch Error Message: .termVectors.0.test_basictv!=test_postv (unordered or missing) Stack Trace: junit.framework.AssertionFailedError: .termVectors.0.test_basictv!=test_postv (unordered or missing) at __randomizedtesting.SeedInfo.seed([C6763A182C2489BA:4790B4005B7BE986]:0) at junit.framework.Assert.fail(Assert.java:50) at org.apache.solr.BaseDistributedSearchTestCase.compareSolrResponses(BaseDistributedSearchTestCase.java:843) at org.apache.solr.BaseDistributedSearchTestCase.compareResponses(BaseDistributedSearchTestCase.java:862) at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:565) at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:545) at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:524) at org.apache.solr.handler.component.TermVectorComponentDistributedTest.doTest(TermVectorComponentDistributedTest.java: 164) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:876) {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5793) SignatureUpdateProcessorFactoryTest routinely fails on J9
[ https://issues.apache.org/jira/browse/SOLR-5793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914922#comment-13914922 ] ASF subversion and git services commented on SOLR-5793: --- Commit 1572709 from hoss...@apache.org in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1572709 ] SOLR-5793, SOLR-5792, SOLR-5791: disable these three tests on J9 JVM (merge r1572706) SignatureUpdateProcessorFactoryTest routinely fails on J9 - Key: SOLR-5793 URL: https://issues.apache.org/jira/browse/SOLR-5793 Project: Solr Issue Type: Bug Reporter: Hoss Man Two very similar looking failures pop up frequently, but not always together... {noformat} REGRESSION: org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.testMultiThreaded Error Message: expected:1 but was:2 Stack Trace: java.lang.AssertionError: expected:1 but was:2 at __randomizedtesting.SeedInfo.seed([791041A112471F1D:18859B41FA9615EB]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.junit.Assert.assertEquals(Assert.java:456) at org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.checkNumDocs(SignatureUpdateProcessorFactoryTest.java:71) at org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.testMultiThreaded(SignatureUpdateProcessorFactoryTest.java:222) {noformat} {noformat} REGRESSION: org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.testDupeDetection Error Message: expected:1 but was:2 Stack Trace: java.lang.AssertionError: expected:1 but was:2 at __randomizedtesting.SeedInfo.seed([16A8922439B48E61:4D9869EC3AF32D1D]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.junit.Assert.assertEquals(Assert.java:456) at org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.checkNumDocs(SignatureUpdateProcessorFactoryTest.java:71) at org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.testDupeDetection(SignatureUpdateProcessorFactoryTest.java:119) {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5477) add near-real-time suggest building to AnalyzingInfixSuggester
[ https://issues.apache.org/jira/browse/LUCENE-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914934#comment-13914934 ] Robert Muir commented on LUCENE-5477: - this looks great! add near-real-time suggest building to AnalyzingInfixSuggester -- Key: LUCENE-5477 URL: https://issues.apache.org/jira/browse/LUCENE-5477 Project: Lucene - Core Issue Type: Improvement Components: modules/spellchecker Reporter: Michael McCandless Fix For: 4.8, 5.0 Attachments: LUCENE-5477.patch Because this suggester impl. is just a Lucene index under-the-hood, it should be straightforward to enable near-real-time additions/removals of suggestions. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5779) REGRESSION: org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.testMultiThreaded
[ https://issues.apache.org/jira/browse/SOLR-5779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved SOLR-5779. --- Resolution: Duplicate REGRESSION: org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.testMultiThreaded --- Key: SOLR-5779 URL: https://issues.apache.org/jira/browse/SOLR-5779 Project: Solr Issue Type: Bug Reporter: Mark Miller On the face of it, this fail that start not too long ago is saying that this is no longer thread safe. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5468) Hunspell very high memory use when loading dictionary
[ https://issues.apache.org/jira/browse/LUCENE-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914954#comment-13914954 ] Chris Male commented on LUCENE-5468: Those are some pretty amazing reductions, well done! Hunspell very high memory use when loading dictionary - Key: LUCENE-5468 URL: https://issues.apache.org/jira/browse/LUCENE-5468 Project: Lucene - Core Issue Type: Bug Affects Versions: 3.5 Reporter: Maciej Lisiewski Priority: Minor Attachments: patch.txt Hunspell stemmer requires gigantic (for the task) amounts of memory to load dictionary/rules files. For example loading a 4.5 MB polish dictionary (with empty index!) will cause whole core to crash with various out of memory errors unless you set max heap size close to 2GB or more. By comparison Stempel using the same dictionary file works just fine with 1/8 of that (and possibly lower values as well). Sample error log entries: http://pastebin.com/fSrdd5W1 http://pastebin.com/Lmi0re7Z -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5468) Hunspell very high memory use when loading dictionary
[ https://issues.apache.org/jira/browse/LUCENE-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914960#comment-13914960 ] Robert Muir commented on LUCENE-5468: - I have the previous options added back too locally. so i will fix up tests and so on and just copy over the old filter and make a patch. Hunspell very high memory use when loading dictionary - Key: LUCENE-5468 URL: https://issues.apache.org/jira/browse/LUCENE-5468 Project: Lucene - Core Issue Type: Bug Affects Versions: 3.5 Reporter: Maciej Lisiewski Priority: Minor Attachments: patch.txt Hunspell stemmer requires gigantic (for the task) amounts of memory to load dictionary/rules files. For example loading a 4.5 MB polish dictionary (with empty index!) will cause whole core to crash with various out of memory errors unless you set max heap size close to 2GB or more. By comparison Stempel using the same dictionary file works just fine with 1/8 of that (and possibly lower values as well). Sample error log entries: http://pastebin.com/fSrdd5W1 http://pastebin.com/Lmi0re7Z -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5477) add near-real-time suggest building to AnalyzingInfixSuggester
[ https://issues.apache.org/jira/browse/LUCENE-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914978#comment-13914978 ] Areek Zillur commented on LUCENE-5477: -- Wow, that looks awesome! Thanks for getting rid of the redundant casting of InputIterator too. add near-real-time suggest building to AnalyzingInfixSuggester -- Key: LUCENE-5477 URL: https://issues.apache.org/jira/browse/LUCENE-5477 Project: Lucene - Core Issue Type: Improvement Components: modules/spellchecker Reporter: Michael McCandless Fix For: 4.8, 5.0 Attachments: LUCENE-5477.patch Because this suggester impl. is just a Lucene index under-the-hood, it should be straightforward to enable near-real-time additions/removals of suggestions. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5468) Hunspell very high memory use when loading dictionary
[ https://issues.apache.org/jira/browse/LUCENE-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13915004#comment-13915004 ] ASF subversion and git services commented on LUCENE-5468: - Commit 1572718 from [~rcmuir] in branch 'dev/branches/lucene5468' [ https://svn.apache.org/r1572718 ] LUCENE-5468: hunspell2 - hunspell (with previous options and tests) Hunspell very high memory use when loading dictionary - Key: LUCENE-5468 URL: https://issues.apache.org/jira/browse/LUCENE-5468 Project: Lucene - Core Issue Type: Bug Affects Versions: 3.5 Reporter: Maciej Lisiewski Priority: Minor Attachments: patch.txt Hunspell stemmer requires gigantic (for the task) amounts of memory to load dictionary/rules files. For example loading a 4.5 MB polish dictionary (with empty index!) will cause whole core to crash with various out of memory errors unless you set max heap size close to 2GB or more. By comparison Stempel using the same dictionary file works just fine with 1/8 of that (and possibly lower values as well). Sample error log entries: http://pastebin.com/fSrdd5W1 http://pastebin.com/Lmi0re7Z -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5428) new statistics results to StatsComponent - distinctValues and countDistinct
[ https://issues.apache.org/jira/browse/SOLR-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13915008#comment-13915008 ] Steven Bower commented on SOLR-5428: does this work on multi-value fields? new statistics results to StatsComponent - distinctValues and countDistinct --- Key: SOLR-5428 URL: https://issues.apache.org/jira/browse/SOLR-5428 Project: Solr Issue Type: New Feature Reporter: Elran Dvir Assignee: Shalin Shekhar Mangar Fix For: 4.7, 5.0 Attachments: SOLR-5428.patch, SOLR-5428.patch I thought it would be very useful to display the distinct values (and the count) of a field among other statistics. Attached a patch implementing this in StatsComponent. Added results : distinctValues - list of all distnict values countDistinct - distnict values count. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5183) Add block support for JSONLoader
[ https://issues.apache.org/jira/browse/SOLR-5183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Thacker updated SOLR-5183: Attachment: SOLR-5183.patch New patch which takes into account changes made on SOLR-5777 Add block support for JSONLoader Key: SOLR-5183 URL: https://issues.apache.org/jira/browse/SOLR-5183 Project: Solr Issue Type: Sub-task Reporter: Varun Thacker Fix For: 4.7 Attachments: SOLR-5183.patch, SOLR-5183.patch, SOLR-5183.patch, SOLR-5183.patch, SOLR-5183.patch We should be able to index block documents in JSON format -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5468) Hunspell very high memory use when loading dictionary
[ https://issues.apache.org/jira/browse/LUCENE-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13915025#comment-13915025 ] ASF subversion and git services commented on LUCENE-5468: - Commit 1572724 from [~rcmuir] in branch 'dev/branches/lucene5468' [ https://svn.apache.org/r1572724 ] LUCENE-5468: fix precommit+test Hunspell very high memory use when loading dictionary - Key: LUCENE-5468 URL: https://issues.apache.org/jira/browse/LUCENE-5468 Project: Lucene - Core Issue Type: Bug Affects Versions: 3.5 Reporter: Maciej Lisiewski Priority: Minor Attachments: patch.txt Hunspell stemmer requires gigantic (for the task) amounts of memory to load dictionary/rules files. For example loading a 4.5 MB polish dictionary (with empty index!) will cause whole core to crash with various out of memory errors unless you set max heap size close to 2GB or more. By comparison Stempel using the same dictionary file works just fine with 1/8 of that (and possibly lower values as well). Sample error log entries: http://pastebin.com/fSrdd5W1 http://pastebin.com/Lmi0re7Z -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5468) Hunspell very high memory use when loading dictionary
[ https://issues.apache.org/jira/browse/LUCENE-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13915028#comment-13915028 ] ASF subversion and git services commented on LUCENE-5468: - Commit 1572727 from [~rcmuir] in branch 'dev/branches/lucene5468' [ https://svn.apache.org/r1572727 ] LUCENE-5468: add additional change Hunspell very high memory use when loading dictionary - Key: LUCENE-5468 URL: https://issues.apache.org/jira/browse/LUCENE-5468 Project: Lucene - Core Issue Type: Bug Affects Versions: 3.5 Reporter: Maciej Lisiewski Priority: Minor Attachments: patch.txt Hunspell stemmer requires gigantic (for the task) amounts of memory to load dictionary/rules files. For example loading a 4.5 MB polish dictionary (with empty index!) will cause whole core to crash with various out of memory errors unless you set max heap size close to 2GB or more. By comparison Stempel using the same dictionary file works just fine with 1/8 of that (and possibly lower values as well). Sample error log entries: http://pastebin.com/fSrdd5W1 http://pastebin.com/Lmi0re7Z -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5468) Hunspell very high memory use when loading dictionary
[ https://issues.apache.org/jira/browse/LUCENE-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-5468: Attachment: LUCENE-5468.patch I think the change is ready. There are other improvements that can be done (for example, maybe an option for the factory to cache these things in case you use same ones across multiple fields, and more efficient affix handling against the FST, and so on), but it would be better on different issues I think? Here is a patch (from diff-sources), sorry its not so useful, as I renamed some things. I tried making one from svn diff after reintegration, but it was equally useless. If you want you can also review my commits on this issue to the branch, too. here is CHANGES entry: API Changes: * LUCENE-5468: Move offline Sort (from suggest module) to OfflineSort. (Robert Muir) Optimizations: * LUCENE-5468: HunspellStemFilter uses 10 to 100x less RAM. It also loads all known openoffice dictionaries without error, and supports an additional longestOnly option for a less aggressive approach. (Robert Muir) Hunspell very high memory use when loading dictionary - Key: LUCENE-5468 URL: https://issues.apache.org/jira/browse/LUCENE-5468 Project: Lucene - Core Issue Type: Bug Affects Versions: 3.5 Reporter: Maciej Lisiewski Priority: Minor Attachments: LUCENE-5468.patch, patch.txt Hunspell stemmer requires gigantic (for the task) amounts of memory to load dictionary/rules files. For example loading a 4.5 MB polish dictionary (with empty index!) will cause whole core to crash with various out of memory errors unless you set max heap size close to 2GB or more. By comparison Stempel using the same dictionary file works just fine with 1/8 of that (and possibly lower values as well). Sample error log entries: http://pastebin.com/fSrdd5W1 http://pastebin.com/Lmi0re7Z -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5605) MapReduceIndexerTool fails in some locales -- seen in random failures of MapReduceIndexerToolArgumentParserTest
[ https://issues.apache.org/jira/browse/SOLR-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13915037#comment-13915037 ] wolfgang hoschek commented on SOLR-5605: bq. Are you not a committer? At Apache, those who do decide. Yes, but you've clearly been assigned to upstream this stuff and I have plenty of other things to attend to these days. bq. I did not realize Patricks patch did not include the latest code updates from MapReduce. Might be good to pay more attention, also to CDH-14804? bq. I had and still have bigger concerns around the usability of this code in Solr than this issue. It is very, very far from easy for someone to get started with this contrib right now. The usability is fine downstream where maven automatically builds a job jar that includes the necessary dependency jars inside of the lib dir of the MR job jar. Hence no startup script or extra steps are required downstream, just one (fat) jar. If it's not usable upstream it may be because no corresponding packaging system has been used upstream, for reasons that escape me. bq. which is why non of these smaller issues concern me very much at this point. I'm afraid ignorance never helps. MapReduceIndexerTool fails in some locales -- seen in random failures of MapReduceIndexerToolArgumentParserTest --- Key: SOLR-5605 URL: https://issues.apache.org/jira/browse/SOLR-5605 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Mark Miller Fix For: 4.7, 5.0 I noticed a randomized failure in MapReduceIndexerToolArgumentParserTest which is reproducible with any seed -- all that matters is the locale. The problem sounded familiar, and a quick search verified that jenkins has in fact hit this a couple of times in the past -- Uwe commented on the list that this is due to a real problem in one of the third-party dependencies (that does the argument parsing) that will affect usage on some systems. If working around the bug in the arg parsing lib isn't feasible, MapReduceIndexerTool should fail cleanly if the locale isn't one we know is supported -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5432) EliasFanoEncoder number of index entry bits is off by 1 for powers of 2 of max index entry
[ https://issues.apache.org/jira/browse/LUCENE-5432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13915041#comment-13915041 ] Adrien Grand commented on LUCENE-5432: -- Thanks Paul, the fix looks good! EliasFanoEncoder number of index entry bits is off by 1 for powers of 2 of max index entry -- Key: LUCENE-5432 URL: https://issues.apache.org/jira/browse/LUCENE-5432 Project: Lucene - Core Issue Type: Bug Components: core/other Reporter: Paul Elschot Priority: Minor Fix For: 5.0 -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5432) EliasFanoEncoder number of index entry bits is off by 1 for powers of 2 of max index entry
[ https://issues.apache.org/jira/browse/LUCENE-5432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13915042#comment-13915042 ] ASF subversion and git services commented on LUCENE-5432: - Commit 1572728 from [~jpountz] in branch 'dev/trunk' [ https://svn.apache.org/r1572728 ] LUCENE-5432: Fix EliasFanoEncoder's number of bits per index entry. Close #28 EliasFanoEncoder number of index entry bits is off by 1 for powers of 2 of max index entry -- Key: LUCENE-5432 URL: https://issues.apache.org/jira/browse/LUCENE-5432 Project: Lucene - Core Issue Type: Bug Components: core/other Reporter: Paul Elschot Priority: Minor Fix For: 5.0 -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[GitHub] lucene-solr pull request: Correct number of bits for an index entr...
Github user asfgit closed the pull request at: https://github.com/apache/lucene-solr/pull/28 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5432) EliasFanoEncoder number of index entry bits is off by 1 for powers of 2 of max index entry
[ https://issues.apache.org/jira/browse/LUCENE-5432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13915043#comment-13915043 ] ASF subversion and git services commented on LUCENE-5432: - Commit 1572729 from [~jpountz] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1572729 ] LUCENE-5432: Fix EliasFanoEncoder's number of bits per index entry. EliasFanoEncoder number of index entry bits is off by 1 for powers of 2 of max index entry -- Key: LUCENE-5432 URL: https://issues.apache.org/jira/browse/LUCENE-5432 Project: Lucene - Core Issue Type: Bug Components: core/other Reporter: Paul Elschot Priority: Minor Fix For: 5.0 -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5468) Hunspell very high memory use when loading dictionary
[ https://issues.apache.org/jira/browse/LUCENE-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13915045#comment-13915045 ] Chris Male commented on LUCENE-5468: Is the longestOnly option a standard Hunspell thing? (more a question of general interest) Hunspell very high memory use when loading dictionary - Key: LUCENE-5468 URL: https://issues.apache.org/jira/browse/LUCENE-5468 Project: Lucene - Core Issue Type: Bug Affects Versions: 3.5 Reporter: Maciej Lisiewski Priority: Minor Attachments: LUCENE-5468.patch, patch.txt Hunspell stemmer requires gigantic (for the task) amounts of memory to load dictionary/rules files. For example loading a 4.5 MB polish dictionary (with empty index!) will cause whole core to crash with various out of memory errors unless you set max heap size close to 2GB or more. By comparison Stempel using the same dictionary file works just fine with 1/8 of that (and possibly lower values as well). Sample error log entries: http://pastebin.com/fSrdd5W1 http://pastebin.com/Lmi0re7Z -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5432) EliasFanoEncoder number of index entry bits is off by 1 for powers of 2 of max index entry
[ https://issues.apache.org/jira/browse/LUCENE-5432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-5432. -- Resolution: Fixed Fix Version/s: 4.8 Assignee: Adrien Grand EliasFanoEncoder number of index entry bits is off by 1 for powers of 2 of max index entry -- Key: LUCENE-5432 URL: https://issues.apache.org/jira/browse/LUCENE-5432 Project: Lucene - Core Issue Type: Bug Components: core/other Reporter: Paul Elschot Assignee: Adrien Grand Priority: Minor Fix For: 4.8, 5.0 -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5376) Add a demo search server
[ https://issues.apache.org/jira/browse/LUCENE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13915047#comment-13915047 ] Chris Male commented on LUCENE-5376: Hey Mike, What's the endzone here? Any thoughts on it coming back into trunk? Add a demo search server Key: LUCENE-5376 URL: https://issues.apache.org/jira/browse/LUCENE-5376 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Attachments: lucene-demo-server.tgz I think it'd be useful to have a demo search server for Lucene. Rather than being fully featured, like Solr, it would be minimal, just wrapping the existing Lucene modules to show how you can make use of these features in a server setting. The purpose is to demonstrate how one can build a minimal search server on top of APIs like SearchManager, SearcherLifetimeManager, etc. This is also useful for finding rough edges / issues in Lucene's APIs that make building a server unnecessarily hard. I don't think it should have back compatibility promises (except Lucene's index back compatibility), so it's free to improve as Lucene's APIs change. As a starting point, I'll post what I built for the eating your own dog food search app for Lucene's Solr's jira issues http://jirasearch.mikemccandless.com (blog: http://blog.mikemccandless.com/2013/05/eating-dog-food-with-lucene.html ). It uses Netty to expose basic indexing searching APIs via JSON, but it's very rough (lots nocommits). -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org