[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7
[ https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286033#comment-14286033 ] Hoss Man commented on SOLR-6991: TIKA-93 introduced the TesseractOCRParser, and TIKA-1476 enabled it as a default parser. that combination means that the first time Tika is used in Solr, the TesseractOCRParser will be checked to see if the system hasTesseract installed to know if that parser should be consulted -- and when that happens, ExternalParser.check is used which calls Runtime.exec and blows up in turkish locale. possible resolutions i can think of: * change how we init Tika to prevent this parser from ever being used (override the list of autodeteced parsers?) * change how we include tika jars/defaults to prevent this parser from ever being used (override the default tesseract properties file in the jar somehow maybe?) * rollback to tika 1.6 * punt and advise turkish users to run their jvm in en_US ? Update to Apache TIKA 1.7 - Key: SOLR-6991 URL: https://issues.apache.org/jira/browse/SOLR-6991 Project: Solr Issue Type: Improvement Components: contrib - Solr Cell (Tika extraction) Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 5.0, Trunk, 5.1 Attachments: SOLR-6991.patch, SOLR-6991.patch Apache TIKA 1.7 was released: [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt] This is more or less a dependency update, so replacements. Not sure if we should do this for 5.0. In 5.0 we currently have the previous version, which was not yet released with Solr. If we now bring this into 5.0, we wouldn't have a new release 2 times. I can change the stuff this evening and let it bake in 5.x, so maybe we backport this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.8.0) - Build # 1943 - Failure!
: It's our old friend SOLR-6387 ! : : This time manifesting itself via calls down into Tika. new comments posted in SOLR-6991 where Tika was upgraded -- definitely Tika 1.7 that introduced this new parser that causes this problem. One followup clarification... : way we never tickled it before ... but more perplexing is why i can't : reproduce any similar errors on trunk (or 5x) using ant test : -Dtests.slow=true -Dtests.locale=tr_TR in solr/contrib/extraction ? ...i'm getting senile, and forgot the most anoying part of SOLR-6387: the posix_spawn problem doesn't manifest on linux, because the JDK code only tries VFORK and FORK, not POSIX_SPAWN : : Date: Wed, 21 Jan 2015 06:47:13 + (UTC) : : From: Policeman Jenkins Server jenk...@thetaphi.de : : Reply-To: dev@lucene.apache.org : : To: rm...@apache.org, ans...@apache.org, sha...@apache.org, no...@apache.org, : : gcha...@apache.org, ehatc...@apache.org, tflo...@apache.org, : : sar...@gmail.com, dev@lucene.apache.org : : Subject: [JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.8.0) - Build # 1943 - : : Failure! : : : : Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/1943/ : : Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC : : : : 1 tests failed. : : FAILED: org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.testXPath : : : : Error Message: : : posix_spawn is not a supported process launch mechanism on this platform. : : : : Stack Trace: : : java.lang.Error: posix_spawn is not a supported process launch mechanism on this platform. : : at __randomizedtesting.SeedInfo.seed([58A6FBEB77E81527:26F735786F5F7761]:0) : : at java.lang.UNIXProcess$1.run(UNIXProcess.java:105) : : at java.lang.UNIXProcess$1.run(UNIXProcess.java:94) : : at java.security.AccessController.doPrivileged(Native Method) : : at java.lang.UNIXProcess.clinit(UNIXProcess.java:92) : : at java.lang.ProcessImpl.start(ProcessImpl.java:130) : : at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029) : : at java.lang.Runtime.exec(Runtime.java:620) : : at java.lang.Runtime.exec(Runtime.java:485) : : at org.apache.tika.parser.external.ExternalParser.check(ExternalParser.java:344) : : at org.apache.tika.parser.ocr.TesseractOCRParser.hasTesseract(TesseractOCRParser.java:117) : : at org.apache.tika.parser.ocr.TesseractOCRParser.getSupportedTypes(TesseractOCRParser.java:90) : : at org.apache.tika.parser.CompositeParser.getParsers(CompositeParser.java:81) : : at org.apache.tika.parser.DefaultParser.getParsers(DefaultParser.java:95) : : at org.apache.tika.parser.CompositeParser.getSupportedTypes(CompositeParser.java:229) : : at org.apache.tika.parser.CompositeParser.getParsers(CompositeParser.java:81) : : at org.apache.tika.parser.CompositeParser.getParser(CompositeParser.java:209) : : at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244) : : at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120) : : at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219) : : at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) : : at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:144) : : at org.apache.solr.core.SolrCore.execute(SolrCore.java:2006) : : at org.apache.solr.util.TestHarness.queryAndResponse(TestHarness.java:353) : : at org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.loadLocalFromHandler(ExtractingRequestHandlerTest.java:703) : : at org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.loadLocal(ExtractingRequestHandlerTest.java:710) : : at org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.testXPath(ExtractingRequestHandlerTest.java:474) : : at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) : : at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) : : at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) : : at java.lang.reflect.Method.invoke(Method.java:483) : : at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) : : at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) : : at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) : : at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) : : at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) : : at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) : : at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) : : at
[jira] [Commented] (LUCENE-6191) Spatial 2D faceting (heatmaps)
[ https://issues.apache.org/jira/browse/LUCENE-6191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285893#comment-14285893 ] Nicholas Knize commented on LUCENE-6191: The error factor would certainly be a function of spatial resolution but not the same since you're dealing with expected vs. observed counts (an RMS over the result set vs. %spatialErr). It'd be worth exploring for a later enhancement (maybe an option to include descriptive stats as part of the faceting operation for spatial analysis use-cases) but not critical for the initial capability. I'm just not a fan of creating analysis results without communicating some kind of accuracy. It leads to data misrepresentation. I do like what you have going on here. I'll experiment with it when I get some time and see if I can't help get some low-overhead accuracy results. Spatial 2D faceting (heatmaps) -- Key: LUCENE-6191 URL: https://issues.apache.org/jira/browse/LUCENE-6191 Project: Lucene - Core Issue Type: New Feature Components: modules/spatial Reporter: David Smiley Assignee: David Smiley Fix For: 5.1 Attachments: LUCENE-6191__Spatial_heatmap.patch Lucene spatial's PrefixTree (grid) based strategies index data in a way highly amenable to faceting on grids cells to compute a so-called _heatmap_. The underlying code in this patch uses the PrefixTreeFacetCounter utility class which was recently refactored out of faceting for NumberRangePrefixTree LUCENE-5735. At a low level, the terms (== grid cells) are navigated per-segment, forward only with TermsEnum.seek, so it's pretty quick and furthermore requires no extra caches no docvalues. Ideally you should use QuadPrefixTree (or Flex once it comes out) to maximize the number grid levels which in turn maximizes the fidelity of choices when you ask for a grid covering a region. Conveniently, the provided capability returns the data in a 2-D grid of counts, so the caller needn't know a thing about how the data is encoded in the prefix tree. Well almost... at this point they need to provide a grid level, but I'll soon provide a means of deriving the grid level based on a min/max cell count. I recommend QuadPrefixTree with geo=false so that you can provide a square world-bounds (360x360 degrees), which means square grid cells which are more desirable to display than rectangular cells. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6521) CloudSolrServer should synchronize cache cluster state loading
[ https://issues.apache.org/jira/browse/SOLR-6521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285925#comment-14285925 ] Jessica Cheng Mallet commented on SOLR-6521: The patch is locking the entire cache for all loading, which might not be an ideal solution for a cluster with many, many collections. Guava's implementation of LocalCache would only lock and wait on Segments, which increases the concurrency level (which is tunable). CloudSolrServer should synchronize cache cluster state loading -- Key: SOLR-6521 URL: https://issues.apache.org/jira/browse/SOLR-6521 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Jessica Cheng Mallet Assignee: Noble Paul Priority: Critical Labels: SolrCloud Fix For: 5.0, Trunk Attachments: SOLR-6521.patch Under heavy load-testing with the new solrj client that caches the cluster state instead of setting a watcher, I started seeing lots of zk connection loss on the client-side when refreshing the CloudSolrServer collectionStateCache, and this was causing crazy client-side 99.9% latency (~15 sec). I swapped the cache out with guava's LoadingCache (which does locking to ensure only one thread loads the content under one key while the other threads that want the same key wait) and the connection loss went away and the 99.9% latency also went down to just about 1 sec. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-6928) solr.cmd stop works only in english
[ https://issues.apache.org/jira/browse/SOLR-6928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Potter reassigned SOLR-6928: Assignee: Timothy Potter solr.cmd stop works only in english --- Key: SOLR-6928 URL: https://issues.apache.org/jira/browse/SOLR-6928 Project: Solr Issue Type: Bug Components: scripts and tools Affects Versions: 4.10.3 Environment: german windows 7 Reporter: john.work Assignee: Timothy Potter Priority: Minor in solr.cmd the stop doesnt work while executing 'netstat -nao ^| find /i listening ^| find :%SOLR_PORT%' so listening is not found. e.g. in german cmd.exe the netstat -nao prints the following output: {noformat} Proto Lokale Adresse Remoteadresse Status PID TCP0.0.0.0:80 0.0.0.0:0 ABHÖREN 4 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-Tests-5.x-Java7 - Build # 2521 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-Tests-5.x-Java7/2521/ 4 tests failed. FAILED: org.apache.solr.cloud.HttpPartitionTest.testDistribSearch Error Message: org.apache.http.NoHttpResponseException: The target server failed to respond Stack Trace: org.apache.solr.client.solrj.SolrServerException: org.apache.http.NoHttpResponseException: The target server failed to respond at __randomizedtesting.SeedInfo.seed([77AD558AFE563DA6:F64BDB9289095D9A]:0) at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:871) at org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:736) at org.apache.solr.cloud.HttpPartitionTest.sendDoc(HttpPartitionTest.java:480) at org.apache.solr.cloud.HttpPartitionTest.testRf2(HttpPartitionTest.java:201) at org.apache.solr.cloud.HttpPartitionTest.doTest(HttpPartitionTest.java:114) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:878) at sun.reflect.GeneratedMethodAccessor46.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at
Re: [JENKINS] Lucene-Solr-5.x-Linux (64bit/ibm-j9-jdk7) - Build # 11491 - Failure!
this looks like problems noted in SOLR-6915 wih the kerberose stuff and the IBM JDK. miller? gregory? ... were you guys going to disable these tests on IBM JDKs or not? It's one thing to say a certain feature only works with Oracle JVMs, but it's going to suck if 5.0 goes out and we know that the Solr tests will reliable fail 100% of the time on IBM JDKs : Date: Wed, 21 Jan 2015 05:30:55 + (UTC) : From: Policeman Jenkins Server jenk...@thetaphi.de : Reply-To: dev@lucene.apache.org : To: sar...@gmail.com, dev@lucene.apache.org : Subject: [JENKINS] Lucene-Solr-5.x-Linux (64bit/ibm-j9-jdk7) - Build # 11491 - : Failure! : : Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-Linux/11491/ : Java: 64bit/ibm-j9-jdk7 -Xjit:exclude={org/apache/lucene/util/fst/FST.pack(IIF)Lorg/apache/lucene/util/fst/FST;} : : 1 tests failed. : FAILED: org.apache.solr.cloud.SaslZkACLProviderTest.testSaslZkACLProvider : : Error Message: : Could not get the port for ZooKeeper server : : Stack Trace: : java.lang.RuntimeException: Could not get the port for ZooKeeper server : at org.apache.solr.cloud.ZkTestServer.run(ZkTestServer.java:482) : at org.apache.solr.cloud.SaslZkACLProviderTest$SaslZkTestServer.run(SaslZkACLProviderTest.java:206) : at org.apache.solr.cloud.SaslZkACLProviderTest.setUp(SaslZkACLProviderTest.java:74) : at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source) : at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55) : at java.lang.reflect.Method.invoke(Method.java:619) : at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) : at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:861) : at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) : at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) : at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) : at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) : at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) : at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) : at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) : at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) : at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) : at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) : at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) : at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) : at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) : at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) : at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) : at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) : at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) : at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) : at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) : at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) : at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) : at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) : at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) : at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) : at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) : at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) : at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) : at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) : at
Re: [JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.8.0) - Build # 1943 - Failure!
It's our old friend SOLR-6387 ! This time manifesting itself via calls down into Tika. My best guess is that something changed in the recently upgraded version of Tika in Solr so that we now tickle this ExternalParser code path in a way we never tickled it before ... but more perplexing is why i can't reproduce any similar errors on trunk (or 5x) using ant test -Dtests.slow=true -Dtests.locale=tr_TR in solr/contrib/extraction ? Does Tika already do some special platform/locale detection internally that is bypassing this error for most people, and only manifesting on MacOSX? can a mac user try to reproduce this? : Date: Wed, 21 Jan 2015 06:47:13 + (UTC) : From: Policeman Jenkins Server jenk...@thetaphi.de : Reply-To: dev@lucene.apache.org : To: rm...@apache.org, ans...@apache.org, sha...@apache.org, no...@apache.org, : gcha...@apache.org, ehatc...@apache.org, tflo...@apache.org, : sar...@gmail.com, dev@lucene.apache.org : Subject: [JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.8.0) - Build # 1943 - : Failure! : : Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/1943/ : Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC : : 1 tests failed. : FAILED: org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.testXPath : : Error Message: : posix_spawn is not a supported process launch mechanism on this platform. : : Stack Trace: : java.lang.Error: posix_spawn is not a supported process launch mechanism on this platform. : at __randomizedtesting.SeedInfo.seed([58A6FBEB77E81527:26F735786F5F7761]:0) : at java.lang.UNIXProcess$1.run(UNIXProcess.java:105) : at java.lang.UNIXProcess$1.run(UNIXProcess.java:94) : at java.security.AccessController.doPrivileged(Native Method) : at java.lang.UNIXProcess.clinit(UNIXProcess.java:92) : at java.lang.ProcessImpl.start(ProcessImpl.java:130) : at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029) : at java.lang.Runtime.exec(Runtime.java:620) : at java.lang.Runtime.exec(Runtime.java:485) : at org.apache.tika.parser.external.ExternalParser.check(ExternalParser.java:344) : at org.apache.tika.parser.ocr.TesseractOCRParser.hasTesseract(TesseractOCRParser.java:117) : at org.apache.tika.parser.ocr.TesseractOCRParser.getSupportedTypes(TesseractOCRParser.java:90) : at org.apache.tika.parser.CompositeParser.getParsers(CompositeParser.java:81) : at org.apache.tika.parser.DefaultParser.getParsers(DefaultParser.java:95) : at org.apache.tika.parser.CompositeParser.getSupportedTypes(CompositeParser.java:229) : at org.apache.tika.parser.CompositeParser.getParsers(CompositeParser.java:81) : at org.apache.tika.parser.CompositeParser.getParser(CompositeParser.java:209) : at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244) : at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120) : at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219) : at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) : at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:144) : at org.apache.solr.core.SolrCore.execute(SolrCore.java:2006) : at org.apache.solr.util.TestHarness.queryAndResponse(TestHarness.java:353) : at org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.loadLocalFromHandler(ExtractingRequestHandlerTest.java:703) : at org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.loadLocal(ExtractingRequestHandlerTest.java:710) : at org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.testXPath(ExtractingRequestHandlerTest.java:474) : at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) : at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) : at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) : at java.lang.reflect.Method.invoke(Method.java:483) : at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) : at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) : at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) : at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) : at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) : at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) : at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) : at
[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7
[ https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286059#comment-14286059 ] Uwe Schindler commented on SOLR-6991: - This is in fact the problem with spawning external processes. This is not new, also TIKA 1.6 had parsers that spawned processes. I just think we never hit this because this one is different: The parser spawns a process while initializing (to inspect the system). The other Spawning parsers are only executed as needed. ExternalParser exists since a long time in TIKA. I would not roll back to TIKA 1.5 because the new TIKA is much better than this one (regarding bugs). In fact we should maybe disable this tests with the well-known assume (trunk, 5.x, 5.0). In fact, I would suggest to add a note to the ref guide, so people know what this means. This is unfortunately a bug in the JVM, so this is not really our or TIKA's fault. In fact, as written in my Blog post about Locale issues: Most Turkish system administrators don't run servers with the turkish locale :-) Its just too broken with lots of software. Update to Apache TIKA 1.7 - Key: SOLR-6991 URL: https://issues.apache.org/jira/browse/SOLR-6991 Project: Solr Issue Type: Improvement Components: contrib - Solr Cell (Tika extraction) Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 5.0, Trunk, 5.1 Attachments: SOLR-6991.patch, SOLR-6991.patch Apache TIKA 1.7 was released: [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt] This is more or less a dependency update, so replacements. Not sure if we should do this for 5.0. In 5.0 we currently have the previous version, which was not yet released with Solr. If we now bring this into 5.0, we wouldn't have a new release 2 times. I can change the stuff this evening and let it bake in 5.x, so maybe we backport this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6521) CloudSolrServer should synchronize cache cluster state loading
[ https://issues.apache.org/jira/browse/SOLR-6521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-6521: - Attachment: SOLR-6521.patch CloudSolrServer should synchronize cache cluster state loading -- Key: SOLR-6521 URL: https://issues.apache.org/jira/browse/SOLR-6521 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Jessica Cheng Mallet Assignee: Noble Paul Priority: Critical Labels: SolrCloud Fix For: 5.0, Trunk Attachments: SOLR-6521.patch, SOLR-6521.patch Under heavy load-testing with the new solrj client that caches the cluster state instead of setting a watcher, I started seeing lots of zk connection loss on the client-side when refreshing the CloudSolrServer collectionStateCache, and this was causing crazy client-side 99.9% latency (~15 sec). I swapped the cache out with guava's LoadingCache (which does locking to ensure only one thread loads the content under one key while the other threads that want the same key wait) and the connection loss went away and the 99.9% latency also went down to just about 1 sec. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6521) CloudSolrServer should synchronize cache cluster state loading
[ https://issues.apache.org/jira/browse/SOLR-6521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286062#comment-14286062 ] Jessica Cheng Mallet commented on SOLR-6521: bq. I agree that the concurrency can be dramatically improved . Using Guava may not be an option because it is not yet a dependency on SolrJ. The other option would be to make the cache pluggable through an API . So ,if you have Guava or something else in your package you can plug it in through an API That'd be awesome! CloudSolrServer should synchronize cache cluster state loading -- Key: SOLR-6521 URL: https://issues.apache.org/jira/browse/SOLR-6521 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Jessica Cheng Mallet Assignee: Noble Paul Priority: Critical Labels: SolrCloud Fix For: 5.0, Trunk Attachments: SOLR-6521.patch Under heavy load-testing with the new solrj client that caches the cluster state instead of setting a watcher, I started seeing lots of zk connection loss on the client-side when refreshing the CloudSolrServer collectionStateCache, and this was causing crazy client-side 99.9% latency (~15 sec). I swapped the cache out with guava's LoadingCache (which does locking to ensure only one thread loads the content under one key while the other threads that want the same key wait) and the connection loss went away and the 99.9% latency also went down to just about 1 sec. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7
[ https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286065#comment-14286065 ] Uwe Schindler commented on SOLR-6991: - In fact you can select parsers using a config file / SetString. But this makes updaing horrible, because we have to revisit the list on each TIKA update... Update to Apache TIKA 1.7 - Key: SOLR-6991 URL: https://issues.apache.org/jira/browse/SOLR-6991 Project: Solr Issue Type: Improvement Components: contrib - Solr Cell (Tika extraction) Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 5.0, Trunk, 5.1 Attachments: SOLR-6991.patch, SOLR-6991.patch Apache TIKA 1.7 was released: [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt] This is more or less a dependency update, so replacements. Not sure if we should do this for 5.0. In 5.0 we currently have the previous version, which was not yet released with Solr. If we now bring this into 5.0, we wouldn't have a new release 2 times. I can change the stuff this evening and let it bake in 5.x, so maybe we backport this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6521) CloudSolrServer should synchronize cache cluster state loading
[ https://issues.apache.org/jira/browse/SOLR-6521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-6521: - Attachment: (was: SOLR-6521.patch) CloudSolrServer should synchronize cache cluster state loading -- Key: SOLR-6521 URL: https://issues.apache.org/jira/browse/SOLR-6521 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Jessica Cheng Mallet Assignee: Noble Paul Priority: Critical Labels: SolrCloud Fix For: 5.0, Trunk Attachments: SOLR-6521.patch, SOLR-6521.patch Under heavy load-testing with the new solrj client that caches the cluster state instead of setting a watcher, I started seeing lots of zk connection loss on the client-side when refreshing the CloudSolrServer collectionStateCache, and this was causing crazy client-side 99.9% latency (~15 sec). I swapped the cache out with guava's LoadingCache (which does locking to ensure only one thread loads the content under one key while the other threads that want the same key wait) and the connection loss went away and the 99.9% latency also went down to just about 1 sec. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6521) CloudSolrServer should synchronize cache cluster state loading
[ https://issues.apache.org/jira/browse/SOLR-6521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-6521: - Attachment: SOLR-6521.patch hi [~mewmewball] This patch increases the parallelism and makes it tunable. CloudSolrServer should synchronize cache cluster state loading -- Key: SOLR-6521 URL: https://issues.apache.org/jira/browse/SOLR-6521 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Jessica Cheng Mallet Assignee: Noble Paul Priority: Critical Labels: SolrCloud Fix For: 5.0, Trunk Attachments: SOLR-6521.patch, SOLR-6521.patch Under heavy load-testing with the new solrj client that caches the cluster state instead of setting a watcher, I started seeing lots of zk connection loss on the client-side when refreshing the CloudSolrServer collectionStateCache, and this was causing crazy client-side 99.9% latency (~15 sec). I swapped the cache out with guava's LoadingCache (which does locking to ensure only one thread loads the content under one key while the other threads that want the same key wait) and the connection loss went away and the 99.9% latency also went down to just about 1 sec. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6991) Update to Apache TIKA 1.7
[ https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated SOLR-6991: Attachment: SOLR-6991-forkfix.patch This disables the test... Just copypasted from map-reduce/morphlines/ In fact this is not TIKA's issue and not new, a lot of stuff around Hadoop in Solr fails with Turkish! Update to Apache TIKA 1.7 - Key: SOLR-6991 URL: https://issues.apache.org/jira/browse/SOLR-6991 Project: Solr Issue Type: Improvement Components: contrib - Solr Cell (Tika extraction) Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 5.0, Trunk, 5.1 Attachments: SOLR-6991-forkfix.patch, SOLR-6991.patch, SOLR-6991.patch Apache TIKA 1.7 was released: [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt] This is more or less a dependency update, so replacements. Not sure if we should do this for 5.0. In 5.0 we currently have the previous version, which was not yet released with Solr. If we now bring this into 5.0, we wouldn't have a new release 2 times. I can change the stuff this evening and let it bake in 5.x, so maybe we backport this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6192) Long overflow in LuceneXXSkipWriter can corrupt skip data
[ https://issues.apache.org/jira/browse/LUCENE-6192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285904#comment-14285904 ] ASF subversion and git services commented on LUCENE-6192: - Commit 1653606 from [~mikemccand] in branch 'dev/branches/lucene_solr_5_0' [ https://svn.apache.org/r1653606 ] LUCENE-6192: don't overflow int when writing skip data for high freq terms in extremely large indices Long overflow in LuceneXXSkipWriter can corrupt skip data - Key: LUCENE-6192 URL: https://issues.apache.org/jira/browse/LUCENE-6192 Project: Lucene - Core Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0, Trunk, 4.x Attachments: LUCENE-6192.patch I've been iterating with Tom on this corruption that CheckIndex detects in his rather large index (720 GB in a single segment): {noformat} java -Xmx16G -Xms16G -cp $JAR -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex //shards/4/core-1/data/test_index -verbose 21 |tee -a shard4_reoptimizedNewJava Opening index @ /htsolr/lss-reindex/shards/4/core-1/data/test_index Segments file=segments_e numSegments=1 version=4.10.2 format= userData={commitTimeMSec=1421479358825} 1 of 1: name=_8m8 docCount=1130856 version=4.10.2 codec=Lucene410 compound=false numFiles=10 size (MB)=719,967.32 diagnostics = {timestamp=1421437320935, os=Linux, os.version=2.6.18-400.1.1.el5, mergeFactor=2, source=merge, lucene.version=4.10.2, os.arch=amd64, mergeMaxNumSegments=1, java.version=1.7.0_71, java.vendor=Oracle Corporation} no deletions test: open reader.OK test: check integrity.OK test: check live docs.OK test: fields..OK [80 fields] test: field norms.OK [23 fields] test: terms, freq, prox...ERROR: java.lang.AssertionError: -96 java.lang.AssertionError: -96 at org.apache.lucene.codecs.lucene41.ForUtil.skipBlock(ForUtil.java:228) at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.skipPositions(Lucene41PostingsReader.java:925) at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.nextPosition(Lucene41PostingsReader.java:955) at org.apache.lucene.index.CheckIndex.checkFields(CheckIndex.java:1100) at org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:1357) at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:655) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096) test: stored fields...OK [67472796 total field count; avg 59.665 fields per doc] test: term vectorsOK [0 total vector count; avg 0 term/freq vector fields per doc] test: docvalues...OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 SORTED; 0 SORTED_NUMERIC; 0 SORTED_SET] FAILED WARNING: fixIndex() would remove reference to this segment; full exception: java.lang.RuntimeException: Term Index test failed at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:670) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096) WARNING: 1 broken segments (containing 1130856 documents) detected WARNING: would write new segments file, and 1130856 documents would be lost, if -fix were specified {noformat} And Rob spotted long - int casts in our skip list writers that look like they could cause such corruption if a single high-freq term with many positions required 2.1 GB to write its positions into .pos. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.8.0) - Build # 1943 - Failure!
Reproduces for me on OS X 10.10, Oracle JDK 1.8.0_20: = [junit4] 2 NOTE: reproduce with: ant test -Dtestcase=ExtractingRequestHandlerTest -Dtests.method=testExtraction -Dtests.seed=98ABBA97C7FD5F1C -Dtests.slow=true -Dtests.locale=tr_TR -Dtests.timezone=America/Nome -Dtests.asserts=true -Dtests.file.encoding=UTF-8 [junit4] ERROR 0.49s | ExtractingRequestHandlerTest.testExtraction [junit4] Throwable #1: java.lang.Error: posix_spawn is not a supported process launch mechanism on this platform. [junit4]at __randomizedtesting.SeedInfo.seed([98ABBA97C7FD5F1C:21D8CEE9BBD58FE9]:0) [junit4]at java.lang.UNIXProcess$1.run(UNIXProcess.java:105) [junit4]at java.lang.UNIXProcess$1.run(UNIXProcess.java:94) [junit4]at java.security.AccessController.doPrivileged(Native Method) [junit4]at java.lang.UNIXProcess.clinit(UNIXProcess.java:92) [junit4]at java.lang.ProcessImpl.start(ProcessImpl.java:130) [junit4]at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029) [junit4]at java.lang.Runtime.exec(Runtime.java:620) [junit4]at java.lang.Runtime.exec(Runtime.java:485) [junit4]at org.apache.tika.parser.external.ExternalParser.check(ExternalParser.java:344) [junit4]at org.apache.tika.parser.ocr.TesseractOCRParser.hasTesseract(TesseractOCRParser.java:117) [junit4]at org.apache.tika.parser.ocr.TesseractOCRParser.getSupportedTypes(TesseractOCRParser.java:90) [junit4]at org.apache.tika.parser.CompositeParser.getParsers(CompositeParser.java:81) [junit4]at org.apache.tika.parser.DefaultParser.getParsers(DefaultParser.java:95) [junit4]at org.apache.tika.parser.CompositeParser.getSupportedTypes(CompositeParser.java:229) [junit4]at org.apache.tika.parser.CompositeParser.getParsers(CompositeParser.java:81) [junit4]at org.apache.tika.parser.CompositeParser.getParser(CompositeParser.java:209) [junit4]at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244) [junit4]at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120) [junit4]at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219) [junit4]at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) [junit4]at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:144) [junit4]at org.apache.solr.core.SolrCore.execute(SolrCore.java:2006) [junit4]at org.apache.solr.util.TestHarness.queryAndResponse(TestHarness.java:353) [junit4]at org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.loadLocalFromHandler(ExtractingRequestHandlerTest.java:703) [junit4]at org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.loadLocal(ExtractingRequestHandlerTest.java:710) [junit4]at org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.testExtraction(ExtractingRequestHandlerTest.java:59) [junit4]at java.lang.Thread.run(Thread.java:745) = I’ll try reverting to just before the Tika upgrade and see if it still happens. Steve On Jan 21, 2015, at 1:03 PM, Chris Hostetter hossman_luc...@fucit.org wrote: It's our old friend SOLR-6387 ! This time manifesting itself via calls down into Tika. My best guess is that something changed in the recently upgraded version of Tika in Solr so that we now tickle this ExternalParser code path in a way we never tickled it before ... but more perplexing is why i can't reproduce any similar errors on trunk (or 5x) using ant test -Dtests.slow=true -Dtests.locale=tr_TR in solr/contrib/extraction ? Does Tika already do some special platform/locale detection internally that is bypassing this error for most people, and only manifesting on MacOSX? can a mac user try to reproduce this? : Date: Wed, 21 Jan 2015 06:47:13 + (UTC) : From: Policeman Jenkins Server jenk...@thetaphi.de : Reply-To: dev@lucene.apache.org : To: rm...@apache.org, ans...@apache.org, sha...@apache.org, no...@apache.org, : gcha...@apache.org, ehatc...@apache.org, tflo...@apache.org, : sar...@gmail.com, dev@lucene.apache.org : Subject: [JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.8.0) - Build # 1943 - : Failure! : : Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/1943/ : Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC : : 1 tests failed. : FAILED: org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.testXPath : : Error Message: : posix_spawn is not a supported process launch mechanism on this platform. : : Stack Trace: : java.lang.Error:
[jira] [Commented] (SOLR-6993) install_solr_service.sh won't install on RHEL / CentOS
[ https://issues.apache.org/jira/browse/SOLR-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285886#comment-14285886 ] ASF subversion and git services commented on SOLR-6993: --- Commit 1653601 from [~thelabdude] in branch 'dev/trunk' [ https://svn.apache.org/r1653601 ] SOLR-6993: install_solr_service.sh won't install on RHEL / CentOS install_solr_service.sh won't install on RHEL / CentOS -- Key: SOLR-6993 URL: https://issues.apache.org/jira/browse/SOLR-6993 Project: Solr Issue Type: Bug Components: scripts and tools Affects Versions: 5.0, Trunk Environment: RHEL 6.5 / CentOS 6.5 Reporter: David Anderson Assignee: Timothy Potter Priority: Blocker Fix For: 5.0, Trunk Attachments: SOLR-6993.patch There's a bug that will prevent install_solr_service.sh from working on RHEL / CentOS 6.5. It works on Ubuntu 14. Appears to be some obscure difference in bash expression evaluation behavior. line 87 and 89:SOLR_DIR=${SOLR_INSTALL_FILE:0:-4} blows up with this error: ./install_solr_service.sh: line 87: -4: substring expression 0 this results in the archive not being extracted and rest of the script won't work. I tested a simple change: SOLR_DIR=${SOLR_INSTALL_FILE%.tgz} and verified it works on both RHEL 6.5 and Ubuntu 14 Patch is attached. I set this to Major thinking that not being able to install on CentOS is worth fixing prior to release. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-6993) install_solr_service.sh won't install on RHEL / CentOS
[ https://issues.apache.org/jira/browse/SOLR-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Potter resolved SOLR-6993. -- Resolution: Fixed install_solr_service.sh won't install on RHEL / CentOS -- Key: SOLR-6993 URL: https://issues.apache.org/jira/browse/SOLR-6993 Project: Solr Issue Type: Bug Components: scripts and tools Affects Versions: 5.0, Trunk Environment: RHEL 6.5 / CentOS 6.5 Reporter: David Anderson Assignee: Timothy Potter Priority: Blocker Fix For: 5.0, Trunk Attachments: SOLR-6993.patch There's a bug that will prevent install_solr_service.sh from working on RHEL / CentOS 6.5. It works on Ubuntu 14. Appears to be some obscure difference in bash expression evaluation behavior. line 87 and 89:SOLR_DIR=${SOLR_INSTALL_FILE:0:-4} blows up with this error: ./install_solr_service.sh: line 87: -4: substring expression 0 this results in the archive not being extracted and rest of the script won't work. I tested a simple change: SOLR_DIR=${SOLR_INSTALL_FILE%.tgz} and verified it works on both RHEL 6.5 and Ubuntu 14 Patch is attached. I set this to Major thinking that not being able to install on CentOS is worth fixing prior to release. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6845) Add buildOnStartup option for suggesters
[ https://issues.apache.org/jira/browse/SOLR-6845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomás Fernández Löbbe updated SOLR-6845: Fix Version/s: 5.1 Trunk Add buildOnStartup option for suggesters Key: SOLR-6845 URL: https://issues.apache.org/jira/browse/SOLR-6845 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Tomás Fernández Löbbe Fix For: Trunk, 5.1 Attachments: SOLR-6845.patch, SOLR-6845.patch, SOLR-6845.patch SOLR-6679 was filed to track the investigation into the following problem... {panel} The stock solrconfig provides a bad experience with a large index... start up Solr and it will spin at 100% CPU for minutes, unresponsive, while it apparently builds a suggester index. ... This is what I did: 1) indexed 10M very small docs (only takes a few minutes). 2) shut down Solr 3) start up Solr and watch it be unresponsive for over 4 minutes! I didn't even use any of the fields specified in the suggester config and I never called the suggest request handler. {panel} ..but ultimately focused on removing/disabling the suggester from the sample configs. Opening this new issue to focus on actually trying to identify the root problem fix it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6134) fix typos
[ https://issues.apache.org/jira/browse/LUCENE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285950#comment-14285950 ] ASF subversion and git services commented on LUCENE-6134: - Commit 1653615 from [~sar...@syr.edu] in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1653615 ] LUCENE-6134: fix typos in lucene/CHANGES.txt and solr/CHANGES.txt (merged trunk r1653612) fix typos - Key: LUCENE-6134 URL: https://issues.apache.org/jira/browse/LUCENE-6134 Project: Lucene - Core Issue Type: Task Reporter: Steve Rowe Assignee: Steve Rowe Priority: Trivial Attachments: LUCENE-6134-CHANGES.txt-s.patch, LUCENE-6134-its-its.patch, LUCENE-6134-necessary-whether-initializ-specified.patch I found a bunch of typos, will fix under this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7
[ https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286101#comment-14286101 ] Uwe Schindler commented on SOLR-6991: - FYI: SolrCellMorphlineTest is already disabled by the same assume, so this is the only broken one. Update to Apache TIKA 1.7 - Key: SOLR-6991 URL: https://issues.apache.org/jira/browse/SOLR-6991 Project: Solr Issue Type: Improvement Components: contrib - Solr Cell (Tika extraction) Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 5.0, Trunk, 5.1 Attachments: SOLR-6991-forkfix.patch, SOLR-6991.patch, SOLR-6991.patch Apache TIKA 1.7 was released: [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt] This is more or less a dependency update, so replacements. Not sure if we should do this for 5.0. In 5.0 we currently have the previous version, which was not yet released with Solr. If we now bring this into 5.0, we wouldn't have a new release 2 times. I can change the stuff this evening and let it bake in 5.x, so maybe we backport this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6134) fix typos
[ https://issues.apache.org/jira/browse/LUCENE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285939#comment-14285939 ] ASF subversion and git services commented on LUCENE-6134: - Commit 1653612 from [~sar...@syr.edu] in branch 'dev/trunk' [ https://svn.apache.org/r1653612 ] LUCENE-6134: fix typos in lucene/CHANGES.txt and solr/CHANGES.txt fix typos - Key: LUCENE-6134 URL: https://issues.apache.org/jira/browse/LUCENE-6134 Project: Lucene - Core Issue Type: Task Reporter: Steve Rowe Assignee: Steve Rowe Priority: Trivial Attachments: LUCENE-6134-CHANGES.txt-s.patch, LUCENE-6134-its-its.patch, LUCENE-6134-necessary-whether-initializ-specified.patch I found a bunch of typos, will fix under this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-5.x-MacOSX (64bit/jdk1.8.0) - Build # 1902 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-MacOSX/1902/ Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseSerialGC 2 tests failed. FAILED: org.apache.solr.cloud.ChaosMonkeyNothingIsSafeTest.testDistribSearch Error Message: There were too many update fails - we expect it can happen, but shouldn't easily Stack Trace: java.lang.AssertionError: There were too many update fails - we expect it can happen, but shouldn't easily at __randomizedtesting.SeedInfo.seed([9FCAAA6FE82229C2:1E2C24779F7D49FE]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.junit.Assert.assertFalse(Assert.java:68) at org.apache.solr.cloud.ChaosMonkeyNothingIsSafeTest.doTest(ChaosMonkeyNothingIsSafeTest.java:224) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:878) at sun.reflect.GeneratedMethodAccessor51.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at
[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7
[ https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286020#comment-14286020 ] Steve Rowe commented on SOLR-6991: -- I can reproduce this on OS X 10.10 using Oracle JDK 1.8.0_20. When I revert back to r1652741 (just before the first commit under this issue), all solr-cell tests pass using the following (same thing that fails 100% for me with current trunk): {noformat} ant clean cd solr/contrib/extraction ant test -Dtests.slow=true -Dtests.locale=tr_TR {noformat} Update to Apache TIKA 1.7 - Key: SOLR-6991 URL: https://issues.apache.org/jira/browse/SOLR-6991 Project: Solr Issue Type: Improvement Components: contrib - Solr Cell (Tika extraction) Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 5.0, Trunk, 5.1 Attachments: SOLR-6991.patch, SOLR-6991.patch Apache TIKA 1.7 was released: [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt] This is more or less a dependency update, so replacements. Not sure if we should do this for 5.0. In 5.0 we currently have the previous version, which was not yet released with Solr. If we now bring this into 5.0, we wouldn't have a new release 2 times. I can change the stuff this evening and let it bake in 5.x, so maybe we backport this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Set path to JRE / JDK in code
Hi, It is primary on my wrapped library that this applies, for so it can be easily installed. Yes on windows the JRE is found using the PATH. I haven't been able to locate where this is done, as I would prefer to use a direct variable or separate environment variable for this to keep it boxed from the system variables. I have however found a way now that works fine based on updating the PATH as a first thing in the __init__.py file in the wrapped library by prepending: import os os.environ[PATH] = rmypath\jre\bin\server + os.pathsep + os.environ[PATH] the installer takes care of updating the path to the right place later on. Many thanks /Petrus On Tue, Jan 20, 2015 at 7:14 PM, Andi Vajda va...@apache.org wrote: On Tue, 20 Jan 2015, Petrus Hyvönen wrote: Hi, I'm trying to package a wrapped library together with a non system-wide java JDK so that it can be easily installed. Can I somehow direct which JDK to use besides using JCC_JDK and putting the JRE in the PATH (I'm currently under windows)? The JCC_JDK could be patched in the setup.py but the PATH JRE that is accessed during running the wrapped library I don't understand where it is accessed, or how to patch this? So you're asking how to control where to pickup the JRE DLLs (on Windows) at runtime ? If I remember correctly, on Windows you just set the Path environment variable, no ? For example it would be good to have this in the config.py file if possible? If you're sure config.py is run _before_ any JRE DLL is loaded, you might be able to change the Path fron there too. Andi.. Any thoughts or someone who's done this already? Regards /Petrus -- _ Petrus Hyvönen, Uppsala, Sweden Mobile Phone/SMS:+46 73 803 19 00 -- _ Petrus Hyvönen, Uppsala, Sweden Mobile Phone/SMS:+46 73 803 19 00
[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7
[ https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286111#comment-14286111 ] Hoss Man commented on SOLR-6991: bq. In fact this is not TIKA's issue and not new, a lot of stuff around Hadoop in Solr fails with Turkish! ...my point is: it's new to Solr. in all other cases where POSIX_SPAWN impacts Solr, we either: * deal with it in the solr code, so we give a meaningful error to the user explaining the problem (ie: SystemInfoHandler) * it's in an optional feature that *NEVER* worked with turkish -- ie: the hadoop / morephlines contribs, from the first version it was available in Solr, would not work with turkish locale ...in this case, we're talking about an _existing_ solr feature, that has previously worked fine if you run older Solr with turkish, and now when upgrading to 5.0 you're going to get a weird error message. if there's nothing better we can do keep the ExtractionRequestHandler working or users who upgrade (even if they run with turkish) then i'm fine with assumes in the tests and notes in the docs ... i was just hoping you'd have a better idea. in particular: I'm still wondering if we can leverage the classpath in a way to override the default TesseractOCRConfig.properties file in the tika-parsers jar with our own version that prevents tesseract from being used. (i agree it's not worth switching to explicitly whitelisting the parsers in Solr code, but is there an easy way to blacklist this parser and/or other parsers we know are problematic?) Update to Apache TIKA 1.7 - Key: SOLR-6991 URL: https://issues.apache.org/jira/browse/SOLR-6991 Project: Solr Issue Type: Improvement Components: contrib - Solr Cell (Tika extraction) Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 5.0, Trunk, 5.1 Attachments: SOLR-6991-forkfix.patch, SOLR-6991.patch, SOLR-6991.patch Apache TIKA 1.7 was released: [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt] This is more or less a dependency update, so replacements. Not sure if we should do this for 5.0. In 5.0 we currently have the previous version, which was not yet released with Solr. If we now bring this into 5.0, we wouldn't have a new release 2 times. I can change the stuff this evening and let it bake in 5.x, so maybe we backport this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6960) Config reporting handler is missing initParams defaults
[ https://issues.apache.org/jira/browse/SOLR-6960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285895#comment-14285895 ] Alexandre Rafalovitch commented on SOLR-6960: - This should really be a blocker for 5.0 as it affects the default example collections. Without this, we cannot claim to actually have Config Report Handler as a new feature. Config reporting handler is missing initParams defaults --- Key: SOLR-6960 URL: https://issues.apache.org/jira/browse/SOLR-6960 Project: Solr Issue Type: Bug Affects Versions: 5.0 Reporter: Alexandre Rafalovitch Fix For: 5.0 *curl http://localhost:8983/solr/techproducts/config/requestHandler* produces (fragments): {quote} /update:{ name:/update, class:org.apache.solr.handler.UpdateRequestHandler, defaults:{}}, /update/json/docs:{ name:/update/json/docs, class:org.apache.solr.handler.UpdateRequestHandler, defaults:{ update.contentType:application/json, json.command:false}}, {quote} Where are the defaults from initParams: {quote} initParams path=/update/**,/query,/select,/tvrh,/elevate,/spell,/browse lst name=defaults str name=dftext/str /lst /initParams initParams path=/update/json/docs lst name=defaults str name=srcField\_src_/str str name=mapUniqueKeyOnlytrue/str /lst /initParams {quote} Obviously, a test is missing as well to catch this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.8.0) - Build # 1943 - Failure!
Is this a Tika 1.7 upgrade related failure? On Tue, Jan 20, 2015 at 10:47 PM, Policeman Jenkins Server jenk...@thetaphi.de wrote: Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/1943/ Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC 1 tests failed. FAILED: org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.testXPath Error Message: posix_spawn is not a supported process launch mechanism on this platform. Stack Trace: java.lang.Error: posix_spawn is not a supported process launch mechanism on this platform. at __randomizedtesting.SeedInfo.seed([58A6FBEB77E81527:26F735786F5F7761]:0) at java.lang.UNIXProcess$1.run(UNIXProcess.java:105) at java.lang.UNIXProcess$1.run(UNIXProcess.java:94) at java.security.AccessController.doPrivileged(Native Method) at java.lang.UNIXProcess.clinit(UNIXProcess.java:92) at java.lang.ProcessImpl.start(ProcessImpl.java:130) at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029) at java.lang.Runtime.exec(Runtime.java:620) at java.lang.Runtime.exec(Runtime.java:485) at org.apache.tika.parser.external.ExternalParser.check(ExternalParser.java:344) at org.apache.tika.parser.ocr.TesseractOCRParser.hasTesseract(TesseractOCRParser.java:117) at org.apache.tika.parser.ocr.TesseractOCRParser.getSupportedTypes(TesseractOCRParser.java:90) at org.apache.tika.parser.CompositeParser.getParsers(CompositeParser.java:81) at org.apache.tika.parser.DefaultParser.getParsers(DefaultParser.java:95) at org.apache.tika.parser.CompositeParser.getSupportedTypes(CompositeParser.java:229) at org.apache.tika.parser.CompositeParser.getParsers(CompositeParser.java:81) at org.apache.tika.parser.CompositeParser.getParser(CompositeParser.java:209) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120) at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:144) at org.apache.solr.core.SolrCore.execute(SolrCore.java:2006) at org.apache.solr.util.TestHarness.queryAndResponse(TestHarness.java:353) at org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.loadLocalFromHandler(ExtractingRequestHandlerTest.java:703) at org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.loadLocal(ExtractingRequestHandlerTest.java:710) at org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.testXPath(ExtractingRequestHandlerTest.java:474) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at
[JENKINS] Lucene-Solr-5.x-Linux (32bit/jdk1.7.0_72) - Build # 11495 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-Linux/11495/ Java: 32bit/jdk1.7.0_72 -server -XX:+UseParallelGC 1 tests failed. FAILED: org.apache.solr.cloud.ReplicationFactorTest.testDistribSearch Error Message: Error from server at http://127.0.0.1:57852/sus/za/repfacttest_c8n_2x2: The target server failed to respond Stack Trace: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://127.0.0.1:57852/sus/za/repfacttest_c8n_2x2: The target server failed to respond at __randomizedtesting.SeedInfo.seed([C9F1855E170E9BAE:48170B466051FB92]:0) at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:558) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:214) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:210) at org.apache.solr.cloud.ReplicationFactorTest.sendNonDirectUpdateRequestReplica(ReplicationFactorTest.java:195) at org.apache.solr.cloud.ReplicationFactorTest.testRf2NotUsingDirectUpdates(ReplicationFactorTest.java:165) at org.apache.solr.cloud.ReplicationFactorTest.doTest(ReplicationFactorTest.java:129) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:878) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54)
[jira] [Resolved] (SOLR-6845) Add buildOnStartup option for suggesters
[ https://issues.apache.org/jira/browse/SOLR-6845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomás Fernández Löbbe resolved SOLR-6845. - Resolution: Fixed Add buildOnStartup option for suggesters Key: SOLR-6845 URL: https://issues.apache.org/jira/browse/SOLR-6845 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Tomás Fernández Löbbe Fix For: Trunk, 5.1 Attachments: SOLR-6845.patch, SOLR-6845.patch, SOLR-6845.patch SOLR-6679 was filed to track the investigation into the following problem... {panel} The stock solrconfig provides a bad experience with a large index... start up Solr and it will spin at 100% CPU for minutes, unresponsive, while it apparently builds a suggester index. ... This is what I did: 1) indexed 10M very small docs (only takes a few minutes). 2) shut down Solr 3) start up Solr and watch it be unresponsive for over 4 minutes! I didn't even use any of the fields specified in the suggester config and I never called the suggest request handler. {panel} ..but ultimately focused on removing/disabling the suggester from the sample configs. Opening this new issue to focus on actually trying to identify the root problem fix it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6993) install_solr_service.sh won't install on RHEL / CentOS
[ https://issues.apache.org/jira/browse/SOLR-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285890#comment-14285890 ] ASF subversion and git services commented on SOLR-6993: --- Commit 1653603 from [~thelabdude] in branch 'dev/branches/lucene_solr_5_0' [ https://svn.apache.org/r1653603 ] SOLR-6993: install_solr_service.sh won't install on RHEL / CentOS install_solr_service.sh won't install on RHEL / CentOS -- Key: SOLR-6993 URL: https://issues.apache.org/jira/browse/SOLR-6993 Project: Solr Issue Type: Bug Components: scripts and tools Affects Versions: 5.0, Trunk Environment: RHEL 6.5 / CentOS 6.5 Reporter: David Anderson Assignee: Timothy Potter Priority: Blocker Fix For: 5.0, Trunk Attachments: SOLR-6993.patch There's a bug that will prevent install_solr_service.sh from working on RHEL / CentOS 6.5. It works on Ubuntu 14. Appears to be some obscure difference in bash expression evaluation behavior. line 87 and 89:SOLR_DIR=${SOLR_INSTALL_FILE:0:-4} blows up with this error: ./install_solr_service.sh: line 87: -4: substring expression 0 this results in the archive not being extracted and rest of the script won't work. I tested a simple change: SOLR_DIR=${SOLR_INSTALL_FILE%.tgz} and verified it works on both RHEL 6.5 and Ubuntu 14 Patch is attached. I set this to Major thinking that not being able to install on CentOS is worth fixing prior to release. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6134) fix typos
[ https://issues.apache.org/jira/browse/LUCENE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285958#comment-14285958 ] ASF subversion and git services commented on LUCENE-6134: - Commit 1653616 from [~sar...@syr.edu] in branch 'dev/branches/lucene_solr_5_0' [ https://svn.apache.org/r1653616 ] LUCENE-6134: fix typos in lucene/CHANGES.txt and solr/CHANGES.txt (merged branch_5x r1653615) fix typos - Key: LUCENE-6134 URL: https://issues.apache.org/jira/browse/LUCENE-6134 Project: Lucene - Core Issue Type: Task Reporter: Steve Rowe Assignee: Steve Rowe Priority: Trivial Attachments: LUCENE-6134-CHANGES.txt-s.patch, LUCENE-6134-its-its.patch, LUCENE-6134-necessary-whether-initializ-specified.patch I found a bunch of typos, will fix under this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6521) CloudSolrServer should synchronize cache cluster state loading
[ https://issues.apache.org/jira/browse/SOLR-6521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285985#comment-14285985 ] Noble Paul commented on SOLR-6521: -- bq.The patch is locking the entire cache for all loading, which might not be an ideal solution for a cluster with many I understand that. In reality , different collections expire at different time .so everyone waiting on the lock would be a rare thing. The common use case is one collection expired and every thread is trying to refresh that simultaneously. I agree that the concurrency can be dramatically improved . Using Guava may not be an option because it is not yet a dependency on SolrJ. The other option would be to make the cache pluggable through an API . So ,if you have Guava or something else in your package you can plug it in through an API CloudSolrServer should synchronize cache cluster state loading -- Key: SOLR-6521 URL: https://issues.apache.org/jira/browse/SOLR-6521 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Jessica Cheng Mallet Assignee: Noble Paul Priority: Critical Labels: SolrCloud Fix For: 5.0, Trunk Attachments: SOLR-6521.patch Under heavy load-testing with the new solrj client that caches the cluster state instead of setting a watcher, I started seeing lots of zk connection loss on the client-side when refreshing the CloudSolrServer collectionStateCache, and this was causing crazy client-side 99.9% latency (~15 sec). I swapped the cache out with guava's LoadingCache (which does locking to ensure only one thread loads the content under one key while the other threads that want the same key wait) and the connection loss went away and the 99.9% latency also went down to just about 1 sec. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-7005) facet.heatmap for spatial heatmap faceting on RPT
[ https://issues.apache.org/jira/browse/SOLR-7005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-7005: --- Attachment: heatmap_64x32.png heatmap_512x256.png There are some performance #'s on LUCENE-6191. I experimented with generating a PNG to carry the data in a compressed manner, since this data can get large. I'm abusing the image to carry the same detail in the counts, and that means 4 bytes per pixel. Counts 16M touch the high byte of a 4-byte int, which is where the alpha channel is, which will progressively lighten the image. _The image is not at all optimized for human viewing that is pleasant on the eyes_, except for the bit flip of the high (alpha channel) byte; otherwise you would see nothing until the counts exceed this figure. That said, it's crude and you can get a sense of it. _If people have input on how to cheaply and easily tweak the value to look nicer, I'm interested._ Since a client app may consume this PNG if it wants this compressed format and render it the way it wants to, there should be a straight-forward algorithm to derive the count from the ARGB (alpha, red, green, blue) int. The attached PNG is 512x256 (131,072 cells mind you!) of the 8.5M geonames data set. On a 16 segment index with no search filters, it took 882ms to compute the underlying heatmap, and 218ms to build the PNG and write it to disk. The write-to-disk hack is temporary to easily view the image by opening it from the file system. You can expect there will be more time in consuming this image from Solr's javabin/XML/JSON + base64 wrapper (whatever you choose). Now a 512x256 image is so detailed that it arguably isn't a heatmap but another way to go about rendering individual points. A more course, say, 64x32 image would be more true to the heatmap label, and obviously much faster to generate -- like 100ms + only ~2ms to generate the PNG. facet.heatmap for spatial heatmap faceting on RPT - Key: SOLR-7005 URL: https://issues.apache.org/jira/browse/SOLR-7005 Project: Solr Issue Type: New Feature Components: spatial Reporter: David Smiley Assignee: David Smiley Fix For: 5.1 Attachments: heatmap_512x256.png, heatmap_64x32.png This is a new feature that uses the new spatial Heatmap / 2D PrefixTree cell counter in Lucene spatial LUCENE-6191. This is a form of faceting, and as-such I think it should live in the facet parameter namespace. Here's what the parameters are: * facet=true * facet.heatmap=fieldname * facet.heatmap.bbox=\[-180 -90 TO 180 90] * facet.heatmap.gridLevel=6 * facet.heatmap.distErrPct=0.10 Like other faceting features, the fieldName can have local-params to exclude filter queries or specify an output key. The bbox is optional; you get the whole world or you can specify a box or actually any shape that WKT supports (you get the bounding box of whatever you put). Ultimately, this feature needs to know the grid level, which together with the input shape will yield a certain number of cells. You can specify gridLevel exactly, or don't and instead provide distErrPct which is computed like it is for the RPT field type as seen in the schema. 0.10 yielded ~4k cells but it'll vary. There's also a facet.heatmap.maxCells safety net defaulting to 100k. Exceed this and you get an error. The output is (JSON): {noformat} {gridLevel=6,columns=64,rows=64,minX=-180.0,maxX=180.0,minY=-90.0,maxY=90.0,counts=[[0, 0, 2, 1, ],[1, 1, 3, 2, ...],...]} {noformat} counts is null if all would be 0. Perhaps individual row arrays should likewise be null... I welcome feedback. I'm toying with an output format option in which you can specify a base-64'ed grayscale PNG. Obviously this should support sharded / distributed environments. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6928) solr.cmd stop works only in english
[ https://issues.apache.org/jira/browse/SOLR-6928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285945#comment-14285945 ] Timothy Potter commented on SOLR-6928: -- awesome suggestion Jan! testing your idea now and will get committed for 5 solr.cmd stop works only in english --- Key: SOLR-6928 URL: https://issues.apache.org/jira/browse/SOLR-6928 Project: Solr Issue Type: Bug Components: scripts and tools Affects Versions: 4.10.3 Environment: german windows 7 Reporter: john.work Assignee: Timothy Potter Priority: Minor in solr.cmd the stop doesnt work while executing 'netstat -nao ^| find /i listening ^| find :%SOLR_PORT%' so listening is not found. e.g. in german cmd.exe the netstat -nao prints the following output: {noformat} Proto Lokale Adresse Remoteadresse Status PID TCP0.0.0.0:80 0.0.0.0:0 ABHÖREN 4 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-6134) fix typos
[ https://issues.apache.org/jira/browse/LUCENE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Rowe updated LUCENE-6134: --- Attachment: LUCENE-6134-CHANGES.txt-s.patch Patch fixing typos in {{lucene/CHANGES.txt}} and {{solr/CHANGES.txt}}. Committing shortly. fix typos - Key: LUCENE-6134 URL: https://issues.apache.org/jira/browse/LUCENE-6134 Project: Lucene - Core Issue Type: Task Reporter: Steve Rowe Assignee: Steve Rowe Priority: Trivial Attachments: LUCENE-6134-CHANGES.txt-s.patch, LUCENE-6134-its-its.patch, LUCENE-6134-necessary-whether-initializ-specified.patch I found a bunch of typos, will fix under this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-7012) add an ant target to package a plugin into a jar
Noble Paul created SOLR-7012: Summary: add an ant target to package a plugin into a jar Key: SOLR-7012 URL: https://issues.apache.org/jira/browse/SOLR-7012 Project: Solr Issue Type: Sub-task Reporter: Noble Paul Assignee: Noble Paul Now it is extremely hard to create plugin because the user do not know about the exact dependencies and their poms we will add a target to solr/build.xml called plugin-jar invoke it as follows {code} ant -Dplugin.package=my.package -Djar.location=/tmp/my.jar plugin-jar {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6845) Add buildOnStartup option for suggesters
[ https://issues.apache.org/jira/browse/SOLR-6845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomás Fernández Löbbe updated SOLR-6845: Assignee: Tomás Fernández Löbbe Summary: Add buildOnStartup option for suggesters (was: figure out why suggester causes slow startup - even when not used) Changed summary to reflect the actual change done Add buildOnStartup option for suggesters Key: SOLR-6845 URL: https://issues.apache.org/jira/browse/SOLR-6845 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Tomás Fernández Löbbe Attachments: SOLR-6845.patch, SOLR-6845.patch, SOLR-6845.patch SOLR-6679 was filed to track the investigation into the following problem... {panel} The stock solrconfig provides a bad experience with a large index... start up Solr and it will spin at 100% CPU for minutes, unresponsive, while it apparently builds a suggester index. ... This is what I did: 1) indexed 10M very small docs (only takes a few minutes). 2) shut down Solr 3) start up Solr and watch it be unresponsive for over 4 minutes! I didn't even use any of the fields specified in the suggester config and I never called the suggest request handler. {panel} ..but ultimately focused on removing/disabling the suggester from the sample configs. Opening this new issue to focus on actually trying to identify the root problem fix it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6928) solr.cmd stop works only in english
[ https://issues.apache.org/jira/browse/SOLR-6928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Potter updated SOLR-6928: - Attachment: SOLR-6928.patch Here's a patch that builds upon Jan's, but required checking for PID==0 because it was still finding something that wasn't listening: Proto Local Address Foreign AddressState PID TCP127.0.0.1:49204127.0.0.1:8983 TIME_WAIT 0 According to the docs, the PID 0 is for a pseudo-idle process so the script could ignore those and keep looping to find the actual listening process. This patch works well on English Windows ... I don't have access to a German Windows box, can someone test please? solr.cmd stop works only in english --- Key: SOLR-6928 URL: https://issues.apache.org/jira/browse/SOLR-6928 Project: Solr Issue Type: Bug Components: scripts and tools Affects Versions: 4.10.3 Environment: german windows 7 Reporter: john.work Assignee: Timothy Potter Priority: Minor Attachments: SOLR-6928.patch in solr.cmd the stop doesnt work while executing 'netstat -nao ^| find /i listening ^| find :%SOLR_PORT%' so listening is not found. e.g. in german cmd.exe the netstat -nao prints the following output: {noformat} Proto Lokale Adresse Remoteadresse Status PID TCP0.0.0.0:80 0.0.0.0:0 ABHÖREN 4 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Reopened] (SOLR-6991) Update to Apache TIKA 1.7
[ https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Rowe reopened SOLR-6991: -- Reopening to address this Mac OS X failure in solr-cell: {noformat} Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/1943/ Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC 1 tests failed. FAILED: org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.testXPath ... [junit4] 2 NOTE: reproduce with: ant test -Dtestcase=ExtractingRequestHandlerTest -Dtests.method=testXPath -Dtests.seed=58A6FBEB77E81527 -Dtests.slow=true -Dtests.locale=tr_TR -Dtests.timezone=Etc/GMT+3 -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] ERROR 2.57s | ExtractingRequestHandlerTest.testXPath [junit4] Throwable #1: java.lang.Error: posix_spawn is not a supported process launch mechanism on this platform. [junit4] at __randomizedtesting.SeedInfo.seed([58A6FBEB77E81527:26F735786F5F7761]:0) [junit4] at java.lang.UNIXProcess$1.run(UNIXProcess.java:105) [junit4] at java.lang.UNIXProcess$1.run(UNIXProcess.java:94) [junit4] at java.security.AccessController.doPrivileged(Native Method) [junit4] at java.lang.UNIXProcess.clinit(UNIXProcess.java:92) [junit4] at java.lang.ProcessImpl.start(ProcessImpl.java:130) [junit4] at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029) [junit4] at java.lang.Runtime.exec(Runtime.java:620) [junit4] at java.lang.Runtime.exec(Runtime.java:485) [junit4] at org.apache.tika.parser.external.ExternalParser.check(ExternalParser.java:344) [junit4] at org.apache.tika.parser.ocr.TesseractOCRParser.hasTesseract(TesseractOCRParser.java:117) [junit4] at org.apache.tika.parser.ocr.TesseractOCRParser.getSupportedTypes(TesseractOCRParser.java:90) [junit4] at org.apache.tika.parser.CompositeParser.getParsers(CompositeParser.java:81) [junit4] at org.apache.tika.parser.DefaultParser.getParsers(DefaultParser.java:95) [junit4] at org.apache.tika.parser.CompositeParser.getSupportedTypes(CompositeParser.java:229) [junit4] at org.apache.tika.parser.CompositeParser.getParsers(CompositeParser.java:81) [junit4] at org.apache.tika.parser.CompositeParser.getParser(CompositeParser.java:209) [junit4] at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244) [junit4] at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120) [junit4] at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219) [junit4] at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) [junit4] at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:144) [junit4] at org.apache.solr.core.SolrCore.execute(SolrCore.java:2006) [junit4] at org.apache.solr.util.TestHarness.queryAndResponse(TestHarness.java:353) [junit4] at org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.loadLocalFromHandler(ExtractingRequestHandlerTest.java:703) [junit4] at org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.loadLocal(ExtractingRequestHandlerTest.java:710) [junit4] at org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.testXPath(ExtractingRequestHandlerTest.java:474) [junit4] at java.lang.Thread.run(Thread.java:745) {noformat} Update to Apache TIKA 1.7 - Key: SOLR-6991 URL: https://issues.apache.org/jira/browse/SOLR-6991 Project: Solr Issue Type: Improvement Components: contrib - Solr Cell (Tika extraction) Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 5.0, Trunk, 5.1 Attachments: SOLR-6991.patch, SOLR-6991.patch Apache TIKA 1.7 was released: [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt] This is more or less a dependency update, so replacements. Not sure if we should do this for 5.0. In 5.0 we currently have the previous version, which was not yet released with Solr. If we now bring this into 5.0, we wouldn't have a new release 2 times. I can change the stuff this evening and let it bake in 5.x, so maybe we backport this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-6192) Long overflow in LuceneXXSkipWriter can corrupt skip data
[ https://issues.apache.org/jira/browse/LUCENE-6192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-6192. Resolution: Fixed Resolving ... Tom can you post back here the results of testing with this fix? Thanks. Hopefully this is the bug you were hitting! Long overflow in LuceneXXSkipWriter can corrupt skip data - Key: LUCENE-6192 URL: https://issues.apache.org/jira/browse/LUCENE-6192 Project: Lucene - Core Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0, Trunk, 4.x Attachments: LUCENE-6192.patch I've been iterating with Tom on this corruption that CheckIndex detects in his rather large index (720 GB in a single segment): {noformat} java -Xmx16G -Xms16G -cp $JAR -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex //shards/4/core-1/data/test_index -verbose 21 |tee -a shard4_reoptimizedNewJava Opening index @ /htsolr/lss-reindex/shards/4/core-1/data/test_index Segments file=segments_e numSegments=1 version=4.10.2 format= userData={commitTimeMSec=1421479358825} 1 of 1: name=_8m8 docCount=1130856 version=4.10.2 codec=Lucene410 compound=false numFiles=10 size (MB)=719,967.32 diagnostics = {timestamp=1421437320935, os=Linux, os.version=2.6.18-400.1.1.el5, mergeFactor=2, source=merge, lucene.version=4.10.2, os.arch=amd64, mergeMaxNumSegments=1, java.version=1.7.0_71, java.vendor=Oracle Corporation} no deletions test: open reader.OK test: check integrity.OK test: check live docs.OK test: fields..OK [80 fields] test: field norms.OK [23 fields] test: terms, freq, prox...ERROR: java.lang.AssertionError: -96 java.lang.AssertionError: -96 at org.apache.lucene.codecs.lucene41.ForUtil.skipBlock(ForUtil.java:228) at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.skipPositions(Lucene41PostingsReader.java:925) at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.nextPosition(Lucene41PostingsReader.java:955) at org.apache.lucene.index.CheckIndex.checkFields(CheckIndex.java:1100) at org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:1357) at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:655) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096) test: stored fields...OK [67472796 total field count; avg 59.665 fields per doc] test: term vectorsOK [0 total vector count; avg 0 term/freq vector fields per doc] test: docvalues...OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 SORTED; 0 SORTED_NUMERIC; 0 SORTED_SET] FAILED WARNING: fixIndex() would remove reference to this segment; full exception: java.lang.RuntimeException: Term Index test failed at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:670) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096) WARNING: 1 broken segments (containing 1130856 documents) detected WARNING: would write new segments file, and 1130856 documents would be lost, if -fix were specified {noformat} And Rob spotted long - int casts in our skip list writers that look like they could cause such corruption if a single high-freq term with many positions required 2.1 GB to write its positions into .pos. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6192) Long overflow in LuceneXXSkipWriter can corrupt skip data
[ https://issues.apache.org/jira/browse/LUCENE-6192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285835#comment-14285835 ] ASF subversion and git services commented on LUCENE-6192: - Commit 1653585 from [~mikemccand] in branch 'dev/branches/lucene_solr_5_0' [ https://svn.apache.org/r1653585 ] LUCENE-6192: don't overflow int when writing skip data for high freq terms in extremely large indices Long overflow in LuceneXXSkipWriter can corrupt skip data - Key: LUCENE-6192 URL: https://issues.apache.org/jira/browse/LUCENE-6192 Project: Lucene - Core Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0, Trunk, 4.x Attachments: LUCENE-6192.patch I've been iterating with Tom on this corruption that CheckIndex detects in his rather large index (720 GB in a single segment): {noformat} java -Xmx16G -Xms16G -cp $JAR -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex //shards/4/core-1/data/test_index -verbose 21 |tee -a shard4_reoptimizedNewJava Opening index @ /htsolr/lss-reindex/shards/4/core-1/data/test_index Segments file=segments_e numSegments=1 version=4.10.2 format= userData={commitTimeMSec=1421479358825} 1 of 1: name=_8m8 docCount=1130856 version=4.10.2 codec=Lucene410 compound=false numFiles=10 size (MB)=719,967.32 diagnostics = {timestamp=1421437320935, os=Linux, os.version=2.6.18-400.1.1.el5, mergeFactor=2, source=merge, lucene.version=4.10.2, os.arch=amd64, mergeMaxNumSegments=1, java.version=1.7.0_71, java.vendor=Oracle Corporation} no deletions test: open reader.OK test: check integrity.OK test: check live docs.OK test: fields..OK [80 fields] test: field norms.OK [23 fields] test: terms, freq, prox...ERROR: java.lang.AssertionError: -96 java.lang.AssertionError: -96 at org.apache.lucene.codecs.lucene41.ForUtil.skipBlock(ForUtil.java:228) at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.skipPositions(Lucene41PostingsReader.java:925) at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.nextPosition(Lucene41PostingsReader.java:955) at org.apache.lucene.index.CheckIndex.checkFields(CheckIndex.java:1100) at org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:1357) at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:655) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096) test: stored fields...OK [67472796 total field count; avg 59.665 fields per doc] test: term vectorsOK [0 total vector count; avg 0 term/freq vector fields per doc] test: docvalues...OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 SORTED; 0 SORTED_NUMERIC; 0 SORTED_SET] FAILED WARNING: fixIndex() would remove reference to this segment; full exception: java.lang.RuntimeException: Term Index test failed at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:670) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096) WARNING: 1 broken segments (containing 1130856 documents) detected WARNING: would write new segments file, and 1130856 documents would be lost, if -fix were specified {noformat} And Rob spotted long - int casts in our skip list writers that look like they could cause such corruption if a single high-freq term with many positions required 2.1 GB to write its positions into .pos. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6191) Spatial 2D faceting (heatmaps)
[ https://issues.apache.org/jira/browse/LUCENE-6191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285863#comment-14285863 ] David Smiley commented on LUCENE-6191: -- delta stats: * *Segments: 16 (no deleted)* * docs: 8,526,175 (slightly less than before, not sure why). * QuadTree precision: 22 (better than 10m) bounds: -180 to 180, -180 to 180 (360x360 square) * Disk index size: 2.35GB * heatmap input range: -180 to 180, -89.999 to 89.999 (slightly inset so heatmap doesn't include a row just 90 and just 90) 512x256 (131,072 cells) heatmap : 882ms 64x32 (2048 cells) heatmap: 120ms Spatial 2D faceting (heatmaps) -- Key: LUCENE-6191 URL: https://issues.apache.org/jira/browse/LUCENE-6191 Project: Lucene - Core Issue Type: New Feature Components: modules/spatial Reporter: David Smiley Assignee: David Smiley Fix For: 5.1 Attachments: LUCENE-6191__Spatial_heatmap.patch Lucene spatial's PrefixTree (grid) based strategies index data in a way highly amenable to faceting on grids cells to compute a so-called _heatmap_. The underlying code in this patch uses the PrefixTreeFacetCounter utility class which was recently refactored out of faceting for NumberRangePrefixTree LUCENE-5735. At a low level, the terms (== grid cells) are navigated per-segment, forward only with TermsEnum.seek, so it's pretty quick and furthermore requires no extra caches no docvalues. Ideally you should use QuadPrefixTree (or Flex once it comes out) to maximize the number grid levels which in turn maximizes the fidelity of choices when you ask for a grid covering a region. Conveniently, the provided capability returns the data in a 2-D grid of counts, so the caller needn't know a thing about how the data is encoded in the prefix tree. Well almost... at this point they need to provide a grid level, but I'll soon provide a means of deriving the grid level based on a min/max cell count. I recommend QuadPrefixTree with geo=false so that you can provide a square world-bounds (360x360 degrees), which means square grid cells which are more desirable to display than rectangular cells. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6192) Long overflow in LuceneXXSkipWriter can corrupt skip data
[ https://issues.apache.org/jira/browse/LUCENE-6192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285862#comment-14285862 ] ASF subversion and git services commented on LUCENE-6192: - Commit 1653594 from [~mikemccand] in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1653594 ] LUCENE-6192: don't overflow int when writing skip data for high freq terms in extremely large indices Long overflow in LuceneXXSkipWriter can corrupt skip data - Key: LUCENE-6192 URL: https://issues.apache.org/jira/browse/LUCENE-6192 Project: Lucene - Core Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0, Trunk, 4.x Attachments: LUCENE-6192.patch I've been iterating with Tom on this corruption that CheckIndex detects in his rather large index (720 GB in a single segment): {noformat} java -Xmx16G -Xms16G -cp $JAR -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex //shards/4/core-1/data/test_index -verbose 21 |tee -a shard4_reoptimizedNewJava Opening index @ /htsolr/lss-reindex/shards/4/core-1/data/test_index Segments file=segments_e numSegments=1 version=4.10.2 format= userData={commitTimeMSec=1421479358825} 1 of 1: name=_8m8 docCount=1130856 version=4.10.2 codec=Lucene410 compound=false numFiles=10 size (MB)=719,967.32 diagnostics = {timestamp=1421437320935, os=Linux, os.version=2.6.18-400.1.1.el5, mergeFactor=2, source=merge, lucene.version=4.10.2, os.arch=amd64, mergeMaxNumSegments=1, java.version=1.7.0_71, java.vendor=Oracle Corporation} no deletions test: open reader.OK test: check integrity.OK test: check live docs.OK test: fields..OK [80 fields] test: field norms.OK [23 fields] test: terms, freq, prox...ERROR: java.lang.AssertionError: -96 java.lang.AssertionError: -96 at org.apache.lucene.codecs.lucene41.ForUtil.skipBlock(ForUtil.java:228) at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.skipPositions(Lucene41PostingsReader.java:925) at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.nextPosition(Lucene41PostingsReader.java:955) at org.apache.lucene.index.CheckIndex.checkFields(CheckIndex.java:1100) at org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:1357) at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:655) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096) test: stored fields...OK [67472796 total field count; avg 59.665 fields per doc] test: term vectorsOK [0 total vector count; avg 0 term/freq vector fields per doc] test: docvalues...OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 SORTED; 0 SORTED_NUMERIC; 0 SORTED_SET] FAILED WARNING: fixIndex() would remove reference to this segment; full exception: java.lang.RuntimeException: Term Index test failed at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:670) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096) WARNING: 1 broken segments (containing 1130856 documents) detected WARNING: would write new segments file, and 1130856 documents would be lost, if -fix were specified {noformat} And Rob spotted long - int casts in our skip list writers that look like they could cause such corruption if a single high-freq term with many positions required 2.1 GB to write its positions into .pos. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-NightlyTests-5.x - Build # 740 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-5.x/740/ 4 tests failed. FAILED: org.apache.solr.cloud.HttpPartitionTest.testDistribSearch Error Message: org.apache.http.NoHttpResponseException: The target server failed to respond Stack Trace: org.apache.solr.client.solrj.SolrServerException: org.apache.http.NoHttpResponseException: The target server failed to respond at __randomizedtesting.SeedInfo.seed([A40ECF0BF65E3F19:25E8411381015F25]:0) at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:871) at org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:736) at org.apache.solr.cloud.HttpPartitionTest.sendDoc(HttpPartitionTest.java:480) at org.apache.solr.cloud.HttpPartitionTest.testRf2(HttpPartitionTest.java:201) at org.apache.solr.cloud.HttpPartitionTest.doTest(HttpPartitionTest.java:114) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:878) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at
[jira] [Commented] (LUCENE-6191) Spatial 2D faceting (heatmaps)
[ https://issues.apache.org/jira/browse/LUCENE-6191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285809#comment-14285809 ] David Smiley commented on LUCENE-6191: -- I have some performance numbers taken while working on SOLR-7005. I took a geonames data set of 8,552,952 docs and I indexed the latitude longitude into a quad prefixTree with maximum resolution of a meter and with geo=false and -180 to 180, -90 to 90 world bounds of standard geodetic degree boundaries. That's a screw-up on my part; I forgot to use 360x360 to get square grid boxes instead of rectangular ones. But that's not pertinent. The index size is 2.6GB which is kind of large. Increasing the maximum resolution to above a meter will decrease the index size a lot. This reminds me of how beneficial the forthcoming flex prefixTree will be, but I digress. This data is all points. Base stats: * Machine: my SSD based recent MacBook Pro, Java 8 * Lucene/Solr: trunk as of last night * Docs: 8,552,952 * Segments: 1 * Disk index size: 2.6GB * QuadTree: ** precision: 26 (better than a meter) 512x512 heatmap, (_note: this is a whopping 262,144 cells_): 248ms (PNG to be attached to SOLR-7005 soon). Now filtered with an additional query down to 165 docs: 105ms (I figure this fast number is due to a particular optimization in the prefix tree facet counter for highly discriminating filters). 64x64 heatmap (4,096 cells): 105ms Filtered to 165 docs: 21ms I took one measurement when the index was un-optimized at 38 segments, including 10K deleted docs (512x512 query all): 1800ms roughly. I should try this again after I re-index with the square grid cells I want. Spatial 2D faceting (heatmaps) -- Key: LUCENE-6191 URL: https://issues.apache.org/jira/browse/LUCENE-6191 Project: Lucene - Core Issue Type: New Feature Components: modules/spatial Reporter: David Smiley Assignee: David Smiley Fix For: 5.1 Attachments: LUCENE-6191__Spatial_heatmap.patch Lucene spatial's PrefixTree (grid) based strategies index data in a way highly amenable to faceting on grids cells to compute a so-called _heatmap_. The underlying code in this patch uses the PrefixTreeFacetCounter utility class which was recently refactored out of faceting for NumberRangePrefixTree LUCENE-5735. At a low level, the terms (== grid cells) are navigated per-segment, forward only with TermsEnum.seek, so it's pretty quick and furthermore requires no extra caches no docvalues. Ideally you should use QuadPrefixTree (or Flex once it comes out) to maximize the number grid levels which in turn maximizes the fidelity of choices when you ask for a grid covering a region. Conveniently, the provided capability returns the data in a 2-D grid of counts, so the caller needn't know a thing about how the data is encoded in the prefix tree. Well almost... at this point they need to provide a grid level, but I'll soon provide a means of deriving the grid level based on a min/max cell count. I recommend QuadPrefixTree with geo=false so that you can provide a square world-bounds (360x360 degrees), which means square grid cells which are more desirable to display than rectangular cells. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1653580 - in /lucene/dev/branches/branch_5x: ./ lucene/ lucene/backward-codecs/src/java/org/apache/lucene/codecs/lucene41/ lucene/core/ lucene/core/src/java/org/apache/lucene/codecs/l
I think we should also fix Lucene41SkipWriter (src/test) ? On Wed, Jan 21, 2015 at 11:24 AM, mikemcc...@apache.org wrote: Author: mikemccand Date: Wed Jan 21 16:24:08 2015 New Revision: 1653580 URL: http://svn.apache.org/r1653580 Log: LUCENE-6192: don't overflow int when writing skip data for high freq terms in extremely large indices Modified: lucene/dev/branches/branch_5x/ (props changed) lucene/dev/branches/branch_5x/lucene/ (props changed) lucene/dev/branches/branch_5x/lucene/CHANGES.txt (props changed) lucene/dev/branches/branch_5x/lucene/backward-codecs/src/java/org/apache/lucene/codecs/lucene41/Lucene41SkipReader.java lucene/dev/branches/branch_5x/lucene/core/ (props changed) lucene/dev/branches/branch_5x/lucene/core/src/java/org/apache/lucene/codecs/lucene50/Lucene50SkipReader.java lucene/dev/branches/branch_5x/lucene/core/src/java/org/apache/lucene/codecs/lucene50/Lucene50SkipWriter.java Modified: lucene/dev/branches/branch_5x/lucene/backward-codecs/src/java/org/apache/lucene/codecs/lucene41/Lucene41SkipReader.java URL: http://svn.apache.org/viewvc/lucene/dev/branches/branch_5x/lucene/backward-codecs/src/java/org/apache/lucene/codecs/lucene41/Lucene41SkipReader.java?rev=1653580r1=1653579r2=1653580view=diff == --- lucene/dev/branches/branch_5x/lucene/backward-codecs/src/java/org/apache/lucene/codecs/lucene41/Lucene41SkipReader.java (original) +++ lucene/dev/branches/branch_5x/lucene/backward-codecs/src/java/org/apache/lucene/codecs/lucene41/Lucene41SkipReader.java Wed Jan 21 16:24:08 2015 @@ -173,13 +173,13 @@ final class Lucene41SkipReader extends M // if (DEBUG) { // System.out.println( delta= + delta); // } -docPointer[level] += skipStream.readVInt(); +docPointer[level] += skipStream.readVLong(); // if (DEBUG) { // System.out.println( docFP= + docPointer[level]); // } if (posPointer != null) { - posPointer[level] += skipStream.readVInt(); + posPointer[level] += skipStream.readVLong(); // if (DEBUG) { // System.out.println( posFP= + posPointer[level]); // } @@ -193,7 +193,7 @@ final class Lucene41SkipReader extends M } if (payPointer != null) { -payPointer[level] += skipStream.readVInt(); +payPointer[level] += skipStream.readVLong(); } } return delta; Modified: lucene/dev/branches/branch_5x/lucene/core/src/java/org/apache/lucene/codecs/lucene50/Lucene50SkipReader.java URL: http://svn.apache.org/viewvc/lucene/dev/branches/branch_5x/lucene/core/src/java/org/apache/lucene/codecs/lucene50/Lucene50SkipReader.java?rev=1653580r1=1653579r2=1653580view=diff == --- lucene/dev/branches/branch_5x/lucene/core/src/java/org/apache/lucene/codecs/lucene50/Lucene50SkipReader.java (original) +++ lucene/dev/branches/branch_5x/lucene/core/src/java/org/apache/lucene/codecs/lucene50/Lucene50SkipReader.java Wed Jan 21 16:24:08 2015 @@ -179,10 +179,10 @@ final class Lucene50SkipReader extends M @Override protected int readSkipData(int level, IndexInput skipStream) throws IOException { int delta = skipStream.readVInt(); -docPointer[level] += skipStream.readVInt(); +docPointer[level] += skipStream.readVLong(); if (posPointer != null) { - posPointer[level] += skipStream.readVInt(); + posPointer[level] += skipStream.readVLong(); posBufferUpto[level] = skipStream.readVInt(); if (payloadByteUpto != null) { @@ -190,7 +190,7 @@ final class Lucene50SkipReader extends M } if (payPointer != null) { -payPointer[level] += skipStream.readVInt(); +payPointer[level] += skipStream.readVLong(); } } return delta; Modified: lucene/dev/branches/branch_5x/lucene/core/src/java/org/apache/lucene/codecs/lucene50/Lucene50SkipWriter.java URL: http://svn.apache.org/viewvc/lucene/dev/branches/branch_5x/lucene/core/src/java/org/apache/lucene/codecs/lucene50/Lucene50SkipWriter.java?rev=1653580r1=1653579r2=1653580view=diff == --- lucene/dev/branches/branch_5x/lucene/core/src/java/org/apache/lucene/codecs/lucene50/Lucene50SkipWriter.java (original) +++ lucene/dev/branches/branch_5x/lucene/core/src/java/org/apache/lucene/codecs/lucene50/Lucene50SkipWriter.java Wed Jan 21 16:24:08 2015 @@ -147,12 +147,12 @@ final class Lucene50SkipWriter extends M skipBuffer.writeVInt(delta); lastSkipDoc[level] = curDoc; -skipBuffer.writeVInt((int) (curDocPointer - lastSkipDocPointer[level])); +skipBuffer.writeVLong(curDocPointer - lastSkipDocPointer[level]); lastSkipDocPointer[level] = curDocPointer; if (fieldHasPositions) {
[jira] [Commented] (LUCENE-6192) Long overflow in LuceneXXSkipWriter can corrupt skip data
[ https://issues.apache.org/jira/browse/LUCENE-6192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285842#comment-14285842 ] ASF subversion and git services commented on LUCENE-6192: - Commit 1653588 from [~mikemccand] in branch 'dev/trunk' [ https://svn.apache.org/r1653588 ] LUCENE-6192: don't overflow int when writing skip data for high freq terms in extremely large indices Long overflow in LuceneXXSkipWriter can corrupt skip data - Key: LUCENE-6192 URL: https://issues.apache.org/jira/browse/LUCENE-6192 Project: Lucene - Core Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0, Trunk, 4.x Attachments: LUCENE-6192.patch I've been iterating with Tom on this corruption that CheckIndex detects in his rather large index (720 GB in a single segment): {noformat} java -Xmx16G -Xms16G -cp $JAR -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex //shards/4/core-1/data/test_index -verbose 21 |tee -a shard4_reoptimizedNewJava Opening index @ /htsolr/lss-reindex/shards/4/core-1/data/test_index Segments file=segments_e numSegments=1 version=4.10.2 format= userData={commitTimeMSec=1421479358825} 1 of 1: name=_8m8 docCount=1130856 version=4.10.2 codec=Lucene410 compound=false numFiles=10 size (MB)=719,967.32 diagnostics = {timestamp=1421437320935, os=Linux, os.version=2.6.18-400.1.1.el5, mergeFactor=2, source=merge, lucene.version=4.10.2, os.arch=amd64, mergeMaxNumSegments=1, java.version=1.7.0_71, java.vendor=Oracle Corporation} no deletions test: open reader.OK test: check integrity.OK test: check live docs.OK test: fields..OK [80 fields] test: field norms.OK [23 fields] test: terms, freq, prox...ERROR: java.lang.AssertionError: -96 java.lang.AssertionError: -96 at org.apache.lucene.codecs.lucene41.ForUtil.skipBlock(ForUtil.java:228) at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.skipPositions(Lucene41PostingsReader.java:925) at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.nextPosition(Lucene41PostingsReader.java:955) at org.apache.lucene.index.CheckIndex.checkFields(CheckIndex.java:1100) at org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:1357) at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:655) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096) test: stored fields...OK [67472796 total field count; avg 59.665 fields per doc] test: term vectorsOK [0 total vector count; avg 0 term/freq vector fields per doc] test: docvalues...OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 SORTED; 0 SORTED_NUMERIC; 0 SORTED_SET] FAILED WARNING: fixIndex() would remove reference to this segment; full exception: java.lang.RuntimeException: Term Index test failed at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:670) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096) WARNING: 1 broken segments (containing 1130856 documents) detected WARNING: would write new segments file, and 1130856 documents would be lost, if -fix were specified {noformat} And Rob spotted long - int casts in our skip list writers that look like they could cause such corruption if a single high-freq term with many positions required 2.1 GB to write its positions into .pos. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6976) Remove all methods and classes deprecated in 4.x from trunk and 5.x
[ https://issues.apache.org/jira/browse/SOLR-6976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285797#comment-14285797 ] ASF subversion and git services commented on SOLR-6976: --- Commit 1653566 from [~romseygeek] in branch 'dev/branches/lucene_solr_5_0' [ https://svn.apache.org/r1653566 ] SOLR-6976: Remove methods and classes deprecated in 4.x Remove all methods and classes deprecated in 4.x from trunk and 5.x --- Key: SOLR-6976 URL: https://issues.apache.org/jira/browse/SOLR-6976 Project: Solr Issue Type: Task Reporter: Alan Woodward Priority: Blocker Fix For: 5.0, Trunk Attachments: SOLR-6976.patch, SOLR-6976.patch, SOLR-6976.patch, SOLR-sharkeys.patch We have a bunch of methods, classes, enums, etc which are marked as deprecated in Solr code in the 4.x branch. Some of them have been marked as such since the 1.4 release. Before we get 5.0 out, these should all be removed -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-6976) Remove all methods and classes deprecated in 4.x from trunk and 5.x
[ https://issues.apache.org/jira/browse/SOLR-6976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Woodward resolved SOLR-6976. - Resolution: Fixed Assignee: Alan Woodward Remove all methods and classes deprecated in 4.x from trunk and 5.x --- Key: SOLR-6976 URL: https://issues.apache.org/jira/browse/SOLR-6976 Project: Solr Issue Type: Task Reporter: Alan Woodward Assignee: Alan Woodward Priority: Blocker Fix For: 5.0, Trunk Attachments: SOLR-6976.patch, SOLR-6976.patch, SOLR-6976.patch, SOLR-sharkeys.patch We have a bunch of methods, classes, enums, etc which are marked as deprecated in Solr code in the 4.x branch. Some of them have been marked as such since the 1.4 release. Before we get 5.0 out, these should all be removed -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6192) Long overflow in LuceneXXSkipWriter can corrupt skip data
[ https://issues.apache.org/jira/browse/LUCENE-6192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285818#comment-14285818 ] ASF subversion and git services commented on LUCENE-6192: - Commit 1653577 from [~mikemccand] in branch 'dev/branches/lucene_solr_4_10' [ https://svn.apache.org/r1653577 ] LUCENE-6192: don't overflow int when writing skip data for high freq terms in extremely large indices Long overflow in LuceneXXSkipWriter can corrupt skip data - Key: LUCENE-6192 URL: https://issues.apache.org/jira/browse/LUCENE-6192 Project: Lucene - Core Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0, Trunk, 4.x Attachments: LUCENE-6192.patch I've been iterating with Tom on this corruption that CheckIndex detects in his rather large index (720 GB in a single segment): {noformat} java -Xmx16G -Xms16G -cp $JAR -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex //shards/4/core-1/data/test_index -verbose 21 |tee -a shard4_reoptimizedNewJava Opening index @ /htsolr/lss-reindex/shards/4/core-1/data/test_index Segments file=segments_e numSegments=1 version=4.10.2 format= userData={commitTimeMSec=1421479358825} 1 of 1: name=_8m8 docCount=1130856 version=4.10.2 codec=Lucene410 compound=false numFiles=10 size (MB)=719,967.32 diagnostics = {timestamp=1421437320935, os=Linux, os.version=2.6.18-400.1.1.el5, mergeFactor=2, source=merge, lucene.version=4.10.2, os.arch=amd64, mergeMaxNumSegments=1, java.version=1.7.0_71, java.vendor=Oracle Corporation} no deletions test: open reader.OK test: check integrity.OK test: check live docs.OK test: fields..OK [80 fields] test: field norms.OK [23 fields] test: terms, freq, prox...ERROR: java.lang.AssertionError: -96 java.lang.AssertionError: -96 at org.apache.lucene.codecs.lucene41.ForUtil.skipBlock(ForUtil.java:228) at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.skipPositions(Lucene41PostingsReader.java:925) at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.nextPosition(Lucene41PostingsReader.java:955) at org.apache.lucene.index.CheckIndex.checkFields(CheckIndex.java:1100) at org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:1357) at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:655) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096) test: stored fields...OK [67472796 total field count; avg 59.665 fields per doc] test: term vectorsOK [0 total vector count; avg 0 term/freq vector fields per doc] test: docvalues...OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 SORTED; 0 SORTED_NUMERIC; 0 SORTED_SET] FAILED WARNING: fixIndex() would remove reference to this segment; full exception: java.lang.RuntimeException: Term Index test failed at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:670) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096) WARNING: 1 broken segments (containing 1130856 documents) detected WARNING: would write new segments file, and 1130856 documents would be lost, if -fix were specified {noformat} And Rob spotted long - int casts in our skip list writers that look like they could cause such corruption if a single high-freq term with many positions required 2.1 GB to write its positions into .pos. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6192) Long overflow in LuceneXXSkipWriter can corrupt skip data
[ https://issues.apache.org/jira/browse/LUCENE-6192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285857#comment-14285857 ] ASF subversion and git services commented on LUCENE-6192: - Commit 1653593 from [~mikemccand] in branch 'dev/branches/lucene_solr_4_10' [ https://svn.apache.org/r1653593 ] LUCENE-6192: don't overflow int when writing skip data for high freq terms in extremely large indices Long overflow in LuceneXXSkipWriter can corrupt skip data - Key: LUCENE-6192 URL: https://issues.apache.org/jira/browse/LUCENE-6192 Project: Lucene - Core Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0, Trunk, 4.x Attachments: LUCENE-6192.patch I've been iterating with Tom on this corruption that CheckIndex detects in his rather large index (720 GB in a single segment): {noformat} java -Xmx16G -Xms16G -cp $JAR -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex //shards/4/core-1/data/test_index -verbose 21 |tee -a shard4_reoptimizedNewJava Opening index @ /htsolr/lss-reindex/shards/4/core-1/data/test_index Segments file=segments_e numSegments=1 version=4.10.2 format= userData={commitTimeMSec=1421479358825} 1 of 1: name=_8m8 docCount=1130856 version=4.10.2 codec=Lucene410 compound=false numFiles=10 size (MB)=719,967.32 diagnostics = {timestamp=1421437320935, os=Linux, os.version=2.6.18-400.1.1.el5, mergeFactor=2, source=merge, lucene.version=4.10.2, os.arch=amd64, mergeMaxNumSegments=1, java.version=1.7.0_71, java.vendor=Oracle Corporation} no deletions test: open reader.OK test: check integrity.OK test: check live docs.OK test: fields..OK [80 fields] test: field norms.OK [23 fields] test: terms, freq, prox...ERROR: java.lang.AssertionError: -96 java.lang.AssertionError: -96 at org.apache.lucene.codecs.lucene41.ForUtil.skipBlock(ForUtil.java:228) at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.skipPositions(Lucene41PostingsReader.java:925) at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.nextPosition(Lucene41PostingsReader.java:955) at org.apache.lucene.index.CheckIndex.checkFields(CheckIndex.java:1100) at org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:1357) at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:655) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096) test: stored fields...OK [67472796 total field count; avg 59.665 fields per doc] test: term vectorsOK [0 total vector count; avg 0 term/freq vector fields per doc] test: docvalues...OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 SORTED; 0 SORTED_NUMERIC; 0 SORTED_SET] FAILED WARNING: fixIndex() would remove reference to this segment; full exception: java.lang.RuntimeException: Term Index test failed at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:670) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096) WARNING: 1 broken segments (containing 1130856 documents) detected WARNING: would write new segments file, and 1130856 documents would be lost, if -fix were specified {noformat} And Rob spotted long - int casts in our skip list writers that look like they could cause such corruption if a single high-freq term with many positions required 2.1 GB to write its positions into .pos. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6192) Long overflow in LuceneXXSkipWriter can corrupt skip data
[ https://issues.apache.org/jira/browse/LUCENE-6192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285828#comment-14285828 ] ASF subversion and git services commented on LUCENE-6192: - Commit 1653580 from [~mikemccand] in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1653580 ] LUCENE-6192: don't overflow int when writing skip data for high freq terms in extremely large indices Long overflow in LuceneXXSkipWriter can corrupt skip data - Key: LUCENE-6192 URL: https://issues.apache.org/jira/browse/LUCENE-6192 Project: Lucene - Core Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0, Trunk, 4.x Attachments: LUCENE-6192.patch I've been iterating with Tom on this corruption that CheckIndex detects in his rather large index (720 GB in a single segment): {noformat} java -Xmx16G -Xms16G -cp $JAR -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex //shards/4/core-1/data/test_index -verbose 21 |tee -a shard4_reoptimizedNewJava Opening index @ /htsolr/lss-reindex/shards/4/core-1/data/test_index Segments file=segments_e numSegments=1 version=4.10.2 format= userData={commitTimeMSec=1421479358825} 1 of 1: name=_8m8 docCount=1130856 version=4.10.2 codec=Lucene410 compound=false numFiles=10 size (MB)=719,967.32 diagnostics = {timestamp=1421437320935, os=Linux, os.version=2.6.18-400.1.1.el5, mergeFactor=2, source=merge, lucene.version=4.10.2, os.arch=amd64, mergeMaxNumSegments=1, java.version=1.7.0_71, java.vendor=Oracle Corporation} no deletions test: open reader.OK test: check integrity.OK test: check live docs.OK test: fields..OK [80 fields] test: field norms.OK [23 fields] test: terms, freq, prox...ERROR: java.lang.AssertionError: -96 java.lang.AssertionError: -96 at org.apache.lucene.codecs.lucene41.ForUtil.skipBlock(ForUtil.java:228) at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.skipPositions(Lucene41PostingsReader.java:925) at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.nextPosition(Lucene41PostingsReader.java:955) at org.apache.lucene.index.CheckIndex.checkFields(CheckIndex.java:1100) at org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:1357) at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:655) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096) test: stored fields...OK [67472796 total field count; avg 59.665 fields per doc] test: term vectorsOK [0 total vector count; avg 0 term/freq vector fields per doc] test: docvalues...OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 SORTED; 0 SORTED_NUMERIC; 0 SORTED_SET] FAILED WARNING: fixIndex() would remove reference to this segment; full exception: java.lang.RuntimeException: Term Index test failed at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:670) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096) WARNING: 1 broken segments (containing 1130856 documents) detected WARNING: would write new segments file, and 1130856 documents would be lost, if -fix were specified {noformat} And Rob spotted long - int casts in our skip list writers that look like they could cause such corruption if a single high-freq term with many positions required 2.1 GB to write its positions into .pos. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-7013) Unclear error message with solr script when lacking jar executable
Derek Wood created SOLR-7013: Summary: Unclear error message with solr script when lacking jar executable Key: SOLR-7013 URL: https://issues.apache.org/jira/browse/SOLR-7013 Project: Solr Issue Type: Bug Components: scripts and tools Affects Versions: 4.10.3 Environment: Fedora 21 Reporter: Derek Wood Fedora 21 doesn't ship the jar executable with the default jdk package, so the attempt to extract webapp/solr.war in the solr script can fail without a clear error message. The attached patch adds this error message and includes support for the unzip utility. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6581) Efficient DocValues support and numeric collapse field implementations for Collapse and Expand
[ https://issues.apache.org/jira/browse/SOLR-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286240#comment-14286240 ] Joel Bernstein commented on SOLR-6581: -- The hint in the code is still upper case TOP_FC. This was meant to be lower case. I'll open another issue for this and have it accept both cases. 5.0 will go out with the upper case syntax though so I'll update the documentation. Efficient DocValues support and numeric collapse field implementations for Collapse and Expand -- Key: SOLR-6581 URL: https://issues.apache.org/jira/browse/SOLR-6581 Project: Solr Issue Type: Bug Reporter: Joel Bernstein Assignee: Joel Bernstein Priority: Minor Fix For: 5.0, Trunk Attachments: SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, renames.diff The 4x implementation of the CollapsingQParserPlugin and the ExpandComponent are optimized to work with a top level FieldCache. Top level FieldCaches have a very fast docID to top-level ordinal lookup. Fast access to the top-level ordinals allows for very high performance field collapsing on high cardinality fields. LUCENE-5666 unified the DocValues and FieldCache api's so that the top level FieldCache is no longer in regular use. Instead all top level caches are accessed through MultiDocValues. This ticket does the following: 1) Optimizes Collapse and Expand to use MultiDocValues and makes this the default approach when collapsing on String fields 2) Provides an option to use a top level FieldCache if the performance of MultiDocValues is a blocker. The mechanism for switching to the FieldCache is a new hint parameter. If the hint parameter is set to top_fc then the top-level FieldCache would be used for both Collapse and Expand. Example syntax: {code} fq={!collapse field=x hint=TOP_FC} {code} 3) Adds numeric collapse field implementations. 4) Resolves issue SOLR-6066 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7
[ https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286314#comment-14286314 ] Uwe Schindler commented on SOLR-6991: - The last comment was just an idea, but doesn't work. The problem here is that initialization of the parser fails, so it will always call TesseractOCRParser.getSupportedTypes()... Update to Apache TIKA 1.7 - Key: SOLR-6991 URL: https://issues.apache.org/jira/browse/SOLR-6991 Project: Solr Issue Type: Improvement Components: contrib - Solr Cell (Tika extraction) Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 5.0, Trunk, 5.1 Attachments: SOLR-6991-forkfix.patch, SOLR-6991.patch, SOLR-6991.patch Apache TIKA 1.7 was released: [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt] This is more or less a dependency update, so replacements. Not sure if we should do this for 5.0. In 5.0 we currently have the previous version, which was not yet released with Solr. If we now bring this into 5.0, we wouldn't have a new release 2 times. I can change the stuff this evening and let it bake in 5.x, so maybe we backport this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-7013) Unclear error message with solr script when lacking jar executable
[ https://issues.apache.org/jira/browse/SOLR-7013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Hatcher updated SOLR-7013: --- Priority: Blocker (was: Major) Fix Version/s: 5.0 Unclear error message with solr script when lacking jar executable -- Key: SOLR-7013 URL: https://issues.apache.org/jira/browse/SOLR-7013 Project: Solr Issue Type: Bug Components: scripts and tools Affects Versions: 4.10.3 Environment: Fedora 21 Reporter: Derek Wood Priority: Blocker Fix For: 5.0 Attachments: solr.patch Fedora 21 doesn't ship the jar executable with the default jdk package, so the attempt to extract webapp/solr.war in the solr script can fail without a clear error message. The attached patch adds this error message and includes support for the unzip utility. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-7005) facet.heatmap for spatial heatmap faceting on RPT
[ https://issues.apache.org/jira/browse/SOLR-7005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-7005: --- Attachment: SOLR-7005_heatmap.patch Thanks for the encouragement Shalin, and Erik on #lucene-dev, and others via email who have gotten wind of this. Here's the first-draft patch. It is still based on being its own SearchComponent, and it doesn't yet support distributed-search -- those issues should be addressed next. I added support for the distErr parameter to facilitate computing the grid level in the same fashion as used by Lucene spatial to ultimately derive a grid level for a given shape (a rect/box in this case). In fact it re-uses utility methods in Lucene spatial to compute the grid level given the world boundary, distErr (if provided) and distErrPct (if provided). The units of distErr is the same as distanceUnits attribute on the field type (a new Solr 5 thing). So if units is a kilometer and distErr is 100 then the grid cells returns are at least as precise as 100 kilometers (which BTW is a little less than a spherical degree for Earth, which is 111.2km). The 512x256 heatmap I uploaded was generated by specifying distErr=111.2. A client could compute a distErr if they instead know how many minimum cells they want in the heatmap. I may bake that formula in and provide a minCells param. For distributed-search, I'm thinking the internal shard requests will use PNG since it's compressed, and then the user can get whatever format they asked for. I only want to write the aggregation logic once, not per-format :-) As a part of this work I found it useful to add SpatialUtils.parseRectangle which parses the {{[lowerLeftPoint TO upperRightPoint]}} format. In another issue I want to re-use this to provide a more Solr-friendly way of indexing a rectangle (for e.g. BBoxField or RPT) or for specifying worldBounds on the field type. Even though I don't have distributed-search implemented yet, the test extends BaseDistributedSearchTestCase any way. I dislike the idea of writing two tests that test the same thing (one distributed, one not) when the infrastructure should make it indifferent since it's transparent to input output I'm testing. Unfortunately, assertQ friends are hard-coded to use TestHarness which is in turn hard-coded to use an embedded Solr instance. And unfortunately, BaseDistributedSearchTestCase doesn't let me test 0 shards (hey, I haven't implemented that feature yet!). The patch tweaks BaseDistributedSearchTestCase slightly to let me do this. facet.heatmap for spatial heatmap faceting on RPT - Key: SOLR-7005 URL: https://issues.apache.org/jira/browse/SOLR-7005 Project: Solr Issue Type: New Feature Components: spatial Reporter: David Smiley Assignee: David Smiley Fix For: 5.1 Attachments: SOLR-7005_heatmap.patch, heatmap_512x256.png, heatmap_64x32.png This is a new feature that uses the new spatial Heatmap / 2D PrefixTree cell counter in Lucene spatial LUCENE-6191. This is a form of faceting, and as-such I think it should live in the facet parameter namespace. Here's what the parameters are: * facet=true * facet.heatmap=fieldname * facet.heatmap.bbox=\[-180 -90 TO 180 90] * facet.heatmap.gridLevel=6 * facet.heatmap.distErrPct=0.10 Like other faceting features, the fieldName can have local-params to exclude filter queries or specify an output key. The bbox is optional; you get the whole world or you can specify a box or actually any shape that WKT supports (you get the bounding box of whatever you put). Ultimately, this feature needs to know the grid level, which together with the input shape will yield a certain number of cells. You can specify gridLevel exactly, or don't and instead provide distErrPct which is computed like it is for the RPT field type as seen in the schema. 0.10 yielded ~4k cells but it'll vary. There's also a facet.heatmap.maxCells safety net defaulting to 100k. Exceed this and you get an error. The output is (JSON): {noformat} {gridLevel=6,columns=64,rows=64,minX=-180.0,maxX=180.0,minY=-90.0,maxY=90.0,counts=[[0, 0, 2, 1, ],[1, 1, 3, 2, ...],...]} {noformat} counts is null if all would be 0. Perhaps individual row arrays should likewise be null... I welcome feedback. I'm toying with an output format option in which you can specify a base-64'ed grayscale PNG. Obviously this should support sharded / distributed environments. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6581) Efficient DocValues support and numeric collapse field implementations for Collapse and Expand
[ https://issues.apache.org/jira/browse/SOLR-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-6581: - Description: The 4x implementation of the CollapsingQParserPlugin and the ExpandComponent are optimized to work with a top level FieldCache. Top level FieldCaches have a very fast docID to top-level ordinal lookup. Fast access to the top-level ordinals allows for very high performance field collapsing on high cardinality fields. LUCENE-5666 unified the DocValues and FieldCache api's so that the top level FieldCache is no longer in regular use. Instead all top level caches are accessed through MultiDocValues. This ticket does the following: 1) Optimizes Collapse and Expand to use MultiDocValues and makes this the default approach when collapsing on String fields 2) Provides an option to use a top level FieldCache if the performance of MultiDocValues is a blocker. The mechanism for switching to the FieldCache is a new hint parameter. If the hint parameter is set to top_fc then the top-level FieldCache would be used for both Collapse and Expand. Example syntax: {code} fq={!collapse field=x hint=TOP_FC} {code} 3) Adds numeric collapse field implementations. 4) Resolves issue SOLR-6066 was: The 4x implementation of the CollapsingQParserPlugin and the ExpandComponent are optimized to work with a top level FieldCache. Top level FieldCaches have a very fast docID to top-level ordinal lookup. Fast access to the top-level ordinals allows for very high performance field collapsing on high cardinality fields. LUCENE-5666 unified the DocValues and FieldCache api's so that the top level FieldCache is no longer in regular use. Instead all top level caches are accessed through MultiDocValues. This ticket does the following: 1) Optimizes Collapse and Expand to use MultiDocValues and makes this the default approach when collapsing on String fields 2) Provides an option to use a top level FieldCache if the performance of MultiDocValues is a blocker. The mechanism for switching to the FieldCache is a new hint parameter. If the hint parameter is set to top_fc then the top-level FieldCache would be used for both Collapse and Expand. Example syntax: {code} fq={!collapse field=x hint=top_fc} {code} 3) Adds numeric collapse field implementations. 4) Resolves issue SOLR-6066 Efficient DocValues support and numeric collapse field implementations for Collapse and Expand -- Key: SOLR-6581 URL: https://issues.apache.org/jira/browse/SOLR-6581 Project: Solr Issue Type: Bug Reporter: Joel Bernstein Assignee: Joel Bernstein Priority: Minor Fix For: 5.0, Trunk Attachments: SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, renames.diff The 4x implementation of the CollapsingQParserPlugin and the ExpandComponent are optimized to work with a top level FieldCache. Top level FieldCaches have a very fast docID to top-level ordinal lookup. Fast access to the top-level ordinals allows for very high performance field collapsing on high cardinality fields. LUCENE-5666 unified the DocValues and FieldCache api's so that the top level FieldCache is no longer in regular use. Instead all top level caches are accessed through MultiDocValues. This ticket does the following: 1) Optimizes Collapse and Expand to use MultiDocValues and makes this the default approach when collapsing on String fields 2) Provides an option to use a top level FieldCache if the performance of MultiDocValues is a blocker. The mechanism for switching to the FieldCache is a new hint parameter. If the hint parameter is set to top_fc then the top-level FieldCache would be used for both Collapse and Expand. Example syntax: {code} fq={!collapse field=x hint=TOP_FC} {code} 3) Adds numeric collapse field implementations. 4) Resolves issue SOLR-6066 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-Tests-5.x-Java7 - Build # 2522 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-Tests-5.x-Java7/2522/ 4 tests failed. FAILED: org.apache.solr.cloud.HttpPartitionTest.testDistribSearch Error Message: org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: http://127.0.0.1:12771/c8n_1x2_shard1_replica2 Stack Trace: org.apache.solr.client.solrj.SolrServerException: org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: http://127.0.0.1:12771/c8n_1x2_shard1_replica2 at __randomizedtesting.SeedInfo.seed([F5B2AF58A025A0EF:74542140D77AC0D3]:0) at org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:581) at org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:890) at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:793) at org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:736) at org.apache.solr.cloud.HttpPartitionTest.sendDoc(HttpPartitionTest.java:480) at org.apache.solr.cloud.HttpPartitionTest.testRf2(HttpPartitionTest.java:201) at org.apache.solr.cloud.HttpPartitionTest.doTest(HttpPartitionTest.java:114) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:878) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at
[jira] [Commented] (SOLR-7005) facet.heatmap for spatial heatmap faceting on RPT
[ https://issues.apache.org/jira/browse/SOLR-7005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286232#comment-14286232 ] David Smiley commented on SOLR-7005: Oh, facet.heatmap.format=png (or ints, ints being the default) facet.heatmap for spatial heatmap faceting on RPT - Key: SOLR-7005 URL: https://issues.apache.org/jira/browse/SOLR-7005 Project: Solr Issue Type: New Feature Components: spatial Reporter: David Smiley Assignee: David Smiley Fix For: 5.1 Attachments: SOLR-7005_heatmap.patch, heatmap_512x256.png, heatmap_64x32.png This is a new feature that uses the new spatial Heatmap / 2D PrefixTree cell counter in Lucene spatial LUCENE-6191. This is a form of faceting, and as-such I think it should live in the facet parameter namespace. Here's what the parameters are: * facet=true * facet.heatmap=fieldname * facet.heatmap.bbox=\[-180 -90 TO 180 90] * facet.heatmap.gridLevel=6 * facet.heatmap.distErrPct=0.10 Like other faceting features, the fieldName can have local-params to exclude filter queries or specify an output key. The bbox is optional; you get the whole world or you can specify a box or actually any shape that WKT supports (you get the bounding box of whatever you put). Ultimately, this feature needs to know the grid level, which together with the input shape will yield a certain number of cells. You can specify gridLevel exactly, or don't and instead provide distErrPct which is computed like it is for the RPT field type as seen in the schema. 0.10 yielded ~4k cells but it'll vary. There's also a facet.heatmap.maxCells safety net defaulting to 100k. Exceed this and you get an error. The output is (JSON): {noformat} {gridLevel=6,columns=64,rows=64,minX=-180.0,maxX=180.0,minY=-90.0,maxY=90.0,counts=[[0, 0, 2, 1, ],[1, 1, 3, 2, ...],...]} {noformat} counts is null if all would be 0. Perhaps individual row arrays should likewise be null... I welcome feedback. I'm toying with an output format option in which you can specify a base-64'ed grayscale PNG. Obviously this should support sharded / distributed environments. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6188) Remove HTML verification from checkJavaDocs.py
[ https://issues.apache.org/jira/browse/LUCENE-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286282#comment-14286282 ] Erick Erickson commented on LUCENE-6188: Thanks! Back from 2 days onsite so I can pay some attention now. Remove HTML verification from checkJavaDocs.py -- Key: LUCENE-6188 URL: https://issues.apache.org/jira/browse/LUCENE-6188 Project: Lucene - Core Issue Type: Improvement Components: general/javadocs Reporter: Ramkumar Aiyengar Assignee: Erick Erickson Priority: Minor Attachments: LUCENE-6188.patch, LUCENE-6188.patch Currently, the broken HTML verification in {{checkJavaDocs.py}} has issues in some cases (see SOLR-6902). On looking further to fix it with the {{html.parser}} package instead, noticed that there is broken HTML verification already present (using {{html.parser}}!)in {{checkJavadocLinks.py}} anyway which takes care of validation, and probably {{jTidy}} does it as well, going by the output (haven't verified it). Given this, the validation in {{checkJavaDocs.py}} doesn't seem to add any further value, so here's a patch to just nuke it instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-5.0-Linux (32bit/ibm-j9-jdk7) - Build # 26 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.0-Linux/26/ Java: 32bit/ibm-j9-jdk7 -Xjit:exclude={org/apache/lucene/util/fst/FST.pack(IIF)Lorg/apache/lucene/util/fst/FST;} 45 tests failed. FAILED: junit.framework.TestSuite.org.apache.solr.cloud.BasicZkTest Error Message: Could not get the port for ZooKeeper server Stack Trace: java.lang.RuntimeException: Could not get the port for ZooKeeper server at __randomizedtesting.SeedInfo.seed([89F35FDD4F4E4D35]:0) at org.apache.solr.cloud.ZkTestServer.run(ZkTestServer.java:482) at org.apache.solr.cloud.AbstractZkTestCase.azt_beforeClass(AbstractZkTestCase.java:68) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:94) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55) at java.lang.reflect.Method.invoke(Method.java:619) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:767) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at java.lang.Thread.run(Thread.java:853) FAILED: junit.framework.TestSuite.org.apache.solr.cloud.BasicZkTest Error Message: Stack Trace: java.lang.NullPointerException at __randomizedtesting.SeedInfo.seed([89F35FDD4F4E4D35]:0) at org.apache.zookeeper.server.NIOServerCnxnFactory.getLocalPort(NIOServerCnxnFactory.java:134) at org.apache.solr.cloud.ZkTestServer$ZKServerMain.shutdown(ZkTestServer.java:334) at org.apache.solr.cloud.ZkTestServer.shutdown(ZkTestServer.java:492) at org.apache.solr.cloud.AbstractZkTestCase.azt_afterClass(AbstractZkTestCase.java:158) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:94) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55) at java.lang.reflect.Method.invoke(Method.java:619) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:790) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at
[jira] [Commented] (LUCENE-6161) Applying deletes is sometimes dog slow
[ https://issues.apache.org/jira/browse/LUCENE-6161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286421#comment-14286421 ] Robert Muir commented on LUCENE-6161: - Just a few minor thoughts: Some of the iteration is more awkward now, it might be nice to open a followup to clean this up. delGen is awkward to see being held in PrefixCodedTerms, and we have an iterator api that ... is neither termsenum or iterable but another one instead. I wonder if we could have the same logic, but using a more natural one. if it would just make the code even more awkward, then screw it :) We should fix the issue though for now I think. Applying deletes is sometimes dog slow -- Key: LUCENE-6161 URL: https://issues.apache.org/jira/browse/LUCENE-6161 Project: Lucene - Core Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0, Trunk Attachments: LUCENE-6161.patch, LUCENE-6161.patch, LUCENE-6161.patch, LUCENE-6161.patch, LUCENE-6161.patch I hit this while testing various use cases for LUCENE-6119 (adding auto-throttle to ConcurrentMergeScheduler). When I tested always call updateDocument (each add buffers a delete term), with many indexing threads, opening an NRT reader once per second (forcing all deleted terms to be applied), I see that BufferedUpdatesStream.applyDeletes sometimes seems to take a lng time, e.g.: {noformat} BD 0 [2015-01-04 09:31:12.597; Lucene Merge Thread #69]: applyDeletes took 339 msec for 10 segments, 117 deleted docs, 607333 visited terms BD 0 [2015-01-04 09:31:18.148; Thread-4]: applyDeletes took 5533 msec for 62 segments, 10989 deleted docs, 8517225 visited terms BD 0 [2015-01-04 09:31:21.463; Lucene Merge Thread #71]: applyDeletes took 1065 msec for 10 segments, 470 deleted docs, 1825649 visited terms BD 0 [2015-01-04 09:31:26.301; Thread-5]: applyDeletes took 4835 msec for 61 segments, 14676 deleted docs, 9649860 visited terms BD 0 [2015-01-04 09:31:35.572; Thread-11]: applyDeletes took 6073 msec for 72 segments, 13835 deleted docs, 11865319 visited terms BD 0 [2015-01-04 09:31:37.604; Lucene Merge Thread #75]: applyDeletes took 251 msec for 10 segments, 58 deleted docs, 240721 visited terms BD 0 [2015-01-04 09:31:44.641; Thread-11]: applyDeletes took 5956 msec for 64 segments, 15109 deleted docs, 10599034 visited terms BD 0 [2015-01-04 09:31:47.814; Lucene Merge Thread #77]: applyDeletes took 396 msec for 10 segments, 137 deleted docs, 719914 visit {noformat} What this means is even though I want an NRT reader every second, often I don't get one for up to ~7 or more seconds. This is on an SSD, machine has 48 GB RAM, heap size is only 2 GB. 12 indexing threads. As hideously complex as this code is, I think there are some inefficiencies, but fixing them could be hard / make code even hairier ... Also, this code is mega-locked: holds IW's lock, holds BD's lock. It blocks things like merges kicking off or finishing... E.g., we pull the MergedIterator many times on the same set of sub-iterators. Maybe we can create the sorted terms up front and reuse that? Maybe we should go term stride (one term visits all N segments) not segment stride (visit each segment, iterating all deleted terms for it). Just iterating the terms to be deleted takes a sizable part of the time, and we now do that once for every segment in the index. Also, the isUnique bit in LUCENE-6005 should help here, since if we know the field is unique, we can stop seekExact once we found a segment that has the deleted term, we can maybe pass false for removeDuplicates to MergedIterator... -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-7013) Unclear error message with solr script when lacking jar executable
[ https://issues.apache.org/jira/browse/SOLR-7013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Derek Wood updated SOLR-7013: - Attachment: solr.patch Unclear error message with solr script when lacking jar executable -- Key: SOLR-7013 URL: https://issues.apache.org/jira/browse/SOLR-7013 Project: Solr Issue Type: Bug Components: scripts and tools Affects Versions: 4.10.3 Environment: Fedora 21 Reporter: Derek Wood Attachments: solr.patch Fedora 21 doesn't ship the jar executable with the default jdk package, so the attempt to extract webapp/solr.war in the solr script can fail without a clear error message. The attached patch adds this error message and includes support for the unzip utility. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6845) Add buildOnStartup option for suggesters
[ https://issues.apache.org/jira/browse/SOLR-6845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar updated SOLR-6845: Attachment: tests-failures.txt I just saw a local failure on trunk on org.apache.solr.handler.component.SuggestComponentTest.testDefaultBuildOnStartupStoredDict. The logs are attached and the stack trace is: {code} 2 786070 T7047 oas.SolrTestCaseJ4.assertQ ERROR REQUEST FAILED: xpath=//lst[@name='suggest']/lst[@name='suggest_doc_default_startup']/lst[@name='example']/int[@name='numFound'][.='0'] 2xml response was: ?xml version=1.0 encoding=UTF-8? 2response 2lst name=responseHeaderint name=status0/intint name=QTime5/int/lstlst name=suggestlst name=suggest_doc_default_startuplst name=exampleint name=numFound2/intarr name=suggestionslststr name=termexample inputdata/strlong name=weight45/longstr name=payload//lstlststr name=termexample data/strlong name=weight40/longstr name=payload//lst/arr/lst/lst/lst 2/response 2 2request was:qt=/suggestsuggest.q=examplesuggest.count=2suggest.dictionary=suggest_doc_default_startupwt=xml 2 786071 T7047 oasc.SolrException.log ERROR REQUEST FAILED: qt=/suggestsuggest.q=examplesuggest.count=2suggest.dictionary=suggest_doc_default_startupwt=xml:java.lang.RuntimeException: REQUEST FAILED: xpath=//lst[@name='suggest']/lst[@name='suggest_doc_default_startup']/lst[@name='example']/int[@name='numFound'][.='0'] 2xml response was: ?xml version=1.0 encoding=UTF-8? 2response 2lst name=responseHeaderint name=status0/intint name=QTime5/int/lstlst name=suggestlst name=suggest_doc_default_startuplst name=exampleint name=numFound2/intarr name=suggestionslststr name=termexample inputdata/strlong name=weight45/longstr name=payload//lstlststr name=termexample data/strlong name=weight40/longstr name=payload//lst/arr/lst/lst/lst 2/response 2 2request was:qt=/suggestsuggest.q=examplesuggest.count=2suggest.dictionary=suggest_doc_default_startupwt=xml 2at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:741) 2at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:715) 2at org.apache.solr.handler.component.SuggestComponentTest.testDefaultBuildOnStartupStoredDict(SuggestComponentTest.java:257) 2at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {code} {code} ant test -Dtestcase=SuggestComponentTest -Dtests.method=testDefaultBuildOnStartupStoredDict -Dtests.seed=1AE9946D9D16B26E -Dtests.slow=true -Dtests.locale=en -Dtests.timezone=Asia/Istanbul -Dtests.asserts=true -Dtests.file.encoding=ISO-8859-1 {code} I tried a few times but couldn't reproduce it. Add buildOnStartup option for suggesters Key: SOLR-6845 URL: https://issues.apache.org/jira/browse/SOLR-6845 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Tomás Fernández Löbbe Fix For: Trunk, 5.1 Attachments: SOLR-6845.patch, SOLR-6845.patch, SOLR-6845.patch, tests-failures.txt SOLR-6679 was filed to track the investigation into the following problem... {panel} The stock solrconfig provides a bad experience with a large index... start up Solr and it will spin at 100% CPU for minutes, unresponsive, while it apparently builds a suggester index. ... This is what I did: 1) indexed 10M very small docs (only takes a few minutes). 2) shut down Solr 3) start up Solr and watch it be unresponsive for over 4 minutes! I didn't even use any of the fields specified in the suggester config and I never called the suggest request handler. {panel} ..but ultimately focused on removing/disabling the suggester from the sample configs. Opening this new issue to focus on actually trying to identify the root problem fix it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6188) Remove HTML verification from checkJavaDocs.py
[ https://issues.apache.org/jira/browse/LUCENE-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286294#comment-14286294 ] Robert Muir commented on LUCENE-6188: - {quote} If it's not adding value anymore (e.g. we recently turned on faster javadocs checking via javac's doclint options), I agree we should remove it: it's slow and hackity and un-understandable. {quote} The doclint stuff added (TRUNK ONLY) is blazing fast and nice, but there is a good amount of work before its checking html, i see these steps: * actually turn on html verification in doclint. this can't be done until a lot of problems are fixed. When they are fixed we can enable html: {noformat}-Xdoclint:all/protected -Xdoclint:-html -Xdoclint:-missing{noformat} * figure out how to check overview.html and package.html. I suspect they are currently not being checked (but maybe im wrong). Maybe we can ask the openjdk developers about it. Then jtidy could be removed completely. python linting is still needed until we can properly enable missing and cutover build logic to that. Then i think its check-missing could be removed. As far as the python broken links checker, im not sure if there is a replacement. Ideally we are just using doclint for all checks in the future. Remove HTML verification from checkJavaDocs.py -- Key: LUCENE-6188 URL: https://issues.apache.org/jira/browse/LUCENE-6188 Project: Lucene - Core Issue Type: Improvement Components: general/javadocs Reporter: Ramkumar Aiyengar Assignee: Erick Erickson Priority: Minor Attachments: LUCENE-6188.patch, LUCENE-6188.patch Currently, the broken HTML verification in {{checkJavaDocs.py}} has issues in some cases (see SOLR-6902). On looking further to fix it with the {{html.parser}} package instead, noticed that there is broken HTML verification already present (using {{html.parser}}!)in {{checkJavadocLinks.py}} anyway which takes care of validation, and probably {{jTidy}} does it as well, going by the output (haven't verified it). Given this, the validation in {{checkJavaDocs.py}} doesn't seem to add any further value, so here's a patch to just nuke it instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6954) Considering changing SolrClient#shutdown to SolrClient#close.
[ https://issues.apache.org/jira/browse/SOLR-6954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Woodward updated SOLR-6954: Attachment: SOLR-6954.patch Patch making SolrClient implement Closeable, and making shutdown() a deprecated concrete method that delegates to close(). Also cuts over all tests to use close() (and try-with-resources where possible). Considering changing SolrClient#shutdown to SolrClient#close. - Key: SOLR-6954 URL: https://issues.apache.org/jira/browse/SOLR-6954 Project: Solr Issue Type: Improvement Reporter: Mark Miller Fix For: 5.0, Trunk Attachments: SOLR-6954.patch SolrClient#shutdown is not as odd as SolrServer#shutdown, but as we want users to release these objects, close is more standard and if we implement Closeable, tools help point out leaks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-7014) Collapse identical catch branches in try-catch statements
[ https://issues.apache.org/jira/browse/SOLR-7014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar updated SOLR-7014: Attachment: SOLR-7014.patch This takes care of all solr classes. I'll attach another one which does the same for lucene. Collapse identical catch branches in try-catch statements - Key: SOLR-7014 URL: https://issues.apache.org/jira/browse/SOLR-7014 Project: Solr Issue Type: Task Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Priority: Trivial Fix For: Trunk, 5.1 Attachments: SOLR-7014.patch We are on Java 7+ so we can reduce verbosity by collapsing identical catch statements into one. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7
[ https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286272#comment-14286272 ] Uwe Schindler commented on SOLR-6991: - Hi, I checked the code. The problem is: You cannot disable by config (because it always tries to execute the command thats part of the default config file). If the config file is not there, then it runs TESSERACT without any path. The only way to work around is: - Disable the whole parser (f*ck, because then we need to maintain our own parser list internally). There is no way to tell TIKA to exclude some parsers (something like AutodetectParser#disableParser(name/class/whatever) - Use a hack with reflection to make TesseractOCRParser#TESSERACT_PRESENT return false for any path... Just replace the static map by one that returns false for any key (LOL) and ignores any put() Update to Apache TIKA 1.7 - Key: SOLR-6991 URL: https://issues.apache.org/jira/browse/SOLR-6991 Project: Solr Issue Type: Improvement Components: contrib - Solr Cell (Tika extraction) Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 5.0, Trunk, 5.1 Attachments: SOLR-6991-forkfix.patch, SOLR-6991.patch, SOLR-6991.patch Apache TIKA 1.7 was released: [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt] This is more or less a dependency update, so replacements. Not sure if we should do this for 5.0. In 5.0 we currently have the previous version, which was not yet released with Solr. If we now bring this into 5.0, we wouldn't have a new release 2 times. I can change the stuff this evening and let it bake in 5.x, so maybe we backport this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-7014) Collapse identical catch branches in try-catch statements
Shalin Shekhar Mangar created SOLR-7014: --- Summary: Collapse identical catch branches in try-catch statements Key: SOLR-7014 URL: https://issues.apache.org/jira/browse/SOLR-7014 Project: Solr Issue Type: Task Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Priority: Trivial Fix For: Trunk, 5.1 We are on Java 7+ so we can reduce verbosity by collapsing identical catch statements into one. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7
[ https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286307#comment-14286307 ] Uwe Schindler commented on SOLR-6991: - One trick could work: TIKA prefers always external parsers loaded by SPI. The trick here would be to add a /META-INF/services/... file that lists a subclass of the Tesseract parser that just always returns no supported media types. TIKA would use our subclass in preference to the one shipped. By that we could disable the parser. I have not checked this, but this would be another hack (that I don't like, too). Update to Apache TIKA 1.7 - Key: SOLR-6991 URL: https://issues.apache.org/jira/browse/SOLR-6991 Project: Solr Issue Type: Improvement Components: contrib - Solr Cell (Tika extraction) Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 5.0, Trunk, 5.1 Attachments: SOLR-6991-forkfix.patch, SOLR-6991.patch, SOLR-6991.patch Apache TIKA 1.7 was released: [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt] This is more or less a dependency update, so replacements. Not sure if we should do this for 5.0. In 5.0 we currently have the previous version, which was not yet released with Solr. If we now bring this into 5.0, we wouldn't have a new release 2 times. I can change the stuff this evening and let it bake in 5.x, so maybe we backport this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7014) Collapse identical catch branches in try-catch statements
[ https://issues.apache.org/jira/browse/SOLR-7014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286324#comment-14286324 ] ASF subversion and git services commented on SOLR-7014: --- Commit 1653665 from sha...@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1653665 ] SOLR-7014: Collapse identical catch branches in try-catch statements Collapse identical catch branches in try-catch statements - Key: SOLR-7014 URL: https://issues.apache.org/jira/browse/SOLR-7014 Project: Solr Issue Type: Task Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Priority: Trivial Fix For: Trunk, 5.1 Attachments: SOLR-7014.patch We are on Java 7+ so we can reduce verbosity by collapsing identical catch statements into one. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6845) Add buildOnStartup option for suggesters
[ https://issues.apache.org/jira/browse/SOLR-6845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286323#comment-14286323 ] Tomás Fernández Löbbe commented on SOLR-6845: - I'll take a look Add buildOnStartup option for suggesters Key: SOLR-6845 URL: https://issues.apache.org/jira/browse/SOLR-6845 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Tomás Fernández Löbbe Fix For: Trunk, 5.1 Attachments: SOLR-6845.patch, SOLR-6845.patch, SOLR-6845.patch, tests-failures.txt SOLR-6679 was filed to track the investigation into the following problem... {panel} The stock solrconfig provides a bad experience with a large index... start up Solr and it will spin at 100% CPU for minutes, unresponsive, while it apparently builds a suggester index. ... This is what I did: 1) indexed 10M very small docs (only takes a few minutes). 2) shut down Solr 3) start up Solr and watch it be unresponsive for over 4 minutes! I didn't even use any of the fields specified in the suggester config and I never called the suggest request handler. {panel} ..but ultimately focused on removing/disabling the suggester from the sample configs. Opening this new issue to focus on actually trying to identify the root problem fix it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-5.x-Linux (64bit/ibm-j9-jdk7) - Build # 11493 - Failure!
J9 bug. Mike McCandless http://blog.mikemccandless.com On Wed, Jan 21, 2015 at 5:53 AM, Policeman Jenkins Server jenk...@thetaphi.de wrote: Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-Linux/11493/ Java: 64bit/ibm-j9-jdk7 -Xjit:exclude={org/apache/lucene/util/fst/FST.pack(IIF)Lorg/apache/lucene/util/fst/FST;} 1 tests failed. FAILED: org.apache.lucene.codecs.lucene49.TestLucene49NormsFormat.testByteRange Error Message: Stack Trace: java.lang.NullPointerException at __randomizedtesting.SeedInfo.seed([1EFEBBCD258C8490:D78182FF451AC405]:0) at org.apache.lucene.codecs.lucene49.Lucene49NormsConsumer$NormMap.add(Lucene49NormsConsumer.java:206) at org.apache.lucene.codecs.lucene49.Lucene49NormsConsumer.addNormsField(Lucene49NormsConsumer.java:95) at org.apache.lucene.index.NormValuesWriter.flush(NormValuesWriter.java:72) at org.apache.lucene.index.DefaultIndexingChain.writeNorms(DefaultIndexingChain.java:204) at org.apache.lucene.index.DefaultIndexingChain.flush(DefaultIndexingChain.java:92) at org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:419) at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:503) at org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:615) at org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2733) at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2888) at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2855) at org.apache.lucene.index.RandomIndexWriter.commit(RandomIndexWriter.java:257) at org.apache.lucene.index.BaseNormsFormatTestCase.doTestNormsVersusStoredFields(BaseNormsFormatTestCase.java:261) at org.apache.lucene.index.BaseNormsFormatTestCase.testByteRange(BaseNormsFormatTestCase.java:54) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:94) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55) at java.lang.reflect.Method.invoke(Method.java:619) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7
[ https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286477#comment-14286477 ] Steve Rowe commented on SOLR-6991: -- bq. don't we need similar assumes in dataimporthandler-extras tests that use TikaEntityProcessor? (i'm not sure why those wouldn't fail with turkish now as well) I ran {{ant test -Dtests.slow=true -Dtests.locale=tr_TR}} in {{solr/contrib/dataimporthandler-extras/}}, and got the following failure: {noformat} [junit4] Suite: org.apache.solr.handler.dataimport.TestTikaEntityProcessor [junit4] 2 Creating dataDir: /Users/sarowe/svn/lucene/dev/trunk2/solr/build/contrib/solr-dataimporthandler-extras/test/J0/temp/solr.handler.dataimport.TestTikaEntityProcessor 9123B7DE098A1C98-001/init-core-data-001 [junit4] 2 log4j:WARN No appenders could be found for logger (org.apache.solr.SolrTestCaseJ4). [junit4] 2 log4j:WARN Please initialize the log4j system properly. [junit4] 2 log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. [junit4] 2 NOTE: reproduce with: ant test -Dtestcase=TestTikaEntityProcessor -Dtests.method=testTikaHTMLMapperIdentity -Dtests.seed=9123B7DE098A1C98 -Dtests.slow=true -Dtests.locale=tr_TR -Dtests.timezone=America/Toronto -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] ERROR 0.93s J0 | TestTikaEntityProcessor.testTikaHTMLMapperIdentity [junit4] Throwable #1: java.lang.Error: posix_spawn is not a supported process launch mechanism on this platform. [junit4]at __randomizedtesting.SeedInfo.seed([9123B7DE098A1C98:C15C334FC0BEE965]:0) [junit4]at java.lang.UNIXProcess$1.run(UNIXProcess.java:105) [junit4]at java.lang.UNIXProcess$1.run(UNIXProcess.java:94) [junit4]at java.security.AccessController.doPrivileged(Native Method) [junit4]at java.lang.UNIXProcess.clinit(UNIXProcess.java:92) [junit4]at java.lang.ProcessImpl.start(ProcessImpl.java:130) [junit4]at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029) [junit4]at java.lang.Runtime.exec(Runtime.java:620) [junit4]at java.lang.Runtime.exec(Runtime.java:485) [junit4]at org.apache.tika.parser.external.ExternalParser.check(ExternalParser.java:344) [junit4]at org.apache.tika.parser.ocr.TesseractOCRParser.hasTesseract(TesseractOCRParser.java:117) [junit4]at org.apache.tika.parser.ocr.TesseractOCRParser.getSupportedTypes(TesseractOCRParser.java:90) [junit4]at org.apache.tika.parser.CompositeParser.getParsers(CompositeParser.java:81) [junit4]at org.apache.tika.parser.DefaultParser.getParsers(DefaultParser.java:95) [junit4]at org.apache.tika.parser.CompositeParser.getSupportedTypes(CompositeParser.java:229) [junit4]at org.apache.tika.parser.CompositeParser.getParsers(CompositeParser.java:81) [junit4]at org.apache.tika.parser.CompositeParser.getParser(CompositeParser.java:209) [junit4]at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244) [junit4]at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120) [junit4]at org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:141) [junit4]at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:243) [junit4]at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:476) [junit4]at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:415) [junit4]at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:330) [junit4]at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232) [junit4]at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416) [junit4]at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480) [junit4]at org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(DataImportHandler.java:189) [junit4]at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:144) [junit4]at org.apache.solr.core.SolrCore.execute(SolrCore.java:2006) [junit4]at org.apache.solr.util.TestHarness.query(TestHarness.java:331) [junit4]at org.apache.solr.handler.dataimport.AbstractDataImportHandlerTestCase.runFullImport(AbstractDataImportHandlerTestCase.java:86) [junit4]at org.apache.solr.handler.dataimport.TestTikaEntityProcessor.testTikaHTMLMapperIdentity(TestTikaEntityProcessor.java:99) [junit4]
[jira] [Commented] (SOLR-6969) Just like we have to retry when the NameNode is in safemode on Solr startup, we also need to retry when opening a transaction log file for append when we get a RecoveryI
[ https://issues.apache.org/jira/browse/SOLR-6969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286485#comment-14286485 ] Mike Drob commented on SOLR-6969: - Is retrying always going to be safe? That works fine after we've lost a server and started a new one (albeit too quickly) but what about the case where two servers both think they are responsible for that tlog? This can happen if the original server partially dies, but still has some threads that are doing work and haven't been cleaned up. Looking at how other projects handle similar issues - HBase moves the entire directory[1] to break any existing leases and ensure any other processes gets kicked out. Maybe a retry is a good stop-gap, but is it going to be a full solution? [1]: https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java#L310 Just like we have to retry when the NameNode is in safemode on Solr startup, we also need to retry when opening a transaction log file for append when we get a RecoveryInProgressException. Key: SOLR-6969 URL: https://issues.apache.org/jira/browse/SOLR-6969 Project: Solr Issue Type: Bug Components: hdfs Reporter: Mark Miller Assignee: Mark Miller Priority: Critical Fix For: 5.0, Trunk This can happen after a hard crash and restart. The current workaround is to stop and wait it out and start again. We should retry and wait a given amount of time as we do when we detect safe mode though. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6928) solr.cmd stop works only in english
[ https://issues.apache.org/jira/browse/SOLR-6928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl updated SOLR-6928: -- Attachment: SOLR-6928.patch Slightly improved patch * No need for case insensitive find * Require a space after port number to avoid false match solr.cmd stop works only in english --- Key: SOLR-6928 URL: https://issues.apache.org/jira/browse/SOLR-6928 Project: Solr Issue Type: Bug Components: scripts and tools Affects Versions: 4.10.3 Environment: german windows 7 Reporter: john.work Assignee: Timothy Potter Priority: Minor Attachments: SOLR-6928.patch, SOLR-6928.patch in solr.cmd the stop doesnt work while executing 'netstat -nao ^| find /i listening ^| find :%SOLR_PORT%' so listening is not found. e.g. in german cmd.exe the netstat -nao prints the following output: {noformat} Proto Lokale Adresse Remoteadresse Status PID TCP0.0.0.0:80 0.0.0.0:0 ABHÖREN 4 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7
[ https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286502#comment-14286502 ] Uwe Schindler commented on SOLR-6991: - [~steve_rowe]: Can you commit to all 3 branches, I wanted to go sleeping? Thanks. Update to Apache TIKA 1.7 - Key: SOLR-6991 URL: https://issues.apache.org/jira/browse/SOLR-6991 Project: Solr Issue Type: Improvement Components: contrib - Solr Cell (Tika extraction) Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 5.0, Trunk, 5.1 Attachments: SOLR-6991-forkfix.patch, SOLR-6991-forkfix.patch, SOLR-6991.patch, SOLR-6991.patch Apache TIKA 1.7 was released: [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt] This is more or less a dependency update, so replacements. Not sure if we should do this for 5.0. In 5.0 we currently have the previous version, which was not yet released with Solr. If we now bring this into 5.0, we wouldn't have a new release 2 times. I can change the stuff this evening and let it bake in 5.x, so maybe we backport this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6928) solr.cmd stop works only in english
[ https://issues.apache.org/jira/browse/SOLR-6928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286540#comment-14286540 ] ASF subversion and git services commented on SOLR-6928: --- Commit 1653700 from [~thelabdude] in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1653700 ] SOLR-6928: solr.cmd stop works only in english solr.cmd stop works only in english --- Key: SOLR-6928 URL: https://issues.apache.org/jira/browse/SOLR-6928 Project: Solr Issue Type: Bug Components: scripts and tools Affects Versions: 4.10.3 Environment: german windows 7 Reporter: john.work Assignee: Timothy Potter Priority: Minor Attachments: SOLR-6928.patch, SOLR-6928.patch in solr.cmd the stop doesnt work while executing 'netstat -nao ^| find /i listening ^| find :%SOLR_PORT%' so listening is not found. e.g. in german cmd.exe the netstat -nao prints the following output: {noformat} Proto Lokale Adresse Remoteadresse Status PID TCP0.0.0.0:80 0.0.0.0:0 ABHÖREN 4 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7
[ https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286558#comment-14286558 ] Steve Rowe commented on SOLR-6991: -- bq. Steve Rowe: Can you commit to all 3 branches, I wanted to go sleeping? Thanks. Will do. Update to Apache TIKA 1.7 - Key: SOLR-6991 URL: https://issues.apache.org/jira/browse/SOLR-6991 Project: Solr Issue Type: Improvement Components: contrib - Solr Cell (Tika extraction) Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 5.0, Trunk, 5.1 Attachments: SOLR-6991-forkfix.patch, SOLR-6991-forkfix.patch, SOLR-6991.patch, SOLR-6991.patch Apache TIKA 1.7 was released: [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt] This is more or less a dependency update, so replacements. Not sure if we should do this for 5.0. In 5.0 we currently have the previous version, which was not yet released with Solr. If we now bring this into 5.0, we wouldn't have a new release 2 times. I can change the stuff this evening and let it bake in 5.x, so maybe we backport this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7
[ https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286562#comment-14286562 ] Steve Rowe commented on SOLR-6991: -- bq. I'm running all Solr tests now with this patch and -Dtests.slow=true -Dtests.locale=tr_TR. All Solr tests passed with the patch. Committing now. Update to Apache TIKA 1.7 - Key: SOLR-6991 URL: https://issues.apache.org/jira/browse/SOLR-6991 Project: Solr Issue Type: Improvement Components: contrib - Solr Cell (Tika extraction) Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 5.0, Trunk, 5.1 Attachments: SOLR-6991-forkfix.patch, SOLR-6991-forkfix.patch, SOLR-6991.patch, SOLR-6991.patch Apache TIKA 1.7 was released: [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt] This is more or less a dependency update, so replacements. Not sure if we should do this for 5.0. In 5.0 we currently have the previous version, which was not yet released with Solr. If we now bring this into 5.0, we wouldn't have a new release 2 times. I can change the stuff this evening and let it bake in 5.x, so maybe we backport this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6188) Remove HTML verification from checkJavaDocs.py
[ https://issues.apache.org/jira/browse/LUCENE-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286584#comment-14286584 ] Ramkumar Aiyengar commented on LUCENE-6188: --- Rob, the logic I have nuked is actually not as a duplicate of doclint (i just didnt check that, and as you mention there might be differences) but the checkJavadocLinks.py script which is run prior to this script in the documentation-lint. That does the exact same check in Python, except it uses a real parser rather than regex hacks.. Remove HTML verification from checkJavaDocs.py -- Key: LUCENE-6188 URL: https://issues.apache.org/jira/browse/LUCENE-6188 Project: Lucene - Core Issue Type: Improvement Components: general/javadocs Reporter: Ramkumar Aiyengar Assignee: Erick Erickson Priority: Minor Attachments: LUCENE-6188.patch, LUCENE-6188.patch Currently, the broken HTML verification in {{checkJavaDocs.py}} has issues in some cases (see SOLR-6902). On looking further to fix it with the {{html.parser}} package instead, noticed that there is broken HTML verification already present (using {{html.parser}}!)in {{checkJavadocLinks.py}} anyway which takes care of validation, and probably {{jTidy}} does it as well, going by the output (haven't verified it). Given this, the validation in {{checkJavaDocs.py}} doesn't seem to add any further value, so here's a patch to just nuke it instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-6991) Update to Apache TIKA 1.7
[ https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Rowe resolved SOLR-6991. -- Resolution: Fixed Update to Apache TIKA 1.7 - Key: SOLR-6991 URL: https://issues.apache.org/jira/browse/SOLR-6991 Project: Solr Issue Type: Improvement Components: contrib - Solr Cell (Tika extraction) Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 5.0, Trunk, 5.1 Attachments: SOLR-6991-forkfix.patch, SOLR-6991-forkfix.patch, SOLR-6991.patch, SOLR-6991.patch Apache TIKA 1.7 was released: [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt] This is more or less a dependency update, so replacements. Not sure if we should do this for 5.0. In 5.0 we currently have the previous version, which was not yet released with Solr. If we now bring this into 5.0, we wouldn't have a new release 2 times. I can change the stuff this evening and let it bake in 5.x, so maybe we backport this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7011) fix OverseerCollectionProcessor.deleteCollection removal-done check
[ https://issues.apache.org/jira/browse/SOLR-7011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286609#comment-14286609 ] ASF subversion and git services commented on SOLR-7011: --- Commit 1653716 from sha...@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1653716 ] SOLR-7011: Delete collection returns before collection is actually removed fix OverseerCollectionProcessor.deleteCollection removal-done check --- Key: SOLR-7011 URL: https://issues.apache.org/jira/browse/SOLR-7011 Project: Solr Issue Type: Bug Affects Versions: 4.10.3, 5.0, Trunk Reporter: Christine Poerschke Assignee: Shalin Shekhar Mangar Priority: Minor {{OverseerCollectionProcessor.java}} line 1184 has a {{.hasCollection(message.getStr(collection))}} call which should be either {{.hasCollection(message.getStr(name))}} or {{.hasCollection(collection)}} instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6188) Remove HTML verification from checkJavaDocs.py
[ https://issues.apache.org/jira/browse/LUCENE-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286498#comment-14286498 ] Erick Erickson commented on LUCENE-6188: Hmmm, thanks of pointing this out, but it makes things... complicated. Problem is that until this is done, SOLR-6902 is blocked as that patch fails precommit. For no good reason I can find. If I'm reading things right, doclint is only in Java 8, so is simply not an option for 5x even if the problems you point out are fixed up. If I'm reading this right, Ramkumar's claim is that the html checking in this patch that is being removed is unnecessary anyway, so removing it doesn't lose us anything. And it's incorrectly failing this doc for some reason. I checked the generated doc file and it looks fine, I think I even ran it through an XML validator. I could always have missed something of course. That said, the proposed changes in this JIRA to take a lot of code out of checkJavaDocs.py, and I'll very much admit I haven't gone through the changes in much detail, but they do appear to just be doing HTML validation. I can treat this somewhat as a black box and do something like apply this patch locally and: 1 create some invalid JavaDoc links and insure that they're flagged if this patch is applied (any suggestions for a candidate list)? If that works (or, more accurately fails the invalid javadocs), commit this patch to trunk and 5x and then commit SOLR-6902 or 2 just remove the javadocs from SOLR-6902 or possibly munge them until that code succeeds precommit. or 3 try to figure out what the false failure is here and fix checkJavaDocs.py I think 1 is my first choice, and 3 is a very distant third. Spending time debugging code that it sounds like we're going to remove on trunk seems like a waste. I may do 2 anyway, remove the javaDocs and put them if one of the other approaches works. SOLR-6902 is hard to keep up to date since it touches so much, Alan's checkin is already going to be a headache to reconcile. So keeping it our of the code line just because of a bad (and possibly redundant) bit of non-standard HTML checking seems like a poor tradeoff. This last can be argued of course Anyway, I'll do some poking around and report back before committing anything. Remove HTML verification from checkJavaDocs.py -- Key: LUCENE-6188 URL: https://issues.apache.org/jira/browse/LUCENE-6188 Project: Lucene - Core Issue Type: Improvement Components: general/javadocs Reporter: Ramkumar Aiyengar Assignee: Erick Erickson Priority: Minor Attachments: LUCENE-6188.patch, LUCENE-6188.patch Currently, the broken HTML verification in {{checkJavaDocs.py}} has issues in some cases (see SOLR-6902). On looking further to fix it with the {{html.parser}} package instead, noticed that there is broken HTML verification already present (using {{html.parser}}!)in {{checkJavadocLinks.py}} anyway which takes care of validation, and probably {{jTidy}} does it as well, going by the output (haven't verified it). Given this, the validation in {{checkJavaDocs.py}} doesn't seem to add any further value, so here's a patch to just nuke it instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-6193) Collapse identical catch branches in try-catch statements
Shalin Shekhar Mangar created LUCENE-6193: - Summary: Collapse identical catch branches in try-catch statements Key: LUCENE-6193 URL: https://issues.apache.org/jira/browse/LUCENE-6193 Project: Lucene - Core Issue Type: Task Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Priority: Trivial Fix For: Trunk, 5.1 We are on Java 7+ so we can reduce verbosity by collapsing identical catch statements into one. We did the same for solr in SOLR-7014. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7014) Collapse identical catch branches in try-catch statements
[ https://issues.apache.org/jira/browse/SOLR-7014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286532#comment-14286532 ] ASF subversion and git services commented on SOLR-7014: --- Commit 1653698 from sha...@apache.org in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1653698 ] SOLR-7014: Collapse identical catch branches in try-catch statements in morphlines-core Collapse identical catch branches in try-catch statements - Key: SOLR-7014 URL: https://issues.apache.org/jira/browse/SOLR-7014 Project: Solr Issue Type: Task Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Priority: Trivial Fix For: Trunk, 5.1 Attachments: SOLR-7014-more.patch, SOLR-7014.patch We are on Java 7+ so we can reduce verbosity by collapsing identical catch statements into one. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7
[ https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286623#comment-14286623 ] Anshum Gupta commented on SOLR-6991: Thanks for fixing this everyone! Update to Apache TIKA 1.7 - Key: SOLR-6991 URL: https://issues.apache.org/jira/browse/SOLR-6991 Project: Solr Issue Type: Improvement Components: contrib - Solr Cell (Tika extraction) Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 5.0, Trunk, 5.1 Attachments: SOLR-6991-forkfix.patch, SOLR-6991-forkfix.patch, SOLR-6991.patch, SOLR-6991.patch Apache TIKA 1.7 was released: [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt] This is more or less a dependency update, so replacements. Not sure if we should do this for 5.0. In 5.0 we currently have the previous version, which was not yet released with Solr. If we now bring this into 5.0, we wouldn't have a new release 2 times. I can change the stuff this evening and let it bake in 5.x, so maybe we backport this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-Tests-5.x-Java7 - Build # 2523 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-Tests-5.x-Java7/2523/ 5 tests failed. REGRESSION: org.apache.solr.handler.component.SuggestComponentTest.testBuildOnStartupWithNewCores Error Message: Exception during query Stack Trace: java.lang.RuntimeException: Exception during query at __randomizedtesting.SeedInfo.seed([D0016418FD804417:4AC564B67B9CDB93]:0) at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:748) at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:715) at org.apache.solr.handler.component.SuggestComponentTest.doTestBuildOnStartup(SuggestComponentTest.java:395) at org.apache.solr.handler.component.SuggestComponentTest.testBuildOnStartupWithNewCores(SuggestComponentTest.java:374) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at
[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7
[ https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286500#comment-14286500 ] Uwe Schindler commented on SOLR-6991: - Ah you already posted a patch. Thanks for testing. I have only Windows ready to use on my laptop :-) Update to Apache TIKA 1.7 - Key: SOLR-6991 URL: https://issues.apache.org/jira/browse/SOLR-6991 Project: Solr Issue Type: Improvement Components: contrib - Solr Cell (Tika extraction) Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 5.0, Trunk, 5.1 Attachments: SOLR-6991-forkfix.patch, SOLR-6991-forkfix.patch, SOLR-6991.patch, SOLR-6991.patch Apache TIKA 1.7 was released: [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt] This is more or less a dependency update, so replacements. Not sure if we should do this for 5.0. In 5.0 we currently have the previous version, which was not yet released with Solr. If we now bring this into 5.0, we wouldn't have a new release 2 times. I can change the stuff this evening and let it bake in 5.x, so maybe we backport this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-6193) Collapse identical catch branches in try-catch statements
[ https://issues.apache.org/jira/browse/LUCENE-6193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar updated LUCENE-6193: -- Attachment: LUCENE-6193.patch The only places where I did not make these changes are where the catch blocks have different comments or the code wasn't ASL. The following were excluded: # org.apache.lucene.analysis.core.TestFactories # org.apache.lucene.index.TestReaderClosed # org.apache.lucene.queryparser.flexible.messages.NLS (one instance) # org.egothor.stemmer.Diff (license different from ASL) # org.tartarus.snowball.SnowballProgram (license different from ASL) Collapse identical catch branches in try-catch statements - Key: LUCENE-6193 URL: https://issues.apache.org/jira/browse/LUCENE-6193 Project: Lucene - Core Issue Type: Task Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Priority: Trivial Fix For: Trunk, 5.1 Attachments: LUCENE-6193.patch We are on Java 7+ so we can reduce verbosity by collapsing identical catch statements into one. We did the same for solr in SOLR-7014. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6928) solr.cmd stop works only in english
[ https://issues.apache.org/jira/browse/SOLR-6928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286538#comment-14286538 ] ASF subversion and git services commented on SOLR-6928: --- Commit 1653699 from [~thelabdude] in branch 'dev/trunk' [ https://svn.apache.org/r1653699 ] SOLR-6928: solr.cmd stop works only in english solr.cmd stop works only in english --- Key: SOLR-6928 URL: https://issues.apache.org/jira/browse/SOLR-6928 Project: Solr Issue Type: Bug Components: scripts and tools Affects Versions: 4.10.3 Environment: german windows 7 Reporter: john.work Assignee: Timothy Potter Priority: Minor Attachments: SOLR-6928.patch, SOLR-6928.patch in solr.cmd the stop doesnt work while executing 'netstat -nao ^| find /i listening ^| find :%SOLR_PORT%' so listening is not found. e.g. in german cmd.exe the netstat -nao prints the following output: {noformat} Proto Lokale Adresse Remoteadresse Status PID TCP0.0.0.0:80 0.0.0.0:0 ABHÖREN 4 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-7014) Collapse identical catch branches in try-catch statements
[ https://issues.apache.org/jira/browse/SOLR-7014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar resolved SOLR-7014. - Resolution: Fixed I opened LUCENE-6193 for the lucene changes. Collapse identical catch branches in try-catch statements - Key: SOLR-7014 URL: https://issues.apache.org/jira/browse/SOLR-7014 Project: Solr Issue Type: Task Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Priority: Trivial Fix For: Trunk, 5.1 Attachments: SOLR-7014-more.patch, SOLR-7014.patch We are on Java 7+ so we can reduce verbosity by collapsing identical catch statements into one. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6193) Collapse identical catch branches in try-catch statements
[ https://issues.apache.org/jira/browse/LUCENE-6193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286570#comment-14286570 ] ASF subversion and git services commented on LUCENE-6193: - Commit 1653707 from sha...@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1653707 ] LUCENE-6193: Collapse identical catch branches in try-catch statements Collapse identical catch branches in try-catch statements - Key: LUCENE-6193 URL: https://issues.apache.org/jira/browse/LUCENE-6193 Project: Lucene - Core Issue Type: Task Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Priority: Trivial Fix For: Trunk, 5.1 Attachments: LUCENE-6193.patch We are on Java 7+ so we can reduce verbosity by collapsing identical catch statements into one. We did the same for solr in SOLR-7014. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7
[ https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286572#comment-14286572 ] ASF subversion and git services commented on SOLR-6991: --- Commit 1653708 from [~sar...@syr.edu] in branch 'dev/branches/lucene_solr_5_0' [ https://svn.apache.org/r1653708 ] SOLR-6991,SOLR-6387: Under Turkish locale, don't run solr-cell and dataimporthandler-extras tests that use Tika (merged trunk r1653704) Update to Apache TIKA 1.7 - Key: SOLR-6991 URL: https://issues.apache.org/jira/browse/SOLR-6991 Project: Solr Issue Type: Improvement Components: contrib - Solr Cell (Tika extraction) Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 5.0, Trunk, 5.1 Attachments: SOLR-6991-forkfix.patch, SOLR-6991-forkfix.patch, SOLR-6991.patch, SOLR-6991.patch Apache TIKA 1.7 was released: [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt] This is more or less a dependency update, so replacements. Not sure if we should do this for 5.0. In 5.0 we currently have the previous version, which was not yet released with Solr. If we now bring this into 5.0, we wouldn't have a new release 2 times. I can change the stuff this evening and let it bake in 5.x, so maybe we backport this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6387) Solr specific work around for JDK bug #8047340: posix_spawn error with turkish locale
[ https://issues.apache.org/jira/browse/SOLR-6387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286573#comment-14286573 ] ASF subversion and git services commented on SOLR-6387: --- Commit 1653708 from [~sar...@syr.edu] in branch 'dev/branches/lucene_solr_5_0' [ https://svn.apache.org/r1653708 ] SOLR-6991,SOLR-6387: Under Turkish locale, don't run solr-cell and dataimporthandler-extras tests that use Tika (merged trunk r1653704) Solr specific work around for JDK bug #8047340: posix_spawn error with turkish locale - Key: SOLR-6387 URL: https://issues.apache.org/jira/browse/SOLR-6387 Project: Solr Issue Type: Bug Environment: Linux, MacOSX, POSIX in general Reporter: Hoss Man Assignee: Uwe Schindler Priority: Minor Labels: Java7, Java8 Fix For: 4.10, Trunk Attachments: SOLR-6387.patch, SOLR-6387.patch Various versions of the Sun/Oracle/OpenJDK JVM have issues executing new processes if the default langauge of the JVM is Turkish. The root bug reports of this affecting Runtime.exec() are here... * https://bugs.openjdk.java.net/browse/JDK-8047340 * https://bugs.openjdk.java.net/browse/JDK-8055301 On systems runining the affected JVMs, with a default langauge of Turkish, this problem has historically manifested itself in Solr in a few ways: * SystemInfoHandler would throw nasty exceptions on these systems due to an attempt at conditionally executing some native process to check system stats * RunExecutableListener would fail cryptically * some solr tests involving either the SystemInfoHandler or the Hadoop MapReduce code would fail if the test framework randomly selected a turkish language based locale. Starting with Solr 4.10, We have worked around this jvm bug in Solr in 3 ways: * RunExecutableListener makes it more clear in the logs why it can't be used * SystemInfoHandler traps and ignores any Error related to posix_span in the same way it traps and ignores other errors related to it's conditional attempts at exec'ing (ie: permission problems, executable not found ,etc...) * our map reduce based tests that depend on exec'ing external processes now skip themselves automatically if a turkish local is randomly selected. Users affected by this issue who, for whatever reasons, can not upgrade to Solr 4.10, may wish to consider setting the jdk.lang.Process.launchMechanism system property explicitly (see below) {panel:title=original issue report} Jenkin's tests occasionally fail with the following cryptic error... {noformat} java.lang.Error: posix_spawn is not a supported process launch mechanism on this platform. at __randomizedtesting.SeedInfo.seed([9219CAA3BCAA7365:7F07719937A772E1]:0) at java.lang.UNIXProcess$1.run(UNIXProcess.java:104) at java.lang.UNIXProcess$1.run(UNIXProcess.java:93) at java.security.AccessController.doPrivileged(Native Method) at java.lang.UNIXProcess.clinit(UNIXProcess.java:91) at java.lang.ProcessImpl.start(ProcessImpl.java:130) at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028) at java.lang.Runtime.exec(Runtime.java:617) {noformat} A commonality of most of these failures is that the turkish locale has been randomly selected, and apparently the Runtime.exec is busted whtn you use turkish... http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8047340 http://java.thedizzyheights.com/2014/07/java-error-posix_spawn-is-not-a-supported-process-launch-mechanism-on-this-platform-when-trying-to-spawn-a-process/ We should consider hardcoding the jdk.lang.Process.launchMechanism sys property mentioned as a workarround in the jdk bug report {panel} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7011) fix OverseerCollectionProcessor.deleteCollection removal-done check
[ https://issues.apache.org/jira/browse/SOLR-7011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286611#comment-14286611 ] ASF subversion and git services commented on SOLR-7011: --- Commit 1653718 from sha...@apache.org in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1653718 ] SOLR-7011: Delete collection returns before collection is actually removed fix OverseerCollectionProcessor.deleteCollection removal-done check --- Key: SOLR-7011 URL: https://issues.apache.org/jira/browse/SOLR-7011 Project: Solr Issue Type: Bug Affects Versions: 4.10.3, 5.0, Trunk Reporter: Christine Poerschke Assignee: Shalin Shekhar Mangar Priority: Minor {{OverseerCollectionProcessor.java}} line 1184 has a {{.hasCollection(message.getStr(collection))}} call which should be either {{.hasCollection(message.getStr(name))}} or {{.hasCollection(collection)}} instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7
[ https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286470#comment-14286470 ] Hoss Man commented on SOLR-6991: bq. The last comment was just an idea, but doesn't work. ... you fought a good fight uwe, but alas... +1 to your SOLR-6991-forkfix.patch for 5.0 .. but don't we need similar assumes in dataimporthandler-extras tests that use TikaEntityProcessor? (i'm not sure why those wouldn't fail with turkish now as well) Update to Apache TIKA 1.7 - Key: SOLR-6991 URL: https://issues.apache.org/jira/browse/SOLR-6991 Project: Solr Issue Type: Improvement Components: contrib - Solr Cell (Tika extraction) Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 5.0, Trunk, 5.1 Attachments: SOLR-6991-forkfix.patch, SOLR-6991.patch, SOLR-6991.patch Apache TIKA 1.7 was released: [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt] This is more or less a dependency update, so replacements. Not sure if we should do this for 5.0. In 5.0 we currently have the previous version, which was not yet released with Solr. If we now bring this into 5.0, we wouldn't have a new release 2 times. I can change the stuff this evening and let it bake in 5.x, so maybe we backport this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org