[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7

2015-01-21 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286033#comment-14286033
 ] 

Hoss Man commented on SOLR-6991:


TIKA-93 introduced the TesseractOCRParser, and TIKA-1476 enabled it as a 
default parser.

that combination means that the first time Tika is used in Solr, the 
TesseractOCRParser will be checked to see if the system hasTesseract 
installed to know if that parser should be consulted -- and when that happens, 
ExternalParser.check is used which calls Runtime.exec and blows up in turkish 
locale.



possible resolutions i can think of:
* change how we init Tika to prevent this parser from ever being used (override 
the list of autodeteced parsers?)
* change how we include tika jars/defaults to prevent this parser from ever 
being used (override the default tesseract properties file in the jar somehow 
maybe?)
* rollback to tika 1.6
* punt and advise turkish users to run their jvm in en_US ?


 Update to Apache TIKA 1.7
 -

 Key: SOLR-6991
 URL: https://issues.apache.org/jira/browse/SOLR-6991
 Project: Solr
  Issue Type: Improvement
  Components: contrib - Solr Cell (Tika extraction)
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 5.0, Trunk, 5.1

 Attachments: SOLR-6991.patch, SOLR-6991.patch


 Apache TIKA 1.7 was released: 
 [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt]
 This is more or less a dependency update, so replacements. Not sure if we 
 should do this for 5.0. In 5.0 we currently have the previous version, which 
 was not yet released with Solr. If we now bring this into 5.0, we wouldn't 
 have a new release 2 times. I can change the stuff this evening and let it 
 bake in 5.x, so maybe we backport this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.8.0) - Build # 1943 - Failure!

2015-01-21 Thread Chris Hostetter

: It's our old friend SOLR-6387 !
: 
: This time manifesting itself via calls down into Tika.

new comments posted in SOLR-6991 where Tika was upgraded -- definitely 
Tika 1.7 that introduced this new parser that causes this problem.

One followup clarification...

: way we never tickled it before ... but more perplexing is why i can't 
: reproduce any similar errors on trunk (or 5x) using ant test 
: -Dtests.slow=true -Dtests.locale=tr_TR in solr/contrib/extraction ?

...i'm getting senile, and forgot the most anoying part of SOLR-6387: the 
posix_spawn problem doesn't manifest on linux, because the JDK code only 
tries VFORK and FORK, not POSIX_SPAWN


: : Date: Wed, 21 Jan 2015 06:47:13 + (UTC)
: : From: Policeman Jenkins Server jenk...@thetaphi.de
: : Reply-To: dev@lucene.apache.org
: : To: rm...@apache.org, ans...@apache.org, sha...@apache.org, 
no...@apache.org,
: : gcha...@apache.org, ehatc...@apache.org, tflo...@apache.org,
: : sar...@gmail.com, dev@lucene.apache.org
: : Subject: [JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.8.0) - Build # 1943 
-
: : Failure!
: : 
: : Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/1943/
: : Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC
: : 
: : 1 tests failed.
: : FAILED:  
org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.testXPath
: : 
: : Error Message:
: : posix_spawn is not a supported process launch mechanism on this platform.
: : 
: : Stack Trace:
: : java.lang.Error: posix_spawn is not a supported process launch mechanism on 
this platform.
: : at 
__randomizedtesting.SeedInfo.seed([58A6FBEB77E81527:26F735786F5F7761]:0)
: : at java.lang.UNIXProcess$1.run(UNIXProcess.java:105)
: : at java.lang.UNIXProcess$1.run(UNIXProcess.java:94)
: : at java.security.AccessController.doPrivileged(Native Method)
: : at java.lang.UNIXProcess.clinit(UNIXProcess.java:92)
: : at java.lang.ProcessImpl.start(ProcessImpl.java:130)
: : at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
: : at java.lang.Runtime.exec(Runtime.java:620)
: : at java.lang.Runtime.exec(Runtime.java:485)
: : at 
org.apache.tika.parser.external.ExternalParser.check(ExternalParser.java:344)
: : at 
org.apache.tika.parser.ocr.TesseractOCRParser.hasTesseract(TesseractOCRParser.java:117)
: : at 
org.apache.tika.parser.ocr.TesseractOCRParser.getSupportedTypes(TesseractOCRParser.java:90)
: : at 
org.apache.tika.parser.CompositeParser.getParsers(CompositeParser.java:81)
: : at 
org.apache.tika.parser.DefaultParser.getParsers(DefaultParser.java:95)
: : at 
org.apache.tika.parser.CompositeParser.getSupportedTypes(CompositeParser.java:229)
: : at 
org.apache.tika.parser.CompositeParser.getParsers(CompositeParser.java:81)
: : at 
org.apache.tika.parser.CompositeParser.getParser(CompositeParser.java:209)
: : at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
: : at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
: : at 
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219)
: : at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
: : at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:144)
: : at org.apache.solr.core.SolrCore.execute(SolrCore.java:2006)
: : at 
org.apache.solr.util.TestHarness.queryAndResponse(TestHarness.java:353)
: : at 
org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.loadLocalFromHandler(ExtractingRequestHandlerTest.java:703)
: : at 
org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.loadLocal(ExtractingRequestHandlerTest.java:710)
: : at 
org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.testXPath(ExtractingRequestHandlerTest.java:474)
: : at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
: : at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
: : at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
: : at java.lang.reflect.Method.invoke(Method.java:483)
: : at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
: : at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827)
: : at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
: : at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877)
: : at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
: : at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
: : at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
: : at 

[jira] [Commented] (LUCENE-6191) Spatial 2D faceting (heatmaps)

2015-01-21 Thread Nicholas Knize (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285893#comment-14285893
 ] 

Nicholas Knize commented on LUCENE-6191:


The error factor would certainly be a function of spatial resolution but not 
the same since you're dealing with expected vs. observed counts (an RMS over 
the result set vs. %spatialErr).  It'd be worth exploring for a later 
enhancement (maybe an option to include descriptive stats as part of the 
faceting operation for spatial analysis use-cases) but not critical for the 
initial capability.

I'm just not a fan of creating analysis results without communicating some kind 
of accuracy.  It leads to data misrepresentation.

I do like what you have going on here. I'll experiment with it when I get some 
time and see if I can't help get some low-overhead accuracy results.

 Spatial 2D faceting (heatmaps)
 --

 Key: LUCENE-6191
 URL: https://issues.apache.org/jira/browse/LUCENE-6191
 Project: Lucene - Core
  Issue Type: New Feature
  Components: modules/spatial
Reporter: David Smiley
Assignee: David Smiley
 Fix For: 5.1

 Attachments: LUCENE-6191__Spatial_heatmap.patch


 Lucene spatial's PrefixTree (grid) based strategies index data in a way 
 highly amenable to faceting on grids cells to compute a so-called _heatmap_. 
 The underlying code in this patch uses the PrefixTreeFacetCounter utility 
 class which was recently refactored out of faceting for NumberRangePrefixTree 
 LUCENE-5735.  At a low level, the terms (== grid cells) are navigated 
 per-segment, forward only with TermsEnum.seek, so it's pretty quick and 
 furthermore requires no extra caches  no docvalues.  Ideally you should use 
 QuadPrefixTree (or Flex once it comes out) to maximize the number grid levels 
 which in turn maximizes the fidelity of choices when you ask for a grid 
 covering a region.  Conveniently, the provided capability returns the data in 
 a 2-D grid of counts, so the caller needn't know a thing about how the data 
 is encoded in the prefix tree.  Well almost... at this point they need to 
 provide a grid level, but I'll soon provide a means of deriving the grid 
 level based on a min/max cell count.
 I recommend QuadPrefixTree with geo=false so that you can provide a square 
 world-bounds (360x360 degrees), which means square grid cells which are more 
 desirable to display than rectangular cells.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6521) CloudSolrServer should synchronize cache cluster state loading

2015-01-21 Thread Jessica Cheng Mallet (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285925#comment-14285925
 ] 

Jessica Cheng Mallet commented on SOLR-6521:


The patch is locking the entire cache for all loading, which might not be an 
ideal solution for a cluster with many, many collections. Guava's 
implementation of LocalCache would only lock and wait on Segments, which 
increases the concurrency level (which is tunable). 

 CloudSolrServer should synchronize cache cluster state loading
 --

 Key: SOLR-6521
 URL: https://issues.apache.org/jira/browse/SOLR-6521
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Jessica Cheng Mallet
Assignee: Noble Paul
Priority: Critical
  Labels: SolrCloud
 Fix For: 5.0, Trunk

 Attachments: SOLR-6521.patch


 Under heavy load-testing with the new solrj client that caches the cluster 
 state instead of setting a watcher, I started seeing lots of zk connection 
 loss on the client-side when refreshing the CloudSolrServer 
 collectionStateCache, and this was causing crazy client-side 99.9% latency 
 (~15 sec). I swapped the cache out with guava's LoadingCache (which does 
 locking to ensure only one thread loads the content under one key while the 
 other threads that want the same key wait) and the connection loss went away 
 and the 99.9% latency also went down to just about 1 sec.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-6928) solr.cmd stop works only in english

2015-01-21 Thread Timothy Potter (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Potter reassigned SOLR-6928:


Assignee: Timothy Potter

 solr.cmd stop works only in english
 ---

 Key: SOLR-6928
 URL: https://issues.apache.org/jira/browse/SOLR-6928
 Project: Solr
  Issue Type: Bug
  Components: scripts and tools
Affects Versions: 4.10.3
 Environment: german windows 7
Reporter: john.work
Assignee: Timothy Potter
Priority: Minor

 in solr.cmd the stop doesnt work while executing 'netstat -nao ^| find /i 
 listening ^| find :%SOLR_PORT%' so listening is not found.
 e.g. in german cmd.exe the netstat -nao prints the following output:
 {noformat}
   Proto  Lokale Adresse Remoteadresse  Status   PID
   TCP0.0.0.0:80 0.0.0.0:0  ABHÖREN 4
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-Tests-5.x-Java7 - Build # 2521 - Still Failing

2015-01-21 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-5.x-Java7/2521/

4 tests failed.
FAILED:  org.apache.solr.cloud.HttpPartitionTest.testDistribSearch

Error Message:
org.apache.http.NoHttpResponseException: The target server failed to respond

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: 
org.apache.http.NoHttpResponseException: The target server failed to respond
at 
__randomizedtesting.SeedInfo.seed([77AD558AFE563DA6:F64BDB9289095D9A]:0)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:871)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:736)
at 
org.apache.solr.cloud.HttpPartitionTest.sendDoc(HttpPartitionTest.java:480)
at 
org.apache.solr.cloud.HttpPartitionTest.testRf2(HttpPartitionTest.java:201)
at 
org.apache.solr.cloud.HttpPartitionTest.doTest(HttpPartitionTest.java:114)
at 
org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:878)
at sun.reflect.GeneratedMethodAccessor46.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 

Re: [JENKINS] Lucene-Solr-5.x-Linux (64bit/ibm-j9-jdk7) - Build # 11491 - Failure!

2015-01-21 Thread Chris Hostetter


this looks like problems noted in SOLR-6915 wih the kerberose stuff and 
the IBM JDK.

miller? gregory? ... were you guys going to disable these tests on IBM 
JDKs or not?

It's one thing to say a certain feature only works with Oracle JVMs, but 
it's going to suck if 5.0 goes out and we know that the Solr tests will 
reliable fail 100% of the time on IBM JDKs



: Date: Wed, 21 Jan 2015 05:30:55 + (UTC)
: From: Policeman Jenkins Server jenk...@thetaphi.de
: Reply-To: dev@lucene.apache.org
: To: sar...@gmail.com, dev@lucene.apache.org
: Subject: [JENKINS] Lucene-Solr-5.x-Linux (64bit/ibm-j9-jdk7) - Build # 11491 -
:  Failure!
: 
: Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-Linux/11491/
: Java: 64bit/ibm-j9-jdk7 
-Xjit:exclude={org/apache/lucene/util/fst/FST.pack(IIF)Lorg/apache/lucene/util/fst/FST;}
: 
: 1 tests failed.
: FAILED:  org.apache.solr.cloud.SaslZkACLProviderTest.testSaslZkACLProvider
: 
: Error Message:
: Could not get the port for ZooKeeper server
: 
: Stack Trace:
: java.lang.RuntimeException: Could not get the port for ZooKeeper server
:   at org.apache.solr.cloud.ZkTestServer.run(ZkTestServer.java:482)
:   at 
org.apache.solr.cloud.SaslZkACLProviderTest$SaslZkTestServer.run(SaslZkACLProviderTest.java:206)
:   at 
org.apache.solr.cloud.SaslZkACLProviderTest.setUp(SaslZkACLProviderTest.java:74)
:   at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
:   at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
:   at java.lang.reflect.Method.invoke(Method.java:619)
:   at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
:   at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:861)
:   at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877)
:   at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
:   at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
:   at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
:   at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
:   at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
:   at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
:   at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
:   at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
:   at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
:   at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798)
:   at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458)
:   at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836)
:   at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738)
:   at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772)
:   at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783)
:   at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
:   at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
:   at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
:   at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
:   at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
:   at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
:   at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
:   at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
:   at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
:   at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
:   at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54)
:   at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
:   at 

Re: [JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.8.0) - Build # 1943 - Failure!

2015-01-21 Thread Chris Hostetter

It's our old friend SOLR-6387 !

This time manifesting itself via calls down into Tika.

My best guess is that something changed in the recently upgraded version 
of Tika in Solr so that we now tickle this ExternalParser code path in a 
way we never tickled it before ... but more perplexing is why i can't 
reproduce any similar errors on trunk (or 5x) using ant test 
-Dtests.slow=true -Dtests.locale=tr_TR in solr/contrib/extraction ?

Does Tika already do some special platform/locale detection internally 
that is bypassing this error for most people, and only manifesting on 
MacOSX?

can a mac user try to reproduce this?


: Date: Wed, 21 Jan 2015 06:47:13 + (UTC)
: From: Policeman Jenkins Server jenk...@thetaphi.de
: Reply-To: dev@lucene.apache.org
: To: rm...@apache.org, ans...@apache.org, sha...@apache.org, no...@apache.org,
: gcha...@apache.org, ehatc...@apache.org, tflo...@apache.org,
: sar...@gmail.com, dev@lucene.apache.org
: Subject: [JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.8.0) - Build # 1943 -
: Failure!
: 
: Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/1943/
: Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC
: 
: 1 tests failed.
: FAILED:  
org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.testXPath
: 
: Error Message:
: posix_spawn is not a supported process launch mechanism on this platform.
: 
: Stack Trace:
: java.lang.Error: posix_spawn is not a supported process launch mechanism on 
this platform.
:   at 
__randomizedtesting.SeedInfo.seed([58A6FBEB77E81527:26F735786F5F7761]:0)
:   at java.lang.UNIXProcess$1.run(UNIXProcess.java:105)
:   at java.lang.UNIXProcess$1.run(UNIXProcess.java:94)
:   at java.security.AccessController.doPrivileged(Native Method)
:   at java.lang.UNIXProcess.clinit(UNIXProcess.java:92)
:   at java.lang.ProcessImpl.start(ProcessImpl.java:130)
:   at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
:   at java.lang.Runtime.exec(Runtime.java:620)
:   at java.lang.Runtime.exec(Runtime.java:485)
:   at 
org.apache.tika.parser.external.ExternalParser.check(ExternalParser.java:344)
:   at 
org.apache.tika.parser.ocr.TesseractOCRParser.hasTesseract(TesseractOCRParser.java:117)
:   at 
org.apache.tika.parser.ocr.TesseractOCRParser.getSupportedTypes(TesseractOCRParser.java:90)
:   at 
org.apache.tika.parser.CompositeParser.getParsers(CompositeParser.java:81)
:   at 
org.apache.tika.parser.DefaultParser.getParsers(DefaultParser.java:95)
:   at 
org.apache.tika.parser.CompositeParser.getSupportedTypes(CompositeParser.java:229)
:   at 
org.apache.tika.parser.CompositeParser.getParsers(CompositeParser.java:81)
:   at 
org.apache.tika.parser.CompositeParser.getParser(CompositeParser.java:209)
:   at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
:   at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
:   at 
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219)
:   at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
:   at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:144)
:   at org.apache.solr.core.SolrCore.execute(SolrCore.java:2006)
:   at 
org.apache.solr.util.TestHarness.queryAndResponse(TestHarness.java:353)
:   at 
org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.loadLocalFromHandler(ExtractingRequestHandlerTest.java:703)
:   at 
org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.loadLocal(ExtractingRequestHandlerTest.java:710)
:   at 
org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.testXPath(ExtractingRequestHandlerTest.java:474)
:   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
:   at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
:   at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
:   at java.lang.reflect.Method.invoke(Method.java:483)
:   at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
:   at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827)
:   at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
:   at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877)
:   at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
:   at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
:   at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
:   at 

[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7

2015-01-21 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286059#comment-14286059
 ] 

Uwe Schindler commented on SOLR-6991:
-

This is in fact the problem with spawning external processes. This is not new, 
also TIKA 1.6 had parsers that spawned processes. I just think we never hit 
this because this one is different: The parser spawns a process while 
initializing (to inspect the system). The other Spawning parsers are only 
executed as needed. ExternalParser exists since a long time in TIKA.

I would not roll back to TIKA 1.5 because the new TIKA is much better than this 
one (regarding bugs). In fact we should maybe disable this tests with the 
well-known assume (trunk, 5.x, 5.0). In fact, I would suggest to add a note to 
the ref guide, so people know what this means. This is unfortunately a bug in 
the JVM, so this is not really our or TIKA's fault.

In fact, as written in my Blog post about Locale issues: Most Turkish system 
administrators don't run servers with the turkish locale :-) Its just too 
broken with lots of software.

 Update to Apache TIKA 1.7
 -

 Key: SOLR-6991
 URL: https://issues.apache.org/jira/browse/SOLR-6991
 Project: Solr
  Issue Type: Improvement
  Components: contrib - Solr Cell (Tika extraction)
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 5.0, Trunk, 5.1

 Attachments: SOLR-6991.patch, SOLR-6991.patch


 Apache TIKA 1.7 was released: 
 [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt]
 This is more or less a dependency update, so replacements. Not sure if we 
 should do this for 5.0. In 5.0 we currently have the previous version, which 
 was not yet released with Solr. If we now bring this into 5.0, we wouldn't 
 have a new release 2 times. I can change the stuff this evening and let it 
 bake in 5.x, so maybe we backport this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6521) CloudSolrServer should synchronize cache cluster state loading

2015-01-21 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-6521:
-
Attachment: SOLR-6521.patch

 CloudSolrServer should synchronize cache cluster state loading
 --

 Key: SOLR-6521
 URL: https://issues.apache.org/jira/browse/SOLR-6521
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Jessica Cheng Mallet
Assignee: Noble Paul
Priority: Critical
  Labels: SolrCloud
 Fix For: 5.0, Trunk

 Attachments: SOLR-6521.patch, SOLR-6521.patch


 Under heavy load-testing with the new solrj client that caches the cluster 
 state instead of setting a watcher, I started seeing lots of zk connection 
 loss on the client-side when refreshing the CloudSolrServer 
 collectionStateCache, and this was causing crazy client-side 99.9% latency 
 (~15 sec). I swapped the cache out with guava's LoadingCache (which does 
 locking to ensure only one thread loads the content under one key while the 
 other threads that want the same key wait) and the connection loss went away 
 and the 99.9% latency also went down to just about 1 sec.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6521) CloudSolrServer should synchronize cache cluster state loading

2015-01-21 Thread Jessica Cheng Mallet (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286062#comment-14286062
 ] 

Jessica Cheng Mallet commented on SOLR-6521:


bq. I agree that the concurrency can be dramatically improved . Using Guava may 
not be an option because it is not yet a dependency on SolrJ. The other option 
would be to make the cache pluggable through an API . So ,if you have Guava or 
something else in your package you can plug it in through an API

That'd be awesome!

 CloudSolrServer should synchronize cache cluster state loading
 --

 Key: SOLR-6521
 URL: https://issues.apache.org/jira/browse/SOLR-6521
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Jessica Cheng Mallet
Assignee: Noble Paul
Priority: Critical
  Labels: SolrCloud
 Fix For: 5.0, Trunk

 Attachments: SOLR-6521.patch


 Under heavy load-testing with the new solrj client that caches the cluster 
 state instead of setting a watcher, I started seeing lots of zk connection 
 loss on the client-side when refreshing the CloudSolrServer 
 collectionStateCache, and this was causing crazy client-side 99.9% latency 
 (~15 sec). I swapped the cache out with guava's LoadingCache (which does 
 locking to ensure only one thread loads the content under one key while the 
 other threads that want the same key wait) and the connection loss went away 
 and the 99.9% latency also went down to just about 1 sec.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7

2015-01-21 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286065#comment-14286065
 ] 

Uwe Schindler commented on SOLR-6991:
-

In fact you can select parsers using a config file / SetString. But this 
makes updaing horrible, because we have to revisit the list on each TIKA 
update...

 Update to Apache TIKA 1.7
 -

 Key: SOLR-6991
 URL: https://issues.apache.org/jira/browse/SOLR-6991
 Project: Solr
  Issue Type: Improvement
  Components: contrib - Solr Cell (Tika extraction)
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 5.0, Trunk, 5.1

 Attachments: SOLR-6991.patch, SOLR-6991.patch


 Apache TIKA 1.7 was released: 
 [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt]
 This is more or less a dependency update, so replacements. Not sure if we 
 should do this for 5.0. In 5.0 we currently have the previous version, which 
 was not yet released with Solr. If we now bring this into 5.0, we wouldn't 
 have a new release 2 times. I can change the stuff this evening and let it 
 bake in 5.x, so maybe we backport this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6521) CloudSolrServer should synchronize cache cluster state loading

2015-01-21 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-6521:
-
Attachment: (was: SOLR-6521.patch)

 CloudSolrServer should synchronize cache cluster state loading
 --

 Key: SOLR-6521
 URL: https://issues.apache.org/jira/browse/SOLR-6521
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Jessica Cheng Mallet
Assignee: Noble Paul
Priority: Critical
  Labels: SolrCloud
 Fix For: 5.0, Trunk

 Attachments: SOLR-6521.patch, SOLR-6521.patch


 Under heavy load-testing with the new solrj client that caches the cluster 
 state instead of setting a watcher, I started seeing lots of zk connection 
 loss on the client-side when refreshing the CloudSolrServer 
 collectionStateCache, and this was causing crazy client-side 99.9% latency 
 (~15 sec). I swapped the cache out with guava's LoadingCache (which does 
 locking to ensure only one thread loads the content under one key while the 
 other threads that want the same key wait) and the connection loss went away 
 and the 99.9% latency also went down to just about 1 sec.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6521) CloudSolrServer should synchronize cache cluster state loading

2015-01-21 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-6521:
-
Attachment: SOLR-6521.patch

hi [~mewmewball]

This patch increases the parallelism and makes it tunable. 

 CloudSolrServer should synchronize cache cluster state loading
 --

 Key: SOLR-6521
 URL: https://issues.apache.org/jira/browse/SOLR-6521
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Jessica Cheng Mallet
Assignee: Noble Paul
Priority: Critical
  Labels: SolrCloud
 Fix For: 5.0, Trunk

 Attachments: SOLR-6521.patch, SOLR-6521.patch


 Under heavy load-testing with the new solrj client that caches the cluster 
 state instead of setting a watcher, I started seeing lots of zk connection 
 loss on the client-side when refreshing the CloudSolrServer 
 collectionStateCache, and this was causing crazy client-side 99.9% latency 
 (~15 sec). I swapped the cache out with guava's LoadingCache (which does 
 locking to ensure only one thread loads the content under one key while the 
 other threads that want the same key wait) and the connection loss went away 
 and the 99.9% latency also went down to just about 1 sec.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6991) Update to Apache TIKA 1.7

2015-01-21 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated SOLR-6991:

Attachment: SOLR-6991-forkfix.patch

This disables the test... Just copypasted from map-reduce/morphlines/

In fact this is not TIKA's issue and not new, a lot of stuff around Hadoop in 
Solr fails with Turkish!

 Update to Apache TIKA 1.7
 -

 Key: SOLR-6991
 URL: https://issues.apache.org/jira/browse/SOLR-6991
 Project: Solr
  Issue Type: Improvement
  Components: contrib - Solr Cell (Tika extraction)
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 5.0, Trunk, 5.1

 Attachments: SOLR-6991-forkfix.patch, SOLR-6991.patch, SOLR-6991.patch


 Apache TIKA 1.7 was released: 
 [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt]
 This is more or less a dependency update, so replacements. Not sure if we 
 should do this for 5.0. In 5.0 we currently have the previous version, which 
 was not yet released with Solr. If we now bring this into 5.0, we wouldn't 
 have a new release 2 times. I can change the stuff this evening and let it 
 bake in 5.x, so maybe we backport this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6192) Long overflow in LuceneXXSkipWriter can corrupt skip data

2015-01-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285904#comment-14285904
 ] 

ASF subversion and git services commented on LUCENE-6192:
-

Commit 1653606 from [~mikemccand] in branch 'dev/branches/lucene_solr_5_0'
[ https://svn.apache.org/r1653606 ]

LUCENE-6192: don't overflow int when writing skip data for high freq terms in 
extremely large indices

 Long overflow in LuceneXXSkipWriter can corrupt skip data
 -

 Key: LUCENE-6192
 URL: https://issues.apache.org/jira/browse/LUCENE-6192
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 5.0, Trunk, 4.x

 Attachments: LUCENE-6192.patch


 I've been iterating with Tom on this corruption that CheckIndex detects in 
 his rather large index (720 GB in a single segment):
 {noformat}
  java -Xmx16G -Xms16G -cp $JAR -ea:org.apache.lucene... 
 org.apache.lucene.index.CheckIndex //shards/4/core-1/data/test_index 
 -verbose 21 |tee -a shard4_reoptimizedNewJava
 Opening index @ /htsolr/lss-reindex/shards/4/core-1/data/test_index
 Segments file=segments_e numSegments=1 version=4.10.2 format= 
 userData={commitTimeMSec=1421479358825}
   1 of 1: name=_8m8 docCount=1130856
 version=4.10.2
 codec=Lucene410
 compound=false
 numFiles=10
 size (MB)=719,967.32
 diagnostics = {timestamp=1421437320935, os=Linux, 
 os.version=2.6.18-400.1.1.el5, mergeFactor=2, source=merge, 
 lucene.version=4.10.2, os.arch=amd64, mergeMaxNumSegments=1, 
 java.version=1.7.0_71, java.vendor=Oracle Corporation}
 no deletions
 test: open reader.OK
 test: check integrity.OK
 test: check live docs.OK
 test: fields..OK [80 fields]
 test: field norms.OK [23 fields]
 test: terms, freq, prox...ERROR: java.lang.AssertionError: -96
 java.lang.AssertionError: -96
 at 
 org.apache.lucene.codecs.lucene41.ForUtil.skipBlock(ForUtil.java:228)
 at 
 org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.skipPositions(Lucene41PostingsReader.java:925)
 at 
 org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.nextPosition(Lucene41PostingsReader.java:955)
 at 
 org.apache.lucene.index.CheckIndex.checkFields(CheckIndex.java:1100)
 at 
 org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:1357)
 at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:655)
 at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096)
 test: stored fields...OK [67472796 total field count; avg 59.665 
 fields per doc]
 test: term vectorsOK [0 total vector count; avg 0 term/freq 
 vector fields per doc]
 test: docvalues...OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 
 SORTED; 0 SORTED_NUMERIC; 0 SORTED_SET]
 FAILED
 WARNING: fixIndex() would remove reference to this segment; full 
 exception:
 java.lang.RuntimeException: Term Index test failed
 at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:670)
 at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096)
 WARNING: 1 broken segments (containing 1130856 documents) detected
 WARNING: would write new segments file, and 1130856 documents would be lost, 
 if -fix were specified
 {noformat}
 And Rob spotted long - int casts in our skip list writers that look like 
 they could cause such corruption if a single high-freq term with many 
 positions required  2.1 GB to write its positions into .pos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.8.0) - Build # 1943 - Failure!

2015-01-21 Thread Steve Rowe
Reproduces for me on OS X 10.10, Oracle JDK 1.8.0_20:

=
   [junit4]   2 NOTE: reproduce with: ant test  
-Dtestcase=ExtractingRequestHandlerTest -Dtests.method=testExtraction 
-Dtests.seed=98ABBA97C7FD5F1C -Dtests.slow=true -Dtests.locale=tr_TR 
-Dtests.timezone=America/Nome -Dtests.asserts=true -Dtests.file.encoding=UTF-8
   [junit4] ERROR   0.49s | ExtractingRequestHandlerTest.testExtraction 
   [junit4] Throwable #1: java.lang.Error: posix_spawn is not a supported 
process launch mechanism on this platform.
   [junit4]at 
__randomizedtesting.SeedInfo.seed([98ABBA97C7FD5F1C:21D8CEE9BBD58FE9]:0)
   [junit4]at java.lang.UNIXProcess$1.run(UNIXProcess.java:105)
   [junit4]at java.lang.UNIXProcess$1.run(UNIXProcess.java:94)
   [junit4]at java.security.AccessController.doPrivileged(Native 
Method)
   [junit4]at java.lang.UNIXProcess.clinit(UNIXProcess.java:92)
   [junit4]at java.lang.ProcessImpl.start(ProcessImpl.java:130)
   [junit4]at 
java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
   [junit4]at java.lang.Runtime.exec(Runtime.java:620)
   [junit4]at java.lang.Runtime.exec(Runtime.java:485)
   [junit4]at 
org.apache.tika.parser.external.ExternalParser.check(ExternalParser.java:344)
   [junit4]at 
org.apache.tika.parser.ocr.TesseractOCRParser.hasTesseract(TesseractOCRParser.java:117)
   [junit4]at 
org.apache.tika.parser.ocr.TesseractOCRParser.getSupportedTypes(TesseractOCRParser.java:90)
   [junit4]at 
org.apache.tika.parser.CompositeParser.getParsers(CompositeParser.java:81)
   [junit4]at 
org.apache.tika.parser.DefaultParser.getParsers(DefaultParser.java:95)
   [junit4]at 
org.apache.tika.parser.CompositeParser.getSupportedTypes(CompositeParser.java:229)
   [junit4]at 
org.apache.tika.parser.CompositeParser.getParsers(CompositeParser.java:81)
   [junit4]at 
org.apache.tika.parser.CompositeParser.getParser(CompositeParser.java:209)
   [junit4]at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
   [junit4]at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
   [junit4]at 
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219)
   [junit4]at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
   [junit4]at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:144)
   [junit4]at 
org.apache.solr.core.SolrCore.execute(SolrCore.java:2006)
   [junit4]at 
org.apache.solr.util.TestHarness.queryAndResponse(TestHarness.java:353)
   [junit4]at 
org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.loadLocalFromHandler(ExtractingRequestHandlerTest.java:703)
   [junit4]at 
org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.loadLocal(ExtractingRequestHandlerTest.java:710)
   [junit4]at 
org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.testExtraction(ExtractingRequestHandlerTest.java:59)
   [junit4]at java.lang.Thread.run(Thread.java:745)
=

I’ll try reverting to just before the Tika upgrade and see if it still happens.

Steve


 On Jan 21, 2015, at 1:03 PM, Chris Hostetter hossman_luc...@fucit.org wrote:
 
 
 It's our old friend SOLR-6387 !
 
 This time manifesting itself via calls down into Tika.
 
 My best guess is that something changed in the recently upgraded version 
 of Tika in Solr so that we now tickle this ExternalParser code path in a 
 way we never tickled it before ... but more perplexing is why i can't 
 reproduce any similar errors on trunk (or 5x) using ant test 
 -Dtests.slow=true -Dtests.locale=tr_TR in solr/contrib/extraction ?
 
 Does Tika already do some special platform/locale detection internally 
 that is bypassing this error for most people, and only manifesting on 
 MacOSX?
 
 can a mac user try to reproduce this?
 
 
 : Date: Wed, 21 Jan 2015 06:47:13 + (UTC)
 : From: Policeman Jenkins Server jenk...@thetaphi.de
 : Reply-To: dev@lucene.apache.org
 : To: rm...@apache.org, ans...@apache.org, sha...@apache.org, 
 no...@apache.org,
 : gcha...@apache.org, ehatc...@apache.org, tflo...@apache.org,
 : sar...@gmail.com, dev@lucene.apache.org
 : Subject: [JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.8.0) - Build # 1943 
 -
 : Failure!
 : 
 : Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/1943/
 : Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC
 : 
 : 1 tests failed.
 : FAILED:  
 org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.testXPath
 : 
 : Error Message:
 : posix_spawn is not a supported process launch mechanism on this platform.
 : 
 : Stack Trace:
 : java.lang.Error: 

[jira] [Commented] (SOLR-6993) install_solr_service.sh won't install on RHEL / CentOS

2015-01-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285886#comment-14285886
 ] 

ASF subversion and git services commented on SOLR-6993:
---

Commit 1653601 from [~thelabdude] in branch 'dev/trunk'
[ https://svn.apache.org/r1653601 ]

SOLR-6993: install_solr_service.sh won't install on RHEL / CentOS

 install_solr_service.sh won't install on RHEL / CentOS
 --

 Key: SOLR-6993
 URL: https://issues.apache.org/jira/browse/SOLR-6993
 Project: Solr
  Issue Type: Bug
  Components: scripts and tools
Affects Versions: 5.0, Trunk
 Environment: RHEL 6.5 / CentOS 6.5
Reporter: David Anderson
Assignee: Timothy Potter
Priority: Blocker
 Fix For: 5.0, Trunk

 Attachments: SOLR-6993.patch


 There's a bug that will prevent install_solr_service.sh from working on RHEL 
 / CentOS 6.5.  It works on Ubuntu 14.  Appears to be some obscure difference 
 in bash expression evaluation behavior.
 line 87 and 89:SOLR_DIR=${SOLR_INSTALL_FILE:0:-4}
 blows up with this error:
 ./install_solr_service.sh: line 87: -4: substring expression  0
 this results in the archive not being extracted and rest of the script won't 
 work.
 I tested a simple change:
   SOLR_DIR=${SOLR_INSTALL_FILE%.tgz}
 and verified it works on both RHEL 6.5 and Ubuntu 14
 Patch is attached.  I set this to Major thinking that not being able to 
 install on CentOS is worth fixing prior to release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-6993) install_solr_service.sh won't install on RHEL / CentOS

2015-01-21 Thread Timothy Potter (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Potter resolved SOLR-6993.
--
Resolution: Fixed

 install_solr_service.sh won't install on RHEL / CentOS
 --

 Key: SOLR-6993
 URL: https://issues.apache.org/jira/browse/SOLR-6993
 Project: Solr
  Issue Type: Bug
  Components: scripts and tools
Affects Versions: 5.0, Trunk
 Environment: RHEL 6.5 / CentOS 6.5
Reporter: David Anderson
Assignee: Timothy Potter
Priority: Blocker
 Fix For: 5.0, Trunk

 Attachments: SOLR-6993.patch


 There's a bug that will prevent install_solr_service.sh from working on RHEL 
 / CentOS 6.5.  It works on Ubuntu 14.  Appears to be some obscure difference 
 in bash expression evaluation behavior.
 line 87 and 89:SOLR_DIR=${SOLR_INSTALL_FILE:0:-4}
 blows up with this error:
 ./install_solr_service.sh: line 87: -4: substring expression  0
 this results in the archive not being extracted and rest of the script won't 
 work.
 I tested a simple change:
   SOLR_DIR=${SOLR_INSTALL_FILE%.tgz}
 and verified it works on both RHEL 6.5 and Ubuntu 14
 Patch is attached.  I set this to Major thinking that not being able to 
 install on CentOS is worth fixing prior to release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6845) Add buildOnStartup option for suggesters

2015-01-21 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SOLR-6845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomás Fernández Löbbe updated SOLR-6845:

Fix Version/s: 5.1
   Trunk

 Add buildOnStartup option for suggesters
 

 Key: SOLR-6845
 URL: https://issues.apache.org/jira/browse/SOLR-6845
 Project: Solr
  Issue Type: Bug
Reporter: Hoss Man
Assignee: Tomás Fernández Löbbe
 Fix For: Trunk, 5.1

 Attachments: SOLR-6845.patch, SOLR-6845.patch, SOLR-6845.patch


 SOLR-6679 was filed to track the investigation into the following problem...
 {panel}
 The stock solrconfig provides a bad experience with a large index... start up 
 Solr and it will spin at 100% CPU for minutes, unresponsive, while it 
 apparently builds a suggester index.
 ...
 This is what I did:
 1) indexed 10M very small docs (only takes a few minutes).
 2) shut down Solr
 3) start up Solr and watch it be unresponsive for over 4 minutes!
 I didn't even use any of the fields specified in the suggester config and I 
 never called the suggest request handler.
 {panel}
 ..but ultimately focused on removing/disabling the suggester from the sample 
 configs.
 Opening this new issue to focus on actually trying to identify the root 
 problem  fix it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6134) fix typos

2015-01-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285950#comment-14285950
 ] 

ASF subversion and git services commented on LUCENE-6134:
-

Commit 1653615 from [~sar...@syr.edu] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1653615 ]

LUCENE-6134: fix typos in lucene/CHANGES.txt and solr/CHANGES.txt (merged trunk 
r1653612)

 fix typos
 -

 Key: LUCENE-6134
 URL: https://issues.apache.org/jira/browse/LUCENE-6134
 Project: Lucene - Core
  Issue Type: Task
Reporter: Steve Rowe
Assignee: Steve Rowe
Priority: Trivial
 Attachments: LUCENE-6134-CHANGES.txt-s.patch, 
 LUCENE-6134-its-its.patch, 
 LUCENE-6134-necessary-whether-initializ-specified.patch


 I found a bunch of typos, will fix under this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7

2015-01-21 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286101#comment-14286101
 ] 

Uwe Schindler commented on SOLR-6991:
-

FYI: SolrCellMorphlineTest is already disabled by the same assume, so this is 
the only broken one.

 Update to Apache TIKA 1.7
 -

 Key: SOLR-6991
 URL: https://issues.apache.org/jira/browse/SOLR-6991
 Project: Solr
  Issue Type: Improvement
  Components: contrib - Solr Cell (Tika extraction)
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 5.0, Trunk, 5.1

 Attachments: SOLR-6991-forkfix.patch, SOLR-6991.patch, SOLR-6991.patch


 Apache TIKA 1.7 was released: 
 [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt]
 This is more or less a dependency update, so replacements. Not sure if we 
 should do this for 5.0. In 5.0 we currently have the previous version, which 
 was not yet released with Solr. If we now bring this into 5.0, we wouldn't 
 have a new release 2 times. I can change the stuff this evening and let it 
 bake in 5.x, so maybe we backport this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6134) fix typos

2015-01-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285939#comment-14285939
 ] 

ASF subversion and git services commented on LUCENE-6134:
-

Commit 1653612 from [~sar...@syr.edu] in branch 'dev/trunk'
[ https://svn.apache.org/r1653612 ]

LUCENE-6134: fix typos in lucene/CHANGES.txt and solr/CHANGES.txt

 fix typos
 -

 Key: LUCENE-6134
 URL: https://issues.apache.org/jira/browse/LUCENE-6134
 Project: Lucene - Core
  Issue Type: Task
Reporter: Steve Rowe
Assignee: Steve Rowe
Priority: Trivial
 Attachments: LUCENE-6134-CHANGES.txt-s.patch, 
 LUCENE-6134-its-its.patch, 
 LUCENE-6134-necessary-whether-initializ-specified.patch


 I found a bunch of typos, will fix under this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-5.x-MacOSX (64bit/jdk1.8.0) - Build # 1902 - Still Failing!

2015-01-21 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-MacOSX/1902/
Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseSerialGC

2 tests failed.
FAILED:  org.apache.solr.cloud.ChaosMonkeyNothingIsSafeTest.testDistribSearch

Error Message:
There were too many update fails - we expect it can happen, but shouldn't easily

Stack Trace:
java.lang.AssertionError: There were too many update fails - we expect it can 
happen, but shouldn't easily
at 
__randomizedtesting.SeedInfo.seed([9FCAAA6FE82229C2:1E2C24779F7D49FE]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at org.junit.Assert.assertFalse(Assert.java:68)
at 
org.apache.solr.cloud.ChaosMonkeyNothingIsSafeTest.doTest(ChaosMonkeyNothingIsSafeTest.java:224)
at 
org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:878)
at sun.reflect.GeneratedMethodAccessor51.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 

[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7

2015-01-21 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286020#comment-14286020
 ] 

Steve Rowe commented on SOLR-6991:
--

I can reproduce this on OS X 10.10 using Oracle JDK 1.8.0_20.

When I revert back to r1652741 (just before the first commit under this issue), 
all solr-cell tests pass using the following (same thing that fails 100% for me 
with current trunk):

{noformat}
ant clean
cd solr/contrib/extraction
ant test -Dtests.slow=true -Dtests.locale=tr_TR
{noformat}

 Update to Apache TIKA 1.7
 -

 Key: SOLR-6991
 URL: https://issues.apache.org/jira/browse/SOLR-6991
 Project: Solr
  Issue Type: Improvement
  Components: contrib - Solr Cell (Tika extraction)
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 5.0, Trunk, 5.1

 Attachments: SOLR-6991.patch, SOLR-6991.patch


 Apache TIKA 1.7 was released: 
 [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt]
 This is more or less a dependency update, so replacements. Not sure if we 
 should do this for 5.0. In 5.0 we currently have the previous version, which 
 was not yet released with Solr. If we now bring this into 5.0, we wouldn't 
 have a new release 2 times. I can change the stuff this evening and let it 
 bake in 5.x, so maybe we backport this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Set path to JRE / JDK in code

2015-01-21 Thread Petrus Hyvönen
Hi,

It is primary on my wrapped library that this applies, for so it can be
easily installed.

Yes on windows the JRE is found using the PATH. I haven't been able to
locate where this is done, as I would prefer to use a direct variable or
separate environment variable for this to keep it boxed from the system
variables.

I have however found a way now that works fine based on updating the PATH
as a first thing in the __init__.py file in the wrapped library by
prepending:

import os
os.environ[PATH] = rmypath\jre\bin\server + os.pathsep +
os.environ[PATH]

the installer takes care of updating the path to the right place later on.

Many thanks
/Petrus



On Tue, Jan 20, 2015 at 7:14 PM, Andi Vajda va...@apache.org wrote:


 On Tue, 20 Jan 2015, Petrus Hyvönen wrote:

  Hi,

 I'm trying to package a wrapped library together with a non system-wide
 java JDK so that it can be easily installed.

 Can I somehow direct which JDK to use besides using JCC_JDK and putting
 the
 JRE in the PATH (I'm currently under windows)? The JCC_JDK could be
 patched
 in the setup.py but the PATH JRE that is accessed during running the
 wrapped library I don't understand where it is accessed, or how to patch
 this?


 So you're asking how to control where to pickup the JRE DLLs (on Windows)
 at runtime ? If I remember correctly, on Windows you just set the Path
 environment variable, no ?

  For example it would be good to have this in the config.py file if
 possible?


 If you're sure config.py is run _before_ any JRE DLL is loaded, you might
 be able to change the Path fron there too.

 Andi..



 Any thoughts or someone who's done this already?

 Regards
 /Petrus


 --
 _
 Petrus Hyvönen, Uppsala, Sweden
 Mobile Phone/SMS:+46 73 803 19 00




-- 
_
Petrus Hyvönen, Uppsala, Sweden
Mobile Phone/SMS:+46 73 803 19 00


[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7

2015-01-21 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286111#comment-14286111
 ] 

Hoss Man commented on SOLR-6991:


bq. In fact this is not TIKA's issue and not new, a lot of stuff around Hadoop 
in Solr fails with Turkish!

...my point is: it's new to Solr.

in all other cases where POSIX_SPAWN impacts Solr, we either:
* deal with it in the solr code, so we give a meaningful error to the user 
explaining the problem (ie: SystemInfoHandler)
* it's in an optional feature that *NEVER* worked with turkish -- ie: the 
hadoop / morephlines contribs, from the first version it was available in Solr, 
would not work with turkish locale

...in this case, we're talking about an _existing_ solr feature, that has 
previously worked fine if you run older Solr with turkish, and now when 
upgrading to 5.0 you're going to get a weird error message.

if there's nothing better we can do keep the ExtractionRequestHandler working 
or users who upgrade (even if they run with turkish) then i'm fine with assumes 
in the tests and notes in the docs ... i was just hoping you'd have a better 
idea.

in particular: I'm still wondering if we can leverage the classpath in a way to 
override the default TesseractOCRConfig.properties file in the tika-parsers 
jar with our own version that prevents tesseract from being used.  (i agree 
it's not worth switching to explicitly whitelisting the parsers in Solr code, 
but is there an easy way to blacklist this parser and/or other parsers we know 
are problematic?)


 Update to Apache TIKA 1.7
 -

 Key: SOLR-6991
 URL: https://issues.apache.org/jira/browse/SOLR-6991
 Project: Solr
  Issue Type: Improvement
  Components: contrib - Solr Cell (Tika extraction)
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 5.0, Trunk, 5.1

 Attachments: SOLR-6991-forkfix.patch, SOLR-6991.patch, SOLR-6991.patch


 Apache TIKA 1.7 was released: 
 [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt]
 This is more or less a dependency update, so replacements. Not sure if we 
 should do this for 5.0. In 5.0 we currently have the previous version, which 
 was not yet released with Solr. If we now bring this into 5.0, we wouldn't 
 have a new release 2 times. I can change the stuff this evening and let it 
 bake in 5.x, so maybe we backport this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6960) Config reporting handler is missing initParams defaults

2015-01-21 Thread Alexandre Rafalovitch (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285895#comment-14285895
 ] 

Alexandre Rafalovitch commented on SOLR-6960:
-

This should really be a blocker for 5.0 as it affects the default example 
collections. Without this, we cannot claim to actually have Config Report 
Handler as a new feature.

 Config reporting handler is missing initParams defaults
 ---

 Key: SOLR-6960
 URL: https://issues.apache.org/jira/browse/SOLR-6960
 Project: Solr
  Issue Type: Bug
Affects Versions: 5.0
Reporter: Alexandre Rafalovitch
 Fix For: 5.0


 *curl http://localhost:8983/solr/techproducts/config/requestHandler* produces 
 (fragments):
 {quote}
   /update:{
 name:/update,
 class:org.apache.solr.handler.UpdateRequestHandler,
 defaults:{}},
   /update/json/docs:{
 name:/update/json/docs,
 class:org.apache.solr.handler.UpdateRequestHandler,
 defaults:{
   update.contentType:application/json,
   json.command:false}},
 {quote}
 Where are the defaults from initParams:
 {quote}
 initParams path=/update/**,/query,/select,/tvrh,/elevate,/spell,/browse
 lst name=defaults
   str name=dftext/str
 /lst
 /initParams
   initParams path=/update/json/docs
 lst name=defaults
   str name=srcField\_src_/str
   str name=mapUniqueKeyOnlytrue/str
 /lst
   /initParams
 {quote}
 Obviously, a test is missing as well to catch this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.8.0) - Build # 1943 - Failure!

2015-01-21 Thread Anshum Gupta
Is this a Tika 1.7 upgrade related failure?

On Tue, Jan 20, 2015 at 10:47 PM, Policeman Jenkins Server 
jenk...@thetaphi.de wrote:

 Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/1943/
 Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC

 1 tests failed.
 FAILED:
 org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.testXPath

 Error Message:
 posix_spawn is not a supported process launch mechanism on this platform.

 Stack Trace:
 java.lang.Error: posix_spawn is not a supported process launch mechanism
 on this platform.
 at
 __randomizedtesting.SeedInfo.seed([58A6FBEB77E81527:26F735786F5F7761]:0)
 at java.lang.UNIXProcess$1.run(UNIXProcess.java:105)
 at java.lang.UNIXProcess$1.run(UNIXProcess.java:94)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.lang.UNIXProcess.clinit(UNIXProcess.java:92)
 at java.lang.ProcessImpl.start(ProcessImpl.java:130)
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
 at java.lang.Runtime.exec(Runtime.java:620)
 at java.lang.Runtime.exec(Runtime.java:485)
 at
 org.apache.tika.parser.external.ExternalParser.check(ExternalParser.java:344)
 at
 org.apache.tika.parser.ocr.TesseractOCRParser.hasTesseract(TesseractOCRParser.java:117)
 at
 org.apache.tika.parser.ocr.TesseractOCRParser.getSupportedTypes(TesseractOCRParser.java:90)
 at
 org.apache.tika.parser.CompositeParser.getParsers(CompositeParser.java:81)
 at
 org.apache.tika.parser.DefaultParser.getParsers(DefaultParser.java:95)
 at
 org.apache.tika.parser.CompositeParser.getSupportedTypes(CompositeParser.java:229)
 at
 org.apache.tika.parser.CompositeParser.getParsers(CompositeParser.java:81)
 at
 org.apache.tika.parser.CompositeParser.getParser(CompositeParser.java:209)
 at
 org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
 at
 org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
 at
 org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219)
 at
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
 at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:144)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:2006)
 at
 org.apache.solr.util.TestHarness.queryAndResponse(TestHarness.java:353)
 at
 org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.loadLocalFromHandler(ExtractingRequestHandlerTest.java:703)
 at
 org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.loadLocal(ExtractingRequestHandlerTest.java:710)
 at
 org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.testXPath(ExtractingRequestHandlerTest.java:474)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:483)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877)
 at
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
 at
 org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
 at
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at
 org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
 at
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
 at
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
 at
 com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798)
 at
 com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458)
 at
 

[JENKINS] Lucene-Solr-5.x-Linux (32bit/jdk1.7.0_72) - Build # 11495 - Failure!

2015-01-21 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-Linux/11495/
Java: 32bit/jdk1.7.0_72 -server -XX:+UseParallelGC

1 tests failed.
FAILED:  org.apache.solr.cloud.ReplicationFactorTest.testDistribSearch

Error Message:
Error from server at http://127.0.0.1:57852/sus/za/repfacttest_c8n_2x2: The 
target server failed to respond

Stack Trace:
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at http://127.0.0.1:57852/sus/za/repfacttest_c8n_2x2: The target 
server failed to respond
at 
__randomizedtesting.SeedInfo.seed([C9F1855E170E9BAE:48170B466051FB92]:0)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:558)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:214)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:210)
at 
org.apache.solr.cloud.ReplicationFactorTest.sendNonDirectUpdateRequestReplica(ReplicationFactorTest.java:195)
at 
org.apache.solr.cloud.ReplicationFactorTest.testRf2NotUsingDirectUpdates(ReplicationFactorTest.java:165)
at 
org.apache.solr.cloud.ReplicationFactorTest.doTest(ReplicationFactorTest.java:129)
at 
org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:878)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54)

[jira] [Resolved] (SOLR-6845) Add buildOnStartup option for suggesters

2015-01-21 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SOLR-6845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomás Fernández Löbbe resolved SOLR-6845.
-
Resolution: Fixed

 Add buildOnStartup option for suggesters
 

 Key: SOLR-6845
 URL: https://issues.apache.org/jira/browse/SOLR-6845
 Project: Solr
  Issue Type: Bug
Reporter: Hoss Man
Assignee: Tomás Fernández Löbbe
 Fix For: Trunk, 5.1

 Attachments: SOLR-6845.patch, SOLR-6845.patch, SOLR-6845.patch


 SOLR-6679 was filed to track the investigation into the following problem...
 {panel}
 The stock solrconfig provides a bad experience with a large index... start up 
 Solr and it will spin at 100% CPU for minutes, unresponsive, while it 
 apparently builds a suggester index.
 ...
 This is what I did:
 1) indexed 10M very small docs (only takes a few minutes).
 2) shut down Solr
 3) start up Solr and watch it be unresponsive for over 4 minutes!
 I didn't even use any of the fields specified in the suggester config and I 
 never called the suggest request handler.
 {panel}
 ..but ultimately focused on removing/disabling the suggester from the sample 
 configs.
 Opening this new issue to focus on actually trying to identify the root 
 problem  fix it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6993) install_solr_service.sh won't install on RHEL / CentOS

2015-01-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285890#comment-14285890
 ] 

ASF subversion and git services commented on SOLR-6993:
---

Commit 1653603 from [~thelabdude] in branch 'dev/branches/lucene_solr_5_0'
[ https://svn.apache.org/r1653603 ]

SOLR-6993: install_solr_service.sh won't install on RHEL / CentOS

 install_solr_service.sh won't install on RHEL / CentOS
 --

 Key: SOLR-6993
 URL: https://issues.apache.org/jira/browse/SOLR-6993
 Project: Solr
  Issue Type: Bug
  Components: scripts and tools
Affects Versions: 5.0, Trunk
 Environment: RHEL 6.5 / CentOS 6.5
Reporter: David Anderson
Assignee: Timothy Potter
Priority: Blocker
 Fix For: 5.0, Trunk

 Attachments: SOLR-6993.patch


 There's a bug that will prevent install_solr_service.sh from working on RHEL 
 / CentOS 6.5.  It works on Ubuntu 14.  Appears to be some obscure difference 
 in bash expression evaluation behavior.
 line 87 and 89:SOLR_DIR=${SOLR_INSTALL_FILE:0:-4}
 blows up with this error:
 ./install_solr_service.sh: line 87: -4: substring expression  0
 this results in the archive not being extracted and rest of the script won't 
 work.
 I tested a simple change:
   SOLR_DIR=${SOLR_INSTALL_FILE%.tgz}
 and verified it works on both RHEL 6.5 and Ubuntu 14
 Patch is attached.  I set this to Major thinking that not being able to 
 install on CentOS is worth fixing prior to release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6134) fix typos

2015-01-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285958#comment-14285958
 ] 

ASF subversion and git services commented on LUCENE-6134:
-

Commit 1653616 from [~sar...@syr.edu] in branch 'dev/branches/lucene_solr_5_0'
[ https://svn.apache.org/r1653616 ]

LUCENE-6134: fix typos in lucene/CHANGES.txt and solr/CHANGES.txt (merged 
branch_5x r1653615)

 fix typos
 -

 Key: LUCENE-6134
 URL: https://issues.apache.org/jira/browse/LUCENE-6134
 Project: Lucene - Core
  Issue Type: Task
Reporter: Steve Rowe
Assignee: Steve Rowe
Priority: Trivial
 Attachments: LUCENE-6134-CHANGES.txt-s.patch, 
 LUCENE-6134-its-its.patch, 
 LUCENE-6134-necessary-whether-initializ-specified.patch


 I found a bunch of typos, will fix under this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6521) CloudSolrServer should synchronize cache cluster state loading

2015-01-21 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285985#comment-14285985
 ] 

Noble Paul commented on SOLR-6521:
--

bq.The patch is locking the entire cache for all loading, which might not be an 
ideal solution for a cluster with many

I understand that. In reality , different  collections expire at different time 
.so everyone waiting on the lock would be a rare thing.  The common use case is 
one collection expired  and every thread is trying to refresh that 
simultaneously.

I agree that the concurrency can be dramatically improved . Using Guava may not 
be an option because it is not yet a dependency on SolrJ. The other option 
would be to make the cache pluggable through an API . So ,if you have Guava or 
something else in your package you can plug it in through an API

 CloudSolrServer should synchronize cache cluster state loading
 --

 Key: SOLR-6521
 URL: https://issues.apache.org/jira/browse/SOLR-6521
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Jessica Cheng Mallet
Assignee: Noble Paul
Priority: Critical
  Labels: SolrCloud
 Fix For: 5.0, Trunk

 Attachments: SOLR-6521.patch


 Under heavy load-testing with the new solrj client that caches the cluster 
 state instead of setting a watcher, I started seeing lots of zk connection 
 loss on the client-side when refreshing the CloudSolrServer 
 collectionStateCache, and this was causing crazy client-side 99.9% latency 
 (~15 sec). I swapped the cache out with guava's LoadingCache (which does 
 locking to ensure only one thread loads the content under one key while the 
 other threads that want the same key wait) and the connection loss went away 
 and the 99.9% latency also went down to just about 1 sec.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7005) facet.heatmap for spatial heatmap faceting on RPT

2015-01-21 Thread David Smiley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated SOLR-7005:
---
Attachment: heatmap_64x32.png
heatmap_512x256.png

There are some performance #'s on LUCENE-6191.

I experimented with generating a PNG to carry the data in a compressed manner, 
since this data can get large.  I'm abusing the image to carry the same detail 
in the counts, and that means 4 bytes per pixel.  Counts  16M touch the high 
byte of a 4-byte int, which is where the alpha channel is, which will 
progressively lighten the image.  _The image is not at all optimized for human 
viewing that is pleasant on the eyes_, except for the bit flip of the high 
(alpha channel) byte; otherwise you would see nothing until the counts exceed 
this figure.  That said, it's crude and you can get a sense of it.  _If people 
have input on how to cheaply and easily tweak the value to look nicer, I'm 
interested._  Since a client app may consume this PNG if it wants this 
compressed format and render it the way it wants to, there should be a 
straight-forward algorithm to derive the count from the ARGB (alpha, red, 
green, blue) int.

The attached PNG is 512x256 (131,072 cells mind you!) of the 8.5M geonames data 
set.  On a 16 segment index with no search filters, it took 882ms to compute 
the underlying heatmap, and 218ms to build the PNG and write it to disk.  The 
write-to-disk hack is temporary to easily view the image by opening it from the 
file system.  You can expect there will be more time in consuming this image 
from Solr's javabin/XML/JSON + base64 wrapper (whatever you choose).

Now a 512x256 image is so detailed that it arguably isn't a heatmap but another 
way to go about rendering individual points.  A more course, say, 64x32 image 
would be more true to the heatmap label, and obviously much faster to generate 
-- like 100ms + only ~2ms to generate the PNG.

 facet.heatmap for spatial heatmap faceting on RPT
 -

 Key: SOLR-7005
 URL: https://issues.apache.org/jira/browse/SOLR-7005
 Project: Solr
  Issue Type: New Feature
  Components: spatial
Reporter: David Smiley
Assignee: David Smiley
 Fix For: 5.1

 Attachments: heatmap_512x256.png, heatmap_64x32.png


 This is a new feature that uses the new spatial Heatmap / 2D PrefixTree cell 
 counter in Lucene spatial LUCENE-6191.  This is a form of faceting, and 
 as-such I think it should live in the facet parameter namespace.  Here's 
 what the parameters are:
 * facet=true
 * facet.heatmap=fieldname
 * facet.heatmap.bbox=\[-180 -90 TO 180 90]
 * facet.heatmap.gridLevel=6
 * facet.heatmap.distErrPct=0.10
 Like other faceting features, the fieldName can have local-params to exclude 
 filter queries or specify an output key.
 The bbox is optional; you get the whole world or you can specify a box or 
 actually any shape that WKT supports (you get the bounding box of whatever 
 you put).
 Ultimately, this feature needs to know the grid level, which together with 
 the input shape will yield a certain number of cells.  You can specify 
 gridLevel exactly, or don't and instead provide distErrPct which is computed 
 like it is for the RPT field type as seen in the schema.  0.10 yielded ~4k 
 cells but it'll vary.  There's also a facet.heatmap.maxCells safety net 
 defaulting to 100k.  Exceed this and you get an error.
 The output is (JSON):
 {noformat}
 {gridLevel=6,columns=64,rows=64,minX=-180.0,maxX=180.0,minY=-90.0,maxY=90.0,counts=[[0,
  0, 2, 1, ],[1, 1, 3, 2, ...],...]}
 {noformat}
 counts is null if all would be 0.  Perhaps individual row arrays should 
 likewise be null... I welcome feedback.
 I'm toying with an output format option in which you can specify a base-64'ed 
 grayscale PNG.
 Obviously this should support sharded / distributed environments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6928) solr.cmd stop works only in english

2015-01-21 Thread Timothy Potter (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285945#comment-14285945
 ] 

Timothy Potter commented on SOLR-6928:
--

awesome suggestion Jan! testing your idea now and will get committed for 5

 solr.cmd stop works only in english
 ---

 Key: SOLR-6928
 URL: https://issues.apache.org/jira/browse/SOLR-6928
 Project: Solr
  Issue Type: Bug
  Components: scripts and tools
Affects Versions: 4.10.3
 Environment: german windows 7
Reporter: john.work
Assignee: Timothy Potter
Priority: Minor

 in solr.cmd the stop doesnt work while executing 'netstat -nao ^| find /i 
 listening ^| find :%SOLR_PORT%' so listening is not found.
 e.g. in german cmd.exe the netstat -nao prints the following output:
 {noformat}
   Proto  Lokale Adresse Remoteadresse  Status   PID
   TCP0.0.0.0:80 0.0.0.0:0  ABHÖREN 4
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6134) fix typos

2015-01-21 Thread Steve Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated LUCENE-6134:
---
Attachment: LUCENE-6134-CHANGES.txt-s.patch

Patch fixing typos in {{lucene/CHANGES.txt}} and {{solr/CHANGES.txt}}.  

Committing shortly.

 fix typos
 -

 Key: LUCENE-6134
 URL: https://issues.apache.org/jira/browse/LUCENE-6134
 Project: Lucene - Core
  Issue Type: Task
Reporter: Steve Rowe
Assignee: Steve Rowe
Priority: Trivial
 Attachments: LUCENE-6134-CHANGES.txt-s.patch, 
 LUCENE-6134-its-its.patch, 
 LUCENE-6134-necessary-whether-initializ-specified.patch


 I found a bunch of typos, will fix under this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-7012) add an ant target to package a plugin into a jar

2015-01-21 Thread Noble Paul (JIRA)
Noble Paul created SOLR-7012:


 Summary: add an ant target to package a plugin into a jar
 Key: SOLR-7012
 URL: https://issues.apache.org/jira/browse/SOLR-7012
 Project: Solr
  Issue Type: Sub-task
Reporter: Noble Paul
Assignee: Noble Paul


Now it is extremely hard to create  plugin because the user do not know about 
the exact dependencies and their poms
we will add a target to solr/build.xml called plugin-jar
invoke it as follows

{code}
ant -Dplugin.package=my.package -Djar.location=/tmp/my.jar plugin-jar
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6845) Add buildOnStartup option for suggesters

2015-01-21 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SOLR-6845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomás Fernández Löbbe updated SOLR-6845:

Assignee: Tomás Fernández Löbbe
 Summary: Add buildOnStartup option for suggesters  (was: figure out why 
suggester causes slow startup - even when not used)

Changed summary to reflect the actual change done

 Add buildOnStartup option for suggesters
 

 Key: SOLR-6845
 URL: https://issues.apache.org/jira/browse/SOLR-6845
 Project: Solr
  Issue Type: Bug
Reporter: Hoss Man
Assignee: Tomás Fernández Löbbe
 Attachments: SOLR-6845.patch, SOLR-6845.patch, SOLR-6845.patch


 SOLR-6679 was filed to track the investigation into the following problem...
 {panel}
 The stock solrconfig provides a bad experience with a large index... start up 
 Solr and it will spin at 100% CPU for minutes, unresponsive, while it 
 apparently builds a suggester index.
 ...
 This is what I did:
 1) indexed 10M very small docs (only takes a few minutes).
 2) shut down Solr
 3) start up Solr and watch it be unresponsive for over 4 minutes!
 I didn't even use any of the fields specified in the suggester config and I 
 never called the suggest request handler.
 {panel}
 ..but ultimately focused on removing/disabling the suggester from the sample 
 configs.
 Opening this new issue to focus on actually trying to identify the root 
 problem  fix it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6928) solr.cmd stop works only in english

2015-01-21 Thread Timothy Potter (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Potter updated SOLR-6928:
-
Attachment: SOLR-6928.patch

Here's a patch that builds upon Jan's, but required checking for PID==0 because 
it was still finding something that wasn't listening:

  Proto  Local Address  Foreign AddressState   PID
  TCP127.0.0.1:49204127.0.0.1:8983 TIME_WAIT   0

According to the docs, the PID 0 is for a pseudo-idle process so the script 
could ignore those and keep looping to find the actual listening process.

This patch works well on English Windows ... I don't have access to a German 
Windows box, can someone test please?

 solr.cmd stop works only in english
 ---

 Key: SOLR-6928
 URL: https://issues.apache.org/jira/browse/SOLR-6928
 Project: Solr
  Issue Type: Bug
  Components: scripts and tools
Affects Versions: 4.10.3
 Environment: german windows 7
Reporter: john.work
Assignee: Timothy Potter
Priority: Minor
 Attachments: SOLR-6928.patch


 in solr.cmd the stop doesnt work while executing 'netstat -nao ^| find /i 
 listening ^| find :%SOLR_PORT%' so listening is not found.
 e.g. in german cmd.exe the netstat -nao prints the following output:
 {noformat}
   Proto  Lokale Adresse Remoteadresse  Status   PID
   TCP0.0.0.0:80 0.0.0.0:0  ABHÖREN 4
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Reopened] (SOLR-6991) Update to Apache TIKA 1.7

2015-01-21 Thread Steve Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe reopened SOLR-6991:
--

Reopening to address this Mac OS X failure in solr-cell:

{noformat}
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/1943/
Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC

1 tests failed.
FAILED:  
org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.testXPath

...

  [junit4]   2 NOTE: reproduce with: ant test  
-Dtestcase=ExtractingRequestHandlerTest -Dtests.method=testXPath 
-Dtests.seed=58A6FBEB77E81527 -Dtests.slow=true -Dtests.locale=tr_TR 
-Dtests.timezone=Etc/GMT+3 -Dtests.asserts=true -Dtests.file.encoding=US-ASCII
  [junit4] ERROR   2.57s | ExtractingRequestHandlerTest.testXPath 
  [junit4] Throwable #1: java.lang.Error: posix_spawn is not a supported 
process launch mechanism on this platform.
  [junit4] at 
__randomizedtesting.SeedInfo.seed([58A6FBEB77E81527:26F735786F5F7761]:0)
  [junit4] at java.lang.UNIXProcess$1.run(UNIXProcess.java:105)
  [junit4] at java.lang.UNIXProcess$1.run(UNIXProcess.java:94)
  [junit4] at java.security.AccessController.doPrivileged(Native 
Method)
  [junit4] at java.lang.UNIXProcess.clinit(UNIXProcess.java:92)
  [junit4] at java.lang.ProcessImpl.start(ProcessImpl.java:130)
  [junit4] at 
java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
  [junit4] at java.lang.Runtime.exec(Runtime.java:620)
  [junit4] at java.lang.Runtime.exec(Runtime.java:485)
  [junit4] at 
org.apache.tika.parser.external.ExternalParser.check(ExternalParser.java:344)
  [junit4] at 
org.apache.tika.parser.ocr.TesseractOCRParser.hasTesseract(TesseractOCRParser.java:117)
  [junit4] at 
org.apache.tika.parser.ocr.TesseractOCRParser.getSupportedTypes(TesseractOCRParser.java:90)
  [junit4] at 
org.apache.tika.parser.CompositeParser.getParsers(CompositeParser.java:81)
  [junit4] at 
org.apache.tika.parser.DefaultParser.getParsers(DefaultParser.java:95)
  [junit4] at 
org.apache.tika.parser.CompositeParser.getSupportedTypes(CompositeParser.java:229)
  [junit4] at 
org.apache.tika.parser.CompositeParser.getParsers(CompositeParser.java:81)
  [junit4] at 
org.apache.tika.parser.CompositeParser.getParser(CompositeParser.java:209)
  [junit4] at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
  [junit4] at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
  [junit4] at 
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219)
  [junit4] at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
  [junit4] at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:144)
  [junit4] at 
org.apache.solr.core.SolrCore.execute(SolrCore.java:2006)
  [junit4] at 
org.apache.solr.util.TestHarness.queryAndResponse(TestHarness.java:353)
  [junit4] at 
org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.loadLocalFromHandler(ExtractingRequestHandlerTest.java:703)
  [junit4] at 
org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.loadLocal(ExtractingRequestHandlerTest.java:710)
  [junit4] at 
org.apache.solr.handler.extraction.ExtractingRequestHandlerTest.testXPath(ExtractingRequestHandlerTest.java:474)
  [junit4] at java.lang.Thread.run(Thread.java:745)
{noformat}

 Update to Apache TIKA 1.7
 -

 Key: SOLR-6991
 URL: https://issues.apache.org/jira/browse/SOLR-6991
 Project: Solr
  Issue Type: Improvement
  Components: contrib - Solr Cell (Tika extraction)
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 5.0, Trunk, 5.1

 Attachments: SOLR-6991.patch, SOLR-6991.patch


 Apache TIKA 1.7 was released: 
 [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt]
 This is more or less a dependency update, so replacements. Not sure if we 
 should do this for 5.0. In 5.0 we currently have the previous version, which 
 was not yet released with Solr. If we now bring this into 5.0, we wouldn't 
 have a new release 2 times. I can change the stuff this evening and let it 
 bake in 5.x, so maybe we backport this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-6192) Long overflow in LuceneXXSkipWriter can corrupt skip data

2015-01-21 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-6192.

Resolution: Fixed

Resolving ... Tom can you post back here the results of testing with this fix?  
 Thanks.  Hopefully this is the bug you were hitting!

 Long overflow in LuceneXXSkipWriter can corrupt skip data
 -

 Key: LUCENE-6192
 URL: https://issues.apache.org/jira/browse/LUCENE-6192
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 5.0, Trunk, 4.x

 Attachments: LUCENE-6192.patch


 I've been iterating with Tom on this corruption that CheckIndex detects in 
 his rather large index (720 GB in a single segment):
 {noformat}
  java -Xmx16G -Xms16G -cp $JAR -ea:org.apache.lucene... 
 org.apache.lucene.index.CheckIndex //shards/4/core-1/data/test_index 
 -verbose 21 |tee -a shard4_reoptimizedNewJava
 Opening index @ /htsolr/lss-reindex/shards/4/core-1/data/test_index
 Segments file=segments_e numSegments=1 version=4.10.2 format= 
 userData={commitTimeMSec=1421479358825}
   1 of 1: name=_8m8 docCount=1130856
 version=4.10.2
 codec=Lucene410
 compound=false
 numFiles=10
 size (MB)=719,967.32
 diagnostics = {timestamp=1421437320935, os=Linux, 
 os.version=2.6.18-400.1.1.el5, mergeFactor=2, source=merge, 
 lucene.version=4.10.2, os.arch=amd64, mergeMaxNumSegments=1, 
 java.version=1.7.0_71, java.vendor=Oracle Corporation}
 no deletions
 test: open reader.OK
 test: check integrity.OK
 test: check live docs.OK
 test: fields..OK [80 fields]
 test: field norms.OK [23 fields]
 test: terms, freq, prox...ERROR: java.lang.AssertionError: -96
 java.lang.AssertionError: -96
 at 
 org.apache.lucene.codecs.lucene41.ForUtil.skipBlock(ForUtil.java:228)
 at 
 org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.skipPositions(Lucene41PostingsReader.java:925)
 at 
 org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.nextPosition(Lucene41PostingsReader.java:955)
 at 
 org.apache.lucene.index.CheckIndex.checkFields(CheckIndex.java:1100)
 at 
 org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:1357)
 at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:655)
 at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096)
 test: stored fields...OK [67472796 total field count; avg 59.665 
 fields per doc]
 test: term vectorsOK [0 total vector count; avg 0 term/freq 
 vector fields per doc]
 test: docvalues...OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 
 SORTED; 0 SORTED_NUMERIC; 0 SORTED_SET]
 FAILED
 WARNING: fixIndex() would remove reference to this segment; full 
 exception:
 java.lang.RuntimeException: Term Index test failed
 at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:670)
 at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096)
 WARNING: 1 broken segments (containing 1130856 documents) detected
 WARNING: would write new segments file, and 1130856 documents would be lost, 
 if -fix were specified
 {noformat}
 And Rob spotted long - int casts in our skip list writers that look like 
 they could cause such corruption if a single high-freq term with many 
 positions required  2.1 GB to write its positions into .pos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6192) Long overflow in LuceneXXSkipWriter can corrupt skip data

2015-01-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285835#comment-14285835
 ] 

ASF subversion and git services commented on LUCENE-6192:
-

Commit 1653585 from [~mikemccand] in branch 'dev/branches/lucene_solr_5_0'
[ https://svn.apache.org/r1653585 ]

LUCENE-6192: don't overflow int when writing skip data for high freq terms in 
extremely large indices

 Long overflow in LuceneXXSkipWriter can corrupt skip data
 -

 Key: LUCENE-6192
 URL: https://issues.apache.org/jira/browse/LUCENE-6192
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 5.0, Trunk, 4.x

 Attachments: LUCENE-6192.patch


 I've been iterating with Tom on this corruption that CheckIndex detects in 
 his rather large index (720 GB in a single segment):
 {noformat}
  java -Xmx16G -Xms16G -cp $JAR -ea:org.apache.lucene... 
 org.apache.lucene.index.CheckIndex //shards/4/core-1/data/test_index 
 -verbose 21 |tee -a shard4_reoptimizedNewJava
 Opening index @ /htsolr/lss-reindex/shards/4/core-1/data/test_index
 Segments file=segments_e numSegments=1 version=4.10.2 format= 
 userData={commitTimeMSec=1421479358825}
   1 of 1: name=_8m8 docCount=1130856
 version=4.10.2
 codec=Lucene410
 compound=false
 numFiles=10
 size (MB)=719,967.32
 diagnostics = {timestamp=1421437320935, os=Linux, 
 os.version=2.6.18-400.1.1.el5, mergeFactor=2, source=merge, 
 lucene.version=4.10.2, os.arch=amd64, mergeMaxNumSegments=1, 
 java.version=1.7.0_71, java.vendor=Oracle Corporation}
 no deletions
 test: open reader.OK
 test: check integrity.OK
 test: check live docs.OK
 test: fields..OK [80 fields]
 test: field norms.OK [23 fields]
 test: terms, freq, prox...ERROR: java.lang.AssertionError: -96
 java.lang.AssertionError: -96
 at 
 org.apache.lucene.codecs.lucene41.ForUtil.skipBlock(ForUtil.java:228)
 at 
 org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.skipPositions(Lucene41PostingsReader.java:925)
 at 
 org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.nextPosition(Lucene41PostingsReader.java:955)
 at 
 org.apache.lucene.index.CheckIndex.checkFields(CheckIndex.java:1100)
 at 
 org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:1357)
 at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:655)
 at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096)
 test: stored fields...OK [67472796 total field count; avg 59.665 
 fields per doc]
 test: term vectorsOK [0 total vector count; avg 0 term/freq 
 vector fields per doc]
 test: docvalues...OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 
 SORTED; 0 SORTED_NUMERIC; 0 SORTED_SET]
 FAILED
 WARNING: fixIndex() would remove reference to this segment; full 
 exception:
 java.lang.RuntimeException: Term Index test failed
 at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:670)
 at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096)
 WARNING: 1 broken segments (containing 1130856 documents) detected
 WARNING: would write new segments file, and 1130856 documents would be lost, 
 if -fix were specified
 {noformat}
 And Rob spotted long - int casts in our skip list writers that look like 
 they could cause such corruption if a single high-freq term with many 
 positions required  2.1 GB to write its positions into .pos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6191) Spatial 2D faceting (heatmaps)

2015-01-21 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285863#comment-14285863
 ] 

David Smiley commented on LUCENE-6191:
--

delta stats:
 * *Segments: 16 (no deleted)*
 * docs: 8,526,175  (slightly less than before, not sure why).
 * QuadTree
 precision: 22 (better than 10m)
 bounds: -180 to 180, -180 to 180 (360x360 square)
 * Disk index size: 2.35GB
 * heatmap input range: -180 to 180, -89.999 to 89.999 (slightly inset so 
heatmap doesn't include a row just  90 and just  90)

512x256 (131,072 cells) heatmap : 882ms
64x32 (2048 cells) heatmap: 120ms

 Spatial 2D faceting (heatmaps)
 --

 Key: LUCENE-6191
 URL: https://issues.apache.org/jira/browse/LUCENE-6191
 Project: Lucene - Core
  Issue Type: New Feature
  Components: modules/spatial
Reporter: David Smiley
Assignee: David Smiley
 Fix For: 5.1

 Attachments: LUCENE-6191__Spatial_heatmap.patch


 Lucene spatial's PrefixTree (grid) based strategies index data in a way 
 highly amenable to faceting on grids cells to compute a so-called _heatmap_. 
 The underlying code in this patch uses the PrefixTreeFacetCounter utility 
 class which was recently refactored out of faceting for NumberRangePrefixTree 
 LUCENE-5735.  At a low level, the terms (== grid cells) are navigated 
 per-segment, forward only with TermsEnum.seek, so it's pretty quick and 
 furthermore requires no extra caches  no docvalues.  Ideally you should use 
 QuadPrefixTree (or Flex once it comes out) to maximize the number grid levels 
 which in turn maximizes the fidelity of choices when you ask for a grid 
 covering a region.  Conveniently, the provided capability returns the data in 
 a 2-D grid of counts, so the caller needn't know a thing about how the data 
 is encoded in the prefix tree.  Well almost... at this point they need to 
 provide a grid level, but I'll soon provide a means of deriving the grid 
 level based on a min/max cell count.
 I recommend QuadPrefixTree with geo=false so that you can provide a square 
 world-bounds (360x360 degrees), which means square grid cells which are more 
 desirable to display than rectangular cells.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6192) Long overflow in LuceneXXSkipWriter can corrupt skip data

2015-01-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285862#comment-14285862
 ] 

ASF subversion and git services commented on LUCENE-6192:
-

Commit 1653594 from [~mikemccand] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1653594 ]

LUCENE-6192: don't overflow int when writing skip data for high freq terms in 
extremely large indices

 Long overflow in LuceneXXSkipWriter can corrupt skip data
 -

 Key: LUCENE-6192
 URL: https://issues.apache.org/jira/browse/LUCENE-6192
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 5.0, Trunk, 4.x

 Attachments: LUCENE-6192.patch


 I've been iterating with Tom on this corruption that CheckIndex detects in 
 his rather large index (720 GB in a single segment):
 {noformat}
  java -Xmx16G -Xms16G -cp $JAR -ea:org.apache.lucene... 
 org.apache.lucene.index.CheckIndex //shards/4/core-1/data/test_index 
 -verbose 21 |tee -a shard4_reoptimizedNewJava
 Opening index @ /htsolr/lss-reindex/shards/4/core-1/data/test_index
 Segments file=segments_e numSegments=1 version=4.10.2 format= 
 userData={commitTimeMSec=1421479358825}
   1 of 1: name=_8m8 docCount=1130856
 version=4.10.2
 codec=Lucene410
 compound=false
 numFiles=10
 size (MB)=719,967.32
 diagnostics = {timestamp=1421437320935, os=Linux, 
 os.version=2.6.18-400.1.1.el5, mergeFactor=2, source=merge, 
 lucene.version=4.10.2, os.arch=amd64, mergeMaxNumSegments=1, 
 java.version=1.7.0_71, java.vendor=Oracle Corporation}
 no deletions
 test: open reader.OK
 test: check integrity.OK
 test: check live docs.OK
 test: fields..OK [80 fields]
 test: field norms.OK [23 fields]
 test: terms, freq, prox...ERROR: java.lang.AssertionError: -96
 java.lang.AssertionError: -96
 at 
 org.apache.lucene.codecs.lucene41.ForUtil.skipBlock(ForUtil.java:228)
 at 
 org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.skipPositions(Lucene41PostingsReader.java:925)
 at 
 org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.nextPosition(Lucene41PostingsReader.java:955)
 at 
 org.apache.lucene.index.CheckIndex.checkFields(CheckIndex.java:1100)
 at 
 org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:1357)
 at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:655)
 at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096)
 test: stored fields...OK [67472796 total field count; avg 59.665 
 fields per doc]
 test: term vectorsOK [0 total vector count; avg 0 term/freq 
 vector fields per doc]
 test: docvalues...OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 
 SORTED; 0 SORTED_NUMERIC; 0 SORTED_SET]
 FAILED
 WARNING: fixIndex() would remove reference to this segment; full 
 exception:
 java.lang.RuntimeException: Term Index test failed
 at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:670)
 at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096)
 WARNING: 1 broken segments (containing 1130856 documents) detected
 WARNING: would write new segments file, and 1130856 documents would be lost, 
 if -fix were specified
 {noformat}
 And Rob spotted long - int casts in our skip list writers that look like 
 they could cause such corruption if a single high-freq term with many 
 positions required  2.1 GB to write its positions into .pos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-NightlyTests-5.x - Build # 740 - Still Failing

2015-01-21 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-5.x/740/

4 tests failed.
FAILED:  org.apache.solr.cloud.HttpPartitionTest.testDistribSearch

Error Message:
org.apache.http.NoHttpResponseException: The target server failed to respond

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: 
org.apache.http.NoHttpResponseException: The target server failed to respond
at 
__randomizedtesting.SeedInfo.seed([A40ECF0BF65E3F19:25E8411381015F25]:0)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:871)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:736)
at 
org.apache.solr.cloud.HttpPartitionTest.sendDoc(HttpPartitionTest.java:480)
at 
org.apache.solr.cloud.HttpPartitionTest.testRf2(HttpPartitionTest.java:201)
at 
org.apache.solr.cloud.HttpPartitionTest.doTest(HttpPartitionTest.java:114)
at 
org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:878)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 

[jira] [Commented] (LUCENE-6191) Spatial 2D faceting (heatmaps)

2015-01-21 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285809#comment-14285809
 ] 

David Smiley commented on LUCENE-6191:
--

I have some performance numbers taken while working on SOLR-7005.  I took a 
geonames data set of 8,552,952 docs and I indexed the latitude  longitude into 
a quad prefixTree with maximum resolution of a meter and with geo=false and 
-180 to 180, -90 to 90 world bounds of standard geodetic degree boundaries.  
That's a screw-up on my part; I forgot to use 360x360 to get square grid boxes 
instead of rectangular ones.  But that's not pertinent.  The index size is 
2.6GB which is kind of large.  Increasing the maximum resolution to above a 
meter will decrease the index size a lot.  This reminds me of how beneficial 
the forthcoming flex prefixTree will be, but I digress.  This data is all 
points.

Base stats:
*  Machine: my SSD based recent MacBook Pro, Java 8
*  Lucene/Solr: trunk as of last night
*  Docs: 8,552,952
*  Segments: 1
*  Disk index size: 2.6GB
*  QuadTree:
** precision: 26 (better than a meter)


512x512 heatmap, (_note: this is a whopping 262,144 cells_): 248ms   (PNG to be 
attached to SOLR-7005 soon).
Now filtered with an additional query down to 165 docs: 105ms  (I figure this 
fast number is due to a particular optimization in the prefix tree facet 
counter for highly discriminating filters).

64x64 heatmap (4,096 cells):  105ms
Filtered to 165 docs: 21ms

I took one measurement when the index was un-optimized at 38 segments, 
including 10K deleted docs (512x512 query all): 1800ms roughly. I should try 
this again after I re-index with the square grid cells I want.

 Spatial 2D faceting (heatmaps)
 --

 Key: LUCENE-6191
 URL: https://issues.apache.org/jira/browse/LUCENE-6191
 Project: Lucene - Core
  Issue Type: New Feature
  Components: modules/spatial
Reporter: David Smiley
Assignee: David Smiley
 Fix For: 5.1

 Attachments: LUCENE-6191__Spatial_heatmap.patch


 Lucene spatial's PrefixTree (grid) based strategies index data in a way 
 highly amenable to faceting on grids cells to compute a so-called _heatmap_. 
 The underlying code in this patch uses the PrefixTreeFacetCounter utility 
 class which was recently refactored out of faceting for NumberRangePrefixTree 
 LUCENE-5735.  At a low level, the terms (== grid cells) are navigated 
 per-segment, forward only with TermsEnum.seek, so it's pretty quick and 
 furthermore requires no extra caches  no docvalues.  Ideally you should use 
 QuadPrefixTree (or Flex once it comes out) to maximize the number grid levels 
 which in turn maximizes the fidelity of choices when you ask for a grid 
 covering a region.  Conveniently, the provided capability returns the data in 
 a 2-D grid of counts, so the caller needn't know a thing about how the data 
 is encoded in the prefix tree.  Well almost... at this point they need to 
 provide a grid level, but I'll soon provide a means of deriving the grid 
 level based on a min/max cell count.
 I recommend QuadPrefixTree with geo=false so that you can provide a square 
 world-bounds (360x360 degrees), which means square grid cells which are more 
 desirable to display than rectangular cells.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1653580 - in /lucene/dev/branches/branch_5x: ./ lucene/ lucene/backward-codecs/src/java/org/apache/lucene/codecs/lucene41/ lucene/core/ lucene/core/src/java/org/apache/lucene/codecs/l

2015-01-21 Thread Robert Muir
I think we should also fix Lucene41SkipWriter (src/test) ?

On Wed, Jan 21, 2015 at 11:24 AM,  mikemcc...@apache.org wrote:
 Author: mikemccand
 Date: Wed Jan 21 16:24:08 2015
 New Revision: 1653580

 URL: http://svn.apache.org/r1653580
 Log:
 LUCENE-6192: don't overflow int when writing skip data for high freq terms in 
 extremely large indices

 Modified:
 lucene/dev/branches/branch_5x/   (props changed)
 lucene/dev/branches/branch_5x/lucene/   (props changed)
 lucene/dev/branches/branch_5x/lucene/CHANGES.txt   (props changed)
 
 lucene/dev/branches/branch_5x/lucene/backward-codecs/src/java/org/apache/lucene/codecs/lucene41/Lucene41SkipReader.java
 lucene/dev/branches/branch_5x/lucene/core/   (props changed)
 
 lucene/dev/branches/branch_5x/lucene/core/src/java/org/apache/lucene/codecs/lucene50/Lucene50SkipReader.java
 
 lucene/dev/branches/branch_5x/lucene/core/src/java/org/apache/lucene/codecs/lucene50/Lucene50SkipWriter.java

 Modified: 
 lucene/dev/branches/branch_5x/lucene/backward-codecs/src/java/org/apache/lucene/codecs/lucene41/Lucene41SkipReader.java
 URL: 
 http://svn.apache.org/viewvc/lucene/dev/branches/branch_5x/lucene/backward-codecs/src/java/org/apache/lucene/codecs/lucene41/Lucene41SkipReader.java?rev=1653580r1=1653579r2=1653580view=diff
 ==
 --- 
 lucene/dev/branches/branch_5x/lucene/backward-codecs/src/java/org/apache/lucene/codecs/lucene41/Lucene41SkipReader.java
  (original)
 +++ 
 lucene/dev/branches/branch_5x/lucene/backward-codecs/src/java/org/apache/lucene/codecs/lucene41/Lucene41SkipReader.java
  Wed Jan 21 16:24:08 2015
 @@ -173,13 +173,13 @@ final class Lucene41SkipReader extends M
  // if (DEBUG) {
  //   System.out.println(  delta= + delta);
  // }
 -docPointer[level] += skipStream.readVInt();
 +docPointer[level] += skipStream.readVLong();
  // if (DEBUG) {
  //   System.out.println(  docFP= + docPointer[level]);
  // }

  if (posPointer != null) {
 -  posPointer[level] += skipStream.readVInt();
 +  posPointer[level] += skipStream.readVLong();
// if (DEBUG) {
//   System.out.println(  posFP= + posPointer[level]);
// }
 @@ -193,7 +193,7 @@ final class Lucene41SkipReader extends M
}

if (payPointer != null) {
 -payPointer[level] += skipStream.readVInt();
 +payPointer[level] += skipStream.readVLong();
}
  }
  return delta;

 Modified: 
 lucene/dev/branches/branch_5x/lucene/core/src/java/org/apache/lucene/codecs/lucene50/Lucene50SkipReader.java
 URL: 
 http://svn.apache.org/viewvc/lucene/dev/branches/branch_5x/lucene/core/src/java/org/apache/lucene/codecs/lucene50/Lucene50SkipReader.java?rev=1653580r1=1653579r2=1653580view=diff
 ==
 --- 
 lucene/dev/branches/branch_5x/lucene/core/src/java/org/apache/lucene/codecs/lucene50/Lucene50SkipReader.java
  (original)
 +++ 
 lucene/dev/branches/branch_5x/lucene/core/src/java/org/apache/lucene/codecs/lucene50/Lucene50SkipReader.java
  Wed Jan 21 16:24:08 2015
 @@ -179,10 +179,10 @@ final class Lucene50SkipReader extends M
@Override
protected int readSkipData(int level, IndexInput skipStream) throws 
 IOException {
  int delta = skipStream.readVInt();
 -docPointer[level] += skipStream.readVInt();
 +docPointer[level] += skipStream.readVLong();

  if (posPointer != null) {
 -  posPointer[level] += skipStream.readVInt();
 +  posPointer[level] += skipStream.readVLong();
posBufferUpto[level] = skipStream.readVInt();

if (payloadByteUpto != null) {
 @@ -190,7 +190,7 @@ final class Lucene50SkipReader extends M
}

if (payPointer != null) {
 -payPointer[level] += skipStream.readVInt();
 +payPointer[level] += skipStream.readVLong();
}
  }
  return delta;

 Modified: 
 lucene/dev/branches/branch_5x/lucene/core/src/java/org/apache/lucene/codecs/lucene50/Lucene50SkipWriter.java
 URL: 
 http://svn.apache.org/viewvc/lucene/dev/branches/branch_5x/lucene/core/src/java/org/apache/lucene/codecs/lucene50/Lucene50SkipWriter.java?rev=1653580r1=1653579r2=1653580view=diff
 ==
 --- 
 lucene/dev/branches/branch_5x/lucene/core/src/java/org/apache/lucene/codecs/lucene50/Lucene50SkipWriter.java
  (original)
 +++ 
 lucene/dev/branches/branch_5x/lucene/core/src/java/org/apache/lucene/codecs/lucene50/Lucene50SkipWriter.java
  Wed Jan 21 16:24:08 2015
 @@ -147,12 +147,12 @@ final class Lucene50SkipWriter extends M
  skipBuffer.writeVInt(delta);
  lastSkipDoc[level] = curDoc;

 -skipBuffer.writeVInt((int) (curDocPointer - lastSkipDocPointer[level]));
 +skipBuffer.writeVLong(curDocPointer - lastSkipDocPointer[level]);
  lastSkipDocPointer[level] = curDocPointer;

  if (fieldHasPositions) {

 

[jira] [Commented] (LUCENE-6192) Long overflow in LuceneXXSkipWriter can corrupt skip data

2015-01-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285842#comment-14285842
 ] 

ASF subversion and git services commented on LUCENE-6192:
-

Commit 1653588 from [~mikemccand] in branch 'dev/trunk'
[ https://svn.apache.org/r1653588 ]

LUCENE-6192: don't overflow int when writing skip data for high freq terms in 
extremely large indices

 Long overflow in LuceneXXSkipWriter can corrupt skip data
 -

 Key: LUCENE-6192
 URL: https://issues.apache.org/jira/browse/LUCENE-6192
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 5.0, Trunk, 4.x

 Attachments: LUCENE-6192.patch


 I've been iterating with Tom on this corruption that CheckIndex detects in 
 his rather large index (720 GB in a single segment):
 {noformat}
  java -Xmx16G -Xms16G -cp $JAR -ea:org.apache.lucene... 
 org.apache.lucene.index.CheckIndex //shards/4/core-1/data/test_index 
 -verbose 21 |tee -a shard4_reoptimizedNewJava
 Opening index @ /htsolr/lss-reindex/shards/4/core-1/data/test_index
 Segments file=segments_e numSegments=1 version=4.10.2 format= 
 userData={commitTimeMSec=1421479358825}
   1 of 1: name=_8m8 docCount=1130856
 version=4.10.2
 codec=Lucene410
 compound=false
 numFiles=10
 size (MB)=719,967.32
 diagnostics = {timestamp=1421437320935, os=Linux, 
 os.version=2.6.18-400.1.1.el5, mergeFactor=2, source=merge, 
 lucene.version=4.10.2, os.arch=amd64, mergeMaxNumSegments=1, 
 java.version=1.7.0_71, java.vendor=Oracle Corporation}
 no deletions
 test: open reader.OK
 test: check integrity.OK
 test: check live docs.OK
 test: fields..OK [80 fields]
 test: field norms.OK [23 fields]
 test: terms, freq, prox...ERROR: java.lang.AssertionError: -96
 java.lang.AssertionError: -96
 at 
 org.apache.lucene.codecs.lucene41.ForUtil.skipBlock(ForUtil.java:228)
 at 
 org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.skipPositions(Lucene41PostingsReader.java:925)
 at 
 org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.nextPosition(Lucene41PostingsReader.java:955)
 at 
 org.apache.lucene.index.CheckIndex.checkFields(CheckIndex.java:1100)
 at 
 org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:1357)
 at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:655)
 at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096)
 test: stored fields...OK [67472796 total field count; avg 59.665 
 fields per doc]
 test: term vectorsOK [0 total vector count; avg 0 term/freq 
 vector fields per doc]
 test: docvalues...OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 
 SORTED; 0 SORTED_NUMERIC; 0 SORTED_SET]
 FAILED
 WARNING: fixIndex() would remove reference to this segment; full 
 exception:
 java.lang.RuntimeException: Term Index test failed
 at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:670)
 at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096)
 WARNING: 1 broken segments (containing 1130856 documents) detected
 WARNING: would write new segments file, and 1130856 documents would be lost, 
 if -fix were specified
 {noformat}
 And Rob spotted long - int casts in our skip list writers that look like 
 they could cause such corruption if a single high-freq term with many 
 positions required  2.1 GB to write its positions into .pos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6976) Remove all methods and classes deprecated in 4.x from trunk and 5.x

2015-01-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285797#comment-14285797
 ] 

ASF subversion and git services commented on SOLR-6976:
---

Commit 1653566 from [~romseygeek] in branch 'dev/branches/lucene_solr_5_0'
[ https://svn.apache.org/r1653566 ]

SOLR-6976: Remove methods and classes deprecated in 4.x

 Remove all methods and classes deprecated in 4.x from trunk and 5.x
 ---

 Key: SOLR-6976
 URL: https://issues.apache.org/jira/browse/SOLR-6976
 Project: Solr
  Issue Type: Task
Reporter: Alan Woodward
Priority: Blocker
 Fix For: 5.0, Trunk

 Attachments: SOLR-6976.patch, SOLR-6976.patch, SOLR-6976.patch, 
 SOLR-sharkeys.patch


 We have a bunch of methods, classes, enums, etc which are marked as 
 deprecated in Solr code in the 4.x branch.  Some of them have been marked as 
 such since the 1.4 release.  Before we get 5.0 out, these should all be 
 removed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-6976) Remove all methods and classes deprecated in 4.x from trunk and 5.x

2015-01-21 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward resolved SOLR-6976.
-
Resolution: Fixed
  Assignee: Alan Woodward

 Remove all methods and classes deprecated in 4.x from trunk and 5.x
 ---

 Key: SOLR-6976
 URL: https://issues.apache.org/jira/browse/SOLR-6976
 Project: Solr
  Issue Type: Task
Reporter: Alan Woodward
Assignee: Alan Woodward
Priority: Blocker
 Fix For: 5.0, Trunk

 Attachments: SOLR-6976.patch, SOLR-6976.patch, SOLR-6976.patch, 
 SOLR-sharkeys.patch


 We have a bunch of methods, classes, enums, etc which are marked as 
 deprecated in Solr code in the 4.x branch.  Some of them have been marked as 
 such since the 1.4 release.  Before we get 5.0 out, these should all be 
 removed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6192) Long overflow in LuceneXXSkipWriter can corrupt skip data

2015-01-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285818#comment-14285818
 ] 

ASF subversion and git services commented on LUCENE-6192:
-

Commit 1653577 from [~mikemccand] in branch 'dev/branches/lucene_solr_4_10'
[ https://svn.apache.org/r1653577 ]

LUCENE-6192: don't overflow int when writing skip data for high freq terms in 
extremely large indices

 Long overflow in LuceneXXSkipWriter can corrupt skip data
 -

 Key: LUCENE-6192
 URL: https://issues.apache.org/jira/browse/LUCENE-6192
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 5.0, Trunk, 4.x

 Attachments: LUCENE-6192.patch


 I've been iterating with Tom on this corruption that CheckIndex detects in 
 his rather large index (720 GB in a single segment):
 {noformat}
  java -Xmx16G -Xms16G -cp $JAR -ea:org.apache.lucene... 
 org.apache.lucene.index.CheckIndex //shards/4/core-1/data/test_index 
 -verbose 21 |tee -a shard4_reoptimizedNewJava
 Opening index @ /htsolr/lss-reindex/shards/4/core-1/data/test_index
 Segments file=segments_e numSegments=1 version=4.10.2 format= 
 userData={commitTimeMSec=1421479358825}
   1 of 1: name=_8m8 docCount=1130856
 version=4.10.2
 codec=Lucene410
 compound=false
 numFiles=10
 size (MB)=719,967.32
 diagnostics = {timestamp=1421437320935, os=Linux, 
 os.version=2.6.18-400.1.1.el5, mergeFactor=2, source=merge, 
 lucene.version=4.10.2, os.arch=amd64, mergeMaxNumSegments=1, 
 java.version=1.7.0_71, java.vendor=Oracle Corporation}
 no deletions
 test: open reader.OK
 test: check integrity.OK
 test: check live docs.OK
 test: fields..OK [80 fields]
 test: field norms.OK [23 fields]
 test: terms, freq, prox...ERROR: java.lang.AssertionError: -96
 java.lang.AssertionError: -96
 at 
 org.apache.lucene.codecs.lucene41.ForUtil.skipBlock(ForUtil.java:228)
 at 
 org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.skipPositions(Lucene41PostingsReader.java:925)
 at 
 org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.nextPosition(Lucene41PostingsReader.java:955)
 at 
 org.apache.lucene.index.CheckIndex.checkFields(CheckIndex.java:1100)
 at 
 org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:1357)
 at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:655)
 at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096)
 test: stored fields...OK [67472796 total field count; avg 59.665 
 fields per doc]
 test: term vectorsOK [0 total vector count; avg 0 term/freq 
 vector fields per doc]
 test: docvalues...OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 
 SORTED; 0 SORTED_NUMERIC; 0 SORTED_SET]
 FAILED
 WARNING: fixIndex() would remove reference to this segment; full 
 exception:
 java.lang.RuntimeException: Term Index test failed
 at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:670)
 at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096)
 WARNING: 1 broken segments (containing 1130856 documents) detected
 WARNING: would write new segments file, and 1130856 documents would be lost, 
 if -fix were specified
 {noformat}
 And Rob spotted long - int casts in our skip list writers that look like 
 they could cause such corruption if a single high-freq term with many 
 positions required  2.1 GB to write its positions into .pos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6192) Long overflow in LuceneXXSkipWriter can corrupt skip data

2015-01-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285857#comment-14285857
 ] 

ASF subversion and git services commented on LUCENE-6192:
-

Commit 1653593 from [~mikemccand] in branch 'dev/branches/lucene_solr_4_10'
[ https://svn.apache.org/r1653593 ]

LUCENE-6192: don't overflow int when writing skip data for high freq terms in 
extremely large indices

 Long overflow in LuceneXXSkipWriter can corrupt skip data
 -

 Key: LUCENE-6192
 URL: https://issues.apache.org/jira/browse/LUCENE-6192
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 5.0, Trunk, 4.x

 Attachments: LUCENE-6192.patch


 I've been iterating with Tom on this corruption that CheckIndex detects in 
 his rather large index (720 GB in a single segment):
 {noformat}
  java -Xmx16G -Xms16G -cp $JAR -ea:org.apache.lucene... 
 org.apache.lucene.index.CheckIndex //shards/4/core-1/data/test_index 
 -verbose 21 |tee -a shard4_reoptimizedNewJava
 Opening index @ /htsolr/lss-reindex/shards/4/core-1/data/test_index
 Segments file=segments_e numSegments=1 version=4.10.2 format= 
 userData={commitTimeMSec=1421479358825}
   1 of 1: name=_8m8 docCount=1130856
 version=4.10.2
 codec=Lucene410
 compound=false
 numFiles=10
 size (MB)=719,967.32
 diagnostics = {timestamp=1421437320935, os=Linux, 
 os.version=2.6.18-400.1.1.el5, mergeFactor=2, source=merge, 
 lucene.version=4.10.2, os.arch=amd64, mergeMaxNumSegments=1, 
 java.version=1.7.0_71, java.vendor=Oracle Corporation}
 no deletions
 test: open reader.OK
 test: check integrity.OK
 test: check live docs.OK
 test: fields..OK [80 fields]
 test: field norms.OK [23 fields]
 test: terms, freq, prox...ERROR: java.lang.AssertionError: -96
 java.lang.AssertionError: -96
 at 
 org.apache.lucene.codecs.lucene41.ForUtil.skipBlock(ForUtil.java:228)
 at 
 org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.skipPositions(Lucene41PostingsReader.java:925)
 at 
 org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.nextPosition(Lucene41PostingsReader.java:955)
 at 
 org.apache.lucene.index.CheckIndex.checkFields(CheckIndex.java:1100)
 at 
 org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:1357)
 at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:655)
 at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096)
 test: stored fields...OK [67472796 total field count; avg 59.665 
 fields per doc]
 test: term vectorsOK [0 total vector count; avg 0 term/freq 
 vector fields per doc]
 test: docvalues...OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 
 SORTED; 0 SORTED_NUMERIC; 0 SORTED_SET]
 FAILED
 WARNING: fixIndex() would remove reference to this segment; full 
 exception:
 java.lang.RuntimeException: Term Index test failed
 at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:670)
 at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096)
 WARNING: 1 broken segments (containing 1130856 documents) detected
 WARNING: would write new segments file, and 1130856 documents would be lost, 
 if -fix were specified
 {noformat}
 And Rob spotted long - int casts in our skip list writers that look like 
 they could cause such corruption if a single high-freq term with many 
 positions required  2.1 GB to write its positions into .pos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6192) Long overflow in LuceneXXSkipWriter can corrupt skip data

2015-01-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285828#comment-14285828
 ] 

ASF subversion and git services commented on LUCENE-6192:
-

Commit 1653580 from [~mikemccand] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1653580 ]

LUCENE-6192: don't overflow int when writing skip data for high freq terms in 
extremely large indices

 Long overflow in LuceneXXSkipWriter can corrupt skip data
 -

 Key: LUCENE-6192
 URL: https://issues.apache.org/jira/browse/LUCENE-6192
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 5.0, Trunk, 4.x

 Attachments: LUCENE-6192.patch


 I've been iterating with Tom on this corruption that CheckIndex detects in 
 his rather large index (720 GB in a single segment):
 {noformat}
  java -Xmx16G -Xms16G -cp $JAR -ea:org.apache.lucene... 
 org.apache.lucene.index.CheckIndex //shards/4/core-1/data/test_index 
 -verbose 21 |tee -a shard4_reoptimizedNewJava
 Opening index @ /htsolr/lss-reindex/shards/4/core-1/data/test_index
 Segments file=segments_e numSegments=1 version=4.10.2 format= 
 userData={commitTimeMSec=1421479358825}
   1 of 1: name=_8m8 docCount=1130856
 version=4.10.2
 codec=Lucene410
 compound=false
 numFiles=10
 size (MB)=719,967.32
 diagnostics = {timestamp=1421437320935, os=Linux, 
 os.version=2.6.18-400.1.1.el5, mergeFactor=2, source=merge, 
 lucene.version=4.10.2, os.arch=amd64, mergeMaxNumSegments=1, 
 java.version=1.7.0_71, java.vendor=Oracle Corporation}
 no deletions
 test: open reader.OK
 test: check integrity.OK
 test: check live docs.OK
 test: fields..OK [80 fields]
 test: field norms.OK [23 fields]
 test: terms, freq, prox...ERROR: java.lang.AssertionError: -96
 java.lang.AssertionError: -96
 at 
 org.apache.lucene.codecs.lucene41.ForUtil.skipBlock(ForUtil.java:228)
 at 
 org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.skipPositions(Lucene41PostingsReader.java:925)
 at 
 org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.nextPosition(Lucene41PostingsReader.java:955)
 at 
 org.apache.lucene.index.CheckIndex.checkFields(CheckIndex.java:1100)
 at 
 org.apache.lucene.index.CheckIndex.testPostings(CheckIndex.java:1357)
 at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:655)
 at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096)
 test: stored fields...OK [67472796 total field count; avg 59.665 
 fields per doc]
 test: term vectorsOK [0 total vector count; avg 0 term/freq 
 vector fields per doc]
 test: docvalues...OK [0 docvalues fields; 0 BINARY; 0 NUMERIC; 0 
 SORTED; 0 SORTED_NUMERIC; 0 SORTED_SET]
 FAILED
 WARNING: fixIndex() would remove reference to this segment; full 
 exception:
 java.lang.RuntimeException: Term Index test failed
 at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:670)
 at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:2096)
 WARNING: 1 broken segments (containing 1130856 documents) detected
 WARNING: would write new segments file, and 1130856 documents would be lost, 
 if -fix were specified
 {noformat}
 And Rob spotted long - int casts in our skip list writers that look like 
 they could cause such corruption if a single high-freq term with many 
 positions required  2.1 GB to write its positions into .pos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-7013) Unclear error message with solr script when lacking jar executable

2015-01-21 Thread Derek Wood (JIRA)
Derek Wood created SOLR-7013:


 Summary: Unclear error message with solr script when lacking jar 
executable
 Key: SOLR-7013
 URL: https://issues.apache.org/jira/browse/SOLR-7013
 Project: Solr
  Issue Type: Bug
  Components: scripts and tools
Affects Versions: 4.10.3
 Environment: Fedora 21
Reporter: Derek Wood


Fedora 21 doesn't ship the jar executable with the default jdk package, so 
the attempt to extract webapp/solr.war in the solr script can fail without a 
clear error message. The attached patch adds this error message and includes 
support for the unzip utility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6581) Efficient DocValues support and numeric collapse field implementations for Collapse and Expand

2015-01-21 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286240#comment-14286240
 ] 

Joel Bernstein commented on SOLR-6581:
--

The hint in the code is still upper case TOP_FC. This was meant to be lower 
case. I'll open another issue for this and have it accept both cases. 5.0 will 
go out with the upper case syntax though so I'll update the documentation.

 Efficient DocValues support and numeric collapse field implementations for 
 Collapse and Expand
 --

 Key: SOLR-6581
 URL: https://issues.apache.org/jira/browse/SOLR-6581
 Project: Solr
  Issue Type: Bug
Reporter: Joel Bernstein
Assignee: Joel Bernstein
Priority: Minor
 Fix For: 5.0, Trunk

 Attachments: SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, 
 SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, 
 SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, 
 SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, 
 renames.diff


 The 4x implementation of the CollapsingQParserPlugin and the ExpandComponent 
 are optimized to work with a top level FieldCache. Top level FieldCaches have 
 a very fast docID to top-level ordinal lookup. Fast access to the top-level 
 ordinals allows for very high performance field collapsing on high 
 cardinality fields. 
 LUCENE-5666 unified the DocValues and FieldCache api's so that the top level 
 FieldCache is no longer in regular use. Instead all top level caches are 
 accessed through MultiDocValues. 
 This ticket does the following:
 1) Optimizes Collapse and Expand to use MultiDocValues and makes this the 
 default approach when collapsing on String fields
 2) Provides an option to use a top level FieldCache if the performance of 
 MultiDocValues is a blocker. The mechanism for switching to the FieldCache is 
 a new hint parameter. If the hint parameter is set to top_fc then the 
 top-level FieldCache would be used for both Collapse and Expand.
 Example syntax:
 {code}
 fq={!collapse field=x hint=TOP_FC}
 {code}
 3)  Adds numeric collapse field implementations.
 4) Resolves issue SOLR-6066
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7

2015-01-21 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286314#comment-14286314
 ] 

Uwe Schindler commented on SOLR-6991:
-

The last comment was just an idea, but doesn't work. The problem here is that 
initialization of the parser fails, so it will always call 
TesseractOCRParser.getSupportedTypes()...

 Update to Apache TIKA 1.7
 -

 Key: SOLR-6991
 URL: https://issues.apache.org/jira/browse/SOLR-6991
 Project: Solr
  Issue Type: Improvement
  Components: contrib - Solr Cell (Tika extraction)
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 5.0, Trunk, 5.1

 Attachments: SOLR-6991-forkfix.patch, SOLR-6991.patch, SOLR-6991.patch


 Apache TIKA 1.7 was released: 
 [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt]
 This is more or less a dependency update, so replacements. Not sure if we 
 should do this for 5.0. In 5.0 we currently have the previous version, which 
 was not yet released with Solr. If we now bring this into 5.0, we wouldn't 
 have a new release 2 times. I can change the stuff this evening and let it 
 bake in 5.x, so maybe we backport this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7013) Unclear error message with solr script when lacking jar executable

2015-01-21 Thread Erik Hatcher (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Hatcher updated SOLR-7013:
---
 Priority: Blocker  (was: Major)
Fix Version/s: 5.0

 Unclear error message with solr script when lacking jar executable
 --

 Key: SOLR-7013
 URL: https://issues.apache.org/jira/browse/SOLR-7013
 Project: Solr
  Issue Type: Bug
  Components: scripts and tools
Affects Versions: 4.10.3
 Environment: Fedora 21
Reporter: Derek Wood
Priority: Blocker
 Fix For: 5.0

 Attachments: solr.patch


 Fedora 21 doesn't ship the jar executable with the default jdk package, so 
 the attempt to extract webapp/solr.war in the solr script can fail without 
 a clear error message. The attached patch adds this error message and 
 includes support for the unzip utility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7005) facet.heatmap for spatial heatmap faceting on RPT

2015-01-21 Thread David Smiley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated SOLR-7005:
---
Attachment: SOLR-7005_heatmap.patch

Thanks for the encouragement Shalin, and Erik on #lucene-dev, and others via 
email who have gotten wind of this.

Here's the first-draft patch.  It is still based on being its own 
SearchComponent, and it doesn't yet support distributed-search -- those issues 
should be addressed next.

I added support for the distErr parameter to facilitate computing the grid 
level in the same fashion as used by Lucene spatial to ultimately derive a grid 
level for a given shape (a rect/box in this case).  In fact it re-uses utility 
methods in Lucene spatial to compute the grid level given the world boundary, 
distErr (if provided) and distErrPct (if provided).  The units of distErr is 
the same as distanceUnits attribute on the field type (a new Solr 5 thing).  So 
if units is a kilometer and distErr is 100 then the grid cells returns are at 
least as precise as 100 kilometers (which BTW is a little less than a spherical 
degree for Earth, which is 111.2km).  The 512x256 heatmap I uploaded was 
generated by specifying distErr=111.2.  A client could compute a distErr if 
they instead know how many minimum cells they want in the heatmap.  I may bake 
that formula in and provide a minCells param.

For distributed-search, I'm thinking the internal shard requests will use PNG 
since it's compressed, and then the user can get whatever format they asked 
for.  I only want to write the aggregation logic once, not per-format :-)

As a part of this work I found it useful to add SpatialUtils.parseRectangle 
which parses the {{[lowerLeftPoint TO upperRightPoint]}} format.  In another 
issue I want to re-use this to provide a more Solr-friendly way of indexing a 
rectangle (for e.g. BBoxField or RPT) or for specifying worldBounds on the 
field type.

Even though I don't have distributed-search implemented yet, the test extends 
BaseDistributedSearchTestCase any way.  I dislike the idea of writing two tests 
that test the same thing (one distributed, one not) when the infrastructure 
should make it indifferent since it's transparent to input  output I'm 
testing.  Unfortunately, assertQ  friends are hard-coded to use TestHarness 
which is in turn hard-coded to use an embedded Solr instance.  And 
unfortunately, BaseDistributedSearchTestCase doesn't let me test 0 shards (hey, 
I haven't implemented that feature yet!).  The patch tweaks 
BaseDistributedSearchTestCase slightly to let me do this. 

 facet.heatmap for spatial heatmap faceting on RPT
 -

 Key: SOLR-7005
 URL: https://issues.apache.org/jira/browse/SOLR-7005
 Project: Solr
  Issue Type: New Feature
  Components: spatial
Reporter: David Smiley
Assignee: David Smiley
 Fix For: 5.1

 Attachments: SOLR-7005_heatmap.patch, heatmap_512x256.png, 
 heatmap_64x32.png


 This is a new feature that uses the new spatial Heatmap / 2D PrefixTree cell 
 counter in Lucene spatial LUCENE-6191.  This is a form of faceting, and 
 as-such I think it should live in the facet parameter namespace.  Here's 
 what the parameters are:
 * facet=true
 * facet.heatmap=fieldname
 * facet.heatmap.bbox=\[-180 -90 TO 180 90]
 * facet.heatmap.gridLevel=6
 * facet.heatmap.distErrPct=0.10
 Like other faceting features, the fieldName can have local-params to exclude 
 filter queries or specify an output key.
 The bbox is optional; you get the whole world or you can specify a box or 
 actually any shape that WKT supports (you get the bounding box of whatever 
 you put).
 Ultimately, this feature needs to know the grid level, which together with 
 the input shape will yield a certain number of cells.  You can specify 
 gridLevel exactly, or don't and instead provide distErrPct which is computed 
 like it is for the RPT field type as seen in the schema.  0.10 yielded ~4k 
 cells but it'll vary.  There's also a facet.heatmap.maxCells safety net 
 defaulting to 100k.  Exceed this and you get an error.
 The output is (JSON):
 {noformat}
 {gridLevel=6,columns=64,rows=64,minX=-180.0,maxX=180.0,minY=-90.0,maxY=90.0,counts=[[0,
  0, 2, 1, ],[1, 1, 3, 2, ...],...]}
 {noformat}
 counts is null if all would be 0.  Perhaps individual row arrays should 
 likewise be null... I welcome feedback.
 I'm toying with an output format option in which you can specify a base-64'ed 
 grayscale PNG.
 Obviously this should support sharded / distributed environments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6581) Efficient DocValues support and numeric collapse field implementations for Collapse and Expand

2015-01-21 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-6581:
-
Description: 
The 4x implementation of the CollapsingQParserPlugin and the ExpandComponent 
are optimized to work with a top level FieldCache. Top level FieldCaches have a 
very fast docID to top-level ordinal lookup. Fast access to the top-level 
ordinals allows for very high performance field collapsing on high cardinality 
fields. 

LUCENE-5666 unified the DocValues and FieldCache api's so that the top level 
FieldCache is no longer in regular use. Instead all top level caches are 
accessed through MultiDocValues. 

This ticket does the following:

1) Optimizes Collapse and Expand to use MultiDocValues and makes this the 
default approach when collapsing on String fields

2) Provides an option to use a top level FieldCache if the performance of 
MultiDocValues is a blocker. The mechanism for switching to the FieldCache is a 
new hint parameter. If the hint parameter is set to top_fc then the 
top-level FieldCache would be used for both Collapse and Expand.

Example syntax:
{code}
fq={!collapse field=x hint=TOP_FC}
{code}

3)  Adds numeric collapse field implementations.

4) Resolves issue SOLR-6066







 






  was:
The 4x implementation of the CollapsingQParserPlugin and the ExpandComponent 
are optimized to work with a top level FieldCache. Top level FieldCaches have a 
very fast docID to top-level ordinal lookup. Fast access to the top-level 
ordinals allows for very high performance field collapsing on high cardinality 
fields. 

LUCENE-5666 unified the DocValues and FieldCache api's so that the top level 
FieldCache is no longer in regular use. Instead all top level caches are 
accessed through MultiDocValues. 

This ticket does the following:

1) Optimizes Collapse and Expand to use MultiDocValues and makes this the 
default approach when collapsing on String fields

2) Provides an option to use a top level FieldCache if the performance of 
MultiDocValues is a blocker. The mechanism for switching to the FieldCache is a 
new hint parameter. If the hint parameter is set to top_fc then the 
top-level FieldCache would be used for both Collapse and Expand.

Example syntax:
{code}
fq={!collapse field=x hint=top_fc}
{code}

3)  Adds numeric collapse field implementations.

4) Resolves issue SOLR-6066







 







 Efficient DocValues support and numeric collapse field implementations for 
 Collapse and Expand
 --

 Key: SOLR-6581
 URL: https://issues.apache.org/jira/browse/SOLR-6581
 Project: Solr
  Issue Type: Bug
Reporter: Joel Bernstein
Assignee: Joel Bernstein
Priority: Minor
 Fix For: 5.0, Trunk

 Attachments: SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, 
 SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, 
 SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, 
 SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, SOLR-6581.patch, 
 renames.diff


 The 4x implementation of the CollapsingQParserPlugin and the ExpandComponent 
 are optimized to work with a top level FieldCache. Top level FieldCaches have 
 a very fast docID to top-level ordinal lookup. Fast access to the top-level 
 ordinals allows for very high performance field collapsing on high 
 cardinality fields. 
 LUCENE-5666 unified the DocValues and FieldCache api's so that the top level 
 FieldCache is no longer in regular use. Instead all top level caches are 
 accessed through MultiDocValues. 
 This ticket does the following:
 1) Optimizes Collapse and Expand to use MultiDocValues and makes this the 
 default approach when collapsing on String fields
 2) Provides an option to use a top level FieldCache if the performance of 
 MultiDocValues is a blocker. The mechanism for switching to the FieldCache is 
 a new hint parameter. If the hint parameter is set to top_fc then the 
 top-level FieldCache would be used for both Collapse and Expand.
 Example syntax:
 {code}
 fq={!collapse field=x hint=TOP_FC}
 {code}
 3)  Adds numeric collapse field implementations.
 4) Resolves issue SOLR-6066
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-Tests-5.x-Java7 - Build # 2522 - Still Failing

2015-01-21 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-5.x-Java7/2522/

4 tests failed.
FAILED:  org.apache.solr.cloud.HttpPartitionTest.testDistribSearch

Error Message:
org.apache.solr.client.solrj.SolrServerException: IOException occured when 
talking to server at: http://127.0.0.1:12771/c8n_1x2_shard1_replica2

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: 
org.apache.solr.client.solrj.SolrServerException: IOException occured when 
talking to server at: http://127.0.0.1:12771/c8n_1x2_shard1_replica2
at 
__randomizedtesting.SeedInfo.seed([F5B2AF58A025A0EF:74542140D77AC0D3]:0)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:581)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:890)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:793)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:736)
at 
org.apache.solr.cloud.HttpPartitionTest.sendDoc(HttpPartitionTest.java:480)
at 
org.apache.solr.cloud.HttpPartitionTest.testRf2(HttpPartitionTest.java:201)
at 
org.apache.solr.cloud.HttpPartitionTest.doTest(HttpPartitionTest.java:114)
at 
org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:878)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 

[jira] [Commented] (SOLR-7005) facet.heatmap for spatial heatmap faceting on RPT

2015-01-21 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286232#comment-14286232
 ] 

David Smiley commented on SOLR-7005:


Oh, facet.heatmap.format=png (or ints, ints being the default)

 facet.heatmap for spatial heatmap faceting on RPT
 -

 Key: SOLR-7005
 URL: https://issues.apache.org/jira/browse/SOLR-7005
 Project: Solr
  Issue Type: New Feature
  Components: spatial
Reporter: David Smiley
Assignee: David Smiley
 Fix For: 5.1

 Attachments: SOLR-7005_heatmap.patch, heatmap_512x256.png, 
 heatmap_64x32.png


 This is a new feature that uses the new spatial Heatmap / 2D PrefixTree cell 
 counter in Lucene spatial LUCENE-6191.  This is a form of faceting, and 
 as-such I think it should live in the facet parameter namespace.  Here's 
 what the parameters are:
 * facet=true
 * facet.heatmap=fieldname
 * facet.heatmap.bbox=\[-180 -90 TO 180 90]
 * facet.heatmap.gridLevel=6
 * facet.heatmap.distErrPct=0.10
 Like other faceting features, the fieldName can have local-params to exclude 
 filter queries or specify an output key.
 The bbox is optional; you get the whole world or you can specify a box or 
 actually any shape that WKT supports (you get the bounding box of whatever 
 you put).
 Ultimately, this feature needs to know the grid level, which together with 
 the input shape will yield a certain number of cells.  You can specify 
 gridLevel exactly, or don't and instead provide distErrPct which is computed 
 like it is for the RPT field type as seen in the schema.  0.10 yielded ~4k 
 cells but it'll vary.  There's also a facet.heatmap.maxCells safety net 
 defaulting to 100k.  Exceed this and you get an error.
 The output is (JSON):
 {noformat}
 {gridLevel=6,columns=64,rows=64,minX=-180.0,maxX=180.0,minY=-90.0,maxY=90.0,counts=[[0,
  0, 2, 1, ],[1, 1, 3, 2, ...],...]}
 {noformat}
 counts is null if all would be 0.  Perhaps individual row arrays should 
 likewise be null... I welcome feedback.
 I'm toying with an output format option in which you can specify a base-64'ed 
 grayscale PNG.
 Obviously this should support sharded / distributed environments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6188) Remove HTML verification from checkJavaDocs.py

2015-01-21 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286282#comment-14286282
 ] 

Erick Erickson commented on LUCENE-6188:


Thanks! Back from 2 days onsite so I can pay some attention now.

 Remove HTML verification from checkJavaDocs.py
 --

 Key: LUCENE-6188
 URL: https://issues.apache.org/jira/browse/LUCENE-6188
 Project: Lucene - Core
  Issue Type: Improvement
  Components: general/javadocs
Reporter: Ramkumar Aiyengar
Assignee: Erick Erickson
Priority: Minor
 Attachments: LUCENE-6188.patch, LUCENE-6188.patch


 Currently, the broken HTML verification in {{checkJavaDocs.py}} has issues in 
 some cases (see SOLR-6902).
 On looking further to fix it with the {{html.parser}} package instead, 
 noticed that there is broken HTML verification already present (using 
 {{html.parser}}!)in {{checkJavadocLinks.py}} anyway which takes care of 
 validation, and probably {{jTidy}} does it as well, going by the output 
 (haven't verified it).
 Given this, the validation in {{checkJavaDocs.py}} doesn't seem to add any 
 further value, so here's a patch to just nuke it instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-5.0-Linux (32bit/ibm-j9-jdk7) - Build # 26 - Failure!

2015-01-21 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.0-Linux/26/
Java: 32bit/ibm-j9-jdk7 
-Xjit:exclude={org/apache/lucene/util/fst/FST.pack(IIF)Lorg/apache/lucene/util/fst/FST;}

45 tests failed.
FAILED:  junit.framework.TestSuite.org.apache.solr.cloud.BasicZkTest

Error Message:
Could not get the port for ZooKeeper server

Stack Trace:
java.lang.RuntimeException: Could not get the port for ZooKeeper server
at __randomizedtesting.SeedInfo.seed([89F35FDD4F4E4D35]:0)
at org.apache.solr.cloud.ZkTestServer.run(ZkTestServer.java:482)
at 
org.apache.solr.cloud.AbstractZkTestCase.azt_beforeClass(AbstractZkTestCase.java:68)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:94)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
at java.lang.reflect.Method.invoke(Method.java:619)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:767)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
at java.lang.Thread.run(Thread.java:853)


FAILED:  junit.framework.TestSuite.org.apache.solr.cloud.BasicZkTest

Error Message:


Stack Trace:
java.lang.NullPointerException
at __randomizedtesting.SeedInfo.seed([89F35FDD4F4E4D35]:0)
at 
org.apache.zookeeper.server.NIOServerCnxnFactory.getLocalPort(NIOServerCnxnFactory.java:134)
at 
org.apache.solr.cloud.ZkTestServer$ZKServerMain.shutdown(ZkTestServer.java:334)
at org.apache.solr.cloud.ZkTestServer.shutdown(ZkTestServer.java:492)
at 
org.apache.solr.cloud.AbstractZkTestCase.azt_afterClass(AbstractZkTestCase.java:158)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:94)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
at java.lang.reflect.Method.invoke(Method.java:619)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:790)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 

[jira] [Commented] (LUCENE-6161) Applying deletes is sometimes dog slow

2015-01-21 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286421#comment-14286421
 ] 

Robert Muir commented on LUCENE-6161:
-

Just a few minor thoughts:

Some of the iteration is more awkward now, it might be nice to open a followup 
to clean this up.
delGen is awkward to see being held in PrefixCodedTerms, and we have an 
iterator api that ... is neither termsenum or iterable but another one instead.
I wonder if we could have the same logic, but using a more natural one. if it 
would just make the code even more awkward, then screw it :)

We should fix the issue though for now I think.

 Applying deletes is sometimes dog slow
 --

 Key: LUCENE-6161
 URL: https://issues.apache.org/jira/browse/LUCENE-6161
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 5.0, Trunk

 Attachments: LUCENE-6161.patch, LUCENE-6161.patch, LUCENE-6161.patch, 
 LUCENE-6161.patch, LUCENE-6161.patch


 I hit this while testing various use cases for LUCENE-6119 (adding 
 auto-throttle to ConcurrentMergeScheduler).
 When I tested always call updateDocument (each add buffers a delete term), 
 with many indexing threads, opening an NRT reader once per second (forcing 
 all deleted terms to be applied), I see that 
 BufferedUpdatesStream.applyDeletes sometimes seems to take a lng time, 
 e.g.:
 {noformat}
 BD 0 [2015-01-04 09:31:12.597; Lucene Merge Thread #69]: applyDeletes took 
 339 msec for 10 segments, 117 deleted docs, 607333 visited terms
 BD 0 [2015-01-04 09:31:18.148; Thread-4]: applyDeletes took 5533 msec for 62 
 segments, 10989 deleted docs, 8517225 visited terms
 BD 0 [2015-01-04 09:31:21.463; Lucene Merge Thread #71]: applyDeletes took 
 1065 msec for 10 segments, 470 deleted docs, 1825649 visited terms
 BD 0 [2015-01-04 09:31:26.301; Thread-5]: applyDeletes took 4835 msec for 61 
 segments, 14676 deleted docs, 9649860 visited terms
 BD 0 [2015-01-04 09:31:35.572; Thread-11]: applyDeletes took 6073 msec for 72 
 segments, 13835 deleted docs, 11865319 visited terms
 BD 0 [2015-01-04 09:31:37.604; Lucene Merge Thread #75]: applyDeletes took 
 251 msec for 10 segments, 58 deleted docs, 240721 visited terms
 BD 0 [2015-01-04 09:31:44.641; Thread-11]: applyDeletes took 5956 msec for 64 
 segments, 15109 deleted docs, 10599034 visited terms
 BD 0 [2015-01-04 09:31:47.814; Lucene Merge Thread #77]: applyDeletes took 
 396 msec for 10 segments, 137 deleted docs, 719914 visit
 {noformat}
 What this means is even though I want an NRT reader every second, often I 
 don't get one for up to ~7 or more seconds.
 This is on an SSD, machine has 48 GB RAM, heap size is only 2 GB.  12 
 indexing threads.
 As hideously complex as this code is, I think there are some inefficiencies, 
 but fixing them could be hard / make code even hairier ...
 Also, this code is mega-locked: holds IW's lock, holds BD's lock.  It blocks 
 things like merges kicking off or finishing...
 E.g., we pull the MergedIterator many times on the same set of sub-iterators. 
  Maybe we can create the sorted terms up front and reuse that?
 Maybe we should go term stride (one term visits all N segments) not 
 segment stride (visit each segment, iterating all deleted terms for it).  
 Just iterating the terms to be deleted takes a sizable part of the time, and 
 we now do that once for every segment in the index.
 Also, the isUnique bit in LUCENE-6005 should help here, since if we know 
 the field is unique, we can stop seekExact once we found a segment that has 
 the deleted term, we can maybe pass false for removeDuplicates to 
 MergedIterator...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7013) Unclear error message with solr script when lacking jar executable

2015-01-21 Thread Derek Wood (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Derek Wood updated SOLR-7013:
-
Attachment: solr.patch

 Unclear error message with solr script when lacking jar executable
 --

 Key: SOLR-7013
 URL: https://issues.apache.org/jira/browse/SOLR-7013
 Project: Solr
  Issue Type: Bug
  Components: scripts and tools
Affects Versions: 4.10.3
 Environment: Fedora 21
Reporter: Derek Wood
 Attachments: solr.patch


 Fedora 21 doesn't ship the jar executable with the default jdk package, so 
 the attempt to extract webapp/solr.war in the solr script can fail without 
 a clear error message. The attached patch adds this error message and 
 includes support for the unzip utility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6845) Add buildOnStartup option for suggesters

2015-01-21 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-6845:

Attachment: tests-failures.txt

I just saw a local failure on trunk on 
org.apache.solr.handler.component.SuggestComponentTest.testDefaultBuildOnStartupStoredDict.
 The logs are attached and the stack trace is:
{code}
  2 786070 T7047 oas.SolrTestCaseJ4.assertQ ERROR REQUEST FAILED: 
xpath=//lst[@name='suggest']/lst[@name='suggest_doc_default_startup']/lst[@name='example']/int[@name='numFound'][.='0']
  2xml response was: ?xml version=1.0 encoding=UTF-8?
  2response
  2lst name=responseHeaderint name=status0/intint 
name=QTime5/int/lstlst name=suggestlst 
name=suggest_doc_default_startuplst name=exampleint 
name=numFound2/intarr name=suggestionslststr name=termexample 
inputdata/strlong name=weight45/longstr 
name=payload//lstlststr name=termexample data/strlong 
name=weight40/longstr name=payload//lst/arr/lst/lst/lst
  2/response
  2
  2request 
was:qt=/suggestsuggest.q=examplesuggest.count=2suggest.dictionary=suggest_doc_default_startupwt=xml
  2 786071 T7047 oasc.SolrException.log ERROR REQUEST FAILED: 
qt=/suggestsuggest.q=examplesuggest.count=2suggest.dictionary=suggest_doc_default_startupwt=xml:java.lang.RuntimeException:
 REQUEST FAILED: 
xpath=//lst[@name='suggest']/lst[@name='suggest_doc_default_startup']/lst[@name='example']/int[@name='numFound'][.='0']
  2xml response was: ?xml version=1.0 encoding=UTF-8?
  2response
  2lst name=responseHeaderint name=status0/intint 
name=QTime5/int/lstlst name=suggestlst 
name=suggest_doc_default_startuplst name=exampleint 
name=numFound2/intarr name=suggestionslststr name=termexample 
inputdata/strlong name=weight45/longstr 
name=payload//lstlststr name=termexample data/strlong 
name=weight40/longstr name=payload//lst/arr/lst/lst/lst
  2/response
  2
  2request 
was:qt=/suggestsuggest.q=examplesuggest.count=2suggest.dictionary=suggest_doc_default_startupwt=xml
  2at 
org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:741)
  2at 
org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:715)
  2at 
org.apache.solr.handler.component.SuggestComponentTest.testDefaultBuildOnStartupStoredDict(SuggestComponentTest.java:257)
  2at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
{code}

{code}
ant test  -Dtestcase=SuggestComponentTest 
-Dtests.method=testDefaultBuildOnStartupStoredDict 
-Dtests.seed=1AE9946D9D16B26E -Dtests.slow=true -Dtests.locale=en 
-Dtests.timezone=Asia/Istanbul -Dtests.asserts=true 
-Dtests.file.encoding=ISO-8859-1
{code}

I tried a few times but couldn't reproduce it. 

 Add buildOnStartup option for suggesters
 

 Key: SOLR-6845
 URL: https://issues.apache.org/jira/browse/SOLR-6845
 Project: Solr
  Issue Type: Bug
Reporter: Hoss Man
Assignee: Tomás Fernández Löbbe
 Fix For: Trunk, 5.1

 Attachments: SOLR-6845.patch, SOLR-6845.patch, SOLR-6845.patch, 
 tests-failures.txt


 SOLR-6679 was filed to track the investigation into the following problem...
 {panel}
 The stock solrconfig provides a bad experience with a large index... start up 
 Solr and it will spin at 100% CPU for minutes, unresponsive, while it 
 apparently builds a suggester index.
 ...
 This is what I did:
 1) indexed 10M very small docs (only takes a few minutes).
 2) shut down Solr
 3) start up Solr and watch it be unresponsive for over 4 minutes!
 I didn't even use any of the fields specified in the suggester config and I 
 never called the suggest request handler.
 {panel}
 ..but ultimately focused on removing/disabling the suggester from the sample 
 configs.
 Opening this new issue to focus on actually trying to identify the root 
 problem  fix it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6188) Remove HTML verification from checkJavaDocs.py

2015-01-21 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286294#comment-14286294
 ] 

Robert Muir commented on LUCENE-6188:
-

{quote}
If it's not adding value anymore (e.g. we recently turned on faster javadocs 
checking via javac's doclint options), I agree we should remove it: it's slow 
and hackity and un-understandable.
{quote}

The doclint stuff added (TRUNK ONLY) is blazing fast and nice, but there is a 
good amount of work before its checking html, i see these steps:
* actually turn on html verification in doclint. this can't be done until a lot 
of problems are fixed. When they are fixed we can enable html:
  {noformat}-Xdoclint:all/protected -Xdoclint:-html -Xdoclint:-missing{noformat}
* figure out how to check overview.html and package.html. I suspect they are 
currently not being checked (but maybe im wrong). Maybe we can ask the openjdk 
developers about it. 

Then jtidy could be removed completely. python linting is still needed until we 
can properly enable missing and cutover build logic to that. Then i think its 
check-missing could be removed. As far as the python broken links checker, im 
not sure if there is a replacement. Ideally we are just using doclint for all 
checks in the future.

 Remove HTML verification from checkJavaDocs.py
 --

 Key: LUCENE-6188
 URL: https://issues.apache.org/jira/browse/LUCENE-6188
 Project: Lucene - Core
  Issue Type: Improvement
  Components: general/javadocs
Reporter: Ramkumar Aiyengar
Assignee: Erick Erickson
Priority: Minor
 Attachments: LUCENE-6188.patch, LUCENE-6188.patch


 Currently, the broken HTML verification in {{checkJavaDocs.py}} has issues in 
 some cases (see SOLR-6902).
 On looking further to fix it with the {{html.parser}} package instead, 
 noticed that there is broken HTML verification already present (using 
 {{html.parser}}!)in {{checkJavadocLinks.py}} anyway which takes care of 
 validation, and probably {{jTidy}} does it as well, going by the output 
 (haven't verified it).
 Given this, the validation in {{checkJavaDocs.py}} doesn't seem to add any 
 further value, so here's a patch to just nuke it instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6954) Considering changing SolrClient#shutdown to SolrClient#close.

2015-01-21 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward updated SOLR-6954:

Attachment: SOLR-6954.patch

Patch making SolrClient implement Closeable, and making shutdown() a deprecated 
concrete method that delegates to close().  Also cuts over all tests to use 
close() (and try-with-resources where possible).

 Considering changing SolrClient#shutdown to SolrClient#close.
 -

 Key: SOLR-6954
 URL: https://issues.apache.org/jira/browse/SOLR-6954
 Project: Solr
  Issue Type: Improvement
Reporter: Mark Miller
 Fix For: 5.0, Trunk

 Attachments: SOLR-6954.patch


 SolrClient#shutdown is not as odd as SolrServer#shutdown, but as we want 
 users to release these objects, close is more standard and if we implement 
 Closeable, tools help point out leaks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7014) Collapse identical catch branches in try-catch statements

2015-01-21 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-7014:

Attachment: SOLR-7014.patch

This takes care of all solr classes. I'll attach another one which does the 
same for lucene.

 Collapse identical catch branches in try-catch statements
 -

 Key: SOLR-7014
 URL: https://issues.apache.org/jira/browse/SOLR-7014
 Project: Solr
  Issue Type: Task
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
Priority: Trivial
 Fix For: Trunk, 5.1

 Attachments: SOLR-7014.patch


 We are on Java 7+ so we can reduce verbosity by collapsing identical catch 
 statements into one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7

2015-01-21 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286272#comment-14286272
 ] 

Uwe Schindler commented on SOLR-6991:
-

Hi,
I checked the code. The problem is: You cannot disable by config (because it 
always tries to execute the command thats part of the default config file). If 
the config file is not there, then it runs TESSERACT without any path.

The only way to work around is: 
- Disable the whole parser (f*ck, because then we need to maintain our own 
parser list internally). There is no way to tell TIKA to exclude some parsers 
(something like AutodetectParser#disableParser(name/class/whatever)
- Use a hack with reflection to make TesseractOCRParser#TESSERACT_PRESENT 
return false for any path... Just replace the static map by one that returns 
false for any key (LOL) and ignores any put()

 Update to Apache TIKA 1.7
 -

 Key: SOLR-6991
 URL: https://issues.apache.org/jira/browse/SOLR-6991
 Project: Solr
  Issue Type: Improvement
  Components: contrib - Solr Cell (Tika extraction)
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 5.0, Trunk, 5.1

 Attachments: SOLR-6991-forkfix.patch, SOLR-6991.patch, SOLR-6991.patch


 Apache TIKA 1.7 was released: 
 [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt]
 This is more or less a dependency update, so replacements. Not sure if we 
 should do this for 5.0. In 5.0 we currently have the previous version, which 
 was not yet released with Solr. If we now bring this into 5.0, we wouldn't 
 have a new release 2 times. I can change the stuff this evening and let it 
 bake in 5.x, so maybe we backport this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-7014) Collapse identical catch branches in try-catch statements

2015-01-21 Thread Shalin Shekhar Mangar (JIRA)
Shalin Shekhar Mangar created SOLR-7014:
---

 Summary: Collapse identical catch branches in try-catch statements
 Key: SOLR-7014
 URL: https://issues.apache.org/jira/browse/SOLR-7014
 Project: Solr
  Issue Type: Task
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
Priority: Trivial
 Fix For: Trunk, 5.1


We are on Java 7+ so we can reduce verbosity by collapsing identical catch 
statements into one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7

2015-01-21 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286307#comment-14286307
 ] 

Uwe Schindler commented on SOLR-6991:
-

One trick could work:
TIKA prefers always external parsers loaded by SPI. The trick here would be 
to add a /META-INF/services/... file that lists a subclass of the Tesseract 
parser that just always returns no supported media types. TIKA would use our 
subclass in preference to the one shipped. By that we could disable the parser. 
I have not checked this, but this would be another hack (that I don't like, 
too).

 Update to Apache TIKA 1.7
 -

 Key: SOLR-6991
 URL: https://issues.apache.org/jira/browse/SOLR-6991
 Project: Solr
  Issue Type: Improvement
  Components: contrib - Solr Cell (Tika extraction)
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 5.0, Trunk, 5.1

 Attachments: SOLR-6991-forkfix.patch, SOLR-6991.patch, SOLR-6991.patch


 Apache TIKA 1.7 was released: 
 [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt]
 This is more or less a dependency update, so replacements. Not sure if we 
 should do this for 5.0. In 5.0 we currently have the previous version, which 
 was not yet released with Solr. If we now bring this into 5.0, we wouldn't 
 have a new release 2 times. I can change the stuff this evening and let it 
 bake in 5.x, so maybe we backport this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7014) Collapse identical catch branches in try-catch statements

2015-01-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286324#comment-14286324
 ] 

ASF subversion and git services commented on SOLR-7014:
---

Commit 1653665 from sha...@apache.org in branch 'dev/trunk'
[ https://svn.apache.org/r1653665 ]

SOLR-7014: Collapse identical catch branches in try-catch statements

 Collapse identical catch branches in try-catch statements
 -

 Key: SOLR-7014
 URL: https://issues.apache.org/jira/browse/SOLR-7014
 Project: Solr
  Issue Type: Task
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
Priority: Trivial
 Fix For: Trunk, 5.1

 Attachments: SOLR-7014.patch


 We are on Java 7+ so we can reduce verbosity by collapsing identical catch 
 statements into one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6845) Add buildOnStartup option for suggesters

2015-01-21 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-6845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286323#comment-14286323
 ] 

Tomás Fernández Löbbe commented on SOLR-6845:
-

I'll take a look

 Add buildOnStartup option for suggesters
 

 Key: SOLR-6845
 URL: https://issues.apache.org/jira/browse/SOLR-6845
 Project: Solr
  Issue Type: Bug
Reporter: Hoss Man
Assignee: Tomás Fernández Löbbe
 Fix For: Trunk, 5.1

 Attachments: SOLR-6845.patch, SOLR-6845.patch, SOLR-6845.patch, 
 tests-failures.txt


 SOLR-6679 was filed to track the investigation into the following problem...
 {panel}
 The stock solrconfig provides a bad experience with a large index... start up 
 Solr and it will spin at 100% CPU for minutes, unresponsive, while it 
 apparently builds a suggester index.
 ...
 This is what I did:
 1) indexed 10M very small docs (only takes a few minutes).
 2) shut down Solr
 3) start up Solr and watch it be unresponsive for over 4 minutes!
 I didn't even use any of the fields specified in the suggester config and I 
 never called the suggest request handler.
 {panel}
 ..but ultimately focused on removing/disabling the suggester from the sample 
 configs.
 Opening this new issue to focus on actually trying to identify the root 
 problem  fix it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Solr-5.x-Linux (64bit/ibm-j9-jdk7) - Build # 11493 - Failure!

2015-01-21 Thread Michael McCandless
J9 bug.

Mike McCandless

http://blog.mikemccandless.com


On Wed, Jan 21, 2015 at 5:53 AM, Policeman Jenkins Server
jenk...@thetaphi.de wrote:
 Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-Linux/11493/
 Java: 64bit/ibm-j9-jdk7 
 -Xjit:exclude={org/apache/lucene/util/fst/FST.pack(IIF)Lorg/apache/lucene/util/fst/FST;}

 1 tests failed.
 FAILED:  
 org.apache.lucene.codecs.lucene49.TestLucene49NormsFormat.testByteRange

 Error Message:


 Stack Trace:
 java.lang.NullPointerException
 at 
 __randomizedtesting.SeedInfo.seed([1EFEBBCD258C8490:D78182FF451AC405]:0)
 at 
 org.apache.lucene.codecs.lucene49.Lucene49NormsConsumer$NormMap.add(Lucene49NormsConsumer.java:206)
 at 
 org.apache.lucene.codecs.lucene49.Lucene49NormsConsumer.addNormsField(Lucene49NormsConsumer.java:95)
 at 
 org.apache.lucene.index.NormValuesWriter.flush(NormValuesWriter.java:72)
 at 
 org.apache.lucene.index.DefaultIndexingChain.writeNorms(DefaultIndexingChain.java:204)
 at 
 org.apache.lucene.index.DefaultIndexingChain.flush(DefaultIndexingChain.java:92)
 at 
 org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:419)
 at 
 org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:503)
 at 
 org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:615)
 at 
 org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2733)
 at 
 org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2888)
 at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2855)
 at 
 org.apache.lucene.index.RandomIndexWriter.commit(RandomIndexWriter.java:257)
 at 
 org.apache.lucene.index.BaseNormsFormatTestCase.doTestNormsVersusStoredFields(BaseNormsFormatTestCase.java:261)
 at 
 org.apache.lucene.index.BaseNormsFormatTestCase.testByteRange(BaseNormsFormatTestCase.java:54)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:94)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
 at java.lang.reflect.Method.invoke(Method.java:619)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877)
 at 
 org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
 at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at 
 org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
 at 
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
 at 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783)
 at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at 
 org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
 at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at 
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
 at 
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
   

[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7

2015-01-21 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286477#comment-14286477
 ] 

Steve Rowe commented on SOLR-6991:
--

bq. don't we need similar assumes in dataimporthandler-extras tests that use 
TikaEntityProcessor? (i'm not sure why those wouldn't fail with turkish now as 
well)

I ran {{ant test -Dtests.slow=true -Dtests.locale=tr_TR}} in 
{{solr/contrib/dataimporthandler-extras/}}, and got the following failure:

{noformat}
   [junit4] Suite: org.apache.solr.handler.dataimport.TestTikaEntityProcessor
   [junit4]   2 Creating dataDir: 
/Users/sarowe/svn/lucene/dev/trunk2/solr/build/contrib/solr-dataimporthandler-extras/test/J0/temp/solr.handler.dataimport.TestTikaEntityProcessor
 9123B7DE098A1C98-001/init-core-data-001
   [junit4]   2 log4j:WARN No appenders could be found for logger 
(org.apache.solr.SolrTestCaseJ4).
   [junit4]   2 log4j:WARN Please initialize the log4j system properly.
   [junit4]   2 log4j:WARN See 
http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
   [junit4]   2 NOTE: reproduce with: ant test  
-Dtestcase=TestTikaEntityProcessor -Dtests.method=testTikaHTMLMapperIdentity 
-Dtests.seed=9123B7DE098A1C98 -Dtests.slow=true -Dtests.locale=tr_TR 
-Dtests.timezone=America/Toronto -Dtests.asserts=true 
-Dtests.file.encoding=US-ASCII
   [junit4] ERROR   0.93s J0 | 
TestTikaEntityProcessor.testTikaHTMLMapperIdentity 
   [junit4] Throwable #1: java.lang.Error: posix_spawn is not a supported 
process launch mechanism on this platform.
   [junit4]at 
__randomizedtesting.SeedInfo.seed([9123B7DE098A1C98:C15C334FC0BEE965]:0)
   [junit4]at java.lang.UNIXProcess$1.run(UNIXProcess.java:105)
   [junit4]at java.lang.UNIXProcess$1.run(UNIXProcess.java:94)
   [junit4]at java.security.AccessController.doPrivileged(Native 
Method)
   [junit4]at java.lang.UNIXProcess.clinit(UNIXProcess.java:92)
   [junit4]at java.lang.ProcessImpl.start(ProcessImpl.java:130)
   [junit4]at 
java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
   [junit4]at java.lang.Runtime.exec(Runtime.java:620)
   [junit4]at java.lang.Runtime.exec(Runtime.java:485)
   [junit4]at 
org.apache.tika.parser.external.ExternalParser.check(ExternalParser.java:344)
   [junit4]at 
org.apache.tika.parser.ocr.TesseractOCRParser.hasTesseract(TesseractOCRParser.java:117)
   [junit4]at 
org.apache.tika.parser.ocr.TesseractOCRParser.getSupportedTypes(TesseractOCRParser.java:90)
   [junit4]at 
org.apache.tika.parser.CompositeParser.getParsers(CompositeParser.java:81)
   [junit4]at 
org.apache.tika.parser.DefaultParser.getParsers(DefaultParser.java:95)
   [junit4]at 
org.apache.tika.parser.CompositeParser.getSupportedTypes(CompositeParser.java:229)
   [junit4]at 
org.apache.tika.parser.CompositeParser.getParsers(CompositeParser.java:81)
   [junit4]at 
org.apache.tika.parser.CompositeParser.getParser(CompositeParser.java:209)
   [junit4]at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
   [junit4]at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
   [junit4]at 
org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:141)
   [junit4]at 
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:243)
   [junit4]at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:476)
   [junit4]at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:415)
   [junit4]at 
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:330)
   [junit4]at 
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
   [junit4]at 
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
   [junit4]at 
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
   [junit4]at 
org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(DataImportHandler.java:189)
   [junit4]at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:144)
   [junit4]at 
org.apache.solr.core.SolrCore.execute(SolrCore.java:2006)
   [junit4]at 
org.apache.solr.util.TestHarness.query(TestHarness.java:331)
   [junit4]at 
org.apache.solr.handler.dataimport.AbstractDataImportHandlerTestCase.runFullImport(AbstractDataImportHandlerTestCase.java:86)
   [junit4]at 
org.apache.solr.handler.dataimport.TestTikaEntityProcessor.testTikaHTMLMapperIdentity(TestTikaEntityProcessor.java:99)
   [junit4]

[jira] [Commented] (SOLR-6969) Just like we have to retry when the NameNode is in safemode on Solr startup, we also need to retry when opening a transaction log file for append when we get a RecoveryI

2015-01-21 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286485#comment-14286485
 ] 

Mike Drob commented on SOLR-6969:
-

Is retrying always going to be safe? That works fine after we've lost a server 
and started a new one (albeit too quickly) but what about the case where two 
servers both think they are responsible for that tlog? This can happen if the 
original server partially dies, but still has some threads that are doing work 
and haven't been cleaned up.

Looking at how other projects handle similar issues - HBase moves the entire 
directory[1] to break any existing leases and ensure any other processes gets 
kicked out. Maybe a retry is a good stop-gap, but is it going to be a full 
solution?

[1]: 
https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java#L310

 Just like we have to retry when the NameNode is in safemode on Solr startup, 
 we also need to retry when opening a transaction log file for append when we 
 get a RecoveryInProgressException.
 

 Key: SOLR-6969
 URL: https://issues.apache.org/jira/browse/SOLR-6969
 Project: Solr
  Issue Type: Bug
  Components: hdfs
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Critical
 Fix For: 5.0, Trunk


 This can happen after a hard crash and restart. The current workaround is to 
 stop and wait it out and start again. We should retry and wait a given amount 
 of time as we do when we detect safe mode though.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6928) solr.cmd stop works only in english

2015-01-21 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SOLR-6928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-6928:
--
Attachment: SOLR-6928.patch

Slightly improved patch
* No need for case insensitive find
* Require a space after port number to avoid false match

 solr.cmd stop works only in english
 ---

 Key: SOLR-6928
 URL: https://issues.apache.org/jira/browse/SOLR-6928
 Project: Solr
  Issue Type: Bug
  Components: scripts and tools
Affects Versions: 4.10.3
 Environment: german windows 7
Reporter: john.work
Assignee: Timothy Potter
Priority: Minor
 Attachments: SOLR-6928.patch, SOLR-6928.patch


 in solr.cmd the stop doesnt work while executing 'netstat -nao ^| find /i 
 listening ^| find :%SOLR_PORT%' so listening is not found.
 e.g. in german cmd.exe the netstat -nao prints the following output:
 {noformat}
   Proto  Lokale Adresse Remoteadresse  Status   PID
   TCP0.0.0.0:80 0.0.0.0:0  ABHÖREN 4
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7

2015-01-21 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286502#comment-14286502
 ] 

Uwe Schindler commented on SOLR-6991:
-

[~steve_rowe]: Can you commit to all 3 branches, I wanted to go sleeping? 
Thanks.

 Update to Apache TIKA 1.7
 -

 Key: SOLR-6991
 URL: https://issues.apache.org/jira/browse/SOLR-6991
 Project: Solr
  Issue Type: Improvement
  Components: contrib - Solr Cell (Tika extraction)
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 5.0, Trunk, 5.1

 Attachments: SOLR-6991-forkfix.patch, SOLR-6991-forkfix.patch, 
 SOLR-6991.patch, SOLR-6991.patch


 Apache TIKA 1.7 was released: 
 [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt]
 This is more or less a dependency update, so replacements. Not sure if we 
 should do this for 5.0. In 5.0 we currently have the previous version, which 
 was not yet released with Solr. If we now bring this into 5.0, we wouldn't 
 have a new release 2 times. I can change the stuff this evening and let it 
 bake in 5.x, so maybe we backport this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6928) solr.cmd stop works only in english

2015-01-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286540#comment-14286540
 ] 

ASF subversion and git services commented on SOLR-6928:
---

Commit 1653700 from [~thelabdude] in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1653700 ]

SOLR-6928: solr.cmd stop works only in english

 solr.cmd stop works only in english
 ---

 Key: SOLR-6928
 URL: https://issues.apache.org/jira/browse/SOLR-6928
 Project: Solr
  Issue Type: Bug
  Components: scripts and tools
Affects Versions: 4.10.3
 Environment: german windows 7
Reporter: john.work
Assignee: Timothy Potter
Priority: Minor
 Attachments: SOLR-6928.patch, SOLR-6928.patch


 in solr.cmd the stop doesnt work while executing 'netstat -nao ^| find /i 
 listening ^| find :%SOLR_PORT%' so listening is not found.
 e.g. in german cmd.exe the netstat -nao prints the following output:
 {noformat}
   Proto  Lokale Adresse Remoteadresse  Status   PID
   TCP0.0.0.0:80 0.0.0.0:0  ABHÖREN 4
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7

2015-01-21 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286558#comment-14286558
 ] 

Steve Rowe commented on SOLR-6991:
--

bq. Steve Rowe: Can you commit to all 3 branches, I wanted to go sleeping? 
Thanks.

Will do.

 Update to Apache TIKA 1.7
 -

 Key: SOLR-6991
 URL: https://issues.apache.org/jira/browse/SOLR-6991
 Project: Solr
  Issue Type: Improvement
  Components: contrib - Solr Cell (Tika extraction)
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 5.0, Trunk, 5.1

 Attachments: SOLR-6991-forkfix.patch, SOLR-6991-forkfix.patch, 
 SOLR-6991.patch, SOLR-6991.patch


 Apache TIKA 1.7 was released: 
 [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt]
 This is more or less a dependency update, so replacements. Not sure if we 
 should do this for 5.0. In 5.0 we currently have the previous version, which 
 was not yet released with Solr. If we now bring this into 5.0, we wouldn't 
 have a new release 2 times. I can change the stuff this evening and let it 
 bake in 5.x, so maybe we backport this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7

2015-01-21 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286562#comment-14286562
 ] 

Steve Rowe commented on SOLR-6991:
--

bq. I'm running all Solr tests now with this patch and -Dtests.slow=true 
-Dtests.locale=tr_TR.

All Solr tests passed with the patch.

Committing now.

 Update to Apache TIKA 1.7
 -

 Key: SOLR-6991
 URL: https://issues.apache.org/jira/browse/SOLR-6991
 Project: Solr
  Issue Type: Improvement
  Components: contrib - Solr Cell (Tika extraction)
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 5.0, Trunk, 5.1

 Attachments: SOLR-6991-forkfix.patch, SOLR-6991-forkfix.patch, 
 SOLR-6991.patch, SOLR-6991.patch


 Apache TIKA 1.7 was released: 
 [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt]
 This is more or less a dependency update, so replacements. Not sure if we 
 should do this for 5.0. In 5.0 we currently have the previous version, which 
 was not yet released with Solr. If we now bring this into 5.0, we wouldn't 
 have a new release 2 times. I can change the stuff this evening and let it 
 bake in 5.x, so maybe we backport this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6188) Remove HTML verification from checkJavaDocs.py

2015-01-21 Thread Ramkumar Aiyengar (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286584#comment-14286584
 ] 

Ramkumar Aiyengar commented on LUCENE-6188:
---

Rob, the logic I have nuked is actually not as a duplicate of doclint (i just 
didnt check that, and as you mention there might be differences) but the 
checkJavadocLinks.py script which is run prior to this script in the 
documentation-lint. That does the exact same check in Python, except it uses a 
real parser rather than regex hacks..

 Remove HTML verification from checkJavaDocs.py
 --

 Key: LUCENE-6188
 URL: https://issues.apache.org/jira/browse/LUCENE-6188
 Project: Lucene - Core
  Issue Type: Improvement
  Components: general/javadocs
Reporter: Ramkumar Aiyengar
Assignee: Erick Erickson
Priority: Minor
 Attachments: LUCENE-6188.patch, LUCENE-6188.patch


 Currently, the broken HTML verification in {{checkJavaDocs.py}} has issues in 
 some cases (see SOLR-6902).
 On looking further to fix it with the {{html.parser}} package instead, 
 noticed that there is broken HTML verification already present (using 
 {{html.parser}}!)in {{checkJavadocLinks.py}} anyway which takes care of 
 validation, and probably {{jTidy}} does it as well, going by the output 
 (haven't verified it).
 Given this, the validation in {{checkJavaDocs.py}} doesn't seem to add any 
 further value, so here's a patch to just nuke it instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-6991) Update to Apache TIKA 1.7

2015-01-21 Thread Steve Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe resolved SOLR-6991.
--
Resolution: Fixed

 Update to Apache TIKA 1.7
 -

 Key: SOLR-6991
 URL: https://issues.apache.org/jira/browse/SOLR-6991
 Project: Solr
  Issue Type: Improvement
  Components: contrib - Solr Cell (Tika extraction)
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 5.0, Trunk, 5.1

 Attachments: SOLR-6991-forkfix.patch, SOLR-6991-forkfix.patch, 
 SOLR-6991.patch, SOLR-6991.patch


 Apache TIKA 1.7 was released: 
 [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt]
 This is more or less a dependency update, so replacements. Not sure if we 
 should do this for 5.0. In 5.0 we currently have the previous version, which 
 was not yet released with Solr. If we now bring this into 5.0, we wouldn't 
 have a new release 2 times. I can change the stuff this evening and let it 
 bake in 5.x, so maybe we backport this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7011) fix OverseerCollectionProcessor.deleteCollection removal-done check

2015-01-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286609#comment-14286609
 ] 

ASF subversion and git services commented on SOLR-7011:
---

Commit 1653716 from sha...@apache.org in branch 'dev/trunk'
[ https://svn.apache.org/r1653716 ]

SOLR-7011: Delete collection returns before collection is actually removed

 fix OverseerCollectionProcessor.deleteCollection removal-done check
 ---

 Key: SOLR-7011
 URL: https://issues.apache.org/jira/browse/SOLR-7011
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.10.3, 5.0, Trunk
Reporter: Christine Poerschke
Assignee: Shalin Shekhar Mangar
Priority: Minor

 {{OverseerCollectionProcessor.java}} line 1184 has a
 {{.hasCollection(message.getStr(collection))}} call which should be either
 {{.hasCollection(message.getStr(name))}} or
 {{.hasCollection(collection)}} instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6188) Remove HTML verification from checkJavaDocs.py

2015-01-21 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286498#comment-14286498
 ] 

Erick Erickson commented on LUCENE-6188:


Hmmm, thanks of pointing this out, but it makes things... complicated.

Problem is that until this is done, SOLR-6902 is blocked as that patch fails 
precommit. For no good reason I can find. If I'm reading things right, doclint 
is only in Java 8, so is simply not an option for 5x even if the problems you 
point out are fixed up.

If I'm reading this right, Ramkumar's claim is that the html checking in this 
patch that is being removed is unnecessary anyway, so removing it doesn't lose 
us anything. And it's incorrectly failing this doc for some reason. I checked 
the generated doc file and it looks fine, I think I even ran it through an XML 
validator. I could always have missed something of course.

That said, the proposed changes in this JIRA to take a lot of code out of 
checkJavaDocs.py, and I'll very much admit I haven't gone through the changes 
in much detail, but they do appear to just be doing HTML validation.

I can treat this somewhat as a black box and do something like apply this patch 
locally and:

1 create some invalid JavaDoc links and insure that they're flagged if this 
patch is applied (any suggestions for a candidate list)? If that works (or, 
more accurately fails the invalid javadocs), commit this patch  to trunk and 5x 
and then commit SOLR-6902
or
2 just remove the javadocs from SOLR-6902 or possibly munge them until that 
code succeeds precommit.
or
3 try to figure out what the false failure is here and fix checkJavaDocs.py

I think 1 is my first choice, and 3 is a very distant third. Spending time 
debugging code that it sounds like we're going to remove on trunk seems like a 
waste. I may do 2 anyway, remove the javaDocs and put them if one of the 
other approaches works. SOLR-6902 is hard to keep up to date since it touches 
so much, Alan's checkin is already going to be a headache to reconcile. So 
keeping it our of the code line just because of a bad (and possibly redundant) 
bit of non-standard HTML checking seems like a poor tradeoff.

This last can be argued of course

Anyway, I'll do some poking around and report back before committing anything.


 Remove HTML verification from checkJavaDocs.py
 --

 Key: LUCENE-6188
 URL: https://issues.apache.org/jira/browse/LUCENE-6188
 Project: Lucene - Core
  Issue Type: Improvement
  Components: general/javadocs
Reporter: Ramkumar Aiyengar
Assignee: Erick Erickson
Priority: Minor
 Attachments: LUCENE-6188.patch, LUCENE-6188.patch


 Currently, the broken HTML verification in {{checkJavaDocs.py}} has issues in 
 some cases (see SOLR-6902).
 On looking further to fix it with the {{html.parser}} package instead, 
 noticed that there is broken HTML verification already present (using 
 {{html.parser}}!)in {{checkJavadocLinks.py}} anyway which takes care of 
 validation, and probably {{jTidy}} does it as well, going by the output 
 (haven't verified it).
 Given this, the validation in {{checkJavaDocs.py}} doesn't seem to add any 
 further value, so here's a patch to just nuke it instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-6193) Collapse identical catch branches in try-catch statements

2015-01-21 Thread Shalin Shekhar Mangar (JIRA)
Shalin Shekhar Mangar created LUCENE-6193:
-

 Summary: Collapse identical catch branches in try-catch statements
 Key: LUCENE-6193
 URL: https://issues.apache.org/jira/browse/LUCENE-6193
 Project: Lucene - Core
  Issue Type: Task
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
Priority: Trivial
 Fix For: Trunk, 5.1


We are on Java 7+ so we can reduce verbosity by collapsing identical catch 
statements into one. We did the same for solr in SOLR-7014.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7014) Collapse identical catch branches in try-catch statements

2015-01-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286532#comment-14286532
 ] 

ASF subversion and git services commented on SOLR-7014:
---

Commit 1653698 from sha...@apache.org in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1653698 ]

SOLR-7014: Collapse identical catch branches in try-catch statements in 
morphlines-core

 Collapse identical catch branches in try-catch statements
 -

 Key: SOLR-7014
 URL: https://issues.apache.org/jira/browse/SOLR-7014
 Project: Solr
  Issue Type: Task
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
Priority: Trivial
 Fix For: Trunk, 5.1

 Attachments: SOLR-7014-more.patch, SOLR-7014.patch


 We are on Java 7+ so we can reduce verbosity by collapsing identical catch 
 statements into one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7

2015-01-21 Thread Anshum Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286623#comment-14286623
 ] 

Anshum Gupta commented on SOLR-6991:


Thanks for fixing this everyone!

 Update to Apache TIKA 1.7
 -

 Key: SOLR-6991
 URL: https://issues.apache.org/jira/browse/SOLR-6991
 Project: Solr
  Issue Type: Improvement
  Components: contrib - Solr Cell (Tika extraction)
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 5.0, Trunk, 5.1

 Attachments: SOLR-6991-forkfix.patch, SOLR-6991-forkfix.patch, 
 SOLR-6991.patch, SOLR-6991.patch


 Apache TIKA 1.7 was released: 
 [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt]
 This is more or less a dependency update, so replacements. Not sure if we 
 should do this for 5.0. In 5.0 we currently have the previous version, which 
 was not yet released with Solr. If we now bring this into 5.0, we wouldn't 
 have a new release 2 times. I can change the stuff this evening and let it 
 bake in 5.x, so maybe we backport this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-Tests-5.x-Java7 - Build # 2523 - Still Failing

2015-01-21 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-5.x-Java7/2523/

5 tests failed.
REGRESSION:  
org.apache.solr.handler.component.SuggestComponentTest.testBuildOnStartupWithNewCores

Error Message:
Exception during query

Stack Trace:
java.lang.RuntimeException: Exception during query
at 
__randomizedtesting.SeedInfo.seed([D0016418FD804417:4AC564B67B9CDB93]:0)
at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:748)
at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:715)
at 
org.apache.solr.handler.component.SuggestComponentTest.doTestBuildOnStartup(SuggestComponentTest.java:395)
at 
org.apache.solr.handler.component.SuggestComponentTest.testBuildOnStartupWithNewCores(SuggestComponentTest.java:374)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
at 

[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7

2015-01-21 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286500#comment-14286500
 ] 

Uwe Schindler commented on SOLR-6991:
-

Ah you already posted a patch. Thanks for testing. I have only Windows ready to 
use on my laptop :-)

 Update to Apache TIKA 1.7
 -

 Key: SOLR-6991
 URL: https://issues.apache.org/jira/browse/SOLR-6991
 Project: Solr
  Issue Type: Improvement
  Components: contrib - Solr Cell (Tika extraction)
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 5.0, Trunk, 5.1

 Attachments: SOLR-6991-forkfix.patch, SOLR-6991-forkfix.patch, 
 SOLR-6991.patch, SOLR-6991.patch


 Apache TIKA 1.7 was released: 
 [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt]
 This is more or less a dependency update, so replacements. Not sure if we 
 should do this for 5.0. In 5.0 we currently have the previous version, which 
 was not yet released with Solr. If we now bring this into 5.0, we wouldn't 
 have a new release 2 times. I can change the stuff this evening and let it 
 bake in 5.x, so maybe we backport this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6193) Collapse identical catch branches in try-catch statements

2015-01-21 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated LUCENE-6193:
--
Attachment: LUCENE-6193.patch

The only places where I did not make these changes are where the catch blocks 
have different comments or the code wasn't ASL.

The following were excluded:
# org.apache.lucene.analysis.core.TestFactories
# org.apache.lucene.index.TestReaderClosed
# org.apache.lucene.queryparser.flexible.messages.NLS (one instance)
# org.egothor.stemmer.Diff (license different from ASL)
# org.tartarus.snowball.SnowballProgram (license different from ASL)

 Collapse identical catch branches in try-catch statements
 -

 Key: LUCENE-6193
 URL: https://issues.apache.org/jira/browse/LUCENE-6193
 Project: Lucene - Core
  Issue Type: Task
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
Priority: Trivial
 Fix For: Trunk, 5.1

 Attachments: LUCENE-6193.patch


 We are on Java 7+ so we can reduce verbosity by collapsing identical catch 
 statements into one. We did the same for solr in SOLR-7014.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6928) solr.cmd stop works only in english

2015-01-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286538#comment-14286538
 ] 

ASF subversion and git services commented on SOLR-6928:
---

Commit 1653699 from [~thelabdude] in branch 'dev/trunk'
[ https://svn.apache.org/r1653699 ]

SOLR-6928: solr.cmd stop works only in english

 solr.cmd stop works only in english
 ---

 Key: SOLR-6928
 URL: https://issues.apache.org/jira/browse/SOLR-6928
 Project: Solr
  Issue Type: Bug
  Components: scripts and tools
Affects Versions: 4.10.3
 Environment: german windows 7
Reporter: john.work
Assignee: Timothy Potter
Priority: Minor
 Attachments: SOLR-6928.patch, SOLR-6928.patch


 in solr.cmd the stop doesnt work while executing 'netstat -nao ^| find /i 
 listening ^| find :%SOLR_PORT%' so listening is not found.
 e.g. in german cmd.exe the netstat -nao prints the following output:
 {noformat}
   Proto  Lokale Adresse Remoteadresse  Status   PID
   TCP0.0.0.0:80 0.0.0.0:0  ABHÖREN 4
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-7014) Collapse identical catch branches in try-catch statements

2015-01-21 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-7014.
-
Resolution: Fixed

I opened LUCENE-6193 for the lucene changes.

 Collapse identical catch branches in try-catch statements
 -

 Key: SOLR-7014
 URL: https://issues.apache.org/jira/browse/SOLR-7014
 Project: Solr
  Issue Type: Task
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
Priority: Trivial
 Fix For: Trunk, 5.1

 Attachments: SOLR-7014-more.patch, SOLR-7014.patch


 We are on Java 7+ so we can reduce verbosity by collapsing identical catch 
 statements into one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6193) Collapse identical catch branches in try-catch statements

2015-01-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286570#comment-14286570
 ] 

ASF subversion and git services commented on LUCENE-6193:
-

Commit 1653707 from sha...@apache.org in branch 'dev/trunk'
[ https://svn.apache.org/r1653707 ]

LUCENE-6193: Collapse identical catch branches in try-catch statements

 Collapse identical catch branches in try-catch statements
 -

 Key: LUCENE-6193
 URL: https://issues.apache.org/jira/browse/LUCENE-6193
 Project: Lucene - Core
  Issue Type: Task
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
Priority: Trivial
 Fix For: Trunk, 5.1

 Attachments: LUCENE-6193.patch


 We are on Java 7+ so we can reduce verbosity by collapsing identical catch 
 statements into one. We did the same for solr in SOLR-7014.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7

2015-01-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286572#comment-14286572
 ] 

ASF subversion and git services commented on SOLR-6991:
---

Commit 1653708 from [~sar...@syr.edu] in branch 'dev/branches/lucene_solr_5_0'
[ https://svn.apache.org/r1653708 ]

SOLR-6991,SOLR-6387: Under Turkish locale, don't run solr-cell and 
dataimporthandler-extras tests that use Tika (merged trunk r1653704)

 Update to Apache TIKA 1.7
 -

 Key: SOLR-6991
 URL: https://issues.apache.org/jira/browse/SOLR-6991
 Project: Solr
  Issue Type: Improvement
  Components: contrib - Solr Cell (Tika extraction)
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 5.0, Trunk, 5.1

 Attachments: SOLR-6991-forkfix.patch, SOLR-6991-forkfix.patch, 
 SOLR-6991.patch, SOLR-6991.patch


 Apache TIKA 1.7 was released: 
 [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt]
 This is more or less a dependency update, so replacements. Not sure if we 
 should do this for 5.0. In 5.0 we currently have the previous version, which 
 was not yet released with Solr. If we now bring this into 5.0, we wouldn't 
 have a new release 2 times. I can change the stuff this evening and let it 
 bake in 5.x, so maybe we backport this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6387) Solr specific work around for JDK bug #8047340: posix_spawn error with turkish locale

2015-01-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286573#comment-14286573
 ] 

ASF subversion and git services commented on SOLR-6387:
---

Commit 1653708 from [~sar...@syr.edu] in branch 'dev/branches/lucene_solr_5_0'
[ https://svn.apache.org/r1653708 ]

SOLR-6991,SOLR-6387: Under Turkish locale, don't run solr-cell and 
dataimporthandler-extras tests that use Tika (merged trunk r1653704)

 Solr specific work around for JDK bug #8047340: posix_spawn error with 
 turkish locale
 -

 Key: SOLR-6387
 URL: https://issues.apache.org/jira/browse/SOLR-6387
 Project: Solr
  Issue Type: Bug
 Environment: Linux, MacOSX, POSIX in general
Reporter: Hoss Man
Assignee: Uwe Schindler
Priority: Minor
  Labels: Java7, Java8
 Fix For: 4.10, Trunk

 Attachments: SOLR-6387.patch, SOLR-6387.patch


 Various versions of the Sun/Oracle/OpenJDK JVM have issues executing new 
 processes if the default langauge of the JVM is Turkish.
 The root bug reports of this affecting Runtime.exec() are here...
 * https://bugs.openjdk.java.net/browse/JDK-8047340
 * https://bugs.openjdk.java.net/browse/JDK-8055301
 On systems runining the affected JVMs, with a default langauge of Turkish, 
 this problem has historically manifested itself in Solr in a few ways:
 * SystemInfoHandler would throw nasty exceptions on these systems due to an 
 attempt at conditionally executing some native process to check system stats
 * RunExecutableListener would fail cryptically
 * some solr tests involving either the SystemInfoHandler or the Hadoop 
 MapReduce code would fail if the test framework randomly selected a turkish 
 language based locale.
 Starting with Solr 4.10, We have worked around this jvm bug in Solr in 3 ways:
 * RunExecutableListener makes it more clear in the logs why it can't be used
 * SystemInfoHandler traps and ignores any Error related to posix_span in 
 the same way it traps and ignores other errors related to it's conditional 
 attempts at exec'ing (ie: permission problems, executable not found ,etc...)
 * our map reduce based tests that depend on exec'ing external processes now 
 skip themselves automatically if a turkish local is randomly selected.
 Users affected by this issue who, for whatever reasons, can not upgrade to 
 Solr 4.10, may wish to consider setting the 
 jdk.lang.Process.launchMechanism system property explicitly (see below)
 {panel:title=original issue report}
 Jenkin's tests occasionally fail with the following cryptic error...
 {noformat}
 java.lang.Error: posix_spawn is not a supported process launch mechanism on 
 this platform.
 at 
 __randomizedtesting.SeedInfo.seed([9219CAA3BCAA7365:7F07719937A772E1]:0)
 at java.lang.UNIXProcess$1.run(UNIXProcess.java:104)
 at java.lang.UNIXProcess$1.run(UNIXProcess.java:93)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.lang.UNIXProcess.clinit(UNIXProcess.java:91)
 at java.lang.ProcessImpl.start(ProcessImpl.java:130)
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
 at java.lang.Runtime.exec(Runtime.java:617)
 {noformat}
 A commonality of most of these failures is that the turkish locale has been 
 randomly selected, and apparently the Runtime.exec is busted whtn you use 
 turkish...
 http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8047340
 http://java.thedizzyheights.com/2014/07/java-error-posix_spawn-is-not-a-supported-process-launch-mechanism-on-this-platform-when-trying-to-spawn-a-process/
 We should consider hardcoding the jdk.lang.Process.launchMechanism sys 
 property mentioned as a workarround in the jdk bug report
 {panel}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7011) fix OverseerCollectionProcessor.deleteCollection removal-done check

2015-01-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286611#comment-14286611
 ] 

ASF subversion and git services commented on SOLR-7011:
---

Commit 1653718 from sha...@apache.org in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1653718 ]

SOLR-7011: Delete collection returns before collection is actually removed

 fix OverseerCollectionProcessor.deleteCollection removal-done check
 ---

 Key: SOLR-7011
 URL: https://issues.apache.org/jira/browse/SOLR-7011
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.10.3, 5.0, Trunk
Reporter: Christine Poerschke
Assignee: Shalin Shekhar Mangar
Priority: Minor

 {{OverseerCollectionProcessor.java}} line 1184 has a
 {{.hasCollection(message.getStr(collection))}} call which should be either
 {{.hasCollection(message.getStr(name))}} or
 {{.hasCollection(collection)}} instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6991) Update to Apache TIKA 1.7

2015-01-21 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286470#comment-14286470
 ] 

Hoss Man commented on SOLR-6991:


bq. The last comment was just an idea, but doesn't work. ...

you fought a good fight uwe, but alas...

+1 to your SOLR-6991-forkfix.patch for 5.0 .. but don't we need similar assumes 
in dataimporthandler-extras tests that use TikaEntityProcessor? (i'm not sure 
why those wouldn't fail with turkish now as well)



 Update to Apache TIKA 1.7
 -

 Key: SOLR-6991
 URL: https://issues.apache.org/jira/browse/SOLR-6991
 Project: Solr
  Issue Type: Improvement
  Components: contrib - Solr Cell (Tika extraction)
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 5.0, Trunk, 5.1

 Attachments: SOLR-6991-forkfix.patch, SOLR-6991.patch, SOLR-6991.patch


 Apache TIKA 1.7 was released: 
 [https://dist.apache.org/repos/dist/release/tika/CHANGES-1.7.txt]
 This is more or less a dependency update, so replacements. Not sure if we 
 should do this for 5.0. In 5.0 we currently have the previous version, which 
 was not yet released with Solr. If we now bring this into 5.0, we wouldn't 
 have a new release 2 times. I can change the stuff this evening and let it 
 bake in 5.x, so maybe we backport this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   >