[jira] [Commented] (LUCENE-6386) TestIndexWriterForceMerge still unreliable in NIGHTLY
[ https://issues.apache.org/jira/browse/LUCENE-6386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392344#comment-14392344 ] Michael McCandless commented on LUCENE-6386: This is hard to think about :) In your example I'm assuming result is just the same as level2, but converted into CFS (ie, s14 really should have been s13.cfs, but I'll just call it s14 here to try not to lose my sanity). So, at the exact moment when IW finishes building the final (level3) CFS [s14], you have: * original source segments - 1X * level 1 segments - 1X * level 2 segment [s13] non-CFS - 1X * result segment [s14] CFS - 1X We do NOT delete level 1 segments after merging to the level 2 non-CFS segment before creating the result s14 (we used to do this, but it caused complexity/problems because a non-CFS file can unexpectedly sneak into an IW commit or an NRT reader even when you demanded CFS). So at that moment, the peak temp disk usage is 3X? TestIndexWriterForceMerge still unreliable in NIGHTLY - Key: LUCENE-6386 URL: https://issues.apache.org/jira/browse/LUCENE-6386 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Assignee: Michael McCandless Fix For: Trunk, 5.1 Attachments: LUCENE-6386.patch Discovered by ryan beasting (trunk): ant test -Dtestcase=TestIndexWriterForceMerge -Dtests.method=testForceMergeTempSpaceUsage -Dtests.seed=DC9ADB74850A581B -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.locale=sr__#Latn -Dtests.timezone=Indian/Chagos -Dtests.asserts=true -Dtests.file.encoding=US-ASCII {noformat} [junit4] 2 NOTE: reproduce with: ant test -Dtestcase=TestIndexWriterForceMerge -Dtests.method=testForceMergeTempSpaceUsage -Dtests.seed=DC9ADB74850A581B -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.locale=sr__#Latn -Dtests.timezone=Indian/Chagos -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] FAILURE 1.20s | TestIndexWriterForceMerge.testForceMergeTempSpaceUsage [junit4] Throwable #1: java.lang.AssertionError: forceMerge used too much temporary space: starting usage was 291570 bytes; final usage was 262469 bytes; max temp usage was 1079501 but should have been 874710 (= 3X starting usage), BEFORE= [junit4] _u.scf 146329 [junit4] _u.si 635 [junit4] |- (inside compound file) _u.fld 2214 [junit4] |- (inside compound file) _u.inf 392 [junit4] |- (inside compound file) _u.len 2381 [junit4] |- (inside compound file) _u.pst 36758 [junit4] |- (inside compound file) _u.vec 104144 [junit4] _s.pst 1338 [junit4] _s.inf 392 [junit4] _s.fld 94 [junit4] _s.len 221 [junit4] _s.vec 3744 [junit4] _s.si 624 [junit4] _t.fld 94 [junit4] _t.len 221 [junit4] _t.pst 1338 [junit4] _t.inf 392 [junit4] _t.vec 3744 [junit4] _t.si 624 [junit4] _v.fld 94 [junit4] _v.pst 1338 [junit4] _v.inf 392 [junit4] _v.vec 3744 [junit4] _v.si 624 [junit4] _v.len 221 [junit4] _w.len 221 [junit4] _w.pst 1338 [junit4] _w.inf 392 [junit4] _w.fld 94 [junit4] _w.si 624 [junit4] _w.vec 3744 [junit4] _x.vec 3744 [junit4] _x.inf 392 [junit4] _x.pst 1338 [junit4] _x.fld 94 [junit4] _x.si 624 [junit4] _x.len 221 [junit4] _y.fld 94 [junit4] _y.pst 1338 [junit4] _y.inf 392 [junit4] _y.si 624 [junit4] _y.vec 3744 [junit4] _y.len 221 [junit4] _z.fld 94 [junit4] _z.pst 1338 [junit4] _z.inf 392 [junit4] _z.len 221 [junit4] _z.vec 3744 [junit4] _z.si 624 [junit4] _10.si 630 [junit4] _10.fld 94 [junit4] _10.pst 1338 [junit4] _10.inf 392 [junit4] _10.vec 3744 [junit4] _10.len 221 [junit4] _11.len 221 [junit4] _11.si 630
[JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.8.0) - Build # 2127 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/2127/ Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseG1GC 1 tests failed. FAILED: org.apache.solr.cloud.ChaosMonkeyNothingIsSafeTest.test Error Message: There were too many update fails (38 20) - we expect it can happen, but shouldn't easily Stack Trace: java.lang.AssertionError: There were too many update fails (38 20) - we expect it can happen, but shouldn't easily at __randomizedtesting.SeedInfo.seed([E4454DD263D68F3A:6C117208CD2AE2C2]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.junit.Assert.assertFalse(Assert.java:68) at org.apache.solr.cloud.ChaosMonkeyNothingIsSafeTest.test(ChaosMonkeyNothingIsSafeTest.java:230) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:872) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:886) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:960) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:935) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at
[jira] [Updated] (SOLR-6220) Replica placement strategy for solrcloud
[ https://issues.apache.org/jira/browse/SOLR-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-6220: - Description: h1.Objective Most cloud based systems allow to specify rules on how the replicas/nodes of a cluster are allocated . Solr should have a flexible mechanism through which we should be able to control allocation of replicas or later change it to suit the needs of the system All configurations are per collection basis. The rules are applied whenever a replica is created in any of the shards in a given collection during * collection creation * shard splitting * add replica * createsshard There are two aspects to how replicas are placed: snitch and placement. h2.snitch How to identify the tags of nodes. Snitches are configured through collection create command with the snitch prefix . eg: snitch.type=EC2Snitch. The system provides the following implicit tag names which cannot be used by other snitches * node : The solr nodename * host : The hostname * ip : The ip address of the host * cores : This is a dynamic varibale which gives the core count at any given point * disk : This is a dynamic variable which gives the available disk space at any given point There will a few snitches provided by the system such as h3.EC2Snitch Provides two tags called dc, rack from the region and zone values in EC2 h3.IPSnitch Use the IP to infer the “dc” and “rack” values h3.NodePropertySnitch This lets users provide system properties to each node with tagname and value . example : -Dsolrcloud.snitch.vals=tag-x:val-a,tag-y:val-b. This means this particular node will have two tags “tag-x” and “tag-y” . h3.RestSnitch Which lets the user configure a url which the server can invoke and get all the tags for a given node. This takes extra parameters in create command example: {{snitch={type=RestSnitch,url=http://snitchserverhost:port/[node]}} The response of the rest call {{http://snitchserverhost:port/?nodename=192.168.1:8080_solr}} must be in json format eg: {code:JavaScript} { “tag-x”:”x-val”, “tag-y”:”y-val” } {code} h3.ManagedSnitch This snitch keeps a list of nodes and their tag value pairs in Zookeeper. The user should be able to manage the tags and values of each node through a collection API h2.Rules This tells how many replicas for a given shard needs to be assigned to nodes with the given key value pairs. These parameters will be passed on to the collection CREATE api as a multivalued parameter rule . The values will be saved in the state of the collection as follows {code:Javascript} { “mycollection”:{ “snitch”: { type:“EC2Snitch” } “rules”:[ {“shard”: “value1”, “replica”: “value2”, tag1:val1}, {“shard”: “value1”, “replica”: “value2”, tag2:val2} ] } {code} A rule is specified as a pseudo JSON syntax . which is a map of keys and values *Each collection can have any number of rules. As long as the rules do not conflict with each other it should be OK. Or else an error is thrown * In each rule , shard and replica can be omitted ** default value of replica is {{\*}} means ANY or you can specify a count and an operand such as {{+}} or {{-}} ** and the value of shard can be a shard name or {{\*}} means EACH or {{**}} means ANY. default value is {{\*\*}} (ANY) * There should be exactly one extra condition in a rule other than {{shard}} and {{replica}}. * all keys other than {{shard}} and {{replica}} are called tags and the tags are nothing but values provided by the snitch for each node * By default certain tags such as {{node}}, {{host}}, {{port}} are provided by the system implicitly Examples: {noformat} //in each rack there can be max two replicas of A given shard {rack:*,shard:*,replica:2-} //in each rack there can be max two replicas of ANY replica {rack:*,shard:**,replica:2-} {rack:*,replica:2-} //in each node there should be a max one replica of EACH shard {node:*,shard:*,replica:1-} //in each node there should be a max one replica of ANY shard {node:*,shard:**,replica:1-} {node:*,replica:1-} //In rack 738 and shard=shard1, there can be a max 0 replica {rack:738,shard:shard1,replica:0-} //All replicas of shard1 should go to rack 730 {shard:shard1,replica:*,rack:730} {shard:shard1,rack:730} // all replicas must be created in a node with at least 20GB disk {replica:*,shard:*,disk:20+} {replica:*,disk:20+} {disk:20+} // All replicas should be created in nodes with less than 5 cores //In this ANY AND each for shard have same meaning {replica:*,shard:**,cores:5-} {replica:*,cores:5-} {cores:5-} //one replica of shard1 must go to node 192.168.1.2:8080_solr {node:”192.168.1.2:8080_solr”, shard:shard1, replica:1} //No replica of shard1 should go to rack 738 {rack:!738,shard:shard1,replica:*} {rack:!738,shard:shard1} //No replica of ANY shard should go to rack 738 {rack:!738,shard:**,replica:*}
[jira] [Commented] (LUCENE-6386) TestIndexWriterForceMerge still unreliable in NIGHTLY
[ https://issues.apache.org/jira/browse/LUCENE-6386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392380#comment-14392380 ] Shai Erera commented on LUCENE-6386: bq. s14 really should have been s13.cfs You're right, my mistake. The example should have listed {{s13.cfs}}. bq. We do NOT delete level 1 segments after merging to the level 2 non-CFS segment before creating the result s14 I see. That explains the 4X. But then what happens if there are multi-level merges? Do we delete any of these files? Let me give a concrete example: {noformat} source: [s1,s2] [s3,s4] [s5,s6] [s7,s8] level1: [s9,s10], [s11,s12] level2: [s13,s14] level3: [s15] final: [s15.cfs] {noformat} Would we take 5X disk space in that case?? TestIndexWriterForceMerge still unreliable in NIGHTLY - Key: LUCENE-6386 URL: https://issues.apache.org/jira/browse/LUCENE-6386 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Assignee: Michael McCandless Fix For: Trunk, 5.1 Attachments: LUCENE-6386.patch Discovered by ryan beasting (trunk): ant test -Dtestcase=TestIndexWriterForceMerge -Dtests.method=testForceMergeTempSpaceUsage -Dtests.seed=DC9ADB74850A581B -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.locale=sr__#Latn -Dtests.timezone=Indian/Chagos -Dtests.asserts=true -Dtests.file.encoding=US-ASCII {noformat} [junit4] 2 NOTE: reproduce with: ant test -Dtestcase=TestIndexWriterForceMerge -Dtests.method=testForceMergeTempSpaceUsage -Dtests.seed=DC9ADB74850A581B -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.locale=sr__#Latn -Dtests.timezone=Indian/Chagos -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] FAILURE 1.20s | TestIndexWriterForceMerge.testForceMergeTempSpaceUsage [junit4] Throwable #1: java.lang.AssertionError: forceMerge used too much temporary space: starting usage was 291570 bytes; final usage was 262469 bytes; max temp usage was 1079501 but should have been 874710 (= 3X starting usage), BEFORE= [junit4] _u.scf 146329 [junit4] _u.si 635 [junit4] |- (inside compound file) _u.fld 2214 [junit4] |- (inside compound file) _u.inf 392 [junit4] |- (inside compound file) _u.len 2381 [junit4] |- (inside compound file) _u.pst 36758 [junit4] |- (inside compound file) _u.vec 104144 [junit4] _s.pst 1338 [junit4] _s.inf 392 [junit4] _s.fld 94 [junit4] _s.len 221 [junit4] _s.vec 3744 [junit4] _s.si 624 [junit4] _t.fld 94 [junit4] _t.len 221 [junit4] _t.pst 1338 [junit4] _t.inf 392 [junit4] _t.vec 3744 [junit4] _t.si 624 [junit4] _v.fld 94 [junit4] _v.pst 1338 [junit4] _v.inf 392 [junit4] _v.vec 3744 [junit4] _v.si 624 [junit4] _v.len 221 [junit4] _w.len 221 [junit4] _w.pst 1338 [junit4] _w.inf 392 [junit4] _w.fld 94 [junit4] _w.si 624 [junit4] _w.vec 3744 [junit4] _x.vec 3744 [junit4] _x.inf 392 [junit4] _x.pst 1338 [junit4] _x.fld 94 [junit4] _x.si 624 [junit4] _x.len 221 [junit4] _y.fld 94 [junit4] _y.pst 1338 [junit4] _y.inf 392 [junit4] _y.si 624 [junit4] _y.vec 3744 [junit4] _y.len 221 [junit4] _z.fld 94 [junit4] _z.pst 1338 [junit4] _z.inf 392 [junit4] _z.len 221 [junit4] _z.vec 3744 [junit4] _z.si 624 [junit4] _10.si 630 [junit4] _10.fld 94 [junit4] _10.pst 1338 [junit4] _10.inf 392 [junit4] _10.vec 3744 [junit4] _10.len 221 [junit4] _11.len 221 [junit4] _11.si 630 [junit4] _11.vec 3744 [junit4] _11.pst 1338 [junit4] _11.inf 392 [junit4] _11.fld 94 [junit4] _12.vec 3744 [junit4] _12.si
[jira] [Updated] (LUCENE-6388) Optimize SpanNearQuery
[ https://issues.apache.org/jira/browse/LUCENE-6388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-6388: Attachment: LUCENE-6388.patch Optimize SpanNearQuery -- Key: LUCENE-6388 URL: https://issues.apache.org/jira/browse/LUCENE-6388 Project: Lucene - Core Issue Type: Improvement Reporter: Robert Muir Attachments: LUCENE-6388.patch After the big spans overhaul in LUCENE-6308, we can speed up SpanNearQuery a little more: * SpanNearQuery defaults to collectPayloads=true, but this requires a slower implementation, for an uncommon case. Use the faster no-payloads impl if the field doesn't actually have any payloads. * Use a simple array of Spans rather than List in NearSpans classes. This is iterated over often in the logic. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6386) TestIndexWriterForceMerge still unreliable in NIGHTLY
[ https://issues.apache.org/jira/browse/LUCENE-6386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392282#comment-14392282 ] Shai Erera commented on LUCENE-6386: Right, therefore I still think we consume up to 2X additional space? When we build level2, we have source + level1 + level2. When level2 is complete, we delete level1, and so we have source + level2. When we build level3 (the CFS), we have source + level2 + level3. No? TestIndexWriterForceMerge still unreliable in NIGHTLY - Key: LUCENE-6386 URL: https://issues.apache.org/jira/browse/LUCENE-6386 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Assignee: Michael McCandless Fix For: Trunk, 5.1 Attachments: LUCENE-6386.patch Discovered by ryan beasting (trunk): ant test -Dtestcase=TestIndexWriterForceMerge -Dtests.method=testForceMergeTempSpaceUsage -Dtests.seed=DC9ADB74850A581B -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.locale=sr__#Latn -Dtests.timezone=Indian/Chagos -Dtests.asserts=true -Dtests.file.encoding=US-ASCII {noformat} [junit4] 2 NOTE: reproduce with: ant test -Dtestcase=TestIndexWriterForceMerge -Dtests.method=testForceMergeTempSpaceUsage -Dtests.seed=DC9ADB74850A581B -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.locale=sr__#Latn -Dtests.timezone=Indian/Chagos -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] FAILURE 1.20s | TestIndexWriterForceMerge.testForceMergeTempSpaceUsage [junit4] Throwable #1: java.lang.AssertionError: forceMerge used too much temporary space: starting usage was 291570 bytes; final usage was 262469 bytes; max temp usage was 1079501 but should have been 874710 (= 3X starting usage), BEFORE= [junit4] _u.scf 146329 [junit4] _u.si 635 [junit4] |- (inside compound file) _u.fld 2214 [junit4] |- (inside compound file) _u.inf 392 [junit4] |- (inside compound file) _u.len 2381 [junit4] |- (inside compound file) _u.pst 36758 [junit4] |- (inside compound file) _u.vec 104144 [junit4] _s.pst 1338 [junit4] _s.inf 392 [junit4] _s.fld 94 [junit4] _s.len 221 [junit4] _s.vec 3744 [junit4] _s.si 624 [junit4] _t.fld 94 [junit4] _t.len 221 [junit4] _t.pst 1338 [junit4] _t.inf 392 [junit4] _t.vec 3744 [junit4] _t.si 624 [junit4] _v.fld 94 [junit4] _v.pst 1338 [junit4] _v.inf 392 [junit4] _v.vec 3744 [junit4] _v.si 624 [junit4] _v.len 221 [junit4] _w.len 221 [junit4] _w.pst 1338 [junit4] _w.inf 392 [junit4] _w.fld 94 [junit4] _w.si 624 [junit4] _w.vec 3744 [junit4] _x.vec 3744 [junit4] _x.inf 392 [junit4] _x.pst 1338 [junit4] _x.fld 94 [junit4] _x.si 624 [junit4] _x.len 221 [junit4] _y.fld 94 [junit4] _y.pst 1338 [junit4] _y.inf 392 [junit4] _y.si 624 [junit4] _y.vec 3744 [junit4] _y.len 221 [junit4] _z.fld 94 [junit4] _z.pst 1338 [junit4] _z.inf 392 [junit4] _z.len 221 [junit4] _z.vec 3744 [junit4] _z.si 624 [junit4] _10.si 630 [junit4] _10.fld 94 [junit4] _10.pst 1338 [junit4] _10.inf 392 [junit4] _10.vec 3744 [junit4] _10.len 221 [junit4] _11.len 221 [junit4] _11.si 630 [junit4] _11.vec 3744 [junit4] _11.pst 1338 [junit4] _11.inf 392 [junit4] _11.fld 94 [junit4] _12.vec 3744 [junit4] _12.si 630 [junit4] _12.len 221 [junit4] _12.fld 94 [junit4] _12.pst 1338 [junit4] _12.inf 392 [junit4] _13.fld 94 [junit4] _13.vec 3744 [junit4] _13.si
[JENKINS] Lucene-Solr-5.1-Linux (32bit/ibm-j9-jdk7) - Build # 185 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.1-Linux/185/ Java: 32bit/ibm-j9-jdk7 -Xjit:exclude={org/apache/lucene/util/fst/FST.pack(IIF)Lorg/apache/lucene/util/fst/FST;} No tests ran. Build Log: [...truncated 161 lines...] ERROR: Publisher hudson.tasks.junit.JUnitResultArchiver aborted due to exception hudson.AbortException: No test report files were found. Configuration error? at hudson.tasks.junit.JUnitParser$ParseResultCallable.invoke(JUnitParser.java:116) at hudson.tasks.junit.JUnitParser$ParseResultCallable.invoke(JUnitParser.java:93) at hudson.FilePath.act(FilePath.java:989) at hudson.FilePath.act(FilePath.java:967) at hudson.tasks.junit.JUnitParser.parseResult(JUnitParser.java:90) at hudson.tasks.junit.JUnitResultArchiver.parse(JUnitResultArchiver.java:120) at hudson.tasks.junit.JUnitResultArchiver.perform(JUnitResultArchiver.java:137) at hudson.tasks.BuildStepCompatibilityLayer.perform(BuildStepCompatibilityLayer.java:74) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:761) at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:721) at hudson.model.Build$BuildExecution.post2(Build.java:183) at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:670) at hudson.model.Run.execute(Run.java:1766) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) at hudson.model.ResourceController.execute(ResourceController.java:98) at hudson.model.Executor.run(Executor.java:374) Email was triggered for: Failure - Any Sending email for trigger: Failure - Any - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-6388) Optimize SpanNearQuery
Robert Muir created LUCENE-6388: --- Summary: Optimize SpanNearQuery Key: LUCENE-6388 URL: https://issues.apache.org/jira/browse/LUCENE-6388 Project: Lucene - Core Issue Type: Improvement Reporter: Robert Muir After the big spans overhaul in LUCENE-6308, we can speed up SpanNearQuery a little more: * SpanNearQuery defaults to collectPayloads=true, but this requires a slower implementation, for an uncommon case. Use the faster no-payloads impl if the field doesn't actually have any payloads. * Use a simple array of Spans rather than List in NearSpans classes. This is iterated over often in the logic. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Windows (64bit/jdk1.8.0_40) - Build # 4627 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Windows/4627/ Java: 64bit/jdk1.8.0_40 -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC 1 tests failed. FAILED: org.apache.solr.TestDistributedSearch.test Error Message: Error from server at http://127.0.0.1:53417/au_/g/collection1: java.lang.NullPointerException at org.apache.solr.search.grouping.distributed.responseprocessor.TopGroupsShardResponseProcessor.process(TopGroupsShardResponseProcessor.java:102) at org.apache.solr.handler.component.QueryComponent.handleGroupedResponses(QueryComponent.java:744) at org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:727) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:356) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1988) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:829) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:446) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:220) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) at org.apache.solr.client.solrj.embedded.JettySolrRunner$DebugFilter.doFilter(JettySolrRunner.java:103) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:83) at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:364) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:221) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) at org.eclipse.jetty.server.Server.handle(Server.java:497) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257) at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555) at java.lang.Thread.run(Thread.java:745) Stack Trace: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://127.0.0.1:53417/au_/g/collection1: java.lang.NullPointerException at org.apache.solr.search.grouping.distributed.responseprocessor.TopGroupsShardResponseProcessor.process(TopGroupsShardResponseProcessor.java:102) at org.apache.solr.handler.component.QueryComponent.handleGroupedResponses(QueryComponent.java:744) at org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:727) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:356) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1988) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:829) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:446) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:220) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) at org.apache.solr.client.solrj.embedded.JettySolrRunner$DebugFilter.doFilter(JettySolrRunner.java:103) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:83) at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:364) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:221) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515) at
[JENKINS] Lucene-Solr-Tests-5.x-Java7 - Build # 2876 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-Tests-5.x-Java7/2876/ 3 tests failed. FAILED: org.apache.solr.cloud.LeaderInitiatedRecoveryOnCommitTest.test Error Message: IOException occured when talking to server at: http://127.0.0.1:24846/hff/p/c8n_1x3_commits_shard1_replica1 Stack Trace: org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: http://127.0.0.1:24846/hff/p/c8n_1x3_commits_shard1_replica1 at __randomizedtesting.SeedInfo.seed([177722754B01BEA:89234DFDFA4C7612]:0) at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:570) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:233) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:225) at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:135) at org.apache.solr.client.solrj.SolrClient.commit(SolrClient.java:483) at org.apache.solr.client.solrj.SolrClient.commit(SolrClient.java:464) at org.apache.solr.cloud.LeaderInitiatedRecoveryOnCommitTest.oneShardTest(LeaderInitiatedRecoveryOnCommitTest.java:132) at org.apache.solr.cloud.LeaderInitiatedRecoveryOnCommitTest.test(LeaderInitiatedRecoveryOnCommitTest.java:64) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:872) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:886) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:960) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:935) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
[jira] [Commented] (LUCENE-6388) Optimize SpanNearQuery
[ https://issues.apache.org/jira/browse/LUCENE-6388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1439#comment-1439 ] Robert Muir commented on LUCENE-6388: - {noformat} Report after iter 10: Chart saved to out.png... (wd: /home/rmuir/workspace/util/src/python) Task QPS trunk StdDev QPS patch StdDev Pct diff MedSpanNear 75.69 (2.0%) 80.58 (3.9%) 6.5% ( 0% - 12%) LowSpanNear 233.30 (3.8%) 259.44 (6.5%) 11.2% ( 0% - 22%) HighSpanNear9.43 (3.6%) 10.76 (7.5%) 14.0% ( 2% - 25%) {noformat} Optimize SpanNearQuery -- Key: LUCENE-6388 URL: https://issues.apache.org/jira/browse/LUCENE-6388 Project: Lucene - Core Issue Type: Improvement Reporter: Robert Muir Attachments: LUCENE-6388.patch After the big spans overhaul in LUCENE-6308, we can speed up SpanNearQuery a little more: * SpanNearQuery defaults to collectPayloads=true, but this requires a slower implementation, for an uncommon case. Use the faster no-payloads impl if the field doesn't actually have any payloads. * Use a simple array of Spans rather than List in NearSpans classes. This is iterated over often in the logic. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-5.1-Linux (64bit/jdk1.7.0_80-ea-b05) - Build # 184 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.1-Linux/184/ Java: 64bit/jdk1.7.0_80-ea-b05 -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC 1 tests failed. FAILED: org.apache.solr.client.solrj.SolrSchemalessExampleTest.testCommitWithinOnDelete Error Message: IOException occured when talking to server at: https://127.0.0.1:48500/solr/collection1 Stack Trace: org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: https://127.0.0.1:48500/solr/collection1 at __randomizedtesting.SeedInfo.seed([B6307C15E8FF231A:DA248BA4327CC921]:0) at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:570) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:233) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:225) at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:135) at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:943) at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:958) at org.apache.solr.client.solrj.SolrExampleTestsBase.testCommitWithinOnDelete(SolrExampleTestsBase.java:148) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:872) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:886) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at
Re: [DISCUSS] Change Query API to make queries immutable in 6.0
I did not close this door, I agree this is something that should be considered and tried to list the pros/cons that I could think about. However I would like it to be dealt with in a different issue as it will already be a big change to change those 4 queries. Would would be ok to first make queries immutable up to the boost and then discuss if/how/when we should go fully immutable with a new API to change boosts? On Wed, Apr 1, 2015 at 9:25 PM, david.w.smi...@gmail.com david.w.smi...@gmail.com wrote: I’m +1 to going all the way (fully immutable) but the proposal stops short by skipping the boost. I agree with Terry’s comments — what a shame to make Queries “more immutable” but not really quite immutable. It kinda misses the point? Otherwise why bother? If this is about progress not perfection, then okay, but if we don’t ultimately go all the way then there isn’t the benefit we’re after and we’ve both changed the API and made it a bit more awkward to use. I like the idea of a method like cloneWithBoost() or some-such. A no-arg clone() could be final and call that one with the current boost. While we’re at it, BooleanQuery other variable aggregates could cache the hashCode at construction. ~ David Smiley Freelance Apache Lucene/Solr Search Consultant/Developer http://www.linkedin.com/in/davidwsmiley On Tue, Mar 31, 2015 at 11:06 AM, Adrien Grand jpou...@gmail.com wrote: On Tue, Mar 31, 2015 at 4:32 PM, Terry Smith sheb...@gmail.com wrote: Thanks for the explanation. It seems a pity to make queries just nearly immutable. Do you have any interest in adding a boost parameter to clone() so they really could be immutable? We could have a single method, but if we do it I would rather do it in a different change since it would affect all queries as opposed to only a handful of them. Also there is some benefit in having clone() and setBoost() in that cloning and setters are things that are familiar to everyone. If we replace them with a new method, we would need to specify its semantics. (Not a blocker, just wanted to mention what the pros/cons are in my opinion.) -- Adrien - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Adrien - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [DISCUSS] Change Query API to make queries immutable in 6.0
On Thu, Apr 2, 2015 at 3:40 AM, Adrien Grand jpou...@gmail.com wrote: Would would be ok to first make queries immutable up to the boost and then discuss if/how/when we should go fully immutable with a new API to change boosts? +1 ... progress not perfection. Mike McCandless http://blog.mikemccandless.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-5.x-Windows (64bit/jdk1.7.0_76) - Build # 4510 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-Windows/4510/ Java: 64bit/jdk1.7.0_76 -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC 1 tests failed. FAILED: org.apache.solr.search.facet.TestJsonFacets.testComplex Error Message: mismatch: 'accord'!='a' @ facets/makes/buckets/[0]/models/buckets/[1]/val Stack Trace: java.lang.RuntimeException: mismatch: 'accord'!='a' @ facets/makes/buckets/[0]/models/buckets/[1]/val at __randomizedtesting.SeedInfo.seed([654BB82DFA2376A8:8494BDB1D66D3ACB]:0) at org.apache.solr.SolrTestCaseHS.matchJSON(SolrTestCaseHS.java:160) at org.apache.solr.SolrTestCaseHS.assertJQ(SolrTestCaseHS.java:142) at org.apache.solr.SolrTestCaseHS$Client.testJQ(SolrTestCaseHS.java:288) at org.apache.solr.search.facet.TestJsonFacets.testComplex(TestJsonFacets.java:155) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:872) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:886) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at java.lang.Thread.run(Thread.java:745) Build Log: [...truncated 10764 lines...] [junit4] Suite:
Re: CompressingTermVectors; per-field decompress?
Vectors are totally per-document. Its hard to do anything smarter with them. Basically by this i mean, IMO vectors aren't going to get better until the semantics around them improves. From the original fileformats, i get the impression they were modelled after stored fields a lot, and I think thats why they will be as slow as stored fields until things are fixed. * removing the embedded per-document schema of vectors. I can't imagine a use case for this. I think in general you either have vectors for docs in a given field X or you do not. * removing the ability to store broken offsets (going backward, etc) into vectors. * removing the ability to store offsets without positions. Why? As far as the current impl, its fallen behind the stored fields, which got a lot of improvements for 5.0. We at least gave it a little love, it has a super-fast bulk merge when no deletions are present (dirtyChunks, totalChunks, etc). But in all other cases its very expensive. Compression block sizes, etc should be tuned. It should implement getMergeInstance() and keep state to avoid shittons of decompressions on merge. Maybe a high compression option should be looked at, though getMergeInstance() should be a prerequisite for that or it will be too slow. When the format matches between readers (typically the case, except when upgrading from older versions etc), it should avoid deserialization overhead if that is costly (still the case for stored fields). Fixing some of the big problems (lots of metadata/complexity needed for embedded schema info, negative numbers when there should not be) with vectors would also enable better compression, maybe even underneath LZ4, like stored fields got in 5.0 too. On Thu, Apr 2, 2015 at 2:51 PM, david.w.smi...@gmail.com david.w.smi...@gmail.com wrote: I was looking at a JIRA issue someone posted pertaining to optimizing highlighting for when there are term vectors ( SOLR-5855 ). I dug into the details a bit and learned something unexpected: CompressingTermVectorsReader.get(docId) fully loads all term vectors for the document. The client/user consuming code in question might just want the term vectors for a subset of all fields that have term vectors. Was this overlooked or are there benefits to the current approach? I can’t think of any except that perhaps there’s better compression over all the data versus in smaller per-field chunks; although I’d trade that any day over being able to just get a subset of fields. I could imagine it being useful to ask for some fields or all — in much the same way we handle stored field data. ~ David Smiley Freelance Apache Lucene/Solr Search Consultant/Developer http://www.linkedin.com/in/davidwsmiley - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6352) Add global ordinal based query time join
[ https://issues.apache.org/jira/browse/LUCENE-6352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393544#comment-14393544 ] ASF subversion and git services commented on LUCENE-6352: - Commit 1670990 from [~martijn.v.groningen] in branch 'dev/trunk' [ https://svn.apache.org/r1670990 ] LUCENE-6352: Added a new query time join to the join module that uses global ordinals, which is faster for subsequent joins between reopens. Add global ordinal based query time join - Key: LUCENE-6352 URL: https://issues.apache.org/jira/browse/LUCENE-6352 Project: Lucene - Core Issue Type: Improvement Reporter: Martijn van Groningen Attachments: LUCENE-6352.patch, LUCENE-6352.patch, LUCENE-6352.patch, LUCENE-6352.patch, LUCENE-6352.patch, LUCENE-6352.patch, LUCENE-6352.patch Global ordinal based query time join as an alternative to the current query time join. The implementation is faster for subsequent joins between reopens, but requires an OrdinalMap to be built. This join has certain restrictions and requirements: * A document can only refer to on other document. (but can be referred by one or more documents) * A type field must exist on all documents and each document must be categorized to a type. This is to distingues between the from and to side. * There must be a single sorted doc values field use by both the from and to documents. By encoding join into a single doc values field it is trival to build an ordinals map from it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Deleted] (SOLR-5855) re-use document term-vector Fields instance across fields in the DefaultSolrHighlighter
[ https://issues.apache.org/jira/browse/SOLR-5855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-5855: --- Comment: was deleted (was: Nice work! I'll put some attention on this to try and get it in; not necessarily today/tomorrow but should make it for 5.2. Ideally the FVH should be supported as well; it'd be a shame to do one but not the other. A SolrCache of term vectors should definitely be a separate issue.) re-use document term-vector Fields instance across fields in the DefaultSolrHighlighter --- Key: SOLR-5855 URL: https://issues.apache.org/jira/browse/SOLR-5855 Project: Solr Issue Type: Improvement Components: highlighter Affects Versions: Trunk Reporter: Daniel Debray Assignee: David Smiley Fix For: 5.2 Attachments: SOLR-5855-without-cache.patch, highlight.patch Hi folks, while investigating possible performance bottlenecks in the highlight component i discovered two places where we can save some cpu cylces. Both are in the class org.apache.solr.highlight.DefaultSolrHighlighter First in method doHighlighting (lines 411-417): In the loop we try to highlight every field that has been resolved from the params on each document. Ok, but why not skip those fields that are not present on the current document? So i changed the code from: for (String fieldName : fieldNames) { fieldName = fieldName.trim(); if( useFastVectorHighlighter( params, schema, fieldName ) ) doHighlightingByFastVectorHighlighter( fvh, fieldQuery, req, docSummaries, docId, doc, fieldName ); else doHighlightingByHighlighter( query, req, docSummaries, docId, doc, fieldName ); } to: for (String fieldName : fieldNames) { fieldName = fieldName.trim(); if (doc.get(fieldName) != null) { if( useFastVectorHighlighter( params, schema, fieldName ) ) doHighlightingByFastVectorHighlighter( fvh, fieldQuery, req, docSummaries, docId, doc, fieldName ); else doHighlightingByHighlighter( query, req, docSummaries, docId, doc, fieldName ); } } The second place is where we try to retrieve the TokenStream from the document for a specific field. line 472: TokenStream tvStream = TokenSources.getTokenStreamWithOffsets(searcher.getIndexReader(), docId, fieldName); where.. public static TokenStream getTokenStreamWithOffsets(IndexReader reader, int docId, String field) throws IOException { Fields vectors = reader.getTermVectors(docId); if (vectors == null) { return null; } Terms vector = vectors.terms(field); if (vector == null) { return null; } if (!vector.hasPositions() || !vector.hasOffsets()) { return null; } return getTokenStream(vector); } keep in mind that we currently hit the IndexReader n times where n = requested rows(documents) * requested amount of highlight fields. in my usecase reader.getTermVectors(docId) takes around 150.000~250.000ns on a warm solr and 1.100.000ns on a cold solr. If we store the returning Fields vectors in a cache, this lookups only take 25000ns. I would suggest something like the following code in the doHighlightingByHighlighter method in the DefaultSolrHighlighter class (line 472): Fields vectors = null; SolrCache termVectorCache = searcher.getCache(termVectorCache); if (termVectorCache != null) { vectors = (Fields) termVectorCache.get(Integer.valueOf(docId)); if (vectors == null) { vectors = searcher.getIndexReader().getTermVectors(docId); if (vectors != null) termVectorCache.put(Integer.valueOf(docId), vectors); } } else { vectors = searcher.getIndexReader().getTermVectors(docId); } TokenStream tvStream = TokenSources.getTokenStreamWithOffsets(vectors, fieldName); and TokenSources class: public static TokenStream getTokenStreamWithOffsets(Fields vectors, String field) throws IOException { if (vectors == null) { return null; } Terms vector = vectors.terms(field); if (vector == null) { return null; } if (!vector.hasPositions() || !vector.hasOffsets()) { return null; } return getTokenStream(vector); } 4000ms on 1000 docs without cache 639ms on 1000 docs with cache 102ms on 30 docs without cache 22ms on 30 docs with cache on an index with 190.000 docs with a numFound of 32000 and 80 different highlight fields. I think querys with only one field to highlight on a document does not benefit that much from a cache like this, thats why i think an optional cache would be the best solution there. As i saw the FastVectorHighlighter uses more or less the same approach and could also benefit from this cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (LUCENE-6339) [suggest] Near real time Document Suggester
[ https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Areek Zillur updated LUCENE-6339: - Fix Version/s: (was: 5.x) 5.1 [suggest] Near real time Document Suggester --- Key: LUCENE-6339 URL: https://issues.apache.org/jira/browse/LUCENE-6339 Project: Lucene - Core Issue Type: New Feature Components: core/search Affects Versions: 5.0 Reporter: Areek Zillur Assignee: Areek Zillur Fix For: Trunk, 5.1 Attachments: LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch The idea is to index documents with one or more *SuggestField*(s) and be able to suggest documents with a *SuggestField* value that matches a given key. A SuggestField can be assigned a numeric weight to be used to score the suggestion at query time. Document suggestion can be done on an indexed *SuggestField*. The document suggester can filter out deleted documents in near real-time. The suggester can filter out documents based on a Filter (note: may change to a non-scoring query?) at query time. A custom postings format (CompletionPostingsFormat) is used to index SuggestField(s) and perform document suggestions. h4. Usage {code:java} // hook up custom postings format // indexAnalyzer for SuggestField Analyzer analyzer = ... IndexWriterConfig config = new IndexWriterConfig(analyzer); Codec codec = new Lucene50Codec() { PostingsFormat completionPostingsFormat = new Completion50PostingsFormat(); @Override public PostingsFormat getPostingsFormatForField(String field) { if (isSuggestField(field)) { return completionPostingsFormat; } return super.getPostingsFormatForField(field); } }; config.setCodec(codec); IndexWriter writer = new IndexWriter(dir, config); // index some documents with suggestions Document doc = new Document(); doc.add(new SuggestField(suggest_title, title1, 2)); doc.add(new SuggestField(suggest_name, name1, 3)); writer.addDocument(doc) ... // open an nrt reader for the directory DirectoryReader reader = DirectoryReader.open(writer, false); // SuggestIndexSearcher is a thin wrapper over IndexSearcher // queryAnalyzer will be used to analyze the query string SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, queryAnalyzer); // suggest 10 documents for titl on suggest_title field TopSuggestDocs suggest = indexSearcher.suggest(suggest_title, titl, 10); {code} h4. Indexing Index analyzer set through *IndexWriterConfig* {code:java} SuggestField(String name, String value, long weight) {code} h4. Query Query analyzer set through *SuggestIndexSearcher*. Hits are collected in descending order of the suggestion's weight {code:java} // full options for TopSuggestDocs (TopDocs) TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter) // full options for Collector // note: only collects does not score void suggest(String field, CharSequence key, int num, Filter filter, TopSuggestDocsCollector collector) {code} h4. Analyzer *CompletionAnalyzer* can be used instead to wrap another analyzer to tune suggest field only parameters. {code:java} CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean preservePositionIncrements, int maxGraphExpansions) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6386) TestIndexWriterForceMerge still unreliable in NIGHTLY
[ https://issues.apache.org/jira/browse/LUCENE-6386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393318#comment-14393318 ] ASF subversion and git services commented on LUCENE-6386: - Commit 1670960 from [~mikemccand] in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1670960 ] LUCENE-6386: correct javadocs about temp disk space required for forceMerge(1) TestIndexWriterForceMerge still unreliable in NIGHTLY - Key: LUCENE-6386 URL: https://issues.apache.org/jira/browse/LUCENE-6386 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Assignee: Michael McCandless Fix For: Trunk, 5.1 Attachments: LUCENE-6386.patch Discovered by ryan beasting (trunk): ant test -Dtestcase=TestIndexWriterForceMerge -Dtests.method=testForceMergeTempSpaceUsage -Dtests.seed=DC9ADB74850A581B -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.locale=sr__#Latn -Dtests.timezone=Indian/Chagos -Dtests.asserts=true -Dtests.file.encoding=US-ASCII {noformat} [junit4] 2 NOTE: reproduce with: ant test -Dtestcase=TestIndexWriterForceMerge -Dtests.method=testForceMergeTempSpaceUsage -Dtests.seed=DC9ADB74850A581B -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.locale=sr__#Latn -Dtests.timezone=Indian/Chagos -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] FAILURE 1.20s | TestIndexWriterForceMerge.testForceMergeTempSpaceUsage [junit4] Throwable #1: java.lang.AssertionError: forceMerge used too much temporary space: starting usage was 291570 bytes; final usage was 262469 bytes; max temp usage was 1079501 but should have been 874710 (= 3X starting usage), BEFORE= [junit4] _u.scf 146329 [junit4] _u.si 635 [junit4] |- (inside compound file) _u.fld 2214 [junit4] |- (inside compound file) _u.inf 392 [junit4] |- (inside compound file) _u.len 2381 [junit4] |- (inside compound file) _u.pst 36758 [junit4] |- (inside compound file) _u.vec 104144 [junit4] _s.pst 1338 [junit4] _s.inf 392 [junit4] _s.fld 94 [junit4] _s.len 221 [junit4] _s.vec 3744 [junit4] _s.si 624 [junit4] _t.fld 94 [junit4] _t.len 221 [junit4] _t.pst 1338 [junit4] _t.inf 392 [junit4] _t.vec 3744 [junit4] _t.si 624 [junit4] _v.fld 94 [junit4] _v.pst 1338 [junit4] _v.inf 392 [junit4] _v.vec 3744 [junit4] _v.si 624 [junit4] _v.len 221 [junit4] _w.len 221 [junit4] _w.pst 1338 [junit4] _w.inf 392 [junit4] _w.fld 94 [junit4] _w.si 624 [junit4] _w.vec 3744 [junit4] _x.vec 3744 [junit4] _x.inf 392 [junit4] _x.pst 1338 [junit4] _x.fld 94 [junit4] _x.si 624 [junit4] _x.len 221 [junit4] _y.fld 94 [junit4] _y.pst 1338 [junit4] _y.inf 392 [junit4] _y.si 624 [junit4] _y.vec 3744 [junit4] _y.len 221 [junit4] _z.fld 94 [junit4] _z.pst 1338 [junit4] _z.inf 392 [junit4] _z.len 221 [junit4] _z.vec 3744 [junit4] _z.si 624 [junit4] _10.si 630 [junit4] _10.fld 94 [junit4] _10.pst 1338 [junit4] _10.inf 392 [junit4] _10.vec 3744 [junit4] _10.len 221 [junit4] _11.len 221 [junit4] _11.si 630 [junit4] _11.vec 3744 [junit4] _11.pst 1338 [junit4] _11.inf 392 [junit4] _11.fld 94 [junit4] _12.vec 3744 [junit4] _12.si 630 [junit4] _12.len 221 [junit4] _12.fld 94 [junit4] _12.pst 1338 [junit4] _12.inf 392 [junit4] _13.fld 94 [junit4] _13.vec 3744 [junit4] _13.si 630 [junit4] _13.pst
[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester
[ https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393425#comment-14393425 ] ASF subversion and git services commented on LUCENE-6339: - Commit 1670969 from [~areek] in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1670969 ] LUCENE-6339: fix test bug (ensure opening nrt reader with applyAllDeletes) [suggest] Near real time Document Suggester --- Key: LUCENE-6339 URL: https://issues.apache.org/jira/browse/LUCENE-6339 Project: Lucene - Core Issue Type: New Feature Components: core/search Affects Versions: 5.0 Reporter: Areek Zillur Assignee: Areek Zillur Fix For: Trunk, 5.x Attachments: LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch The idea is to index documents with one or more *SuggestField*(s) and be able to suggest documents with a *SuggestField* value that matches a given key. A SuggestField can be assigned a numeric weight to be used to score the suggestion at query time. Document suggestion can be done on an indexed *SuggestField*. The document suggester can filter out deleted documents in near real-time. The suggester can filter out documents based on a Filter (note: may change to a non-scoring query?) at query time. A custom postings format (CompletionPostingsFormat) is used to index SuggestField(s) and perform document suggestions. h4. Usage {code:java} // hook up custom postings format // indexAnalyzer for SuggestField Analyzer analyzer = ... IndexWriterConfig config = new IndexWriterConfig(analyzer); Codec codec = new Lucene50Codec() { PostingsFormat completionPostingsFormat = new Completion50PostingsFormat(); @Override public PostingsFormat getPostingsFormatForField(String field) { if (isSuggestField(field)) { return completionPostingsFormat; } return super.getPostingsFormatForField(field); } }; config.setCodec(codec); IndexWriter writer = new IndexWriter(dir, config); // index some documents with suggestions Document doc = new Document(); doc.add(new SuggestField(suggest_title, title1, 2)); doc.add(new SuggestField(suggest_name, name1, 3)); writer.addDocument(doc) ... // open an nrt reader for the directory DirectoryReader reader = DirectoryReader.open(writer, false); // SuggestIndexSearcher is a thin wrapper over IndexSearcher // queryAnalyzer will be used to analyze the query string SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, queryAnalyzer); // suggest 10 documents for titl on suggest_title field TopSuggestDocs suggest = indexSearcher.suggest(suggest_title, titl, 10); {code} h4. Indexing Index analyzer set through *IndexWriterConfig* {code:java} SuggestField(String name, String value, long weight) {code} h4. Query Query analyzer set through *SuggestIndexSearcher*. Hits are collected in descending order of the suggestion's weight {code:java} // full options for TopSuggestDocs (TopDocs) TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter) // full options for Collector // note: only collects does not score void suggest(String field, CharSequence key, int num, Filter filter, TopSuggestDocsCollector collector) {code} h4. Analyzer *CompletionAnalyzer* can be used instead to wrap another analyzer to tune suggest field only parameters. {code:java} CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean preservePositionIncrements, int maxGraphExpansions) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6385) NullPointerException from Highlighter.getBestFragment()
[ https://issues.apache.org/jira/browse/LUCENE-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393453#comment-14393453 ] Robert Muir commented on LUCENE-6385: - I'm just running checks/tests and plan to commit this later tonight. Thanks for the fix and nice test Terry. NullPointerException from Highlighter.getBestFragment() --- Key: LUCENE-6385 URL: https://issues.apache.org/jira/browse/LUCENE-6385 Project: Lucene - Core Issue Type: Bug Components: modules/highlighter Affects Versions: 5.1 Reporter: Terry Smith Assignee: Robert Muir Priority: Blocker Attachments: LUCENE-6385.patch When testing against the 5.1 nightly snapshots I've come across a NullPointerException in highlighting when nothing would be highlighted. This does not happen with 5.0. {noformat} java.lang.NullPointerException at __randomizedtesting.SeedInfo.seed([3EDC6EB0FA552B34:9971866E394F5FD0]:0) at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extractWeightedSpanTerms(WeightedSpanTermExtractor.java:311) at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:151) at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.getWeightedSpanTerms(WeightedSpanTermExtractor.java:515) at org.apache.lucene.search.highlight.QueryScorer.initExtractor(QueryScorer.java:219) at org.apache.lucene.search.highlight.QueryScorer.init(QueryScorer.java:187) at org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:196) at org.apache.lucene.search.highlight.Highlighter.getBestFragments(Highlighter.java:156) at org.apache.lucene.search.highlight.Highlighter.getBestFragment(Highlighter.java:102) at org.apache.lucene.search.highlight.Highlighter.getBestFragment(Highlighter.java:80) at org.apache.lucene.search.highlight.MissesTest.testPhraseQuery(MissesTest.java:50) {noformat} I've written a small unit test and used git bisect to narrow the regression to the following commit: {noformat} commit 24e4eefaefb1837d1d4fa35f7669c2b264f872ac Author: Michael McCandless mikemcc...@apache.org Date: Tue Mar 31 08:48:28 2015 + LUCENE-6308: cutover Spans to DISI, reuse ConjunctionDISI, use two-phased iteration git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/branch_5x@1670273 13f79535-47bb-0310-9956-ffa450edef68 {noformat} The problem looks quite simple, WeightedSpanTermExtractor.extractWeightedSpanTerms() needs an early return if SpanQuery.getSpans() returns null. All other callers check against this. Unit test and fix (against the regressed commit) attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.8.0_40) - Build # 12189 - Failure!
I have committed a fix for this.
[jira] [Commented] (LUCENE-6386) TestIndexWriterForceMerge still unreliable in NIGHTLY
[ https://issues.apache.org/jira/browse/LUCENE-6386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393315#comment-14393315 ] ASF subversion and git services commented on LUCENE-6386: - Commit 1670959 from [~mikemccand] in branch 'dev/trunk' [ https://svn.apache.org/r1670959 ] LUCENE-6386: correct javadocs about temp disk space required for forceMerge(1) TestIndexWriterForceMerge still unreliable in NIGHTLY - Key: LUCENE-6386 URL: https://issues.apache.org/jira/browse/LUCENE-6386 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Assignee: Michael McCandless Fix For: Trunk, 5.1 Attachments: LUCENE-6386.patch Discovered by ryan beasting (trunk): ant test -Dtestcase=TestIndexWriterForceMerge -Dtests.method=testForceMergeTempSpaceUsage -Dtests.seed=DC9ADB74850A581B -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.locale=sr__#Latn -Dtests.timezone=Indian/Chagos -Dtests.asserts=true -Dtests.file.encoding=US-ASCII {noformat} [junit4] 2 NOTE: reproduce with: ant test -Dtestcase=TestIndexWriterForceMerge -Dtests.method=testForceMergeTempSpaceUsage -Dtests.seed=DC9ADB74850A581B -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.locale=sr__#Latn -Dtests.timezone=Indian/Chagos -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] FAILURE 1.20s | TestIndexWriterForceMerge.testForceMergeTempSpaceUsage [junit4] Throwable #1: java.lang.AssertionError: forceMerge used too much temporary space: starting usage was 291570 bytes; final usage was 262469 bytes; max temp usage was 1079501 but should have been 874710 (= 3X starting usage), BEFORE= [junit4] _u.scf 146329 [junit4] _u.si 635 [junit4] |- (inside compound file) _u.fld 2214 [junit4] |- (inside compound file) _u.inf 392 [junit4] |- (inside compound file) _u.len 2381 [junit4] |- (inside compound file) _u.pst 36758 [junit4] |- (inside compound file) _u.vec 104144 [junit4] _s.pst 1338 [junit4] _s.inf 392 [junit4] _s.fld 94 [junit4] _s.len 221 [junit4] _s.vec 3744 [junit4] _s.si 624 [junit4] _t.fld 94 [junit4] _t.len 221 [junit4] _t.pst 1338 [junit4] _t.inf 392 [junit4] _t.vec 3744 [junit4] _t.si 624 [junit4] _v.fld 94 [junit4] _v.pst 1338 [junit4] _v.inf 392 [junit4] _v.vec 3744 [junit4] _v.si 624 [junit4] _v.len 221 [junit4] _w.len 221 [junit4] _w.pst 1338 [junit4] _w.inf 392 [junit4] _w.fld 94 [junit4] _w.si 624 [junit4] _w.vec 3744 [junit4] _x.vec 3744 [junit4] _x.inf 392 [junit4] _x.pst 1338 [junit4] _x.fld 94 [junit4] _x.si 624 [junit4] _x.len 221 [junit4] _y.fld 94 [junit4] _y.pst 1338 [junit4] _y.inf 392 [junit4] _y.si 624 [junit4] _y.vec 3744 [junit4] _y.len 221 [junit4] _z.fld 94 [junit4] _z.pst 1338 [junit4] _z.inf 392 [junit4] _z.len 221 [junit4] _z.vec 3744 [junit4] _z.si 624 [junit4] _10.si 630 [junit4] _10.fld 94 [junit4] _10.pst 1338 [junit4] _10.inf 392 [junit4] _10.vec 3744 [junit4] _10.len 221 [junit4] _11.len 221 [junit4] _11.si 630 [junit4] _11.vec 3744 [junit4] _11.pst 1338 [junit4] _11.inf 392 [junit4] _11.fld 94 [junit4] _12.vec 3744 [junit4] _12.si 630 [junit4] _12.len 221 [junit4] _12.fld 94 [junit4] _12.pst 1338 [junit4] _12.inf 392 [junit4] _13.fld 94 [junit4] _13.vec 3744 [junit4] _13.si 630 [junit4] _13.pst 1338
Re: [JENKINS] Lucene-Solr-5.1-Linux (32bit/jdk1.8.0_60-ea-b06) - Build # 189 - Failure!
I have committed a fix for this. On Thu, Apr 2, 2015 at 4:22 PM, Policeman Jenkins Server jenk...@thetaphi.de wrote: Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.1-Linux/189/ Java: 32bit/jdk1.8.0_60-ea-b06 -client -XX:+UseConcMarkSweepGC 1 tests failed. FAILED: org.apache.lucene.search.suggest.document.SuggestFieldTest.testSuggestOnAllDeletedDocuments Error Message: MockDirectoryWrapper: cannot close: there are still open files: {_1_Lucene50_0.pay=1, _0_Lucene50_0.doc=1, _0_Lucene50_0.tim=1, _1.nvd=1, _0_Lucene50_0.pos=1, _0.nvd=1, _0_completion_0.lkp=1, _1_completion_0.pos=1, _0_completion_0.pay=1, _1_completion_0.doc=1, _1_completion_0.tim=1, _0.fdt=1, _0_completion_0.doc=1, _1_Lucene50_0.doc=1, _1_Lucene50_0.tim=1, _0_Lucene50_0.pay=1, _1_completion_0.lkp=1, _1_Lucene50_0.pos=1, _0_completion_0.pos=1, _0_completion_0.tim=1, _1_completion_0.pay=1, _1.fdt=1} Stack Trace: java.lang.RuntimeException: MockDirectoryWrapper: cannot close: there are still open files: {_1_Lucene50_0.pay=1, _0_Lucene50_0.doc=1, _0_Lucene50_0.tim=1, _1.nvd=1, _0_Lucene50_0.pos=1, _0.nvd=1, _0_completion_0.lkp=1, _1_completion_0.pos=1, _0_completion_0.pay=1, _1_completion_0.doc=1, _1_completion_0.tim=1, _0.fdt=1, _0_completion_0.doc=1, _1_Lucene50_0.doc=1, _1_Lucene50_0.tim=1, _0_Lucene50_0.pay=1, _1_completion_0.lkp=1, _1_Lucene50_0.pos=1, _0_completion_0.pos=1, _0_completion_0.tim=1, _1_completion_0.pay=1, _1.fdt=1} at org.apache.lucene.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:747) at org.apache.lucene.search.suggest.document.SuggestFieldTest.after(SuggestFieldTest.java:83) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:894) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at
[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester
[ https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393474#comment-14393474 ] ASF subversion and git services commented on LUCENE-6339: - Commit 1670978 from [~areek] in branch 'dev/branches/lucene_solr_5_1' [ https://svn.apache.org/r1670978 ] LUCENE-6339: fix test bug (ensure opening nrt reader with applyAllDeletes) [suggest] Near real time Document Suggester --- Key: LUCENE-6339 URL: https://issues.apache.org/jira/browse/LUCENE-6339 Project: Lucene - Core Issue Type: New Feature Components: core/search Affects Versions: 5.0 Reporter: Areek Zillur Assignee: Areek Zillur Fix For: Trunk, 5.x Attachments: LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch The idea is to index documents with one or more *SuggestField*(s) and be able to suggest documents with a *SuggestField* value that matches a given key. A SuggestField can be assigned a numeric weight to be used to score the suggestion at query time. Document suggestion can be done on an indexed *SuggestField*. The document suggester can filter out deleted documents in near real-time. The suggester can filter out documents based on a Filter (note: may change to a non-scoring query?) at query time. A custom postings format (CompletionPostingsFormat) is used to index SuggestField(s) and perform document suggestions. h4. Usage {code:java} // hook up custom postings format // indexAnalyzer for SuggestField Analyzer analyzer = ... IndexWriterConfig config = new IndexWriterConfig(analyzer); Codec codec = new Lucene50Codec() { PostingsFormat completionPostingsFormat = new Completion50PostingsFormat(); @Override public PostingsFormat getPostingsFormatForField(String field) { if (isSuggestField(field)) { return completionPostingsFormat; } return super.getPostingsFormatForField(field); } }; config.setCodec(codec); IndexWriter writer = new IndexWriter(dir, config); // index some documents with suggestions Document doc = new Document(); doc.add(new SuggestField(suggest_title, title1, 2)); doc.add(new SuggestField(suggest_name, name1, 3)); writer.addDocument(doc) ... // open an nrt reader for the directory DirectoryReader reader = DirectoryReader.open(writer, false); // SuggestIndexSearcher is a thin wrapper over IndexSearcher // queryAnalyzer will be used to analyze the query string SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, queryAnalyzer); // suggest 10 documents for titl on suggest_title field TopSuggestDocs suggest = indexSearcher.suggest(suggest_title, titl, 10); {code} h4. Indexing Index analyzer set through *IndexWriterConfig* {code:java} SuggestField(String name, String value, long weight) {code} h4. Query Query analyzer set through *SuggestIndexSearcher*. Hits are collected in descending order of the suggestion's weight {code:java} // full options for TopSuggestDocs (TopDocs) TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter) // full options for Collector // note: only collects does not score void suggest(String field, CharSequence key, int num, Filter filter, TopSuggestDocsCollector collector) {code} h4. Analyzer *CompletionAnalyzer* can be used instead to wrap another analyzer to tune suggest field only parameters. {code:java} CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean preservePositionIncrements, int maxGraphExpansions) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: [DISCUSS] Change Query API to make queries immutable in 6.0
Unfortunately, since boost is used in hashCode() and equals() calculations, changing the boost will still make the queries trappy. You will do all that work to make everything-but-boost immutable and still not fix the problem. You can prove it to yourself like so (this test fails!): public void testMapOrphan() { MapQuery, Integer map = new HashMap(); BooleanQuery booleanAB = new BooleanQuery(); booleanAB.add(new TermQuery(new Term(contents, a)), BooleanClause.Occur.SHOULD); booleanAB.add(new TermQuery(new Term(contents, b)), BooleanClause.Occur.SHOULD); map.put( booleanAB, 1 ); booleanAB.setBoost( 33.3f );// Set boost after map.put() assertTrue( map.containsKey(booleanAB) ); } Seems like the quick way to the coast is to write a failing test - before making changes. I realize this is easier said than done. Based on your testing that led you to start this discussion, can you narrow it down to a single Query class and/or IndexSearcher use case? Not there will be only one case. But, at least, it will be a starting point. Once the first failing test has been written, it should be relatively easy to write test variations to cover the remaining mutuable Query classes. With the scale of the changes you are proposing, test first seems like a reasonable approach. Another compromise approach might be to sub-class the mutable Query classes like so: class ImmutableBooleanQuery extends BooleanQuery { public void add(BooleanClause clause) { throw new UnsupportedOperationException( ImmutableBooleanQuery.add(BooleanClause) ); } public void setBoost( int boost ) { throw new UnsupportedOperationException( ImmutableBooleanQuery.add(BooleanClause) ); } // etc. public static ImmutableBooleanQuery cloneFrom(BooleanQuery original) { // Use field level access to by-pass mutator methods. } // Do NOT override rewrite(IndexReader)! } In theory, such a proxy class could be generated at runtime to force immutability: https://github.com/verhas/immutator Which could make a lot of sense in JUnit tests, if not production runtime. An immutable Query would be cloned from the original and place on the cache instead. Any attempt to modify the cache entry should fail quickly. To me, a less invasive approach seems like a faster and easier way to actually find and fix this bug. Once that is done, then it might make sense to perform the exhaustive updates to prevent a relapse in the future. -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Thursday, April 02, 2015 9:46 AM To: dev@lucene.apache.org Subject: Re: [DISCUSS] Change Query API to make queries immutable in 6.0 Boosts might not make sense to become immutable, it might make the code too complex. Who is to say until the other stuff is fixed first. The downsides might outweight the upsides. So yeah, if you want to say if anyone disagrees with what the future might look like i'm gonna -1 your progress, then i will bite right now. Fixing the rest of Query to be immutable, so filter caching isn't trappy, we should really do that. And we have been doing it already. I remember Uwe suggested this approach when adding automaton and related queries a long time ago. It made things simpler and avoided bugs, we ultimately made as much of it immutable as we could. Queries have to be well-behaved, they need a good hashcode/equals, thread safety, good error checking, etc. It is easier to do this when things are immutable. Someone today can make a patch for FooQuery that nukes setBar and moves it to a ctor parameter named 'bar' and chances are a lot of the time, it probably fixes bugs in FooQuery somehow. Thats just what it is. Boosts are the 'long tail'. they are simple primitive floating point values, so susceptible to less problems. The base class incorporates boosts into equals/hashcode already, which prevents the most common bugs with them. They are trickier because internal things like rewrite() might shuffle them around in conjunction with clone(), to do optimizations. They are also only relevant when scores are needed: so we can prevent nasty filter caching bugs as a step, by making everything else immutable. On Thu, Apr 2, 2015 at 9:27 AM, david.w.smi...@gmail.com david.w.smi...@gmail.com wrote: On Thu, Apr 2, 2015 at 3:40 AM, Adrien Grand jpou...@gmail.com wrote: first make queries immutable up to the boost and then discuss if/how/when we should go fully immutable with a new API to change boosts? The “if” part concerns me; I don’t mind it being a separate issue to make the changes more manageable (progress not perfection, and all that). I’m all for the whole shebang. But if others think “no” then…. will it have been worthwhile to do this big change and not go all the way? I think not. Does anyone feel the answer is “no” to make boosts immutable? And if so why? If nobody comes up with a
[jira] [Commented] (LUCENE-6352) Add global ordinal based query time join
[ https://issues.apache.org/jira/browse/LUCENE-6352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393569#comment-14393569 ] ASF subversion and git services commented on LUCENE-6352: - Commit 1670991 from [~martijn.v.groningen] in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1670991 ] LUCENE-6352: Added a new query time join to the join module that uses global ordinals, which is faster for subsequent joins between reopens. Add global ordinal based query time join - Key: LUCENE-6352 URL: https://issues.apache.org/jira/browse/LUCENE-6352 Project: Lucene - Core Issue Type: Improvement Reporter: Martijn van Groningen Attachments: LUCENE-6352.patch, LUCENE-6352.patch, LUCENE-6352.patch, LUCENE-6352.patch, LUCENE-6352.patch, LUCENE-6352.patch, LUCENE-6352.patch Global ordinal based query time join as an alternative to the current query time join. The implementation is faster for subsequent joins between reopens, but requires an OrdinalMap to be built. This join has certain restrictions and requirements: * A document can only refer to on other document. (but can be referred by one or more documents) * A type field must exist on all documents and each document must be categorized to a type. This is to distingues between the from and to side. * There must be a single sorted doc values field use by both the from and to documents. By encoding join into a single doc values field it is trival to build an ordinals map from it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-5855) Increasing solr highlight performance with caching
[ https://issues.apache.org/jira/browse/SOLR-5855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley reassigned SOLR-5855: -- Assignee: David Smiley Increasing solr highlight performance with caching -- Key: SOLR-5855 URL: https://issues.apache.org/jira/browse/SOLR-5855 Project: Solr Issue Type: Improvement Components: highlighter Affects Versions: Trunk Reporter: Daniel Debray Assignee: David Smiley Fix For: Trunk Attachments: SOLR-5855-without-cache.patch, highlight.patch Hi folks, while investigating possible performance bottlenecks in the highlight component i discovered two places where we can save some cpu cylces. Both are in the class org.apache.solr.highlight.DefaultSolrHighlighter First in method doHighlighting (lines 411-417): In the loop we try to highlight every field that has been resolved from the params on each document. Ok, but why not skip those fields that are not present on the current document? So i changed the code from: for (String fieldName : fieldNames) { fieldName = fieldName.trim(); if( useFastVectorHighlighter( params, schema, fieldName ) ) doHighlightingByFastVectorHighlighter( fvh, fieldQuery, req, docSummaries, docId, doc, fieldName ); else doHighlightingByHighlighter( query, req, docSummaries, docId, doc, fieldName ); } to: for (String fieldName : fieldNames) { fieldName = fieldName.trim(); if (doc.get(fieldName) != null) { if( useFastVectorHighlighter( params, schema, fieldName ) ) doHighlightingByFastVectorHighlighter( fvh, fieldQuery, req, docSummaries, docId, doc, fieldName ); else doHighlightingByHighlighter( query, req, docSummaries, docId, doc, fieldName ); } } The second place is where we try to retrieve the TokenStream from the document for a specific field. line 472: TokenStream tvStream = TokenSources.getTokenStreamWithOffsets(searcher.getIndexReader(), docId, fieldName); where.. public static TokenStream getTokenStreamWithOffsets(IndexReader reader, int docId, String field) throws IOException { Fields vectors = reader.getTermVectors(docId); if (vectors == null) { return null; } Terms vector = vectors.terms(field); if (vector == null) { return null; } if (!vector.hasPositions() || !vector.hasOffsets()) { return null; } return getTokenStream(vector); } keep in mind that we currently hit the IndexReader n times where n = requested rows(documents) * requested amount of highlight fields. in my usecase reader.getTermVectors(docId) takes around 150.000~250.000ns on a warm solr and 1.100.000ns on a cold solr. If we store the returning Fields vectors in a cache, this lookups only take 25000ns. I would suggest something like the following code in the doHighlightingByHighlighter method in the DefaultSolrHighlighter class (line 472): Fields vectors = null; SolrCache termVectorCache = searcher.getCache(termVectorCache); if (termVectorCache != null) { vectors = (Fields) termVectorCache.get(Integer.valueOf(docId)); if (vectors == null) { vectors = searcher.getIndexReader().getTermVectors(docId); if (vectors != null) termVectorCache.put(Integer.valueOf(docId), vectors); } } else { vectors = searcher.getIndexReader().getTermVectors(docId); } TokenStream tvStream = TokenSources.getTokenStreamWithOffsets(vectors, fieldName); and TokenSources class: public static TokenStream getTokenStreamWithOffsets(Fields vectors, String field) throws IOException { if (vectors == null) { return null; } Terms vector = vectors.terms(field); if (vector == null) { return null; } if (!vector.hasPositions() || !vector.hasOffsets()) { return null; } return getTokenStream(vector); } 4000ms on 1000 docs without cache 639ms on 1000 docs with cache 102ms on 30 docs without cache 22ms on 30 docs with cache on an index with 190.000 docs with a numFound of 32000 and 80 different highlight fields. I think querys with only one field to highlight on a document does not benefit that much from a cache like this, thats why i think an optional cache would be the best solution there. As i saw the FastVectorHighlighter uses more or less the same approach and could also benefit from this cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: CompressingTermVectors; per-field decompress?
On Thu, Apr 2, 2015 at 4:02 PM, david.w.smi...@gmail.com david.w.smi...@gmail.com wrote: They are fundamentally per-document, yes, like stored fields — yes. But I don’t see how this fundamental constraint prevents the term vector format from returning a light “Fields” instance which loads per-field data on demand when asked for. I understand most of your ideas for a better term vector format below, to varying degrees, but again I don’t see these ideas as being blocking factors for having field term data be stored together so it could be accessed lazily. (don’t fetch fields you don’t need). Maybe you didn’t mean to imply they are? Although I think you did by saying “vectors aren't going to get better until the semantics around them improves”. It is pretty much impossible to fix the underlying layout to be efficient fieldwise when the way vectors can be structured in the different documents is heterogeneous (per-doc) and there are so many crazy things that can happen. If that were fixed, the file or block header could contain this metadata instead of per-field-per-doc. blocks could be compressed fieldwise across documents, maybe use preset dictionary for each field, etc. Currently, everything must be decompressed. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.9.0-ea-b54) - Build # 12190 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/12190/ Java: 32bit/jdk1.9.0-ea-b54 -server -XX:+UseConcMarkSweepGC 1 tests failed. FAILED: org.apache.solr.cloud.ChaosMonkeyNothingIsSafeTest.test Error Message: There were too many update fails (24 20) - we expect it can happen, but shouldn't easily Stack Trace: java.lang.AssertionError: There were too many update fails (24 20) - we expect it can happen, but shouldn't easily at __randomizedtesting.SeedInfo.seed([CA316AE78D96C162:4265553D236AAC9A]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.junit.Assert.assertFalse(Assert.java:68) at org.apache.solr.cloud.ChaosMonkeyNothingIsSafeTest.test(ChaosMonkeyNothingIsSafeTest.java:230) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:502) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:872) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:886) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:960) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:935) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at
[JENKINS] Lucene-Solr-5.1-Linux (32bit/jdk1.8.0_60-ea-b06) - Build # 189 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.1-Linux/189/ Java: 32bit/jdk1.8.0_60-ea-b06 -client -XX:+UseConcMarkSweepGC 1 tests failed. FAILED: org.apache.lucene.search.suggest.document.SuggestFieldTest.testSuggestOnAllDeletedDocuments Error Message: MockDirectoryWrapper: cannot close: there are still open files: {_1_Lucene50_0.pay=1, _0_Lucene50_0.doc=1, _0_Lucene50_0.tim=1, _1.nvd=1, _0_Lucene50_0.pos=1, _0.nvd=1, _0_completion_0.lkp=1, _1_completion_0.pos=1, _0_completion_0.pay=1, _1_completion_0.doc=1, _1_completion_0.tim=1, _0.fdt=1, _0_completion_0.doc=1, _1_Lucene50_0.doc=1, _1_Lucene50_0.tim=1, _0_Lucene50_0.pay=1, _1_completion_0.lkp=1, _1_Lucene50_0.pos=1, _0_completion_0.pos=1, _0_completion_0.tim=1, _1_completion_0.pay=1, _1.fdt=1} Stack Trace: java.lang.RuntimeException: MockDirectoryWrapper: cannot close: there are still open files: {_1_Lucene50_0.pay=1, _0_Lucene50_0.doc=1, _0_Lucene50_0.tim=1, _1.nvd=1, _0_Lucene50_0.pos=1, _0.nvd=1, _0_completion_0.lkp=1, _1_completion_0.pos=1, _0_completion_0.pay=1, _1_completion_0.doc=1, _1_completion_0.tim=1, _0.fdt=1, _0_completion_0.doc=1, _1_Lucene50_0.doc=1, _1_Lucene50_0.tim=1, _0_Lucene50_0.pay=1, _1_completion_0.lkp=1, _1_Lucene50_0.pos=1, _0_completion_0.pos=1, _0_completion_0.tim=1, _1_completion_0.pay=1, _1.fdt=1} at org.apache.lucene.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:747) at org.apache.lucene.search.suggest.document.SuggestFieldTest.after(SuggestFieldTest.java:83) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:894) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at
[jira] [Commented] (LUCENE-6386) TestIndexWriterForceMerge still unreliable in NIGHTLY
[ https://issues.apache.org/jira/browse/LUCENE-6386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393339#comment-14393339 ] ASF subversion and git services commented on LUCENE-6386: - Commit 1670963 from [~mikemccand] in branch 'dev/branches/lucene_solr_5_1' [ https://svn.apache.org/r1670963 ] LUCENE-6386: correct javadocs about temp disk space required for forceMerge(1) TestIndexWriterForceMerge still unreliable in NIGHTLY - Key: LUCENE-6386 URL: https://issues.apache.org/jira/browse/LUCENE-6386 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Assignee: Michael McCandless Fix For: Trunk, 5.1 Attachments: LUCENE-6386.patch Discovered by ryan beasting (trunk): ant test -Dtestcase=TestIndexWriterForceMerge -Dtests.method=testForceMergeTempSpaceUsage -Dtests.seed=DC9ADB74850A581B -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.locale=sr__#Latn -Dtests.timezone=Indian/Chagos -Dtests.asserts=true -Dtests.file.encoding=US-ASCII {noformat} [junit4] 2 NOTE: reproduce with: ant test -Dtestcase=TestIndexWriterForceMerge -Dtests.method=testForceMergeTempSpaceUsage -Dtests.seed=DC9ADB74850A581B -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.locale=sr__#Latn -Dtests.timezone=Indian/Chagos -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] FAILURE 1.20s | TestIndexWriterForceMerge.testForceMergeTempSpaceUsage [junit4] Throwable #1: java.lang.AssertionError: forceMerge used too much temporary space: starting usage was 291570 bytes; final usage was 262469 bytes; max temp usage was 1079501 but should have been 874710 (= 3X starting usage), BEFORE= [junit4] _u.scf 146329 [junit4] _u.si 635 [junit4] |- (inside compound file) _u.fld 2214 [junit4] |- (inside compound file) _u.inf 392 [junit4] |- (inside compound file) _u.len 2381 [junit4] |- (inside compound file) _u.pst 36758 [junit4] |- (inside compound file) _u.vec 104144 [junit4] _s.pst 1338 [junit4] _s.inf 392 [junit4] _s.fld 94 [junit4] _s.len 221 [junit4] _s.vec 3744 [junit4] _s.si 624 [junit4] _t.fld 94 [junit4] _t.len 221 [junit4] _t.pst 1338 [junit4] _t.inf 392 [junit4] _t.vec 3744 [junit4] _t.si 624 [junit4] _v.fld 94 [junit4] _v.pst 1338 [junit4] _v.inf 392 [junit4] _v.vec 3744 [junit4] _v.si 624 [junit4] _v.len 221 [junit4] _w.len 221 [junit4] _w.pst 1338 [junit4] _w.inf 392 [junit4] _w.fld 94 [junit4] _w.si 624 [junit4] _w.vec 3744 [junit4] _x.vec 3744 [junit4] _x.inf 392 [junit4] _x.pst 1338 [junit4] _x.fld 94 [junit4] _x.si 624 [junit4] _x.len 221 [junit4] _y.fld 94 [junit4] _y.pst 1338 [junit4] _y.inf 392 [junit4] _y.si 624 [junit4] _y.vec 3744 [junit4] _y.len 221 [junit4] _z.fld 94 [junit4] _z.pst 1338 [junit4] _z.inf 392 [junit4] _z.len 221 [junit4] _z.vec 3744 [junit4] _z.si 624 [junit4] _10.si 630 [junit4] _10.fld 94 [junit4] _10.pst 1338 [junit4] _10.inf 392 [junit4] _10.vec 3744 [junit4] _10.len 221 [junit4] _11.len 221 [junit4] _11.si 630 [junit4] _11.vec 3744 [junit4] _11.pst 1338 [junit4] _11.inf 392 [junit4] _11.fld 94 [junit4] _12.vec 3744 [junit4] _12.si 630 [junit4] _12.len 221 [junit4] _12.fld 94 [junit4] _12.pst 1338 [junit4] _12.inf 392 [junit4] _13.fld 94 [junit4] _13.vec 3744 [junit4] _13.si 630 [junit4] _13.pst
[jira] [Resolved] (LUCENE-6386) TestIndexWriterForceMerge still unreliable in NIGHTLY
[ https://issues.apache.org/jira/browse/LUCENE-6386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-6386. Resolution: Fixed TestIndexWriterForceMerge still unreliable in NIGHTLY - Key: LUCENE-6386 URL: https://issues.apache.org/jira/browse/LUCENE-6386 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Assignee: Michael McCandless Fix For: Trunk, 5.1 Attachments: LUCENE-6386.patch Discovered by ryan beasting (trunk): ant test -Dtestcase=TestIndexWriterForceMerge -Dtests.method=testForceMergeTempSpaceUsage -Dtests.seed=DC9ADB74850A581B -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.locale=sr__#Latn -Dtests.timezone=Indian/Chagos -Dtests.asserts=true -Dtests.file.encoding=US-ASCII {noformat} [junit4] 2 NOTE: reproduce with: ant test -Dtestcase=TestIndexWriterForceMerge -Dtests.method=testForceMergeTempSpaceUsage -Dtests.seed=DC9ADB74850A581B -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.locale=sr__#Latn -Dtests.timezone=Indian/Chagos -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] FAILURE 1.20s | TestIndexWriterForceMerge.testForceMergeTempSpaceUsage [junit4] Throwable #1: java.lang.AssertionError: forceMerge used too much temporary space: starting usage was 291570 bytes; final usage was 262469 bytes; max temp usage was 1079501 but should have been 874710 (= 3X starting usage), BEFORE= [junit4] _u.scf 146329 [junit4] _u.si 635 [junit4] |- (inside compound file) _u.fld 2214 [junit4] |- (inside compound file) _u.inf 392 [junit4] |- (inside compound file) _u.len 2381 [junit4] |- (inside compound file) _u.pst 36758 [junit4] |- (inside compound file) _u.vec 104144 [junit4] _s.pst 1338 [junit4] _s.inf 392 [junit4] _s.fld 94 [junit4] _s.len 221 [junit4] _s.vec 3744 [junit4] _s.si 624 [junit4] _t.fld 94 [junit4] _t.len 221 [junit4] _t.pst 1338 [junit4] _t.inf 392 [junit4] _t.vec 3744 [junit4] _t.si 624 [junit4] _v.fld 94 [junit4] _v.pst 1338 [junit4] _v.inf 392 [junit4] _v.vec 3744 [junit4] _v.si 624 [junit4] _v.len 221 [junit4] _w.len 221 [junit4] _w.pst 1338 [junit4] _w.inf 392 [junit4] _w.fld 94 [junit4] _w.si 624 [junit4] _w.vec 3744 [junit4] _x.vec 3744 [junit4] _x.inf 392 [junit4] _x.pst 1338 [junit4] _x.fld 94 [junit4] _x.si 624 [junit4] _x.len 221 [junit4] _y.fld 94 [junit4] _y.pst 1338 [junit4] _y.inf 392 [junit4] _y.si 624 [junit4] _y.vec 3744 [junit4] _y.len 221 [junit4] _z.fld 94 [junit4] _z.pst 1338 [junit4] _z.inf 392 [junit4] _z.len 221 [junit4] _z.vec 3744 [junit4] _z.si 624 [junit4] _10.si 630 [junit4] _10.fld 94 [junit4] _10.pst 1338 [junit4] _10.inf 392 [junit4] _10.vec 3744 [junit4] _10.len 221 [junit4] _11.len 221 [junit4] _11.si 630 [junit4] _11.vec 3744 [junit4] _11.pst 1338 [junit4] _11.inf 392 [junit4] _11.fld 94 [junit4] _12.vec 3744 [junit4] _12.si 630 [junit4] _12.len 221 [junit4] _12.fld 94 [junit4] _12.pst 1338 [junit4] _12.inf 392 [junit4] _13.fld 94 [junit4] _13.vec 3744 [junit4] _13.si 630 [junit4] _13.pst 1338 [junit4] _13.inf 392 [junit4] _13.len 221 [junit4] _14.fld 94 [junit4] _14.pst 1338 [junit4] _14.inf 392 [junit4] _14.si
[jira] [Assigned] (LUCENE-6385) NullPointerException from Highlighter.getBestFragment()
[ https://issues.apache.org/jira/browse/LUCENE-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir reassigned LUCENE-6385: --- Assignee: Robert Muir NullPointerException from Highlighter.getBestFragment() --- Key: LUCENE-6385 URL: https://issues.apache.org/jira/browse/LUCENE-6385 Project: Lucene - Core Issue Type: Bug Components: modules/highlighter Affects Versions: 5.1 Reporter: Terry Smith Assignee: Robert Muir Priority: Blocker Attachments: LUCENE-6385.patch When testing against the 5.1 nightly snapshots I've come across a NullPointerException in highlighting when nothing would be highlighted. This does not happen with 5.0. {noformat} java.lang.NullPointerException at __randomizedtesting.SeedInfo.seed([3EDC6EB0FA552B34:9971866E394F5FD0]:0) at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extractWeightedSpanTerms(WeightedSpanTermExtractor.java:311) at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:151) at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.getWeightedSpanTerms(WeightedSpanTermExtractor.java:515) at org.apache.lucene.search.highlight.QueryScorer.initExtractor(QueryScorer.java:219) at org.apache.lucene.search.highlight.QueryScorer.init(QueryScorer.java:187) at org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:196) at org.apache.lucene.search.highlight.Highlighter.getBestFragments(Highlighter.java:156) at org.apache.lucene.search.highlight.Highlighter.getBestFragment(Highlighter.java:102) at org.apache.lucene.search.highlight.Highlighter.getBestFragment(Highlighter.java:80) at org.apache.lucene.search.highlight.MissesTest.testPhraseQuery(MissesTest.java:50) {noformat} I've written a small unit test and used git bisect to narrow the regression to the following commit: {noformat} commit 24e4eefaefb1837d1d4fa35f7669c2b264f872ac Author: Michael McCandless mikemcc...@apache.org Date: Tue Mar 31 08:48:28 2015 + LUCENE-6308: cutover Spans to DISI, reuse ConjunctionDISI, use two-phased iteration git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/branch_5x@1670273 13f79535-47bb-0310-9956-ffa450edef68 {noformat} The problem looks quite simple, WeightedSpanTermExtractor.extractWeightedSpanTerms() needs an early return if SpanQuery.getSpans() returns null. All other callers check against this. Unit test and fix (against the regressed commit) attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6385) NullPointerException from Highlighter.getBestFragment()
[ https://issues.apache.org/jira/browse/LUCENE-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393283#comment-14393283 ] Robert Muir commented on LUCENE-6385: - The fix looks fine. I probably messed this up on LUCENE-6308. NullPointerException from Highlighter.getBestFragment() --- Key: LUCENE-6385 URL: https://issues.apache.org/jira/browse/LUCENE-6385 Project: Lucene - Core Issue Type: Bug Components: modules/highlighter Affects Versions: 5.1 Reporter: Terry Smith Priority: Blocker Attachments: LUCENE-6385.patch When testing against the 5.1 nightly snapshots I've come across a NullPointerException in highlighting when nothing would be highlighted. This does not happen with 5.0. {noformat} java.lang.NullPointerException at __randomizedtesting.SeedInfo.seed([3EDC6EB0FA552B34:9971866E394F5FD0]:0) at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extractWeightedSpanTerms(WeightedSpanTermExtractor.java:311) at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:151) at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.getWeightedSpanTerms(WeightedSpanTermExtractor.java:515) at org.apache.lucene.search.highlight.QueryScorer.initExtractor(QueryScorer.java:219) at org.apache.lucene.search.highlight.QueryScorer.init(QueryScorer.java:187) at org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:196) at org.apache.lucene.search.highlight.Highlighter.getBestFragments(Highlighter.java:156) at org.apache.lucene.search.highlight.Highlighter.getBestFragment(Highlighter.java:102) at org.apache.lucene.search.highlight.Highlighter.getBestFragment(Highlighter.java:80) at org.apache.lucene.search.highlight.MissesTest.testPhraseQuery(MissesTest.java:50) {noformat} I've written a small unit test and used git bisect to narrow the regression to the following commit: {noformat} commit 24e4eefaefb1837d1d4fa35f7669c2b264f872ac Author: Michael McCandless mikemcc...@apache.org Date: Tue Mar 31 08:48:28 2015 + LUCENE-6308: cutover Spans to DISI, reuse ConjunctionDISI, use two-phased iteration git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/branch_5x@1670273 13f79535-47bb-0310-9956-ffa450edef68 {noformat} The problem looks quite simple, WeightedSpanTermExtractor.extractWeightedSpanTerms() needs an early return if SpanQuery.getSpans() returns null. All other callers check against this. Unit test and fix (against the regressed commit) attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6339) [suggest] Near real time Document Suggester
[ https://issues.apache.org/jira/browse/LUCENE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393438#comment-14393438 ] ASF subversion and git services commented on LUCENE-6339: - Commit 1670972 from [~areek] in branch 'dev/trunk' [ https://svn.apache.org/r1670972 ] LUCENE-6339: fix test bug (ensure opening nrt reader with applyAllDeletes) [suggest] Near real time Document Suggester --- Key: LUCENE-6339 URL: https://issues.apache.org/jira/browse/LUCENE-6339 Project: Lucene - Core Issue Type: New Feature Components: core/search Affects Versions: 5.0 Reporter: Areek Zillur Assignee: Areek Zillur Fix For: Trunk, 5.x Attachments: LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch, LUCENE-6339.patch The idea is to index documents with one or more *SuggestField*(s) and be able to suggest documents with a *SuggestField* value that matches a given key. A SuggestField can be assigned a numeric weight to be used to score the suggestion at query time. Document suggestion can be done on an indexed *SuggestField*. The document suggester can filter out deleted documents in near real-time. The suggester can filter out documents based on a Filter (note: may change to a non-scoring query?) at query time. A custom postings format (CompletionPostingsFormat) is used to index SuggestField(s) and perform document suggestions. h4. Usage {code:java} // hook up custom postings format // indexAnalyzer for SuggestField Analyzer analyzer = ... IndexWriterConfig config = new IndexWriterConfig(analyzer); Codec codec = new Lucene50Codec() { PostingsFormat completionPostingsFormat = new Completion50PostingsFormat(); @Override public PostingsFormat getPostingsFormatForField(String field) { if (isSuggestField(field)) { return completionPostingsFormat; } return super.getPostingsFormatForField(field); } }; config.setCodec(codec); IndexWriter writer = new IndexWriter(dir, config); // index some documents with suggestions Document doc = new Document(); doc.add(new SuggestField(suggest_title, title1, 2)); doc.add(new SuggestField(suggest_name, name1, 3)); writer.addDocument(doc) ... // open an nrt reader for the directory DirectoryReader reader = DirectoryReader.open(writer, false); // SuggestIndexSearcher is a thin wrapper over IndexSearcher // queryAnalyzer will be used to analyze the query string SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader, queryAnalyzer); // suggest 10 documents for titl on suggest_title field TopSuggestDocs suggest = indexSearcher.suggest(suggest_title, titl, 10); {code} h4. Indexing Index analyzer set through *IndexWriterConfig* {code:java} SuggestField(String name, String value, long weight) {code} h4. Query Query analyzer set through *SuggestIndexSearcher*. Hits are collected in descending order of the suggestion's weight {code:java} // full options for TopSuggestDocs (TopDocs) TopSuggestDocs suggest(String field, CharSequence key, int num, Filter filter) // full options for Collector // note: only collects does not score void suggest(String field, CharSequence key, int num, Filter filter, TopSuggestDocsCollector collector) {code} h4. Analyzer *CompletionAnalyzer* can be used instead to wrap another analyzer to tune suggest field only parameters. {code:java} CompletionAnalyzer(Analyzer analyzer, boolean preserveSep, boolean preservePositionIncrements, int maxGraphExpansions) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [DISCUSS] Change Query API to make queries immutable in 6.0
On Fri, Apr 3, 2015 at 12:10 AM, Reitzel, Charles charles.reit...@tiaa-cref.org wrote: Unfortunately, since boost is used in hashCode() and equals() calculations, changing the boost will still make the queries trappy. You will do all that work to make everything-but-boost immutable and still not fix the problem. This is why the query cache clones queries before putting them into the cache and sets the boost to 1. The issue is that this clone method is only shallow since its purpose is to change the boost. So it works fine for the boost parameter but not for eg. boolean clauses. Again, I don't dismiss advantages of going fully immutable, I'm just arguing that making all queries immutable up to the boost would already be a good improvement like Robert described. We can still discuss about going fully immutable in a different issue, it would benefit from the proposed change. -- Adrien - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6865) Upgrade HttpClient to 4.4.1
[ https://issues.apache.org/jira/browse/SOLR-6865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Heisey updated SOLR-6865: --- Fix Version/s: (was: 5.1) 5.2 Upgrade HttpClient to 4.4.1 --- Key: SOLR-6865 URL: https://issues.apache.org/jira/browse/SOLR-6865 Project: Solr Issue Type: Task Affects Versions: 5.0 Reporter: Shawn Heisey Priority: Minor Fix For: Trunk, 5.2 Attachments: SOLR-6865.patch HttpClient 4.4 has been released. 5.0 seems like a good time to upgrade. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6865) Upgrade HttpClient to 4.4.1
[ https://issues.apache.org/jira/browse/SOLR-6865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14393996#comment-14393996 ] Shawn Heisey commented on SOLR-6865: I plan on committing to 5x and trunk tonight. I will leave the 5.1 branch alone. Upgrade HttpClient to 4.4.1 --- Key: SOLR-6865 URL: https://issues.apache.org/jira/browse/SOLR-6865 Project: Solr Issue Type: Task Affects Versions: 5.0 Reporter: Shawn Heisey Priority: Minor Fix For: Trunk, 5.2 Attachments: SOLR-6865.patch HttpClient 4.4 has been released. 5.0 seems like a good time to upgrade. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6865) Upgrade HttpClient to 4.4.1
[ https://issues.apache.org/jira/browse/SOLR-6865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shawn Heisey updated SOLR-6865: --- Summary: Upgrade HttpClient to 4.4.1 (was: Upgrade HttpClient to 4.4) Upgrade HttpClient to 4.4.1 --- Key: SOLR-6865 URL: https://issues.apache.org/jira/browse/SOLR-6865 Project: Solr Issue Type: Task Affects Versions: 5.0 Reporter: Shawn Heisey Priority: Minor Fix For: Trunk, 5.1 Attachments: SOLR-6865.patch HttpClient 4.4 has been released. 5.0 seems like a good time to upgrade. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-Tests-5.x-Java7 - Build # 2879 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-Tests-5.x-Java7/2879/ 4 tests failed. FAILED: org.apache.solr.handler.component.DistributedMLTComponentTest.test Error Message: Timeout occured while waiting response from server at: http://127.0.0.1:33912//collection1 Stack Trace: org.apache.solr.client.solrj.SolrServerException: Timeout occured while waiting response from server at: http://127.0.0.1:33912//collection1 at __randomizedtesting.SeedInfo.seed([94BF5CA3001FF936:1CEB6379AEE394CE]:0) at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:566) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:233) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:225) at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:135) at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:943) at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:958) at org.apache.solr.BaseDistributedSearchTestCase.queryServer(BaseDistributedSearchTestCase.java:558) at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:606) at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:588) at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:567) at org.apache.solr.handler.component.DistributedMLTComponentTest.test(DistributedMLTComponentTest.java:126) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:872) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:886) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:960) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:935) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at
[JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.8.0_40) - Build # 12191 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/12191/ Java: 64bit/jdk1.8.0_40 -XX:+UseCompressedOops -XX:+UseParallelGC 1 tests failed. FAILED: org.apache.solr.handler.TestReqParamsAPI.test Error Message: Could not get expected value 'CY val' for path 'params/c' full output: { responseHeader:{ status:0, QTime:0}, params:{ wt:json, useParams:}, context:{ webapp:, path:/dump, httpMethod:GET}} Stack Trace: java.lang.AssertionError: Could not get expected value 'CY val' for path 'params/c' full output: { responseHeader:{ status:0, QTime:0}, params:{ wt:json, useParams:}, context:{ webapp:, path:/dump, httpMethod:GET}} at __randomizedtesting.SeedInfo.seed([FC143A11B7F9A66F:744005CB1905CB97]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.solr.core.TestSolrConfigHandler.testForResponseElement(TestSolrConfigHandler.java:405) at org.apache.solr.handler.TestReqParamsAPI.testReqParams(TestReqParamsAPI.java:180) at org.apache.solr.handler.TestReqParamsAPI.test(TestReqParamsAPI.java:71) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:872) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:886) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:960) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:935) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) at
[JENKINS] Lucene-Solr-5.1-Linux (32bit/jdk1.8.0_40) - Build # 191 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.1-Linux/191/ Java: 32bit/jdk1.8.0_40 -client -XX:+UseParallelGC 1 tests failed. FAILED: org.apache.lucene.search.suggest.document.SuggestFieldTest.testSuggestOnMostlyFilteredOutDocuments Error Message: MockDirectoryWrapper: cannot close: there are still open files: {_z.cfs=1, _11.cfs=1, _13.cfs=1, _10.cfs=1, _12.cfs=1, _y.cfs=1} Stack Trace: java.lang.RuntimeException: MockDirectoryWrapper: cannot close: there are still open files: {_z.cfs=1, _11.cfs=1, _13.cfs=1, _10.cfs=1, _12.cfs=1, _y.cfs=1} at org.apache.lucene.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:747) at org.apache.lucene.search.suggest.document.SuggestFieldTest.after(SuggestFieldTest.java:83) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:894) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: unclosed IndexInput: _y.cfs at org.apache.lucene.store.MockDirectoryWrapper.addFileHandle(MockDirectoryWrapper.java:622) at org.apache.lucene.store.MockDirectoryWrapper.openInput(MockDirectoryWrapper.java:666) at org.apache.lucene.codecs.lucene50.Lucene50CompoundReader.init(Lucene50CompoundReader.java:71) at org.apache.lucene.codecs.lucene50.Lucene50CompoundFormat.getCompoundReader(Lucene50CompoundFormat.java:71) at org.apache.lucene.index.SegmentCoreReaders.init(SegmentCoreReaders.java:93) at org.apache.lucene.index.SegmentReader.init(SegmentReader.java:65) at
[JENKINS] Lucene-Solr-5.x-Windows (32bit/jdk1.7.0_76) - Build # 4513 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-Windows/4513/ Java: 32bit/jdk1.7.0_76 -client -XX:+UseSerialGC 1 tests failed. FAILED: org.apache.solr.search.facet.TestJsonFacets.testComplex Error Message: mismatch: 'accord'!='a' @ facets/makes/buckets/[0]/models/buckets/[1]/val Stack Trace: java.lang.RuntimeException: mismatch: 'accord'!='a' @ facets/makes/buckets/[0]/models/buckets/[1]/val at __randomizedtesting.SeedInfo.seed([AAD23C52D0F93063:4B0D39CEFCB77C00]:0) at org.apache.solr.SolrTestCaseHS.matchJSON(SolrTestCaseHS.java:160) at org.apache.solr.SolrTestCaseHS.assertJQ(SolrTestCaseHS.java:142) at org.apache.solr.SolrTestCaseHS$Client.testJQ(SolrTestCaseHS.java:288) at org.apache.solr.search.facet.TestJsonFacets.testComplex(TestJsonFacets.java:155) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:872) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:886) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at java.lang.Thread.run(Thread.java:745) Build Log: [...truncated 10834 lines...] [junit4] Suite:
[JENKINS] Lucene-Solr-Tests-5.x-Java7 - Build # 2880 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-Tests-5.x-Java7/2880/ 4 tests failed. REGRESSION: org.apache.solr.cloud.RecoveryZkTest.test Error Message: shard1 is not consistent. Got 501 from http://127.0.0.1:39753/mkrd/collection1lastClient and got 264 from http://127.0.0.1:39784/mkrd/collection1 Stack Trace: java.lang.AssertionError: shard1 is not consistent. Got 501 from http://127.0.0.1:39753/mkrd/collection1lastClient and got 264 from http://127.0.0.1:39784/mkrd/collection1 at __randomizedtesting.SeedInfo.seed([5E8BB946415F99B4:D6DF869CEFA3F44C]:0) at org.junit.Assert.fail(Assert.java:93) at org.apache.solr.cloud.RecoveryZkTest.test(RecoveryZkTest.java:123) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:872) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:886) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:960) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:935) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at
[jira] [Created] (SOLR-7344) Use two thread pools, one for internal requests and one for external, to avoid distributed deadlock and decrease the number of threads that need to be created.
Mark Miller created SOLR-7344: - Summary: Use two thread pools, one for internal requests and one for external, to avoid distributed deadlock and decrease the number of threads that need to be created. Key: SOLR-7344 URL: https://issues.apache.org/jira/browse/SOLR-7344 Project: Solr Issue Type: Improvement Components: SolrCloud Reporter: Mark Miller -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS-MAVEN] Lucene-Solr-Maven-5.x #891: POMs out of sync
Build: https://builds.apache.org/job/Lucene-Solr-Maven-5.x/891/ No tests ran. Build Log: [...truncated 25354 lines...] BUILD FAILED /usr/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Maven-5.x/build.xml:545: The following error occurred while executing this line: /usr/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Maven-5.x/build.xml:186: The following error occurred while executing this line: /usr/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Maven-5.x/lucene/build.xml:406: The following error occurred while executing this line: /usr/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Maven-5.x/lucene/common-build.xml:1594: The following error occurred while executing this line: /usr/home/jenkins/jenkins-slave/workspace/Lucene-Solr-Maven-5.x/lucene/common-build.xml:583: Error deploying artifact 'org.apache.lucene:lucene-test-framework:jar': Error deploying artifact: Failed to transfer file: https://repository.apache.org/content/repositories/snapshots/org/apache/lucene/lucene-test-framework/5.2.0-SNAPSHOT/lucene-test-framework-5.2.0-20150402.101254-2-javadoc.jar.md5. Return code is: 502 Total time: 40 minutes 54 seconds Build step 'Invoke Ant' marked build as failure Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-5.x-Linux (64bit/ibm-j9-jdk7) - Build # 12020 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-Linux/12020/ Java: 64bit/ibm-j9-jdk7 -Xjit:exclude={org/apache/lucene/util/fst/FST.pack(IIF)Lorg/apache/lucene/util/fst/FST;} No tests ran. Build Log: [...truncated 304 lines...] ERROR: Publisher hudson.tasks.junit.JUnitResultArchiver aborted due to exception hudson.AbortException: No test report files were found. Configuration error? at hudson.tasks.junit.JUnitParser$ParseResultCallable.invoke(JUnitParser.java:116) at hudson.tasks.junit.JUnitParser$ParseResultCallable.invoke(JUnitParser.java:93) at hudson.FilePath.act(FilePath.java:989) at hudson.FilePath.act(FilePath.java:967) at hudson.tasks.junit.JUnitParser.parseResult(JUnitParser.java:90) at hudson.tasks.junit.JUnitResultArchiver.parse(JUnitResultArchiver.java:120) at hudson.tasks.junit.JUnitResultArchiver.perform(JUnitResultArchiver.java:137) at hudson.tasks.BuildStepCompatibilityLayer.perform(BuildStepCompatibilityLayer.java:74) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:761) at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:721) at hudson.model.Build$BuildExecution.post2(Build.java:183) at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:670) at hudson.model.Run.execute(Run.java:1766) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) at hudson.model.ResourceController.execute(ResourceController.java:98) at hudson.model.Executor.run(Executor.java:374) Email was triggered for: Failure - Any Sending email for trigger: Failure - Any - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-5.x-MacOSX (64bit/jdk1.7.0) - Build # 2084 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-MacOSX/2084/ Java: 64bit/jdk1.7.0 -XX:+UseCompressedOops -XX:+UseG1GC 1 tests failed. FAILED: org.apache.solr.handler.TestReplicationHandler.doTestIndexAndConfigAliasReplication Error Message: [/Users/jenkins/workspace/Lucene-Solr-5.x-MacOSX/solr/build/solr-core/test/J1/temp/solr.handler.TestReplicationHandler 8BAAF8D1E50DC84E-001/solr-instance-017/./collection1/data/index.20150402093714022, /Users/jenkins/workspace/Lucene-Solr-5.x-MacOSX/solr/build/solr-core/test/J1/temp/solr.handler.TestReplicationHandler 8BAAF8D1E50DC84E-001/solr-instance-017/./collection1/data, /Users/jenkins/workspace/Lucene-Solr-5.x-MacOSX/solr/build/solr-core/test/J1/temp/solr.handler.TestReplicationHandler 8BAAF8D1E50DC84E-001/solr-instance-017/./collection1/data/index.20150402093715023] expected:2 but was:3 Stack Trace: java.lang.AssertionError: [/Users/jenkins/workspace/Lucene-Solr-5.x-MacOSX/solr/build/solr-core/test/J1/temp/solr.handler.TestReplicationHandler 8BAAF8D1E50DC84E-001/solr-instance-017/./collection1/data/index.20150402093714022, /Users/jenkins/workspace/Lucene-Solr-5.x-MacOSX/solr/build/solr-core/test/J1/temp/solr.handler.TestReplicationHandler 8BAAF8D1E50DC84E-001/solr-instance-017/./collection1/data, /Users/jenkins/workspace/Lucene-Solr-5.x-MacOSX/solr/build/solr-core/test/J1/temp/solr.handler.TestReplicationHandler 8BAAF8D1E50DC84E-001/solr-instance-017/./collection1/data/index.20150402093715023] expected:2 but was:3 at __randomizedtesting.SeedInfo.seed([8BAAF8D1E50DC84E:7CD9168923E567A8]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.apache.solr.handler.TestReplicationHandler.checkForSingleIndex(TestReplicationHandler.java:813) at org.apache.solr.handler.TestReplicationHandler.doTestIndexAndConfigAliasReplication(TestReplicationHandler.java:1243) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:872) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:886) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at
[JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.8.0_40) - Build # 12185 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/12185/ Java: 64bit/jdk1.8.0_40 -XX:-UseCompressedOops -XX:+UseSerialGC 2 tests failed. FAILED: org.apache.solr.cloud.ShardSplitTest.test Error Message: Timeout occured while waiting response from server at: http://127.0.0.1:45162/fk Stack Trace: org.apache.solr.client.solrj.SolrServerException: Timeout occured while waiting response from server at: http://127.0.0.1:45162/fk at __randomizedtesting.SeedInfo.seed([2412D66B3867317A:AC46E9B1969B5C82]:0) at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:568) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:233) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:225) at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1220) at org.apache.solr.cloud.ShardSplitTest.splitShard(ShardSplitTest.java:497) at org.apache.solr.cloud.ShardSplitTest.incompleteOrOverlappingCustomRangeTest(ShardSplitTest.java:118) at org.apache.solr.cloud.ShardSplitTest.test(ShardSplitTest.java:83) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:872) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:886) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:960) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:935) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) at
[jira] [Commented] (SOLR-7336) Add State enum to Replica
[ https://issues.apache.org/jira/browse/SOLR-7336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14394044#comment-14394044 ] Shai Erera commented on SOLR-7336: -- [~markrmil...@gmail.com] if you have no objections, I will commit these changes. Add State enum to Replica - Key: SOLR-7336 URL: https://issues.apache.org/jira/browse/SOLR-7336 Project: Solr Issue Type: Improvement Components: SolrJ Reporter: Shai Erera Assignee: Shai Erera Attachments: SOLR-7336.patch, SOLR-7336.patch, SOLR-7336.patch Following SOLR-7325, this issue adds a State enum to Replica. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.8.0_40) - Build # 12193 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/12193/ Java: 32bit/jdk1.8.0_40 -client -XX:+UseParallelGC 1 tests failed. FAILED: org.apache.lucene.search.suggest.document.SuggestFieldTest.testDupSuggestFieldValues Error Message: MockDirectoryWrapper: cannot close: there are still open files: {_yp.cfs=1, _yo_completion_0.pos=1, _yo_completion_0.pay=1, _yo_completion_0.tim=1, _yo_completion_0.lkp=1, _yq.cfs=1, _yo.fdt=1, _yo_completion_0.doc=1, _yo.nvd=1} Stack Trace: java.lang.RuntimeException: MockDirectoryWrapper: cannot close: there are still open files: {_yp.cfs=1, _yo_completion_0.pos=1, _yo_completion_0.pay=1, _yo_completion_0.tim=1, _yo_completion_0.lkp=1, _yq.cfs=1, _yo.fdt=1, _yo_completion_0.doc=1, _yo.nvd=1} at org.apache.lucene.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:747) at org.apache.lucene.search.suggest.document.SuggestFieldTest.after(SuggestFieldTest.java:81) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:894) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: unclosed IndexInput: _yo_completion_0.pos at org.apache.lucene.store.MockDirectoryWrapper.addFileHandle(MockDirectoryWrapper.java:622) at org.apache.lucene.store.MockDirectoryWrapper.openInput(MockDirectoryWrapper.java:666) at org.apache.lucene.codecs.lucene50.Lucene50PostingsReader.init(Lucene50PostingsReader.java:89) at org.apache.lucene.codecs.lucene50.Lucene50PostingsFormat.fieldsProducer(Lucene50PostingsFormat.java:443) at
[JENKINS] Lucene-Solr-Tests-5.x-Java7 - Build # 2881 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-Tests-5.x-Java7/2881/ 3 tests failed. FAILED: org.apache.solr.cloud.LeaderInitiatedRecoveryOnCommitTest.test Error Message: IOException occured when talking to server at: http://127.0.0.1:59379/e_/tc/c8n_1x3_commits_shard1_replica1 Stack Trace: org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: http://127.0.0.1:59379/e_/tc/c8n_1x3_commits_shard1_replica1 at __randomizedtesting.SeedInfo.seed([9B66993D745958DD:1332A6E7DAA53525]:0) at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:570) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:233) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:225) at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:135) at org.apache.solr.client.solrj.SolrClient.commit(SolrClient.java:483) at org.apache.solr.client.solrj.SolrClient.commit(SolrClient.java:464) at org.apache.solr.cloud.LeaderInitiatedRecoveryOnCommitTest.oneShardTest(LeaderInitiatedRecoveryOnCommitTest.java:132) at org.apache.solr.cloud.LeaderInitiatedRecoveryOnCommitTest.test(LeaderInitiatedRecoveryOnCommitTest.java:64) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:872) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:886) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:960) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:935) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
CompressingTermVectors; per-field decompress?
I was looking at a JIRA issue someone posted pertaining to optimizing highlighting for when there are term vectors ( SOLR-5855 ). I dug into the details a bit and learned something unexpected: CompressingTermVectorsReader.get(docId) fully loads all term vectors for the document. The client/user consuming code in question might just want the term vectors for a subset of all fields that have term vectors. Was this overlooked or are there benefits to the current approach? I can’t think of any except that perhaps there’s better compression over all the data versus in smaller per-field chunks; although I’d trade that any day over being able to just get a subset of fields. I could imagine it being useful to ask for some fields or all — in much the same way we handle stored field data. ~ David Smiley Freelance Apache Lucene/Solr Search Consultant/Developer http://www.linkedin.com/in/davidwsmiley
[jira] [Updated] (SOLR-7336) Add State enum to Replica
[ https://issues.apache.org/jira/browse/SOLR-7336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated SOLR-7336: - Attachment: SOLR-7336.patch Thanks Mark for the clarifications. I improved the javadocs of all the states and Replica.State based on your feedback and my understanding. Please confirm I got it all right. Add State enum to Replica - Key: SOLR-7336 URL: https://issues.apache.org/jira/browse/SOLR-7336 Project: Solr Issue Type: Improvement Components: SolrJ Reporter: Shai Erera Assignee: Shai Erera Attachments: SOLR-7336.patch, SOLR-7336.patch, SOLR-7336.patch Following SOLR-7325, this issue adds a State enum to Replica. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-Tests-5.x-Java7 - Build # 2878 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-Tests-5.x-Java7/2878/ 4 tests failed. REGRESSION: org.apache.solr.handler.component.DistributedMLTComponentTest.test Error Message: Timeout occured while waiting response from server at: http://127.0.0.1:50578/r/collection1 Stack Trace: org.apache.solr.client.solrj.SolrServerException: Timeout occured while waiting response from server at: http://127.0.0.1:50578/r/collection1 at __randomizedtesting.SeedInfo.seed([7FD3AC6109AC3620:F78793BBA7505BD8]:0) at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:566) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:233) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:225) at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:135) at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:943) at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:958) at org.apache.solr.BaseDistributedSearchTestCase.queryServer(BaseDistributedSearchTestCase.java:558) at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:606) at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:588) at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:567) at org.apache.solr.handler.component.DistributedMLTComponentTest.test(DistributedMLTComponentTest.java:126) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:872) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:886) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:960) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:935) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at
[jira] [Updated] (SOLR-6692) hl.maxAnalyzedChars should apply cumulatively on a multi-valued field
[ https://issues.apache.org/jira/browse/SOLR-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-6692: --- Description: in DefaultSolrHighlighter, the hl.maxAnalyzedChars figure is used to constrain how much text is analyzed before the highlighter stops, in the interests of performance. For a multi-valued field, it effectively treats each value anew, no matter how much text it was previously analyzed for other values for the same field for the current document. The PostingsHighlighter doesn't work this way -- hl.maxAnalyzedChars is effectively the total budget for a field for a document, no matter how many values there might be. It's not reset for each value. I think this makes more sense. When we loop over the values, we should subtract from hl.maxAnalyzedChars the length of the value just checked. The motivation here is consistency with PostingsHighlighter, and to allow for hl.maxAnalyzedChars to be pushed down to term vector uninversion, which wouldn't be possible for multi-valued fields based on the current way this parameter is used. Interestingly, I noticed Solr's use of FastVectorHighlighter doesn't honor hl.maxAnalyzedChars as the FVH doesn't have a knob for that. It does have hl.phraseLimit which is a limit that could be used for a similar purpose, albeit applied differently. Furthermore, DefaultSolrHighligher.doHighlightingByHighlighter should exit early from it's field value loop if it reaches hl.snippets, and if hl.preserveMulti=true was: I think hl.maxAnalyzedChars should apply cumulatively across the values of a multi-valued field. DefaultSolrHighligher doesn't; I'm not sure yet about the other two. Furthermore, DefaultSolrHighligher.doHighlightingByHighlighter should exit early from it's field value loop if it reaches hl.snippets. hl.maxAnalyzedChars should apply cumulatively on a multi-valued field - Key: SOLR-6692 URL: https://issues.apache.org/jira/browse/SOLR-6692 Project: Solr Issue Type: Improvement Components: highlighter Reporter: David Smiley Fix For: 5.0 in DefaultSolrHighlighter, the hl.maxAnalyzedChars figure is used to constrain how much text is analyzed before the highlighter stops, in the interests of performance. For a multi-valued field, it effectively treats each value anew, no matter how much text it was previously analyzed for other values for the same field for the current document. The PostingsHighlighter doesn't work this way -- hl.maxAnalyzedChars is effectively the total budget for a field for a document, no matter how many values there might be. It's not reset for each value. I think this makes more sense. When we loop over the values, we should subtract from hl.maxAnalyzedChars the length of the value just checked. The motivation here is consistency with PostingsHighlighter, and to allow for hl.maxAnalyzedChars to be pushed down to term vector uninversion, which wouldn't be possible for multi-valued fields based on the current way this parameter is used. Interestingly, I noticed Solr's use of FastVectorHighlighter doesn't honor hl.maxAnalyzedChars as the FVH doesn't have a knob for that. It does have hl.phraseLimit which is a limit that could be used for a similar purpose, albeit applied differently. Furthermore, DefaultSolrHighligher.doHighlightingByHighlighter should exit early from it's field value loop if it reaches hl.snippets, and if hl.preserveMulti=true -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Closed] (SOLR-7326) Reduce hl.maxAnalyzedChars budget for multi-valued fields in the default highlighter
[ https://issues.apache.org/jira/browse/SOLR-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley closed SOLR-7326. -- Resolution: Duplicate Reduce hl.maxAnalyzedChars budget for multi-valued fields in the default highlighter Key: SOLR-7326 URL: https://issues.apache.org/jira/browse/SOLR-7326 Project: Solr Issue Type: Improvement Components: highlighter Reporter: David Smiley Assignee: David Smiley in DefaultSolrHighlighter, the hl.maxAnalyzedChars figure is used to constrain how much text is analyzed before the highlighter stops, in the interests of performance. For a multi-valued field, it effectively treats each value anew, no matter how much text it was previously analyzed for other values for the same field for the current document. The PostingsHighlighter doesn't work this way -- hl.maxAnalyzedChars is effectively the total budget for a field for a document, no matter how many values there might be. It's not reset for each value. I think this makes more sense. When we loop over the values, we should subtract from hl.maxAnalyzedChars the length of the value just checked. The motivation here is consistency with PostingsHighlighter, and to allow for hl.maxAnalyzedChars to be pushed down to term vector uninversion, which wouldn't be possible for multi-valued fields based on the current way this parameter is used. Interestingly, I noticed Solr's use of FastVectorHighlighter doesn't honor hl.maxAnalyzedChars as the FVH doesn't have a knob for that. It does have hl.phraseLimit which is a limit that could be used for a similar purpose, albeit applied differently. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5879) Add auto-prefix terms to block tree terms dict
[ https://issues.apache.org/jira/browse/LUCENE-5879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392802#comment-14392802 ] ASF subversion and git services commented on LUCENE-5879: - Commit 1670918 from [~mikemccand] in branch 'dev/trunk' [ https://svn.apache.org/r1670918 ] LUCENE-5879: add auto-prefix terms to block tree, and experimental AutoPrefixTermsPostingsFormat Add auto-prefix terms to block tree terms dict -- Key: LUCENE-5879 URL: https://issues.apache.org/jira/browse/LUCENE-5879 Project: Lucene - Core Issue Type: New Feature Components: core/codecs Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0, Trunk Attachments: LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch This cool idea to generalize numeric/trie fields came from Adrien: Today, when we index a numeric field (LongField, etc.) we pre-compute (via NumericTokenStream) outside of indexer/codec which prefix terms should be indexed. But this can be inefficient: you set a static precisionStep, and always add those prefix terms regardless of how the terms in the field are actually distributed. Yet typically in real world applications the terms have a non-random distribution. So, it should be better if instead the terms dict decides where it makes sense to insert prefix terms, based on how dense the terms are in each region of term space. This way we can speed up query time for both term (e.g. infix suggester) and numeric ranges, and it should let us use less index space and get faster range queries. This would also mean that min/maxTerm for a numeric field would now be correct, vs today where the externally computed prefix terms are placed after the full precision terms, causing hairy code like NumericUtils.getMaxInt/Long. So optos like LUCENE-5860 become feasible. The terms dict can also do tricks not possible if you must live on top of its APIs, e.g. to handle the adversary/over-constrained case when a given prefix has too many terms following it but finer prefixes have too few (what block tree calls floor term blocks). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5879) Add auto-prefix terms to block tree terms dict
[ https://issues.apache.org/jira/browse/LUCENE-5879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392803#comment-14392803 ] Michael McCandless commented on LUCENE-5879: I committed to trunk ... I'll let it bake a bit before backporting to 5.x (5.2). Add auto-prefix terms to block tree terms dict -- Key: LUCENE-5879 URL: https://issues.apache.org/jira/browse/LUCENE-5879 Project: Lucene - Core Issue Type: New Feature Components: core/codecs Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0, Trunk Attachments: LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch This cool idea to generalize numeric/trie fields came from Adrien: Today, when we index a numeric field (LongField, etc.) we pre-compute (via NumericTokenStream) outside of indexer/codec which prefix terms should be indexed. But this can be inefficient: you set a static precisionStep, and always add those prefix terms regardless of how the terms in the field are actually distributed. Yet typically in real world applications the terms have a non-random distribution. So, it should be better if instead the terms dict decides where it makes sense to insert prefix terms, based on how dense the terms are in each region of term space. This way we can speed up query time for both term (e.g. infix suggester) and numeric ranges, and it should let us use less index space and get faster range queries. This would also mean that min/maxTerm for a numeric field would now be correct, vs today where the externally computed prefix terms are placed after the full precision terms, causing hairy code like NumericUtils.getMaxInt/Long. So optos like LUCENE-5860 become feasible. The terms dict can also do tricks not possible if you must live on top of its APIs, e.g. to handle the adversary/over-constrained case when a given prefix has too many terms following it but finer prefixes have too few (what block tree calls floor term blocks). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7337) ZkController.registerConfListenerForCore should add the dir name it can't find to the message.
[ https://issues.apache.org/jira/browse/SOLR-7337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392807#comment-14392807 ] Erick Erickson commented on SOLR-7337: -- Noble: Thanks, I'm slammed and wasn't going to be able to get here until the weekend at best. ZkController.registerConfListenerForCore should add the dir name it can't find to the message. -- Key: SOLR-7337 URL: https://issues.apache.org/jira/browse/SOLR-7337 Project: Solr Issue Type: Improvement Reporter: Erick Erickson Assignee: Noble Paul Priority: Minor Fix For: Trunk, 5.2 At least adding the name of -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-6387) Storing lucene indexes in HDFS
[ https://issues.apache.org/jira/browse/LUCENE-6387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson resolved LUCENE-6387. Resolution: Not a Problem Please raise issues like this on the user's list before raising a JIRA, there's lots more help to be had there. We try to keep the JIRAs for code and documentation changes _after_ it's determined that there's a code problem rather than a usage issue. And when you do ping the user's list, please include a _lot_ more information. Perhaps yo should review: http://wiki.apache.org/solr/UsingMailingLists Storing lucene indexes in HDFS -- Key: LUCENE-6387 URL: https://issues.apache.org/jira/browse/LUCENE-6387 Project: Lucene - Core Issue Type: Test Components: core/search, core/store Affects Versions: 4.10.2 Environment: Lucene 4.10.2,Accumulo 1.6.1,Hadoop 2.5.0 Reporter: madhvi gupta Labels: hadoop I have created lucene indexes of data stored in accumulo in HDFS but while queryng over those indexes I am getting CorruptIndexException.Can anyone help me out for this or tell me Why the accumulo data is not getting indexed.Is there anything I might be missing?When I indexed the data from local file system, it was working fine. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1670918 [1/5] - in /lucene/dev/trunk/lucene: ./ codecs/src/java/org/apache/lucene/codecs/autoprefix/ codecs/src/resources/META-INF/services/ codecs/src/test/org/apache/lucene/codecs/a
NOTE: i hit compile failure like this (TermRangeTermsEnum got removed). I am going to remove these asserts: to me they don't look very useful, and fix the build for now. compile-test: [mkdir] Created dir: /home/rmuir/workspace/trunk/lucene/build/core/classes/test [javac] Compiling 431 source files to /home/rmuir/workspace/trunk/lucene/build/core/classes/test [javac] /home/rmuir/workspace/trunk/lucene/core/src/test/org/apache/lucene/search/TestTermRangeQuery.java:123: error: cannot find symbol [javac] assertFalse(query.getTermsEnum(terms) instanceof TermRangeTermsEnum); [javac] ^ [javac] symbol: class TermRangeTermsEnum [javac] location: class TestTermRangeQuery [javac] /home/rmuir/workspace/trunk/lucene/core/src/test/org/apache/lucene/search/TestTermRangeQuery.java:128: error: cannot find symbol [javac] assertFalse(query.getTermsEnum(terms) instanceof TermRangeTermsEnum); [javac] ^ [javac] symbol: class TermRangeTermsEnum [javac] location: class TestTermRangeQuery [javac] /home/rmuir/workspace/trunk/lucene/core/src/test/org/apache/lucene/search/TestTermRangeQuery.java:132: error: cannot find symbol [javac] assertFalse(query.getTermsEnum(terms) instanceof TermRangeTermsEnum); [javac] ^ [javac] symbol: class TermRangeTermsEnum [javac] location: class TestTermRangeQuery [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] 3 errors On Thu, Apr 2, 2015 at 11:05 AM, mikemcc...@apache.org wrote: Author: mikemccand Date: Thu Apr 2 15:05:48 2015 New Revision: 1670918 URL: http://svn.apache.org/r1670918 Log: LUCENE-5879: add auto-prefix terms to block tree, and experimental AutoPrefixTermsPostingsFormat Added: lucene/dev/trunk/lucene/codecs/src/java/org/apache/lucene/codecs/autoprefix/ lucene/dev/trunk/lucene/codecs/src/java/org/apache/lucene/codecs/autoprefix/AutoPrefixPostingsFormat.java (with props) lucene/dev/trunk/lucene/codecs/src/java/org/apache/lucene/codecs/autoprefix/package-info.java (with props) lucene/dev/trunk/lucene/codecs/src/test/org/apache/lucene/codecs/autoprefix/ lucene/dev/trunk/lucene/codecs/src/test/org/apache/lucene/codecs/autoprefix/TestAutoPrefixPostingsFormat.java (with props) lucene/dev/trunk/lucene/codecs/src/test/org/apache/lucene/codecs/autoprefix/TestAutoPrefixTerms.java (with props) lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/codecs/blocktree/AutoPrefixTermsWriter.java (with props) lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/codecs/blocktree/BitSetPostingsEnum.java (with props) lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/codecs/blocktree/BitSetTermsEnum.java (with props) lucene/dev/trunk/lucene/test-framework/src/java/org/apache/lucene/index/RandomPostingsTester.java (with props) Removed: lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/search/TermRangeTermsEnum.java Modified: lucene/dev/trunk/lucene/CHANGES.txt lucene/dev/trunk/lucene/codecs/src/resources/META-INF/services/org.apache.lucene.codecs.PostingsFormat lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/codecs/BlockTermState.java lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/codecs/PostingsFormat.java lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/codecs/blocktree/BlockTreeTermsReader.java lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/codecs/blocktree/BlockTreeTermsWriter.java lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/codecs/blocktree/FieldReader.java lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/codecs/blocktree/IntersectTermsEnum.java lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/codecs/blocktree/IntersectTermsEnumFrame.java lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/codecs/blocktree/SegmentTermsEnum.java lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/codecs/blocktree/SegmentTermsEnumFrame.java lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/codecs/blocktree/Stats.java lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/index/AutomatonTermsEnum.java lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/index/CheckIndex.java lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/index/FreqProxFields.java lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/index/MappingMultiPostingsEnum.java lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/index/TermContext.java lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/index/Terms.java
[jira] [Commented] (SOLR-7336) Add State enum to Replica
[ https://issues.apache.org/jira/browse/SOLR-7336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392817#comment-14392817 ] Mark Miller commented on SOLR-7336: --- Notes on states: bq. ACTIVE The replica is ready to receive updates and queries. bq. DOWN Some of these names came before things were fully fleshed out :) DOWN is actually the first state before RECOVERING. I think tlog replay happens in DOWN, though that is a bit of a bug IMO. We should probably have a new state for it or something. A node in DOWN should be actively trying to move to RECOVERYING. bq. RECOVERING The node is recovering from the leader. This might involve peersync or full replication or finding out things are already in sync. bq. RECOVERY_FAILED RECOVERY attempts have not worked, something is not right. NOTE: This state doesn't matter if the node is not part of /live_nodes in zk - in that case the node is not part of the cluster and it's state should be discarded. Add State enum to Replica - Key: SOLR-7336 URL: https://issues.apache.org/jira/browse/SOLR-7336 Project: Solr Issue Type: Improvement Components: SolrJ Reporter: Shai Erera Assignee: Shai Erera Attachments: SOLR-7336.patch Following SOLR-7325, this issue adds a State enum to Replica. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6386) TestIndexWriterForceMerge still unreliable in NIGHTLY
[ https://issues.apache.org/jira/browse/LUCENE-6386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392855#comment-14392855 ] Shai Erera commented on LUCENE-6386: I thought we only write the final segments as .cfs if needed? Does what you say mean the following happens: {noformat} source: [s1,s2] [s3,s4] [s5,s6] [s7,s8] level1: [s9,s10], [s11,s12] level1: [s9.cfs,s10.cfs], [s11.cfs,s12.cfs] level2: [s13,s14] level2: [s13.cfs,s14.cfs] level3: [s15] final: [s15.cfs] {noformat} So when it gets to create level3, with the {{level2.cfs}} written, the index contains {{source}} (1X), {{level2.cfs}} (1X), {{level3}} (1X) and {{level3.cfs}} (1X) -- that's 4X indeed. Is there a reason why we don't delete {{level2.cfs}} after creating {{level3}}? If not, can we fix that and reduce 1X here? TestIndexWriterForceMerge still unreliable in NIGHTLY - Key: LUCENE-6386 URL: https://issues.apache.org/jira/browse/LUCENE-6386 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Assignee: Michael McCandless Fix For: Trunk, 5.1 Attachments: LUCENE-6386.patch Discovered by ryan beasting (trunk): ant test -Dtestcase=TestIndexWriterForceMerge -Dtests.method=testForceMergeTempSpaceUsage -Dtests.seed=DC9ADB74850A581B -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.locale=sr__#Latn -Dtests.timezone=Indian/Chagos -Dtests.asserts=true -Dtests.file.encoding=US-ASCII {noformat} [junit4] 2 NOTE: reproduce with: ant test -Dtestcase=TestIndexWriterForceMerge -Dtests.method=testForceMergeTempSpaceUsage -Dtests.seed=DC9ADB74850A581B -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.locale=sr__#Latn -Dtests.timezone=Indian/Chagos -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] FAILURE 1.20s | TestIndexWriterForceMerge.testForceMergeTempSpaceUsage [junit4] Throwable #1: java.lang.AssertionError: forceMerge used too much temporary space: starting usage was 291570 bytes; final usage was 262469 bytes; max temp usage was 1079501 but should have been 874710 (= 3X starting usage), BEFORE= [junit4] _u.scf 146329 [junit4] _u.si 635 [junit4] |- (inside compound file) _u.fld 2214 [junit4] |- (inside compound file) _u.inf 392 [junit4] |- (inside compound file) _u.len 2381 [junit4] |- (inside compound file) _u.pst 36758 [junit4] |- (inside compound file) _u.vec 104144 [junit4] _s.pst 1338 [junit4] _s.inf 392 [junit4] _s.fld 94 [junit4] _s.len 221 [junit4] _s.vec 3744 [junit4] _s.si 624 [junit4] _t.fld 94 [junit4] _t.len 221 [junit4] _t.pst 1338 [junit4] _t.inf 392 [junit4] _t.vec 3744 [junit4] _t.si 624 [junit4] _v.fld 94 [junit4] _v.pst 1338 [junit4] _v.inf 392 [junit4] _v.vec 3744 [junit4] _v.si 624 [junit4] _v.len 221 [junit4] _w.len 221 [junit4] _w.pst 1338 [junit4] _w.inf 392 [junit4] _w.fld 94 [junit4] _w.si 624 [junit4] _w.vec 3744 [junit4] _x.vec 3744 [junit4] _x.inf 392 [junit4] _x.pst 1338 [junit4] _x.fld 94 [junit4] _x.si 624 [junit4] _x.len 221 [junit4] _y.fld 94 [junit4] _y.pst 1338 [junit4] _y.inf 392 [junit4] _y.si 624 [junit4] _y.vec 3744 [junit4] _y.len 221 [junit4] _z.fld 94 [junit4] _z.pst 1338 [junit4] _z.inf 392 [junit4] _z.len 221 [junit4] _z.vec 3744 [junit4] _z.si 624 [junit4] _10.si 630 [junit4] _10.fld 94 [junit4] _10.pst 1338 [junit4] _10.inf 392 [junit4] _10.vec 3744 [junit4] _10.len 221 [junit4] _11.len 221 [junit4] _11.si 630 [junit4] _11.vec 3744 [junit4] _11.pst 1338 [junit4] _11.inf 392 [junit4] _11.fld 94
[jira] [Commented] (LUCENE-6386) TestIndexWriterForceMerge still unreliable in NIGHTLY
[ https://issues.apache.org/jira/browse/LUCENE-6386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392859#comment-14392859 ] Robert Muir commented on LUCENE-6386: - Looking at the code, I think that is too aggressive/complicated to do right now. The CFS interaction with merging is already absolutely horrible. We should fix the documentation bug at the moment. But IMO we should not do this optimization until this stuff is cleaned up (LUCENE-5988) TestIndexWriterForceMerge still unreliable in NIGHTLY - Key: LUCENE-6386 URL: https://issues.apache.org/jira/browse/LUCENE-6386 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Assignee: Michael McCandless Fix For: Trunk, 5.1 Attachments: LUCENE-6386.patch Discovered by ryan beasting (trunk): ant test -Dtestcase=TestIndexWriterForceMerge -Dtests.method=testForceMergeTempSpaceUsage -Dtests.seed=DC9ADB74850A581B -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.locale=sr__#Latn -Dtests.timezone=Indian/Chagos -Dtests.asserts=true -Dtests.file.encoding=US-ASCII {noformat} [junit4] 2 NOTE: reproduce with: ant test -Dtestcase=TestIndexWriterForceMerge -Dtests.method=testForceMergeTempSpaceUsage -Dtests.seed=DC9ADB74850A581B -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.locale=sr__#Latn -Dtests.timezone=Indian/Chagos -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] FAILURE 1.20s | TestIndexWriterForceMerge.testForceMergeTempSpaceUsage [junit4] Throwable #1: java.lang.AssertionError: forceMerge used too much temporary space: starting usage was 291570 bytes; final usage was 262469 bytes; max temp usage was 1079501 but should have been 874710 (= 3X starting usage), BEFORE= [junit4] _u.scf 146329 [junit4] _u.si 635 [junit4] |- (inside compound file) _u.fld 2214 [junit4] |- (inside compound file) _u.inf 392 [junit4] |- (inside compound file) _u.len 2381 [junit4] |- (inside compound file) _u.pst 36758 [junit4] |- (inside compound file) _u.vec 104144 [junit4] _s.pst 1338 [junit4] _s.inf 392 [junit4] _s.fld 94 [junit4] _s.len 221 [junit4] _s.vec 3744 [junit4] _s.si 624 [junit4] _t.fld 94 [junit4] _t.len 221 [junit4] _t.pst 1338 [junit4] _t.inf 392 [junit4] _t.vec 3744 [junit4] _t.si 624 [junit4] _v.fld 94 [junit4] _v.pst 1338 [junit4] _v.inf 392 [junit4] _v.vec 3744 [junit4] _v.si 624 [junit4] _v.len 221 [junit4] _w.len 221 [junit4] _w.pst 1338 [junit4] _w.inf 392 [junit4] _w.fld 94 [junit4] _w.si 624 [junit4] _w.vec 3744 [junit4] _x.vec 3744 [junit4] _x.inf 392 [junit4] _x.pst 1338 [junit4] _x.fld 94 [junit4] _x.si 624 [junit4] _x.len 221 [junit4] _y.fld 94 [junit4] _y.pst 1338 [junit4] _y.inf 392 [junit4] _y.si 624 [junit4] _y.vec 3744 [junit4] _y.len 221 [junit4] _z.fld 94 [junit4] _z.pst 1338 [junit4] _z.inf 392 [junit4] _z.len 221 [junit4] _z.vec 3744 [junit4] _z.si 624 [junit4] _10.si 630 [junit4] _10.fld 94 [junit4] _10.pst 1338 [junit4] _10.inf 392 [junit4] _10.vec 3744 [junit4] _10.len 221 [junit4] _11.len 221 [junit4] _11.si 630 [junit4] _11.vec 3744 [junit4] _11.pst 1338 [junit4] _11.inf 392 [junit4] _11.fld 94 [junit4] _12.vec 3744 [junit4] _12.si 630 [junit4] _12.len 221 [junit4] _12.fld 94 [junit4] _12.pst 1338 [junit4] _12.inf 392 [junit4] _13.fld 94 [junit4] _13.vec 3744 [junit4]
[JENKINS] Lucene-Solr-Tests-5.x-Java7 - Build # 2877 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-Tests-5.x-Java7/2877/ 3 tests failed. FAILED: org.apache.solr.cloud.LeaderInitiatedRecoveryOnCommitTest.test Error Message: IOException occured when talking to server at: http://127.0.0.1:50644/c8n_1x3_commits_shard1_replica2 Stack Trace: org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: http://127.0.0.1:50644/c8n_1x3_commits_shard1_replica2 at __randomizedtesting.SeedInfo.seed([EAB2019C6240E07D:62E63E46CCBC8D85]:0) at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:570) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:233) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:225) at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:135) at org.apache.solr.client.solrj.SolrClient.commit(SolrClient.java:483) at org.apache.solr.client.solrj.SolrClient.commit(SolrClient.java:464) at org.apache.solr.cloud.LeaderInitiatedRecoveryOnCommitTest.oneShardTest(LeaderInitiatedRecoveryOnCommitTest.java:132) at org.apache.solr.cloud.LeaderInitiatedRecoveryOnCommitTest.test(LeaderInitiatedRecoveryOnCommitTest.java:64) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1627) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:872) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:886) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:960) at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:935) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:845) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:747) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:781) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:792) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at
[jira] [Commented] (LUCENE-5879) Add auto-prefix terms to block tree terms dict
[ https://issues.apache.org/jira/browse/LUCENE-5879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392701#comment-14392701 ] Robert Muir commented on LUCENE-5879: - +1 I thought those tests were gonna be easy... but the refactoring of the test is great. Thanks! Add auto-prefix terms to block tree terms dict -- Key: LUCENE-5879 URL: https://issues.apache.org/jira/browse/LUCENE-5879 Project: Lucene - Core Issue Type: New Feature Components: core/codecs Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0, Trunk Attachments: LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch This cool idea to generalize numeric/trie fields came from Adrien: Today, when we index a numeric field (LongField, etc.) we pre-compute (via NumericTokenStream) outside of indexer/codec which prefix terms should be indexed. But this can be inefficient: you set a static precisionStep, and always add those prefix terms regardless of how the terms in the field are actually distributed. Yet typically in real world applications the terms have a non-random distribution. So, it should be better if instead the terms dict decides where it makes sense to insert prefix terms, based on how dense the terms are in each region of term space. This way we can speed up query time for both term (e.g. infix suggester) and numeric ranges, and it should let us use less index space and get faster range queries. This would also mean that min/maxTerm for a numeric field would now be correct, vs today where the externally computed prefix terms are placed after the full precision terms, causing hairy code like NumericUtils.getMaxInt/Long. So optos like LUCENE-5860 become feasible. The terms dict can also do tricks not possible if you must live on top of its APIs, e.g. to handle the adversary/over-constrained case when a given prefix has too many terms following it but finer prefixes have too few (what block tree calls floor term blocks). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-7338) A reloaded core will never register itself as active after a ZK session expiration
[ https://issues.apache.org/jira/browse/SOLR-7338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Potter updated SOLR-7338: - Attachment: SOLR-7338_test.patch Not sure if it's useful to you, but here's the unit test I started working on (basically implements the scenario I described in the issue description) ... it currently fails as expected until the register code is fixed up. A reloaded core will never register itself as active after a ZK session expiration -- Key: SOLR-7338 URL: https://issues.apache.org/jira/browse/SOLR-7338 Project: Solr Issue Type: Bug Components: SolrCloud Reporter: Timothy Potter Assignee: Mark Miller Attachments: SOLR-7338.patch, SOLR-7338_test.patch If a collection gets reloaded, then a core's isReloaded flag is always true. If a core experiences a ZK session expiration after a reload, then it won't ever be able to set itself to active because of the check in {{ZkController#register}}: {code} UpdateLog ulog = core.getUpdateHandler().getUpdateLog(); if (!core.isReloaded() ulog != null) { // disable recovery in case shard is in construction state (for shard splits) Slice slice = getClusterState().getSlice(collection, shardId); if (slice.getState() != Slice.State.CONSTRUCTION || !isLeader) { FutureUpdateLog.RecoveryInfo recoveryFuture = core.getUpdateHandler().getUpdateLog().recoverFromLog(); if (recoveryFuture != null) { log.info(Replaying tlog for + ourUrl + during startup... NOTE: This can take a while.); recoveryFuture.get(); // NOTE: this could potentially block for // minutes or more! // TODO: public as recovering in the mean time? // TODO: in the future we could do peersync in parallel with recoverFromLog } else { log.info(No LogReplay needed for core= + core.getName() + baseURL= + baseUrl); } } boolean didRecovery = checkRecovery(coreName, desc, recoverReloadedCores, isLeader, cloudDesc, collection, coreZkNodeName, shardId, leaderProps, core, cc); if (!didRecovery) { publish(desc, ZkStateReader.ACTIVE); } } {code} I can easily simulate this on trunk by doing: {code} bin/solr -c -z localhost:2181 bin/solr create -c foo bin/post -c foo example/exampledocs/*.xml curl http://localhost:8983/solr/admin/collections?action=RELOADname=foo; kill -STOP PID sleep PAUSE_SECONDS kill -CONT PID {code} Where PID is the process ID of the Solr node. Here are the logs after the CONT command. As you can see below, the core never gets to setting itself as active again. I think the bug is that the isReloaded flag needs to get set back to false once the reload is successful, but I don't understand what this flag is needed for anyway??? {code} INFO - 2015-04-01 17:28:50.962; org.apache.solr.common.cloud.ConnectionManager; Watcher org.apache.solr.common.cloud.ConnectionManager@5519dba0 name:ZooKeeperConnection Watcher:localhost:2181 got event WatchedEvent state:Disconnected type:None path:null path:null type:None INFO - 2015-04-01 17:28:50.963; org.apache.solr.common.cloud.ConnectionManager; zkClient has disconnected INFO - 2015-04-01 17:28:51.107; org.apache.solr.common.cloud.ConnectionManager; Watcher org.apache.solr.common.cloud.ConnectionManager@5519dba0 name:ZooKeeperConnection Watcher:localhost:2181 got event WatchedEvent state:Expired type:None path:null path:null type:None INFO - 2015-04-01 17:28:51.107; org.apache.solr.common.cloud.ConnectionManager; Our previous ZooKeeper session was expired. Attempting to reconnect to recover relationship with ZooKeeper... INFO - 2015-04-01 17:28:51.108; org.apache.solr.cloud.Overseer; Overseer (id=93579450724974592-192.168.1.2:8983_solr-n_00) closing INFO - 2015-04-01 17:28:51.108; org.apache.solr.cloud.ZkController$WatcherImpl; A node got unwatched for /configs/foo INFO - 2015-04-01 17:28:51.108; org.apache.solr.cloud.Overseer$ClusterStateUpdater; Overseer Loop exiting : 192.168.1.2:8983_solr INFO - 2015-04-01 17:28:51.109; org.apache.solr.cloud.OverseerCollectionProcessor; According to ZK I (id=93579450724974592-192.168.1.2:8983_solr-n_00) am no longer a leader. INFO - 2015-04-01 17:28:51.108; org.apache.solr.cloud.ZkController$4; Running listeners for /configs/foo INFO - 2015-04-01 17:28:51.109; org.apache.solr.common.cloud.DefaultConnectionStrategy; Connection expired - starting a new one... INFO - 2015-04-01 17:28:51.109; org.apache.solr.core.SolrCore$11; config update listener called for
[jira] [Commented] (SOLR-5855) Increasing solr highlight performance with caching
[ https://issues.apache.org/jira/browse/SOLR-5855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392752#comment-14392752 ] Thomas Champagne commented on SOLR-5855: I test again the patch with 100 queries (Create 20 docs with 80 fields (1/2 null, 1/2 with a value) and 100 queries with hl.fl=*) : Without patch : ~186 sec With patch : ~72 sec Increasing solr highlight performance with caching -- Key: SOLR-5855 URL: https://issues.apache.org/jira/browse/SOLR-5855 Project: Solr Issue Type: Improvement Components: highlighter Affects Versions: Trunk Reporter: Daniel Debray Fix For: Trunk Attachments: SOLR-5855-without-cache.patch, highlight.patch Hi folks, while investigating possible performance bottlenecks in the highlight component i discovered two places where we can save some cpu cylces. Both are in the class org.apache.solr.highlight.DefaultSolrHighlighter First in method doHighlighting (lines 411-417): In the loop we try to highlight every field that has been resolved from the params on each document. Ok, but why not skip those fields that are not present on the current document? So i changed the code from: for (String fieldName : fieldNames) { fieldName = fieldName.trim(); if( useFastVectorHighlighter( params, schema, fieldName ) ) doHighlightingByFastVectorHighlighter( fvh, fieldQuery, req, docSummaries, docId, doc, fieldName ); else doHighlightingByHighlighter( query, req, docSummaries, docId, doc, fieldName ); } to: for (String fieldName : fieldNames) { fieldName = fieldName.trim(); if (doc.get(fieldName) != null) { if( useFastVectorHighlighter( params, schema, fieldName ) ) doHighlightingByFastVectorHighlighter( fvh, fieldQuery, req, docSummaries, docId, doc, fieldName ); else doHighlightingByHighlighter( query, req, docSummaries, docId, doc, fieldName ); } } The second place is where we try to retrieve the TokenStream from the document for a specific field. line 472: TokenStream tvStream = TokenSources.getTokenStreamWithOffsets(searcher.getIndexReader(), docId, fieldName); where.. public static TokenStream getTokenStreamWithOffsets(IndexReader reader, int docId, String field) throws IOException { Fields vectors = reader.getTermVectors(docId); if (vectors == null) { return null; } Terms vector = vectors.terms(field); if (vector == null) { return null; } if (!vector.hasPositions() || !vector.hasOffsets()) { return null; } return getTokenStream(vector); } keep in mind that we currently hit the IndexReader n times where n = requested rows(documents) * requested amount of highlight fields. in my usecase reader.getTermVectors(docId) takes around 150.000~250.000ns on a warm solr and 1.100.000ns on a cold solr. If we store the returning Fields vectors in a cache, this lookups only take 25000ns. I would suggest something like the following code in the doHighlightingByHighlighter method in the DefaultSolrHighlighter class (line 472): Fields vectors = null; SolrCache termVectorCache = searcher.getCache(termVectorCache); if (termVectorCache != null) { vectors = (Fields) termVectorCache.get(Integer.valueOf(docId)); if (vectors == null) { vectors = searcher.getIndexReader().getTermVectors(docId); if (vectors != null) termVectorCache.put(Integer.valueOf(docId), vectors); } } else { vectors = searcher.getIndexReader().getTermVectors(docId); } TokenStream tvStream = TokenSources.getTokenStreamWithOffsets(vectors, fieldName); and TokenSources class: public static TokenStream getTokenStreamWithOffsets(Fields vectors, String field) throws IOException { if (vectors == null) { return null; } Terms vector = vectors.terms(field); if (vector == null) { return null; } if (!vector.hasPositions() || !vector.hasOffsets()) { return null; } return getTokenStream(vector); } 4000ms on 1000 docs without cache 639ms on 1000 docs with cache 102ms on 30 docs without cache 22ms on 30 docs with cache on an index with 190.000 docs with a numFound of 32000 and 80 different highlight fields. I think querys with only one field to highlight on a document does not benefit that much from a cache like this, thats why i think an optional cache would be the best solution there. As i saw the FastVectorHighlighter uses more or less the same approach and could also benefit from this cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5879) Add auto-prefix terms to block tree terms dict
[ https://issues.apache.org/jira/browse/LUCENE-5879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392815#comment-14392815 ] ASF subversion and git services commented on LUCENE-5879: - Commit 1670923 from [~rcmuir] in branch 'dev/trunk' [ https://svn.apache.org/r1670923 ] LUCENE-5879: fix test compilation (this enum no longer exists) Add auto-prefix terms to block tree terms dict -- Key: LUCENE-5879 URL: https://issues.apache.org/jira/browse/LUCENE-5879 Project: Lucene - Core Issue Type: New Feature Components: core/codecs Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0, Trunk Attachments: LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch, LUCENE-5879.patch This cool idea to generalize numeric/trie fields came from Adrien: Today, when we index a numeric field (LongField, etc.) we pre-compute (via NumericTokenStream) outside of indexer/codec which prefix terms should be indexed. But this can be inefficient: you set a static precisionStep, and always add those prefix terms regardless of how the terms in the field are actually distributed. Yet typically in real world applications the terms have a non-random distribution. So, it should be better if instead the terms dict decides where it makes sense to insert prefix terms, based on how dense the terms are in each region of term space. This way we can speed up query time for both term (e.g. infix suggester) and numeric ranges, and it should let us use less index space and get faster range queries. This would also mean that min/maxTerm for a numeric field would now be correct, vs today where the externally computed prefix terms are placed after the full precision terms, causing hairy code like NumericUtils.getMaxInt/Long. So optos like LUCENE-5860 become feasible. The terms dict can also do tricks not possible if you must live on top of its APIs, e.g. to handle the adversary/over-constrained case when a given prefix has too many terms following it but finer prefixes have too few (what block tree calls floor term blocks). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7274) Pluggable authentication module in Solr
[ https://issues.apache.org/jira/browse/SOLR-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392816#comment-14392816 ] Ishan Chattopadhyaya commented on SOLR-7274: [~gchanan] Thanks for pitching in! Cloudera's implementation was actually one of my starting points when I started looking into this. bq. What I'm not sure about is how you will make the configuration general enough without mentioning Filters. I.e. will there be pre-approved authentication mechanisms? Will I be able to write my own? My thought was to have configurations actually mention the filters (which may deal with any authentication mechanism, not just preapproved ones), without the user having to add it to the web.xml. For instance (and this may look different in the implementation), a user could have a configuration as HostnameFilter,SolrHadoopAuthenticationFilter and the authentication layer (which might itself be a servlet filter) would call the doFilter() on each of the two filters. bq. This discussion also seems focused on the server side. Is the client side considered outside the scope of this jira? (i'm thinking something like SOLR-6625, but SOLR-4470 is related). Client side configurations are in the scope of pluggable items for each authentication mechanism. My thought was that this issue (SOLR-7274) could leverage the callback frameworks of SOLR-6625, SOLR-4470 and focus on the pluggable aspects of the configurations for each authc mechanism. Pluggable authentication module in Solr --- Key: SOLR-7274 URL: https://issues.apache.org/jira/browse/SOLR-7274 Project: Solr Issue Type: Sub-task Reporter: Anshum Gupta It would be good to have Solr support different authentication protocols. To begin with, it'd be good to have support for kerberos and basic auth. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6386) TestIndexWriterForceMerge still unreliable in NIGHTLY
[ https://issues.apache.org/jira/browse/LUCENE-6386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392876#comment-14392876 ] Michael McCandless commented on LUCENE-6386: bq. I thought we only write the final segments as .cfs if needed? No: every segment that's written (flush or merge) decides whether to be cfs or not and then builds the cfs immediately after writing... bq. Is there a reason why we don't delete level2.cfs after creating level3? If not, can we fix that and reduce 1X here? In the past we did that, by exposing level3 (not yet a CFS) into IW's SIS, and then deleting level2.cfs before building level3.cfs ... but this proved tricky (I think we had ref count bugs around this, and it also allowed commit / NRT readers to see a non-CFS segment when such external usage expected the segments to be CFS). TestIndexWriterForceMerge still unreliable in NIGHTLY - Key: LUCENE-6386 URL: https://issues.apache.org/jira/browse/LUCENE-6386 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Assignee: Michael McCandless Fix For: Trunk, 5.1 Attachments: LUCENE-6386.patch Discovered by ryan beasting (trunk): ant test -Dtestcase=TestIndexWriterForceMerge -Dtests.method=testForceMergeTempSpaceUsage -Dtests.seed=DC9ADB74850A581B -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.locale=sr__#Latn -Dtests.timezone=Indian/Chagos -Dtests.asserts=true -Dtests.file.encoding=US-ASCII {noformat} [junit4] 2 NOTE: reproduce with: ant test -Dtestcase=TestIndexWriterForceMerge -Dtests.method=testForceMergeTempSpaceUsage -Dtests.seed=DC9ADB74850A581B -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.locale=sr__#Latn -Dtests.timezone=Indian/Chagos -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] FAILURE 1.20s | TestIndexWriterForceMerge.testForceMergeTempSpaceUsage [junit4] Throwable #1: java.lang.AssertionError: forceMerge used too much temporary space: starting usage was 291570 bytes; final usage was 262469 bytes; max temp usage was 1079501 but should have been 874710 (= 3X starting usage), BEFORE= [junit4] _u.scf 146329 [junit4] _u.si 635 [junit4] |- (inside compound file) _u.fld 2214 [junit4] |- (inside compound file) _u.inf 392 [junit4] |- (inside compound file) _u.len 2381 [junit4] |- (inside compound file) _u.pst 36758 [junit4] |- (inside compound file) _u.vec 104144 [junit4] _s.pst 1338 [junit4] _s.inf 392 [junit4] _s.fld 94 [junit4] _s.len 221 [junit4] _s.vec 3744 [junit4] _s.si 624 [junit4] _t.fld 94 [junit4] _t.len 221 [junit4] _t.pst 1338 [junit4] _t.inf 392 [junit4] _t.vec 3744 [junit4] _t.si 624 [junit4] _v.fld 94 [junit4] _v.pst 1338 [junit4] _v.inf 392 [junit4] _v.vec 3744 [junit4] _v.si 624 [junit4] _v.len 221 [junit4] _w.len 221 [junit4] _w.pst 1338 [junit4] _w.inf 392 [junit4] _w.fld 94 [junit4] _w.si 624 [junit4] _w.vec 3744 [junit4] _x.vec 3744 [junit4] _x.inf 392 [junit4] _x.pst 1338 [junit4] _x.fld 94 [junit4] _x.si 624 [junit4] _x.len 221 [junit4] _y.fld 94 [junit4] _y.pst 1338 [junit4] _y.inf 392 [junit4] _y.si 624 [junit4] _y.vec 3744 [junit4] _y.len 221 [junit4] _z.fld 94 [junit4] _z.pst 1338 [junit4] _z.inf 392 [junit4] _z.len 221 [junit4] _z.vec 3744 [junit4] _z.si 624 [junit4] _10.si 630 [junit4] _10.fld 94 [junit4] _10.pst 1338 [junit4] _10.inf 392 [junit4] _10.vec 3744 [junit4] _10.len 221 [junit4] _11.len 221 [junit4] _11.si 630 [junit4] _11.vec 3744 [junit4] _11.pst 1338 [junit4] _11.inf 392
[jira] [Created] (SOLR-7342) zkcli command to delete node
Shawn Heisey created SOLR-7342: -- Summary: zkcli command to delete node Key: SOLR-7342 URL: https://issues.apache.org/jira/browse/SOLR-7342 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 5.0 Reporter: Shawn Heisey Priority: Minor Users encounter situations where SolrCloud detects the node IP address incorrectly, so a node gets inserted into the Zookeeper cluster with the incorrect address. It would be very useful if there were a command in zkcli to remove a node (host:port_context) from the SolrCloud cluster information stored in zookeeper. Manually editing the zookeeper database to remove a node is tedious and somewhat dangerous. Other situations where it would be useful are when a user needs to change the IP addressing on their SolrCloud cluster, and when an old node is retired. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-5.x-Windows (64bit/jdk1.8.0_40) - Build # 4511 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-Windows/4511/ Java: 64bit/jdk1.8.0_40 -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC 1 tests failed. FAILED: junit.framework.TestSuite.org.apache.solr.core.TestSolrConfigHandler Error Message: Could not remove the following files (in the order of attempts): C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler CA19103DA1EA464E-001\tempDir-010\collection1\conf\params.json: java.nio.file.FileSystemException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler CA19103DA1EA464E-001\tempDir-010\collection1\conf\params.json: The process cannot access the file because it is being used by another process. C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler CA19103DA1EA464E-001\tempDir-010\collection1\conf: java.nio.file.DirectoryNotEmptyException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler CA19103DA1EA464E-001\tempDir-010\collection1\conf C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler CA19103DA1EA464E-001\tempDir-010\collection1: java.nio.file.DirectoryNotEmptyException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler CA19103DA1EA464E-001\tempDir-010\collection1 C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler CA19103DA1EA464E-001\tempDir-010: java.nio.file.DirectoryNotEmptyException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler CA19103DA1EA464E-001\tempDir-010 C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler CA19103DA1EA464E-001\tempDir-010: java.nio.file.DirectoryNotEmptyException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler CA19103DA1EA464E-001\tempDir-010 C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler CA19103DA1EA464E-001: java.nio.file.DirectoryNotEmptyException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler CA19103DA1EA464E-001 Stack Trace: java.io.IOException: Could not remove the following files (in the order of attempts): C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler CA19103DA1EA464E-001\tempDir-010\collection1\conf\params.json: java.nio.file.FileSystemException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler CA19103DA1EA464E-001\tempDir-010\collection1\conf\params.json: The process cannot access the file because it is being used by another process. C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler CA19103DA1EA464E-001\tempDir-010\collection1\conf: java.nio.file.DirectoryNotEmptyException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler CA19103DA1EA464E-001\tempDir-010\collection1\conf C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler CA19103DA1EA464E-001\tempDir-010\collection1: java.nio.file.DirectoryNotEmptyException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler CA19103DA1EA464E-001\tempDir-010\collection1 C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler CA19103DA1EA464E-001\tempDir-010: java.nio.file.DirectoryNotEmptyException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler CA19103DA1EA464E-001\tempDir-010 C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler CA19103DA1EA464E-001\tempDir-010: java.nio.file.DirectoryNotEmptyException: C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler CA19103DA1EA464E-001\tempDir-010 C:\Users\JenkinsSlave\workspace\Lucene-Solr-5.x-Windows\solr\build\solr-core\test\J1\temp\solr.core.TestSolrConfigHandler CA19103DA1EA464E-001: java.nio.file.DirectoryNotEmptyException:
[jira] [Commented] (LUCENE-6388) Optimize SpanNearQuery
[ https://issues.apache.org/jira/browse/LUCENE-6388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392831#comment-14392831 ] Paul Elschot commented on LUCENE-6388: -- Oops, I thought by now ArrayList would be JIT-ed away, thanks. Also the UOE's in the NearSpansOrdered payload methods have gone in this patch, I had put these in to check the tests. Optimize SpanNearQuery -- Key: LUCENE-6388 URL: https://issues.apache.org/jira/browse/LUCENE-6388 Project: Lucene - Core Issue Type: Improvement Reporter: Robert Muir Attachments: LUCENE-6388.patch After the big spans overhaul in LUCENE-6308, we can speed up SpanNearQuery a little more: * SpanNearQuery defaults to collectPayloads=true, but this requires a slower implementation, for an uncommon case. Use the faster no-payloads impl if the field doesn't actually have any payloads. * Use a simple array of Spans rather than List in NearSpans classes. This is iterated over often in the logic. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6388) Optimize SpanNearQuery
[ https://issues.apache.org/jira/browse/LUCENE-6388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392847#comment-14392847 ] Robert Muir commented on LUCENE-6388: - I removed the UOE because now the no-payload impl is used if a segment doesn't happen to have any payloads. But this is valid, the documents might just not have any. Optimize SpanNearQuery -- Key: LUCENE-6388 URL: https://issues.apache.org/jira/browse/LUCENE-6388 Project: Lucene - Core Issue Type: Improvement Reporter: Robert Muir Attachments: LUCENE-6388.patch After the big spans overhaul in LUCENE-6308, we can speed up SpanNearQuery a little more: * SpanNearQuery defaults to collectPayloads=true, but this requires a slower implementation, for an uncommon case. Use the faster no-payloads impl if the field doesn't actually have any payloads. * Use a simple array of Spans rather than List in NearSpans classes. This is iterated over often in the logic. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7336) Add State enum to Replica
[ https://issues.apache.org/jira/browse/SOLR-7336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392811#comment-14392811 ] Mark Miller commented on SOLR-7336: --- SYNC is just cruft - part of some prototyping at the way start and never used. Add State enum to Replica - Key: SOLR-7336 URL: https://issues.apache.org/jira/browse/SOLR-7336 Project: Solr Issue Type: Improvement Components: SolrJ Reporter: Shai Erera Assignee: Shai Erera Attachments: SOLR-7336.patch Following SOLR-7325, this issue adds a State enum to Replica. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7274) Pluggable authentication module in Solr
[ https://issues.apache.org/jira/browse/SOLR-7274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392822#comment-14392822 ] Ishan Chattopadhyaya commented on SOLR-7274: bq. Doesn't Apache Shiro do all of this and give you an ini file with which to configure everything? They also have a web-app filtering system too (also see http://shiro.apache.org/web.html#Web-EnablingandDisablingFilters) I did have a look at Shiro, but my initial thought was that it might not fit our bill due to a couple of reasons: * Shiro doesn't have out of the box support for Kerberos * Shiro's commit patterns indicated that it is not a very active project at the moment, and hence I wasn't sure if having Solr depend on Shiro was a good idea. Maybe someone more experienced with Shiro might help me understand if this isn't true. Hadoop Common's hadoop-auth library seems easier to leverage here, esp. for Kerberos, and it is already a dependency for Solr. Pluggable authentication module in Solr --- Key: SOLR-7274 URL: https://issues.apache.org/jira/browse/SOLR-7274 Project: Solr Issue Type: Sub-task Reporter: Anshum Gupta It would be good to have Solr support different authentication protocols. To begin with, it'd be good to have support for kerberos and basic auth. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-7341) xjoin - join data from external sources
Tom Winch created SOLR-7341: --- Summary: xjoin - join data from external sources Key: SOLR-7341 URL: https://issues.apache.org/jira/browse/SOLR-7341 Project: Solr Issue Type: New Feature Components: search Affects Versions: 4.10.3 Reporter: Tom Winch Priority: Minor Fix For: Trunk h2. XJoin The xjoin SOLR contrib allows external results to be joined with SOLR results in a query and the SOLR result set to be filtered by the results of an external query. Values from the external results are made available in the SOLR results and may also be used to boost the scores of corresponding documents during the search. The contrib consists of the Java classes XJoinSearchComponent and XJoinValueSourceParser, which must be configured in solrconfig.xml, and the interfaces XJoinResultsFactory and XJoinResults, which are implemented by the user to provide the link between SOLR and the external results source. External results and SOLR documents are matched via a single configurable attribute (the join field). The contrib JAR solr-xjoin-4.10.3.jar contains these classes and interfaces and should be included in SOLR's class path from solrconfig.xml, as should a JAR containing the user implementations of the previously mentioned interfaces. For example: {code:xml} config .. !-- XJoin contrib JAR file -- lib dir=${solr.install.dir:../../..}/dist/ regex=solr-xjoin-\d.*\.jar / .. !-- user implementations of XJoin interfaces -- lib path=/path/to/xjoin_test.jar / .. /config {code} h2. Java classes and interfaces h3. XJoinResultsFactory The user implementation of this interface is responsible for connecting to the external source to perform a query (or otherwise collect results). Parameters with prefix component name.external. are passed from the SOLR query URL to pararameterise the search. The interface has the following methods: * void init(NamedList args) - this is called during SOLR initialisation, and passed parameters from the search component configuration (see below) * XJoinResults getResults(SolrParams params) - this is called during a SOLR search to generate external results, and is passed parameters from the SOLR query URL (as above) For example, the implementation might perform queries of an external source based on the 'q' SOLR query URL parameter (in full, component name.external.q). h3. XJoinResults A user implementation of this interface is returned by the getResults() method of the XJoinResultsFactory implementation. It has methods: * Object getResult(String joinId) - this should return a particular result given the value of the join attribute * IterableString getJoinIds() - this should return the join attribute values for all results of the external search h3. XJoinSearchComponent This is the central Java class of the contrib. It is a SOLR search component, configured in solrconfig.xml and included in one or more SOLR request handlers. It has two main responsibilities: * Before the SOLR search, it connects to the external source and retrieves results, storing them in the SOLR request context * After the SOLR search, it matches SOLR document in the results set and external results via the join field, adding attributes from the external results to documents in the SOLR results set It takes the following initialisation parameters: * factoryClass - this specifies the user-supplied class implementing XJoinResultsFactory, used to generate external results * joinField - this specifies the attribute on which to join between SOLR documents and external results * external - this parameter set is passed to configure the XJoinResultsFactory implementation For example, in solrconfig.xml: {code:xml} searchComponent name=xjoin_test class=org.apache.solr.search.xjoin.XJoinSearchComponent str name=factoryClasstest.TestXJoinResultsFactory/str str name=joinFieldid/str lst name=external str name=values1,2,3/str /lst /searchComponent {code} Here, the search component instantiates a new TextXJoinResultsFactory during initialisation, and passes it the values parameter (1, 2, 3) to configure it. To properly use the XJoinSearchComponent in a request handler, it must be included at the start and end of the component list, and may be configured with the following query parameters: * listParameter - external join field values will be placed in this query parameter for reference by local query parameters * results - a comma-separated list of attributes from the XJoinResults implementation (created by the factory at search time) to be included in the SOLR results * fl - a comma-separated list of attributes from results objects (contained in an XJoinResults implementation) to be included in the SOLR results For example: {code:xml} requestHandler name=/xjoin class=solr.SearchHandler startup=lazy lst
Re: svn commit: r1670918 [1/5] - in /lucene/dev/trunk/lucene: ./ codecs/src/java/org/apache/lucene/codecs/autoprefix/ codecs/src/resources/META-INF/services/ codecs/src/test/org/apache/lucene/codecs/a
Woops, thanks Rob! Mike McCandless http://blog.mikemccandless.com On Thu, Apr 2, 2015 at 11:17 AM, Robert Muir rcm...@gmail.com wrote: NOTE: i hit compile failure like this (TermRangeTermsEnum got removed). I am going to remove these asserts: to me they don't look very useful, and fix the build for now. compile-test: [mkdir] Created dir: /home/rmuir/workspace/trunk/lucene/build/core/classes/test [javac] Compiling 431 source files to /home/rmuir/workspace/trunk/lucene/build/core/classes/test [javac] /home/rmuir/workspace/trunk/lucene/core/src/test/org/apache/lucene/search/TestTermRangeQuery.java:123: error: cannot find symbol [javac] assertFalse(query.getTermsEnum(terms) instanceof TermRangeTermsEnum); [javac] ^ [javac] symbol: class TermRangeTermsEnum [javac] location: class TestTermRangeQuery [javac] /home/rmuir/workspace/trunk/lucene/core/src/test/org/apache/lucene/search/TestTermRangeQuery.java:128: error: cannot find symbol [javac] assertFalse(query.getTermsEnum(terms) instanceof TermRangeTermsEnum); [javac] ^ [javac] symbol: class TermRangeTermsEnum [javac] location: class TestTermRangeQuery [javac] /home/rmuir/workspace/trunk/lucene/core/src/test/org/apache/lucene/search/TestTermRangeQuery.java:132: error: cannot find symbol [javac] assertFalse(query.getTermsEnum(terms) instanceof TermRangeTermsEnum); [javac] ^ [javac] symbol: class TermRangeTermsEnum [javac] location: class TestTermRangeQuery [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] 3 errors On Thu, Apr 2, 2015 at 11:05 AM, mikemcc...@apache.org wrote: Author: mikemccand Date: Thu Apr 2 15:05:48 2015 New Revision: 1670918 URL: http://svn.apache.org/r1670918 Log: LUCENE-5879: add auto-prefix terms to block tree, and experimental AutoPrefixTermsPostingsFormat Added: lucene/dev/trunk/lucene/codecs/src/java/org/apache/lucene/codecs/autoprefix/ lucene/dev/trunk/lucene/codecs/src/java/org/apache/lucene/codecs/autoprefix/AutoPrefixPostingsFormat.java (with props) lucene/dev/trunk/lucene/codecs/src/java/org/apache/lucene/codecs/autoprefix/package-info.java (with props) lucene/dev/trunk/lucene/codecs/src/test/org/apache/lucene/codecs/autoprefix/ lucene/dev/trunk/lucene/codecs/src/test/org/apache/lucene/codecs/autoprefix/TestAutoPrefixPostingsFormat.java (with props) lucene/dev/trunk/lucene/codecs/src/test/org/apache/lucene/codecs/autoprefix/TestAutoPrefixTerms.java (with props) lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/codecs/blocktree/AutoPrefixTermsWriter.java (with props) lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/codecs/blocktree/BitSetPostingsEnum.java (with props) lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/codecs/blocktree/BitSetTermsEnum.java (with props) lucene/dev/trunk/lucene/test-framework/src/java/org/apache/lucene/index/RandomPostingsTester.java (with props) Removed: lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/search/TermRangeTermsEnum.java Modified: lucene/dev/trunk/lucene/CHANGES.txt lucene/dev/trunk/lucene/codecs/src/resources/META-INF/services/org.apache.lucene.codecs.PostingsFormat lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/codecs/BlockTermState.java lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/codecs/PostingsFormat.java lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/codecs/blocktree/BlockTreeTermsReader.java lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/codecs/blocktree/BlockTreeTermsWriter.java lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/codecs/blocktree/FieldReader.java lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/codecs/blocktree/IntersectTermsEnum.java lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/codecs/blocktree/IntersectTermsEnumFrame.java lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/codecs/blocktree/SegmentTermsEnum.java lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/codecs/blocktree/SegmentTermsEnumFrame.java lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/codecs/blocktree/Stats.java lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/index/AutomatonTermsEnum.java lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/index/CheckIndex.java lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/index/FreqProxFields.java lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/index/MappingMultiPostingsEnum.java
[jira] [Updated] (SOLR-7341) xjoin - join data from external sources
[ https://issues.apache.org/jira/browse/SOLR-7341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom Winch updated SOLR-7341: Description: h2. XJoin The xjoin SOLR contrib allows external results to be joined with SOLR results in a query and the SOLR result set to be filtered by the results of an external query. Values from the external results are made available in the SOLR results and may also be used to boost the scores of corresponding documents during the search. The contrib consists of the Java classes XJoinSearchComponent and XJoinValueSourceParser, which must be configured in solrconfig.xml, and the interfaces XJoinResultsFactory and XJoinResults, which are implemented by the user to provide the link between SOLR and the external results source. External results and SOLR documents are matched via a single configurable attribute (the join field). The contrib JAR solr-xjoin-4.10.3.jar contains these classes and interfaces and should be included in SOLR's class path from solrconfig.xml, as should a JAR containing the user implementations of the previously mentioned interfaces. For example: {code:xml} config .. !-- XJoin contrib JAR file -- lib dir=${solr.install.dir:../../..}/dist/ regex=solr-xjoin-\d.*\.jar / .. !-- user implementations of XJoin interfaces -- lib path=/path/to/xjoin_test.jar / .. /config {code} h2. Java classes and interfaces h3. XJoinResultsFactory The user implementation of this interface is responsible for connecting to the external source to perform a query (or otherwise collect results). Parameters with prefix component name.external. are passed from the SOLR query URL to pararameterise the search. The interface has the following methods: * void init(NamedList args) - this is called during SOLR initialisation, and passed parameters from the search component configuration (see below) * XJoinResults getResults(SolrParams params) - this is called during a SOLR search to generate external results, and is passed parameters from the SOLR query URL (as above) For example, the implementation might perform queries of an external source based on the 'q' SOLR query URL parameter (in full, component name.external.q). h3. XJoinResults A user implementation of this interface is returned by the getResults() method of the XJoinResultsFactory implementation. It has methods: * Object getResult(String joinId) - this should return a particular result given the value of the join attribute * IterableString getJoinIds() - this should return the join attribute values for all results of the external search h3. XJoinSearchComponent This is the central Java class of the contrib. It is a SOLR search component, configured in solrconfig.xml and included in one or more SOLR request handlers. It has two main responsibilities: * Before the SOLR search, it connects to the external source and retrieves results, storing them in the SOLR request context * After the SOLR search, it matches SOLR document in the results set and external results via the join field, adding attributes from the external results to documents in the SOLR results set It takes the following initialisation parameters: * factoryClass - this specifies the user-supplied class implementing XJoinResultsFactory, used to generate external results * joinField - this specifies the attribute on which to join between SOLR documents and external results * external - this parameter set is passed to configure the XJoinResultsFactory implementation For example, in solrconfig.xml: {code:xml} searchComponent name=xjoin_test class=org.apache.solr.search.xjoin.XJoinSearchComponent str name=factoryClasstest.TestXJoinResultsFactory/str str name=joinFieldid/str lst name=external str name=values1,2,3/str /lst /searchComponent {code} Here, the search component instantiates a new TextXJoinResultsFactory during initialisation, and passes it the values parameter (1, 2, 3) to configure it. To properly use the XJoinSearchComponent in a request handler, it must be included at the start and end of the component list, and may be configured with the following query parameters: * results - a comma-separated list of attributes from the XJoinResults implementation (created by the factory at search time) to be included in the SOLR results * fl - a comma-separated list of attributes from results objects (contained in an XJoinResults implementation) to be included in the SOLR results For example: {code:xml} requestHandler name=/xjoin class=solr.SearchHandler startup=lazy lst name=defaults .. bool name=xjoin_testtrue/bool str name=xjoin_test.listParameterxx/str str name=xjoin_test.resultstest_count/str str name=xjoin_test.flid,value/str /lst arr name=first-components strxjoin_test/str /arr arr name=last-components strxjoin_test/str /arr /requestHandler {code} h3. XJoinQParserPlugin This query
[jira] [Commented] (SOLR-6583) Resuming connection with ZooKeeper causes log replay
[ https://issues.apache.org/jira/browse/SOLR-6583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392662#comment-14392662 ] Mark Miller commented on SOLR-6583: --- I linked to SOLR-7338, as that is fairly related to this issue and my sample patch attached there should address this. Resuming connection with ZooKeeper causes log replay Key: SOLR-6583 URL: https://issues.apache.org/jira/browse/SOLR-6583 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.10.1 Reporter: Shalin Shekhar Mangar Assignee: Shalin Shekhar Mangar Priority: Minor Fix For: 5.0, Trunk If a node is partitioned from ZooKeeper for an extended period of time then upon resuming connection, the node re-registers itself causing recoverFromLog() method to be executed which fails with the following exception: {code} 8091124 [Thread-71] ERROR org.apache.solr.update.UpdateLog – Error inspecting tlog tlog{file=/home/ubuntu/shalin-lusolr/solr/example/solr/collection_5x3_shard5_replica3/data/tlog/tlog.0009869 refcount=2} java.nio.channels.ClosedChannelException at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:99) at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:678) at org.apache.solr.update.ChannelFastInputStream.readWrappedStream(TransactionLog.java:784) at org.apache.solr.common.util.FastInputStream.refill(FastInputStream.java:89) at org.apache.solr.common.util.FastInputStream.read(FastInputStream.java:125) at java.io.InputStream.read(InputStream.java:101) at org.apache.solr.update.TransactionLog.endsWithCommit(TransactionLog.java:218) at org.apache.solr.update.UpdateLog.recoverFromLog(UpdateLog.java:800) at org.apache.solr.cloud.ZkController.register(ZkController.java:834) at org.apache.solr.cloud.ZkController$1.command(ZkController.java:271) at org.apache.solr.common.cloud.ConnectionManager$1$1.run(ConnectionManager.java:166) 8091125 [Thread-71] ERROR org.apache.solr.update.UpdateLog – Error inspecting tlog tlog{file=/home/ubuntu/shalin-lusolr/solr/example/solr/collection_5x3_shard5_replica3/data/tlog/tlog.0009870 refcount=2} java.nio.channels.ClosedChannelException at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:99) at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:678) at org.apache.solr.update.ChannelFastInputStream.readWrappedStream(TransactionLog.java:784) at org.apache.solr.common.util.FastInputStream.refill(FastInputStream.java:89) at org.apache.solr.common.util.FastInputStream.read(FastInputStream.java:125) at java.io.InputStream.read(InputStream.java:101) at org.apache.solr.update.TransactionLog.endsWithCommit(TransactionLog.java:218) at org.apache.solr.update.UpdateLog.recoverFromLog(UpdateLog.java:800) at org.apache.solr.cloud.ZkController.register(ZkController.java:834) at org.apache.solr.cloud.ZkController$1.command(ZkController.java:271) at org.apache.solr.common.cloud.ConnectionManager$1$1.run(ConnectionManager.java:166) {code} This is because the recoverFromLog uses transaction log references that were collected at startup and are no longer valid. We shouldn't even be running recoverFromLog code for ZK re-connect. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [DISCUSS] Change Query API to make queries immutable in 6.0
Boosts might not make sense to become immutable, it might make the code too complex. Who is to say until the other stuff is fixed first. The downsides might outweight the upsides. So yeah, if you want to say if anyone disagrees with what the future might look like i'm gonna -1 your progress, then i will bite right now. Fixing the rest of Query to be immutable, so filter caching isn't trappy, we should really do that. And we have been doing it already. I remember Uwe suggested this approach when adding automaton and related queries a long time ago. It made things simpler and avoided bugs, we ultimately made as much of it immutable as we could. Queries have to be well-behaved, they need a good hashcode/equals, thread safety, good error checking, etc. It is easier to do this when things are immutable. Someone today can make a patch for FooQuery that nukes setBar and moves it to a ctor parameter named 'bar' and chances are a lot of the time, it probably fixes bugs in FooQuery somehow. Thats just what it is. Boosts are the 'long tail'. they are simple primitive floating point values, so susceptible to less problems. The base class incorporates boosts into equals/hashcode already, which prevents the most common bugs with them. They are trickier because internal things like rewrite() might shuffle them around in conjunction with clone(), to do optimizations. They are also only relevant when scores are needed: so we can prevent nasty filter caching bugs as a step, by making everything else immutable. On Thu, Apr 2, 2015 at 9:27 AM, david.w.smi...@gmail.com david.w.smi...@gmail.com wrote: On Thu, Apr 2, 2015 at 3:40 AM, Adrien Grand jpou...@gmail.com wrote: first make queries immutable up to the boost and then discuss if/how/when we should go fully immutable with a new API to change boosts? The “if” part concerns me; I don’t mind it being a separate issue to make the changes more manageable (progress not perfection, and all that). I’m all for the whole shebang. But if others think “no” then…. will it have been worthwhile to do this big change and not go all the way? I think not. Does anyone feel the answer is “no” to make boosts immutable? And if so why? If nobody comes up with a dissenting opinion to make boosts immutable within a couple days then count me as “+1” to your plans, else “-1” pending that discussion. ~ David Smiley Freelance Apache Lucene/Solr Search Consultant/Developer http://www.linkedin.com/in/davidwsmiley - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-7340) Collations output changed for suggestions, SolrJ can't read collations anymore
[ https://issues.apache.org/jira/browse/SOLR-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeroen Steggink resolved SOLR-7340. --- Resolution: Won't Fix Never mind, SolrJ was version 4.10, which explains it can't read from Solr 5. I guess this isn't backwards compatible. Collations output changed for suggestions, SolrJ can't read collations anymore -- Key: SOLR-7340 URL: https://issues.apache.org/jira/browse/SOLR-7340 Project: Solr Issue Type: Bug Components: SolrJ Affects Versions: 5.0 Reporter: Jeroen Steggink The output for the collations changed in the suggester. SolrJ can no longer read the collations part and always returns null. Version 4.10: {code:JavaScript} { responseHeader:{ status:0, QTime:3}, spellcheck:{ suggestions:[ innovatie,{ numFound:9, startOffset:0, endOffset:9, suggestion:[innovatieagenda, innovatieagenda’s, innovatieattaché, innovatiebehoefte, innovatiebeleid, innovatiebox, innovatiecatalogus, innovatief, innovatiefinanciering]}, correctlySpelled,false, collation,innovatieagenda, collation,innovatieagenda’s, collation,innovatieattaché, collation,innovatiebehoefte, collation,innovatiebeleid, collation,innovatiebox, collation,innovatiecatalogus, collation,innovatief, collation,innovatiefinanciering]}, response:{numFound:1070,start:0,docs:[] }} {code} vs version 5.0.0 {code:JavaScript} { responseHeader:{ status:0, QTime:3}, spellcheck:{ suggestions:[ innovatie,{ numFound:9, startOffset:0, endOffset:9, suggestion:[innovatieagenda, innovatieagenda’s, innovatieattaché, innovatiebehoefte, innovatiebeleid, innovatiebox, innovatiecatalogus, innovatief, innovatiefinanciering]}], correctlySpelled:false, collations:[ collation,innovatieagenda, collation,innovatieagenda’s, collation,innovatieattaché, collation,innovatiebehoefte, collation,innovatiebeleid, collation,innovatiebox, collation,innovatiecatalogus, collation,innovatief, collation,innovatiefinanciering]}, response:{numFound:1070,start:0,docs:[] }} {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-7340) Collationions output changed for suggestions, SolrJ can't read collations anymore
Jeroen Steggink created SOLR-7340: - Summary: Collationions output changed for suggestions, SolrJ can't read collations anymore Key: SOLR-7340 URL: https://issues.apache.org/jira/browse/SOLR-7340 Project: Solr Issue Type: Bug Components: SolrJ Affects Versions: 5.0 Reporter: Jeroen Steggink The output for the collations changed in the suggester. SolrJ can no longer read the collations part and always returns null. Version 4.10: {code:JavaScript} { responseHeader:{ status:0, QTime:3}, spellcheck:{ suggestions:[ innovatie,{ numFound:9, startOffset:0, endOffset:9, suggestion:[innovatieagenda, innovatieagenda’s, innovatieattaché, innovatiebehoefte, innovatiebeleid, innovatiebox, innovatiecatalogus, innovatief, innovatiefinanciering]}, correctlySpelled,false, collation,innovatieagenda, collation,innovatieagenda’s, collation,innovatieattaché, collation,innovatiebehoefte, collation,innovatiebeleid, collation,innovatiebox, collation,innovatiecatalogus, collation,innovatief, collation,innovatiefinanciering]}, response:{numFound:1070,start:0,docs:[] }} {code} vs version 5.0.0 {code:JavaScript} { responseHeader:{ status:0, QTime:3}, spellcheck:{ suggestions:[ innovatie,{ numFound:9, startOffset:0, endOffset:9, suggestion:[innovatieagenda, innovatieagenda’s, innovatieattaché, innovatiebehoefte, innovatiebeleid, innovatiebox, innovatiecatalogus, innovatief, innovatiefinanciering]}], correctlySpelled:false, collations:[ collation,innovatieagenda, collation,innovatieagenda’s, collation,innovatieattaché, collation,innovatiebehoefte, collation,innovatiebeleid, collation,innovatiebox, collation,innovatiecatalogus, collation,innovatief, collation,innovatiefinanciering]}, response:{numFound:1070,start:0,docs:[] }} {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6386) TestIndexWriterForceMerge still unreliable in NIGHTLY
[ https://issues.apache.org/jira/browse/LUCENE-6386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392688#comment-14392688 ] Michael McCandless commented on LUCENE-6386: If there are more levels we don't use any more space, because we will fully delete the intermediate levels after they are merged. So in your example, once s13 is done (and made into .cfs if necessary) we delete s9 and s10, as long as no NRT reader was opened holding a reference to them (or, a commit). TestIndexWriterForceMerge still unreliable in NIGHTLY - Key: LUCENE-6386 URL: https://issues.apache.org/jira/browse/LUCENE-6386 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Assignee: Michael McCandless Fix For: Trunk, 5.1 Attachments: LUCENE-6386.patch Discovered by ryan beasting (trunk): ant test -Dtestcase=TestIndexWriterForceMerge -Dtests.method=testForceMergeTempSpaceUsage -Dtests.seed=DC9ADB74850A581B -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.locale=sr__#Latn -Dtests.timezone=Indian/Chagos -Dtests.asserts=true -Dtests.file.encoding=US-ASCII {noformat} [junit4] 2 NOTE: reproduce with: ant test -Dtestcase=TestIndexWriterForceMerge -Dtests.method=testForceMergeTempSpaceUsage -Dtests.seed=DC9ADB74850A581B -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.locale=sr__#Latn -Dtests.timezone=Indian/Chagos -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] FAILURE 1.20s | TestIndexWriterForceMerge.testForceMergeTempSpaceUsage [junit4] Throwable #1: java.lang.AssertionError: forceMerge used too much temporary space: starting usage was 291570 bytes; final usage was 262469 bytes; max temp usage was 1079501 but should have been 874710 (= 3X starting usage), BEFORE= [junit4] _u.scf 146329 [junit4] _u.si 635 [junit4] |- (inside compound file) _u.fld 2214 [junit4] |- (inside compound file) _u.inf 392 [junit4] |- (inside compound file) _u.len 2381 [junit4] |- (inside compound file) _u.pst 36758 [junit4] |- (inside compound file) _u.vec 104144 [junit4] _s.pst 1338 [junit4] _s.inf 392 [junit4] _s.fld 94 [junit4] _s.len 221 [junit4] _s.vec 3744 [junit4] _s.si 624 [junit4] _t.fld 94 [junit4] _t.len 221 [junit4] _t.pst 1338 [junit4] _t.inf 392 [junit4] _t.vec 3744 [junit4] _t.si 624 [junit4] _v.fld 94 [junit4] _v.pst 1338 [junit4] _v.inf 392 [junit4] _v.vec 3744 [junit4] _v.si 624 [junit4] _v.len 221 [junit4] _w.len 221 [junit4] _w.pst 1338 [junit4] _w.inf 392 [junit4] _w.fld 94 [junit4] _w.si 624 [junit4] _w.vec 3744 [junit4] _x.vec 3744 [junit4] _x.inf 392 [junit4] _x.pst 1338 [junit4] _x.fld 94 [junit4] _x.si 624 [junit4] _x.len 221 [junit4] _y.fld 94 [junit4] _y.pst 1338 [junit4] _y.inf 392 [junit4] _y.si 624 [junit4] _y.vec 3744 [junit4] _y.len 221 [junit4] _z.fld 94 [junit4] _z.pst 1338 [junit4] _z.inf 392 [junit4] _z.len 221 [junit4] _z.vec 3744 [junit4] _z.si 624 [junit4] _10.si 630 [junit4] _10.fld 94 [junit4] _10.pst 1338 [junit4] _10.inf 392 [junit4] _10.vec 3744 [junit4] _10.len 221 [junit4] _11.len 221 [junit4] _11.si 630 [junit4] _11.vec 3744 [junit4] _11.pst 1338 [junit4] _11.inf 392 [junit4] _11.fld 94 [junit4] _12.vec 3744 [junit4] _12.si 630 [junit4] _12.len 221 [junit4] _12.fld 94 [junit4] _12.pst 1338 [junit4] _12.inf 392 [junit4] _13.fld 94 [junit4] _13.vec
[jira] [Commented] (SOLR-7338) A reloaded core will never register itself as active after a ZK session expiration
[ https://issues.apache.org/jira/browse/SOLR-7338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392770#comment-14392770 ] Mark Miller commented on SOLR-7338: --- Anyway, we can spin that off into a new issue if someone wants to try and refactor it. I just don't yet get the impetus for it. Onlookers feel free to chime in here until/unless a new issue is made. bq. but here's the unit test I started working on Cool. A reloaded core will never register itself as active after a ZK session expiration -- Key: SOLR-7338 URL: https://issues.apache.org/jira/browse/SOLR-7338 Project: Solr Issue Type: Bug Components: SolrCloud Reporter: Timothy Potter Assignee: Mark Miller Attachments: SOLR-7338.patch, SOLR-7338_test.patch If a collection gets reloaded, then a core's isReloaded flag is always true. If a core experiences a ZK session expiration after a reload, then it won't ever be able to set itself to active because of the check in {{ZkController#register}}: {code} UpdateLog ulog = core.getUpdateHandler().getUpdateLog(); if (!core.isReloaded() ulog != null) { // disable recovery in case shard is in construction state (for shard splits) Slice slice = getClusterState().getSlice(collection, shardId); if (slice.getState() != Slice.State.CONSTRUCTION || !isLeader) { FutureUpdateLog.RecoveryInfo recoveryFuture = core.getUpdateHandler().getUpdateLog().recoverFromLog(); if (recoveryFuture != null) { log.info(Replaying tlog for + ourUrl + during startup... NOTE: This can take a while.); recoveryFuture.get(); // NOTE: this could potentially block for // minutes or more! // TODO: public as recovering in the mean time? // TODO: in the future we could do peersync in parallel with recoverFromLog } else { log.info(No LogReplay needed for core= + core.getName() + baseURL= + baseUrl); } } boolean didRecovery = checkRecovery(coreName, desc, recoverReloadedCores, isLeader, cloudDesc, collection, coreZkNodeName, shardId, leaderProps, core, cc); if (!didRecovery) { publish(desc, ZkStateReader.ACTIVE); } } {code} I can easily simulate this on trunk by doing: {code} bin/solr -c -z localhost:2181 bin/solr create -c foo bin/post -c foo example/exampledocs/*.xml curl http://localhost:8983/solr/admin/collections?action=RELOADname=foo; kill -STOP PID sleep PAUSE_SECONDS kill -CONT PID {code} Where PID is the process ID of the Solr node. Here are the logs after the CONT command. As you can see below, the core never gets to setting itself as active again. I think the bug is that the isReloaded flag needs to get set back to false once the reload is successful, but I don't understand what this flag is needed for anyway??? {code} INFO - 2015-04-01 17:28:50.962; org.apache.solr.common.cloud.ConnectionManager; Watcher org.apache.solr.common.cloud.ConnectionManager@5519dba0 name:ZooKeeperConnection Watcher:localhost:2181 got event WatchedEvent state:Disconnected type:None path:null path:null type:None INFO - 2015-04-01 17:28:50.963; org.apache.solr.common.cloud.ConnectionManager; zkClient has disconnected INFO - 2015-04-01 17:28:51.107; org.apache.solr.common.cloud.ConnectionManager; Watcher org.apache.solr.common.cloud.ConnectionManager@5519dba0 name:ZooKeeperConnection Watcher:localhost:2181 got event WatchedEvent state:Expired type:None path:null path:null type:None INFO - 2015-04-01 17:28:51.107; org.apache.solr.common.cloud.ConnectionManager; Our previous ZooKeeper session was expired. Attempting to reconnect to recover relationship with ZooKeeper... INFO - 2015-04-01 17:28:51.108; org.apache.solr.cloud.Overseer; Overseer (id=93579450724974592-192.168.1.2:8983_solr-n_00) closing INFO - 2015-04-01 17:28:51.108; org.apache.solr.cloud.ZkController$WatcherImpl; A node got unwatched for /configs/foo INFO - 2015-04-01 17:28:51.108; org.apache.solr.cloud.Overseer$ClusterStateUpdater; Overseer Loop exiting : 192.168.1.2:8983_solr INFO - 2015-04-01 17:28:51.109; org.apache.solr.cloud.OverseerCollectionProcessor; According to ZK I (id=93579450724974592-192.168.1.2:8983_solr-n_00) am no longer a leader. INFO - 2015-04-01 17:28:51.108; org.apache.solr.cloud.ZkController$4; Running listeners for /configs/foo INFO - 2015-04-01 17:28:51.109; org.apache.solr.common.cloud.DefaultConnectionStrategy; Connection expired - starting a new one... INFO - 2015-04-01 17:28:51.109;
[jira] [Commented] (LUCENE-3922) Add Japanese Kanji number normalization to Kuromoji
[ https://issues.apache.org/jira/browse/LUCENE-3922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392489#comment-14392489 ] Ramkumar Aiyengar commented on LUCENE-3922: --- [~cm], just got interested in this patch.. Any reason this hasn't gone to branch_5x as yet? Add Japanese Kanji number normalization to Kuromoji --- Key: LUCENE-3922 URL: https://issues.apache.org/jira/browse/LUCENE-3922 Project: Lucene - Core Issue Type: New Feature Components: modules/analysis Affects Versions: 4.0-ALPHA Reporter: Kazuaki Hiraga Assignee: Christian Moen Labels: features Fix For: 5.1 Attachments: LUCENE-3922.patch, LUCENE-3922.patch, LUCENE-3922.patch, LUCENE-3922.patch, LUCENE-3922.patch, LUCENE-3922.patch, LUCENE-3922.patch, LUCENE-3922.patch Japanese people use Kanji numerals instead of Arabic numerals for writing price, address and so on. i.e 12万4800円(124,800JPY), 二番町三ノ二(3-2 Nibancho) and 十二月(December). So, we would like to normalize those Kanji numerals to Arabic numerals (I don't think we need to have a capability to normalize to Kanji numerals). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [DISCUSS] Change Query API to make queries immutable in 6.0
On Thu, Apr 2, 2015 at 3:40 AM, Adrien Grand jpou...@gmail.com wrote: first make queries immutable up to the boost and then discuss if/how/when we should go fully immutable with a new API to change boosts? The “if” part concerns me; I don’t mind it being a separate issue to make the changes more manageable (progress not perfection, and all that). I’m all for the whole shebang. But if others think “no” then…. will it have been worthwhile to do this big change and not go all the way? I think not. Does anyone feel the answer is “no” to make boosts immutable? And if so why? If nobody comes up with a dissenting opinion to make boosts immutable within a couple days then count me as “+1” to your plans, else “-1” pending that discussion. ~ David Smiley Freelance Apache Lucene/Solr Search Consultant/Developer http://www.linkedin.com/in/davidwsmiley
[jira] [Updated] (SOLR-7336) Add State enum to Replica
[ https://issues.apache.org/jira/browse/SOLR-7336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated SOLR-7336: - Attachment: SOLR-7336.patch * Add Replica.getState() * Remove ZkStateReader state related constants and cutover to use Replica.State * Take Replica.State where need to, instead of a String. Tests pass, I think it's ready. Add State enum to Replica - Key: SOLR-7336 URL: https://issues.apache.org/jira/browse/SOLR-7336 Project: Solr Issue Type: Improvement Components: SolrJ Reporter: Shai Erera Assignee: Shai Erera Attachments: SOLR-7336.patch Following SOLR-7325, this issue adds a State enum to Replica. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6709) ClassCastException in QueryResponse after applying XMLResponseParser on a response containing an expanded section
[ https://issues.apache.org/jira/browse/SOLR-6709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392527#comment-14392527 ] Simon Endele commented on SOLR-6709: Thank you guys very much for fixing/reviewing and happy Easter! ClassCastException in QueryResponse after applying XMLResponseParser on a response containing an expanded section --- Key: SOLR-6709 URL: https://issues.apache.org/jira/browse/SOLR-6709 Project: Solr Issue Type: Bug Components: SolrJ Reporter: Simon Endele Assignee: Varun Thacker Fix For: Trunk, 5.2 Attachments: SOLR-6709.patch, SOLR-6709.patch, SOLR-6709.patch, test-response.xml Shouldn't the following code work on the attached input file? It matches the structure of a Solr response with wt=xml. {code}import java.io.InputStream; import org.apache.solr.client.solrj.ResponseParser; import org.apache.solr.client.solrj.impl.XMLResponseParser; import org.apache.solr.client.solrj.response.QueryResponse; import org.apache.solr.common.util.NamedList; import org.junit.Test; public class ParseXmlExpandedTest { @Test public void test() { ResponseParser responseParser = new XMLResponseParser(); InputStream inStream = getClass() .getResourceAsStream(test-response.xml); NamedListObject response = responseParser .processResponse(inStream, UTF-8); QueryResponse queryResponse = new QueryResponse(response, null); } }{code} Unexpectedly (for me), it throws a java.lang.ClassCastException: org.apache.solr.common.util.SimpleOrderedMap cannot be cast to java.util.Map at org.apache.solr.client.solrj.response.QueryResponse.setResponse(QueryResponse.java:126) Am I missing something, is XMLResponseParser deprecated or something? We use a setup like this to mock a QueryResponse for unit tests in our service that post-processes the Solr response. Obviously, it works with the javabin format which SolrJ uses internally. But that is no appropriate format for unit tests, where the response should be human readable. I think there's some conversion missing in QueryResponse or XMLResponseParser. Note: The null value supplied as SolrServer argument to the constructor of QueryResponse shouldn't have an effect as the error occurs before the parameter is even used. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7336) Add State enum to Replica
[ https://issues.apache.org/jira/browse/SOLR-7336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392582#comment-14392582 ] Shai Erera commented on SOLR-7336: -- Forgot to mention that I also removed ZkStateReader.SYNC which seemed unused except by a test which waited on replicas to be active. But I don't think a replica is put in that state? Also, would appreciate if someone can review the documentation of the Replica.State values. Add State enum to Replica - Key: SOLR-7336 URL: https://issues.apache.org/jira/browse/SOLR-7336 Project: Solr Issue Type: Improvement Components: SolrJ Reporter: Shai Erera Assignee: Shai Erera Attachments: SOLR-7336.patch Following SOLR-7325, this issue adds a State enum to Replica. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7338) A reloaded core will never register itself as active after a ZK session expiration
[ https://issues.apache.org/jira/browse/SOLR-7338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392763#comment-14392763 ] Mark Miller commented on SOLR-7338: --- bq. isReloaded feels more like isReloading What's the gain, what's the point, what's the alternative? I don't get that at all. It tells you if the core has been reloaded. This is often useful in things that happen on creating a new SolrCore. Who cares about isReloading? I'm lost. Is it just too difficult to understand what isReloaded means? I'd be more confused with this temporary isReloading call - seems so easy for that to be tricky. isReloaded is so permanent and easy to understand. The core has been reloaded or it hasn't. How the heck does trying to track exactly when the core is actually in the process of reload or not more useful? Anyone else? A reloaded core will never register itself as active after a ZK session expiration -- Key: SOLR-7338 URL: https://issues.apache.org/jira/browse/SOLR-7338 Project: Solr Issue Type: Bug Components: SolrCloud Reporter: Timothy Potter Assignee: Mark Miller Attachments: SOLR-7338.patch, SOLR-7338_test.patch If a collection gets reloaded, then a core's isReloaded flag is always true. If a core experiences a ZK session expiration after a reload, then it won't ever be able to set itself to active because of the check in {{ZkController#register}}: {code} UpdateLog ulog = core.getUpdateHandler().getUpdateLog(); if (!core.isReloaded() ulog != null) { // disable recovery in case shard is in construction state (for shard splits) Slice slice = getClusterState().getSlice(collection, shardId); if (slice.getState() != Slice.State.CONSTRUCTION || !isLeader) { FutureUpdateLog.RecoveryInfo recoveryFuture = core.getUpdateHandler().getUpdateLog().recoverFromLog(); if (recoveryFuture != null) { log.info(Replaying tlog for + ourUrl + during startup... NOTE: This can take a while.); recoveryFuture.get(); // NOTE: this could potentially block for // minutes or more! // TODO: public as recovering in the mean time? // TODO: in the future we could do peersync in parallel with recoverFromLog } else { log.info(No LogReplay needed for core= + core.getName() + baseURL= + baseUrl); } } boolean didRecovery = checkRecovery(coreName, desc, recoverReloadedCores, isLeader, cloudDesc, collection, coreZkNodeName, shardId, leaderProps, core, cc); if (!didRecovery) { publish(desc, ZkStateReader.ACTIVE); } } {code} I can easily simulate this on trunk by doing: {code} bin/solr -c -z localhost:2181 bin/solr create -c foo bin/post -c foo example/exampledocs/*.xml curl http://localhost:8983/solr/admin/collections?action=RELOADname=foo; kill -STOP PID sleep PAUSE_SECONDS kill -CONT PID {code} Where PID is the process ID of the Solr node. Here are the logs after the CONT command. As you can see below, the core never gets to setting itself as active again. I think the bug is that the isReloaded flag needs to get set back to false once the reload is successful, but I don't understand what this flag is needed for anyway??? {code} INFO - 2015-04-01 17:28:50.962; org.apache.solr.common.cloud.ConnectionManager; Watcher org.apache.solr.common.cloud.ConnectionManager@5519dba0 name:ZooKeeperConnection Watcher:localhost:2181 got event WatchedEvent state:Disconnected type:None path:null path:null type:None INFO - 2015-04-01 17:28:50.963; org.apache.solr.common.cloud.ConnectionManager; zkClient has disconnected INFO - 2015-04-01 17:28:51.107; org.apache.solr.common.cloud.ConnectionManager; Watcher org.apache.solr.common.cloud.ConnectionManager@5519dba0 name:ZooKeeperConnection Watcher:localhost:2181 got event WatchedEvent state:Expired type:None path:null path:null type:None INFO - 2015-04-01 17:28:51.107; org.apache.solr.common.cloud.ConnectionManager; Our previous ZooKeeper session was expired. Attempting to reconnect to recover relationship with ZooKeeper... INFO - 2015-04-01 17:28:51.108; org.apache.solr.cloud.Overseer; Overseer (id=93579450724974592-192.168.1.2:8983_solr-n_00) closing INFO - 2015-04-01 17:28:51.108; org.apache.solr.cloud.ZkController$WatcherImpl; A node got unwatched for /configs/foo INFO - 2015-04-01 17:28:51.108; org.apache.solr.cloud.Overseer$ClusterStateUpdater; Overseer Loop exiting : 192.168.1.2:8983_solr INFO - 2015-04-01 17:28:51.109; org.apache.solr.cloud.OverseerCollectionProcessor;
[jira] [Commented] (SOLR-7143) MoreLikeThis Query Parser does not handle multiple field names
[ https://issues.apache.org/jira/browse/SOLR-7143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392608#comment-14392608 ] Jens Wille commented on SOLR-7143: -- First of all, thanks to Vitaliy for providing patches and to Anshum for picking up this issue. Now I'm wondering what needs to happen to move this forward. I'm new here and don't really know what's expected of me. I have verified that the latest patch works for this particular issue and I, too, would like to see the query parser functionality brought on par with the handler functionality (parameters and defaults, including an {{mlt.match.include}} equivalent). However, there's still a problem with the current approach (multiple {{qf}} parameters: yes, comma-separated list: no): Only the latter would work properly with parameter dereferencing: {code} curl 'http://localhost:8983/solr/techproducts/select?q=%7B!mlt+qf=$mlt.fl%7DMA147LL/Amlt.fl=name,features' {code} Sets {{qf=name,features}}, which is not split into {{name}} and {{features}} with the latest patch. {code} curl 'http://localhost:8983/solr/techproducts/select?q=%7B!mlt+qf=$mlt.fl%7DMA147LL/Amlt.fl=namemlt.fl=features' {code} Sets {{qf=name}}, ignores subsequent {{mlt.fl}} parameters. Please let me know if I can do anything. MoreLikeThis Query Parser does not handle multiple field names -- Key: SOLR-7143 URL: https://issues.apache.org/jira/browse/SOLR-7143 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 5.0 Reporter: Jens Wille Assignee: Anshum Gupta Attachments: SOLR-7143.patch, SOLR-7143.patch The newly introduced MoreLikeThis Query Parser (SOLR-6248) does not return any results when supplied with multiple fields in the {{qf}} parameter. To reproduce within the techproducts example, compare: {code} curl 'http://localhost:8983/solr/techproducts/select?q=%7B!mlt+qf=name%7DMA147LL/A' curl 'http://localhost:8983/solr/techproducts/select?q=%7B!mlt+qf=features%7DMA147LL/A' curl 'http://localhost:8983/solr/techproducts/select?q=%7B!mlt+qf=name,features%7DMA147LL/A' {code} The first two queries return 8 and 5 results, respectively. The third query doesn't return any results (not even the matched document). In contrast, the MoreLikeThis Handler works as expected (accounting for the default {{mintf}} and {{mindf}} values in SimpleMLTQParser): {code} curl 'http://localhost:8983/solr/techproducts/mlt?q=id:MA147LL/Amlt.fl=namemlt.mintf=1mlt.mindf=1' curl 'http://localhost:8983/solr/techproducts/mlt?q=id:MA147LL/Amlt.fl=featuresmlt.mintf=1mlt.mindf=1' curl 'http://localhost:8983/solr/techproducts/mlt?q=id:MA147LL/Amlt.fl=name,featuresmlt.mintf=1mlt.mindf=1' {code} After adding the following line to {{example/techproducts/solr/techproducts/conf/solrconfig.xml}}: {code:language=XML} requestHandler name=/mlt class=solr.MoreLikeThisHandler / {code} The first two queries return 7 and 4 results, respectively (excluding the matched document). The third query returns 7 results, as one would expect. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7338) A reloaded core will never register itself as active after a ZK session expiration
[ https://issues.apache.org/jira/browse/SOLR-7338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392603#comment-14392603 ] Yonik Seeley commented on SOLR-7338: Without looking at the code/patches ;-) I understand what Tim is saying, and agree. isReloaded feels more like isReloading (i.e. it's state that is only temporarily used to make other decisions during initialization.) I don't know how hard it would be to factor out though... prob not worth it. A reloaded core will never register itself as active after a ZK session expiration -- Key: SOLR-7338 URL: https://issues.apache.org/jira/browse/SOLR-7338 Project: Solr Issue Type: Bug Components: SolrCloud Reporter: Timothy Potter Assignee: Mark Miller Attachments: SOLR-7338.patch If a collection gets reloaded, then a core's isReloaded flag is always true. If a core experiences a ZK session expiration after a reload, then it won't ever be able to set itself to active because of the check in {{ZkController#register}}: {code} UpdateLog ulog = core.getUpdateHandler().getUpdateLog(); if (!core.isReloaded() ulog != null) { // disable recovery in case shard is in construction state (for shard splits) Slice slice = getClusterState().getSlice(collection, shardId); if (slice.getState() != Slice.State.CONSTRUCTION || !isLeader) { FutureUpdateLog.RecoveryInfo recoveryFuture = core.getUpdateHandler().getUpdateLog().recoverFromLog(); if (recoveryFuture != null) { log.info(Replaying tlog for + ourUrl + during startup... NOTE: This can take a while.); recoveryFuture.get(); // NOTE: this could potentially block for // minutes or more! // TODO: public as recovering in the mean time? // TODO: in the future we could do peersync in parallel with recoverFromLog } else { log.info(No LogReplay needed for core= + core.getName() + baseURL= + baseUrl); } } boolean didRecovery = checkRecovery(coreName, desc, recoverReloadedCores, isLeader, cloudDesc, collection, coreZkNodeName, shardId, leaderProps, core, cc); if (!didRecovery) { publish(desc, ZkStateReader.ACTIVE); } } {code} I can easily simulate this on trunk by doing: {code} bin/solr -c -z localhost:2181 bin/solr create -c foo bin/post -c foo example/exampledocs/*.xml curl http://localhost:8983/solr/admin/collections?action=RELOADname=foo; kill -STOP PID sleep PAUSE_SECONDS kill -CONT PID {code} Where PID is the process ID of the Solr node. Here are the logs after the CONT command. As you can see below, the core never gets to setting itself as active again. I think the bug is that the isReloaded flag needs to get set back to false once the reload is successful, but I don't understand what this flag is needed for anyway??? {code} INFO - 2015-04-01 17:28:50.962; org.apache.solr.common.cloud.ConnectionManager; Watcher org.apache.solr.common.cloud.ConnectionManager@5519dba0 name:ZooKeeperConnection Watcher:localhost:2181 got event WatchedEvent state:Disconnected type:None path:null path:null type:None INFO - 2015-04-01 17:28:50.963; org.apache.solr.common.cloud.ConnectionManager; zkClient has disconnected INFO - 2015-04-01 17:28:51.107; org.apache.solr.common.cloud.ConnectionManager; Watcher org.apache.solr.common.cloud.ConnectionManager@5519dba0 name:ZooKeeperConnection Watcher:localhost:2181 got event WatchedEvent state:Expired type:None path:null path:null type:None INFO - 2015-04-01 17:28:51.107; org.apache.solr.common.cloud.ConnectionManager; Our previous ZooKeeper session was expired. Attempting to reconnect to recover relationship with ZooKeeper... INFO - 2015-04-01 17:28:51.108; org.apache.solr.cloud.Overseer; Overseer (id=93579450724974592-192.168.1.2:8983_solr-n_00) closing INFO - 2015-04-01 17:28:51.108; org.apache.solr.cloud.ZkController$WatcherImpl; A node got unwatched for /configs/foo INFO - 2015-04-01 17:28:51.108; org.apache.solr.cloud.Overseer$ClusterStateUpdater; Overseer Loop exiting : 192.168.1.2:8983_solr INFO - 2015-04-01 17:28:51.109; org.apache.solr.cloud.OverseerCollectionProcessor; According to ZK I (id=93579450724974592-192.168.1.2:8983_solr-n_00) am no longer a leader. INFO - 2015-04-01 17:28:51.108; org.apache.solr.cloud.ZkController$4; Running listeners for /configs/foo INFO - 2015-04-01 17:28:51.109; org.apache.solr.common.cloud.DefaultConnectionStrategy; Connection expired - starting a new one... INFO - 2015-04-01 17:28:51.109;
[jira] [Commented] (LUCENE-6385) NullPointerException from Highlighter.getBestFragment()
[ https://issues.apache.org/jira/browse/LUCENE-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392624#comment-14392624 ] Ramkumar Aiyengar commented on LUCENE-6385: --- [~mikemccand], [~thelabdude]: Looks like this is a blocker for 5.1 NullPointerException from Highlighter.getBestFragment() --- Key: LUCENE-6385 URL: https://issues.apache.org/jira/browse/LUCENE-6385 Project: Lucene - Core Issue Type: Bug Components: modules/highlighter Affects Versions: 5.1 Reporter: Terry Smith Attachments: LUCENE-6385.patch When testing against the 5.1 nightly snapshots I've come across a NullPointerException in highlighting when nothing would be highlighted. This does not happen with 5.0. {noformat} java.lang.NullPointerException at __randomizedtesting.SeedInfo.seed([3EDC6EB0FA552B34:9971866E394F5FD0]:0) at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extractWeightedSpanTerms(WeightedSpanTermExtractor.java:311) at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.extract(WeightedSpanTermExtractor.java:151) at org.apache.lucene.search.highlight.WeightedSpanTermExtractor.getWeightedSpanTerms(WeightedSpanTermExtractor.java:515) at org.apache.lucene.search.highlight.QueryScorer.initExtractor(QueryScorer.java:219) at org.apache.lucene.search.highlight.QueryScorer.init(QueryScorer.java:187) at org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:196) at org.apache.lucene.search.highlight.Highlighter.getBestFragments(Highlighter.java:156) at org.apache.lucene.search.highlight.Highlighter.getBestFragment(Highlighter.java:102) at org.apache.lucene.search.highlight.Highlighter.getBestFragment(Highlighter.java:80) at org.apache.lucene.search.highlight.MissesTest.testPhraseQuery(MissesTest.java:50) {noformat} I've written a small unit test and used git bisect to narrow the regression to the following commit: {noformat} commit 24e4eefaefb1837d1d4fa35f7669c2b264f872ac Author: Michael McCandless mikemcc...@apache.org Date: Tue Mar 31 08:48:28 2015 + LUCENE-6308: cutover Spans to DISI, reuse ConjunctionDISI, use two-phased iteration git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/branch_5x@1670273 13f79535-47bb-0310-9956-ffa450edef68 {noformat} The problem looks quite simple, WeightedSpanTermExtractor.extractWeightedSpanTerms() needs an early return if SpanQuery.getSpans() returns null. All other callers check against this. Unit test and fix (against the regressed commit) attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-7143) MoreLikeThis Query Parser does not handle multiple field names
[ https://issues.apache.org/jira/browse/SOLR-7143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392608#comment-14392608 ] Jens Wille edited comment on SOLR-7143 at 4/2/15 12:16 PM: --- First of all, thanks to Vitaliy for providing patches and to Anshum for picking up this issue. Now I'm wondering what needs to happen to move this forward. I'm new here and don't really know what's expected of me. I have verified that the latest patch works for this particular issue and I, too, would like to see the query parser functionality brought on par with the handler functionality (parameters and defaults, including an -{{mlt.match.include}}-+exclude current document from results+ equivalent). However, there's still a problem with the current approach (multiple {{qf}} parameters: yes, comma-separated list: no): Only the latter would work properly with parameter dereferencing: {code} curl 'http://localhost:8983/solr/techproducts/select?q=%7B!mlt+qf=$mlt.fl%7DMA147LL/Amlt.fl=name,features' {code} Sets {{qf=name,features}}, which is not split into {{name}} and {{features}} with the latest patch. {code} curl 'http://localhost:8983/solr/techproducts/select?q=%7B!mlt+qf=$mlt.fl%7DMA147LL/Amlt.fl=namemlt.fl=features' {code} Sets {{qf=name}}, ignores subsequent {{mlt.fl}} parameters. Please let me know if I can do anything. was (Author: blackwinter): First of all, thanks to Vitaliy for providing patches and to Anshum for picking up this issue. Now I'm wondering what needs to happen to move this forward. I'm new here and don't really know what's expected of me. I have verified that the latest patch works for this particular issue and I, too, would like to see the query parser functionality brought on par with the handler functionality (parameters and defaults, including an {{mlt.match.include}} equivalent). However, there's still a problem with the current approach (multiple {{qf}} parameters: yes, comma-separated list: no): Only the latter would work properly with parameter dereferencing: {code} curl 'http://localhost:8983/solr/techproducts/select?q=%7B!mlt+qf=$mlt.fl%7DMA147LL/Amlt.fl=name,features' {code} Sets {{qf=name,features}}, which is not split into {{name}} and {{features}} with the latest patch. {code} curl 'http://localhost:8983/solr/techproducts/select?q=%7B!mlt+qf=$mlt.fl%7DMA147LL/Amlt.fl=namemlt.fl=features' {code} Sets {{qf=name}}, ignores subsequent {{mlt.fl}} parameters. Please let me know if I can do anything. MoreLikeThis Query Parser does not handle multiple field names -- Key: SOLR-7143 URL: https://issues.apache.org/jira/browse/SOLR-7143 Project: Solr Issue Type: Bug Components: query parsers Affects Versions: 5.0 Reporter: Jens Wille Assignee: Anshum Gupta Attachments: SOLR-7143.patch, SOLR-7143.patch The newly introduced MoreLikeThis Query Parser (SOLR-6248) does not return any results when supplied with multiple fields in the {{qf}} parameter. To reproduce within the techproducts example, compare: {code} curl 'http://localhost:8983/solr/techproducts/select?q=%7B!mlt+qf=name%7DMA147LL/A' curl 'http://localhost:8983/solr/techproducts/select?q=%7B!mlt+qf=features%7DMA147LL/A' curl 'http://localhost:8983/solr/techproducts/select?q=%7B!mlt+qf=name,features%7DMA147LL/A' {code} The first two queries return 8 and 5 results, respectively. The third query doesn't return any results (not even the matched document). In contrast, the MoreLikeThis Handler works as expected (accounting for the default {{mintf}} and {{mindf}} values in SimpleMLTQParser): {code} curl 'http://localhost:8983/solr/techproducts/mlt?q=id:MA147LL/Amlt.fl=namemlt.mintf=1mlt.mindf=1' curl 'http://localhost:8983/solr/techproducts/mlt?q=id:MA147LL/Amlt.fl=featuresmlt.mintf=1mlt.mindf=1' curl 'http://localhost:8983/solr/techproducts/mlt?q=id:MA147LL/Amlt.fl=name,featuresmlt.mintf=1mlt.mindf=1' {code} After adding the following line to {{example/techproducts/solr/techproducts/conf/solrconfig.xml}}: {code:language=XML} requestHandler name=/mlt class=solr.MoreLikeThisHandler / {code} The first two queries return 7 and 4 results, respectively (excluding the matched document). The third query returns 7 results, as one would expect. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [DISCUSS] Change Query API to make queries immutable in 6.0
If we were designing things from scratch again, would boost really be on Query, or would it be on BooleanClause? Just throwing that out there... although it may make it easier to implement immutable queries (and perhaps make more sense too), it may also be too big of a change to be worth while. -Yonik On Thu, Apr 2, 2015 at 9:27 AM, david.w.smi...@gmail.com david.w.smi...@gmail.com wrote: On Thu, Apr 2, 2015 at 3:40 AM, Adrien Grand jpou...@gmail.com wrote: first make queries immutable up to the boost and then discuss if/how/when we should go fully immutable with a new API to change boosts? The “if” part concerns me; I don’t mind it being a separate issue to make the changes more manageable (progress not perfection, and all that). I’m all for the whole shebang. But if others think “no” then…. will it have been worthwhile to do this big change and not go all the way? I think not. Does anyone feel the answer is “no” to make boosts immutable? And if so why? If nobody comes up with a dissenting opinion to make boosts immutable within a couple days then count me as “+1” to your plans, else “-1” pending that discussion. ~ David Smiley Freelance Apache Lucene/Solr Search Consultant/Developer http://www.linkedin.com/in/davidwsmiley - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org