[jira] [Commented] (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
[ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14083884#comment-14083884 ] Shai Erera commented on SOLR-912: - Resurrecting that issue, I reviewed NamedList and I don't understand why it has to be so complicated: * Its {{T}} generic cause nothing but TONS of warnings in eclipse, for no good reason. I don't buy this comment made on the issue that it's better to make it generic than not. If we have a generic API which isn't used like that, it calls for a bad API IMO. From what I can see, NamedList is not supposed to be generic at all, as its main purpose is to allow you to store a heterogeneous list of {{name,value}} pairs, where name is always String, and the type of {{value}} may differ. If we want to make it more convenient for people who know e.g. all values are Strings, we can add sugar methods like {{getInt(), getString()...}}. I've also briefly reviewed some classes that use NamedList (outside of tests), and none seem to rely on {{T}} at all. So I'd rather we remove that generic from the API signature. * There is what seems to be a totally redundant {{SimpleOrderedMap}} class, which has contradicting documentation in its ctor: ** _a {{NamedList}} where access by key is more important than maintaining order_ ** _This class does not provide efficient lookup by key_ But the class doesn't add any additional data structures, contains only 3 ctors which delegate as-is to NamedList and offers a clone() which is identical to NamedList.clone(). Yet there are 574 references to it (per-eclipse) ... I think this class is just confusing and has to go away. Leaving performance aside for a second, NamedList could simply hold an internal {{MapString,ListObject}} to enable efficient access by key, remove all values of a key, access a key's values in order etc. It doesn't allow accessing the {{name,value}} pairs in any order though, i.e. {{getVal(i)}}. I don't know how important is this functionality though .. i.e. if we replaced it with a {{namesIterator()}}, would it not allow roughly the same thing? I'm kind of sure it does, but there are so many uses of NamedList across the Solr code base that I might be missing a case which won't like it. So I'd like to ask the Solr folks who know this code better than me -- how important is {{getName/Val(i)}}? Now back to performance, for a sec, in order to not always allocate a {{ListObject}} when NamedList is used to hold only one value per parameter, we can either: * Use Collections.singletonList() on first _add_, and change to a concrete List on the second _add_ only. * Use an {{Object[]}}, it's less expensive than a List object. * Use a MapString,Object internally and do instanceof checks on add/get as appropriate. BUT, if accessing/removing values by name is not important and it's OK if get(i) is run on O(N), we can simply simplify the class, like Yonik's proposal above, to hold an Object[] array (instead of List). But I think we should remove the generic anyway. Maybe we should break this down into 3 issues: * Get rid of SimpleOrderedMap -- if it's important to keep in 4x, I can deprecate and move all uses of it to NamedList directly. * Remove the generics from NamedList's API. We can add sugar getters for specific types if we want. * Simplify NamedList internal implementation. On the performance side -- how critical is NamedList on the execution path? I don't like micro-benchmarks too much, so if NamedList is only a fraction of an entire execution path, I'd rather it's even a tad slower but readable and easier to use/maintain, than if it's overly complicated only to buy us a few nanos in the overall request. org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList Key: SOLR-912 URL: https://issues.apache.org/jira/browse/SOLR-912 Project: Solr Issue Type: Improvement Components: search Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies Reporter: Karthik K Priority: Minor Attachments: NLProfile.java, SOLR-912.patch, SOLR-912.patch Original Estimate: 72h Remaining Estimate: 72h The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
[ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14083884#comment-14083884 ] Shai Erera edited comment on SOLR-912 at 8/3/14 6:21 AM: - Resurrecting that issue, I reviewed NamedList and I don't understand why it has to be so complicated: * Its {{T}} generic results in TONS of warnings in eclipse, for no good reason. I don't buy this comment made on the issue that it's better to make it generic than not. If we have a generic API which isn't used like that, it calls for a bad API IMO. From what I can see, NamedList is not supposed to be generic at all, as its main purpose is to allow you to store a heterogeneous list of {{name,value}} pairs, where name is always String, and the type of {{value}} may differ. If we want to make it more convenient for people who know e.g. all values are Strings, we can add sugar methods like {{getInt(), getString()...}}. I've also briefly reviewed some classes that use NamedList (outside of tests), and none seem to rely on {{T}} at all. So I'd rather we remove that generic from the API signature. * There is what seems to be a totally redundant {{SimpleOrderedMap}} class, which has contradicting documentation in its class-jdocs: ** _a {{NamedList}} where access by key is more important than maintaining order_ ** _This class does not provide efficient lookup by key_ But the class doesn't add any additional data structures, contains only 3 ctors which delegate as-is to NamedList and offers a clone() which is identical to NamedList.clone(). Yet there are 574 references to it (per-eclipse) ... I think this class is just confusing and has to go away. Leaving performance aside for a second, NamedList could simply hold an internal {{MapString,ListObject}} to enable efficient access by key, remove all values of a key, access a key's values in order etc. It doesn't allow accessing the {{name,value}} pairs in any order though, i.e. {{getVal(i)}}. I don't know how important is this functionality though .. i.e. if we replaced it with a {{namesIterator()}}, would it not allow roughly the same thing? I'm kind of sure it does, but there are so many uses of NamedList across the Solr code base that I might be missing a case which won't like it. So I'd like to ask the Solr folks who know this code better than me -- how important is {{getName/Val(i)}}? Now back to performance, for a sec, in order to not always allocate a {{ListObject}} when NamedList is used to hold only one value per parameter, we can either: * Use Collections.singletonList() on first _add_, and change to a concrete List on the second _add_ only. * Use an {{Object[]}}, it's less expensive than a List object. * Use a MapString,Object internally and do instanceof checks on add/get as appropriate. BUT, if accessing/removing values by name is not important and it's OK if get(i) is run on O(N), we can simply simplify the class, like Yonik's proposal above, to hold an Object[] array (instead of List). But I think we should remove the generic anyway. Maybe we should break this down into 3 issues: * Get rid of SimpleOrderedMap -- if it's important to keep in 4x, I can deprecate and move all uses of it to NamedList directly. * Remove the generics from NamedList's API. We can add sugar getters for specific types if we want. * Simplify NamedList internal implementation. On the performance side -- how critical is NamedList on the execution path? I don't like micro-benchmarks too much, so if NamedList is only a fraction of an entire execution path, I'd rather it's even a tad slower but readable and easier to use/maintain, than if it's overly complicated only to buy us a few nanos in the overall request. was (Author: shaie): Resurrecting that issue, I reviewed NamedList and I don't understand why it has to be so complicated: * Its {{T}} generic cause nothing but TONS of warnings in eclipse, for no good reason. I don't buy this comment made on the issue that it's better to make it generic than not. If we have a generic API which isn't used like that, it calls for a bad API IMO. From what I can see, NamedList is not supposed to be generic at all, as its main purpose is to allow you to store a heterogeneous list of {{name,value}} pairs, where name is always String, and the type of {{value}} may differ. If we want to make it more convenient for people who know e.g. all values are Strings, we can add sugar methods like {{getInt(), getString()...}}. I've also briefly reviewed some classes that use NamedList (outside of tests), and none seem to rely on {{T}} at all. So I'd rather we remove that generic from the API signature. * There is what seems to be a totally redundant {{SimpleOrderedMap}} class, which has contradicting documentation in its ctor: ** _a {{NamedList}} where access by key is more important than maintaining order_ **
[jira] [Comment Edited] (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
[ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14083884#comment-14083884 ] Shai Erera edited comment on SOLR-912 at 8/3/14 6:22 AM: - Resurrecting that issue, I reviewed NamedList and I don't understand why it has to be so complicated: * Its {{T}} generic results in TONS of warnings in eclipse, for no good reason. I don't buy this comment made on the issue that it's better to make it generic than not. If we have a generic API which isn't used like that, it calls for a bad API IMO. From what I can see, NamedList is not supposed to be generic at all, as its main purpose is to allow you to store a heterogeneous list of {{name,value}} pairs, where name is always String, and the type of {{value}} may differ. If we want to make it more convenient for people who know e.g. all values are Strings, we can add sugar methods like {{getInt(), getString()...}}. I've also briefly reviewed some classes that use NamedList (outside of tests), and none seem to rely on {{T}} at all. So I'd rather we remove that generic from the API signature. * There is what seems to be a totally redundant {{SimpleOrderedMap}} class, which has contradicting documentation in its class-jdocs: ** _a {{NamedList}} where access by key is more important than maintaining order_ ** _This class does not provide efficient lookup by key_ But the class doesn't add any additional data structures, contains only 3 ctors which delegate as-is to NamedList and offers a clone() which is identical to NamedList.clone(). Yet there are 574 references to it (per-eclipse) ... I think this class is just confusing and has to go away. Leaving performance aside for a second, NamedList could simply hold an internal {{MapString,ListObject}} to enable efficient access by key, remove all values of a key, access a key's values in order etc. It doesn't allow accessing the {{name,value}} pairs in any order though, i.e. {{getVal\(i\)}}. I don't know how important is this functionality though .. i.e. if we replaced it with a {{namesIterator()}}, would it not allow roughly the same thing? I'm kind of sure it does, but there are so many uses of NamedList across the Solr code base that I might be missing a case which won't like it. So I'd like to ask the Solr folks who know this code better than me -- how important is {{getName/Val\(i\)}}? Now back to performance, for a sec, in order to not always allocate a {{ListObject}} when NamedList is used to hold only one value per parameter, we can either: * Use Collections.singletonList() on first _add_, and change to a concrete List on the second _add_ only. * Use an {{Object[]}}, it's less expensive than a List object. * Use a MapString,Object internally and do instanceof checks on add/get as appropriate. BUT, if accessing/removing values by name is not important and it's OK if get\(i\) is run on O(N), we can simply simplify the class, like Yonik's proposal above, to hold an Object[] array (instead of List). But I think we should remove the generic anyway. Maybe we should break this down into 3 issues: * Get rid of SimpleOrderedMap -- if it's important to keep in 4x, I can deprecate and move all uses of it to NamedList directly. * Remove the generics from NamedList's API. We can add sugar getters for specific types if we want. * Simplify NamedList internal implementation. On the performance side -- how critical is NamedList on the execution path? I don't like micro-benchmarks too much, so if NamedList is only a fraction of an entire execution path, I'd rather it's even a tad slower but readable and easier to use/maintain, than if it's overly complicated only to buy us a few nanos in the overall request. was (Author: shaie): Resurrecting that issue, I reviewed NamedList and I don't understand why it has to be so complicated: * Its {{T}} generic results in TONS of warnings in eclipse, for no good reason. I don't buy this comment made on the issue that it's better to make it generic than not. If we have a generic API which isn't used like that, it calls for a bad API IMO. From what I can see, NamedList is not supposed to be generic at all, as its main purpose is to allow you to store a heterogeneous list of {{name,value}} pairs, where name is always String, and the type of {{value}} may differ. If we want to make it more convenient for people who know e.g. all values are Strings, we can add sugar methods like {{getInt(), getString()...}}. I've also briefly reviewed some classes that use NamedList (outside of tests), and none seem to rely on {{T}} at all. So I'd rather we remove that generic from the API signature. * There is what seems to be a totally redundant {{SimpleOrderedMap}} class, which has contradicting documentation in its class-jdocs: ** _a {{NamedList}} where access by key is more important than maintaining order_
[jira] [Commented] (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
[ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14083888#comment-14083888 ] Shai Erera commented on SOLR-912: - One other thing, how important is it to be able to store {{null}} names? I haven't dug deep through the code -- do we actually use it? This doesn't prevent us from using a Map internally, as we can use our own key, something like $$NULL_STRING!! (or pick some other constant) and map the null-name requests to this key. org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList Key: SOLR-912 URL: https://issues.apache.org/jira/browse/SOLR-912 Project: Solr Issue Type: Improvement Components: search Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies Reporter: Karthik K Priority: Minor Attachments: NLProfile.java, SOLR-912.patch, SOLR-912.patch Original Estimate: 72h Remaining Estimate: 72h The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6191) Self Describing SearchComponents, RequestHandlers, params. etc.
[ https://issues.apache.org/jira/browse/SOLR-6191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14083889#comment-14083889 ] Noble Paul commented on SOLR-6191: -- I guess it is better to use java annotations to describe the params than using interfaces. The problem with interfaces is that you need to instantiate a class to get the details. I shall take another stab at this and post a patch Self Describing SearchComponents, RequestHandlers, params. etc. --- Key: SOLR-6191 URL: https://issues.apache.org/jira/browse/SOLR-6191 Project: Solr Issue Type: Bug Reporter: Vitaliy Zhovtyuk Labels: features Attachments: SOLR-6191.patch We should have self describing parameters for search components, etc. I think we should support UNIX style short and long names and that you should also be able to get a short description of what a parameter does if you ask for INFO on it. For instance, fl could also be fieldList, etc. Also, we should put this into the base classes so that new components can add to it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5972) new statistics facet capabilities to StatsComponent facet - limit, sort and missing.
[ https://issues.apache.org/jira/browse/SOLR-5972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elran Dvir updated SOLR-5972: - Attachment: SOLR-5972.patch new statistics facet capabilities to StatsComponent facet - limit, sort and missing. Key: SOLR-5972 URL: https://issues.apache.org/jira/browse/SOLR-5972 Project: Solr Issue Type: New Feature Reporter: Elran Dvir Attachments: SOLR-5972.patch, SOLR-5972.patch I thought it would be very useful to enable limiting and sorting StatsComponent facet response. I chose to implement it in Stats Component rather than Analytics component because Analytics doesn't support distributed queries yet. The default for limit is -1 - returns all facet values. The default for sort is no sorting. The default for missing is true. So if you use stats component exactly as before, the response won't change as of nowadays. If ask for sort or limit, missing facet value will be the last, as in regular facet. Sort types supported: min, max, sum and countdistinct for stats fields, and count and index for facet fields (all sort types are lower cased). Sort directions asc and desc are supported. Sorting by multiple fields is supported. our example use case will be employees' monthly salaries: The follwing query returns the 10 most expensive employees: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=salary sum descf.employee_name.stats.facet.limit=10 The follwing query returns the 10 least expensive employees: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=salary sum ascf.employee_name.stats.facet.limit=10 The follwing query returns the employee that got the highest salary ever: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=salary max descf.employee_name.stats.facet.limit=1 The follwing query returns the employee that got the lowest salary ever: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=salary min ascf.employee_name.stats.facet.limit=1 The follwing query returns the 10 first (lexicographically) employees: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=employee_name index ascf.employee_name.stats.facet.limit=10 The follwing query returns the 10 employees that have worked for the longest period: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=employee_name count descf.employee_name.stats.facet.limit=10 The follwing query returns the 10 employee whose salaries vary the most: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=salary countdistinct descf.employee_name.stats.facet.limit=10 Attached a patch implementing this in StatsComponent. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS-MAVEN] Lucene-Solr-Maven-4.x #669: POMs out of sync
Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/669/ 2 tests failed. FAILED: org.apache.solr.cloud.MultiThreadedOCPTest.testDistribSearch Error Message: We have a failed SPLITSHARD task Stack Trace: java.lang.AssertionError: We have a failed SPLITSHARD task at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.solr.cloud.MultiThreadedOCPTest.testTaskExclusivity(MultiThreadedOCPTest.java:125) at org.apache.solr.cloud.MultiThreadedOCPTest.doTest(MultiThreadedOCPTest.java:71) FAILED: org.apache.solr.cloud.MultiThreadedOCPTest.testDistribSearch Error Message: Captured an uncaught exception in thread: Thread[id=3164, name=parallelCoreAdminExecutor-794-thread-6, state=RUNNABLE, group=TGRP-MultiThreadedOCPTest] Stack Trace: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=3164, name=parallelCoreAdminExecutor-794-thread-6, state=RUNNABLE, group=TGRP-MultiThreadedOCPTest] Caused by: java.lang.OutOfMemoryError: unable to create new native thread at __randomizedtesting.SeedInfo.seed([8A4122FBD86B32D7]:0) at java.lang.Thread.start0(Native Method) at java.lang.Thread.start(Thread.java:714) at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949) at java.util.concurrent.ThreadPoolExecutor.processWorkerExit(ThreadPoolExecutor.java:1017) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1163) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Build Log: [...truncated 53083 lines...] BUILD FAILED /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-4.x/build.xml:490: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-4.x/build.xml:182: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-4.x/extra-targets.xml:77: Java returned: 1 Total time: 206 minutes 14 seconds Build step 'Invoke Ant' marked build as failure Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5972) new statistics facet capabilities to StatsComponent facet - limit, sort and missing.
[ https://issues.apache.org/jira/browse/SOLR-5972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14083916#comment-14083916 ] Elran Dvir commented on SOLR-5972: -- I attached a newer patch with fix of calculation of existInDoc for multivalue fields new statistics facet capabilities to StatsComponent facet - limit, sort and missing. Key: SOLR-5972 URL: https://issues.apache.org/jira/browse/SOLR-5972 Project: Solr Issue Type: New Feature Reporter: Elran Dvir Attachments: SOLR-5972.patch, SOLR-5972.patch I thought it would be very useful to enable limiting and sorting StatsComponent facet response. I chose to implement it in Stats Component rather than Analytics component because Analytics doesn't support distributed queries yet. The default for limit is -1 - returns all facet values. The default for sort is no sorting. The default for missing is true. So if you use stats component exactly as before, the response won't change as of nowadays. If ask for sort or limit, missing facet value will be the last, as in regular facet. Sort types supported: min, max, sum and countdistinct for stats fields, and count and index for facet fields (all sort types are lower cased). Sort directions asc and desc are supported. Sorting by multiple fields is supported. our example use case will be employees' monthly salaries: The follwing query returns the 10 most expensive employees: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=salary sum descf.employee_name.stats.facet.limit=10 The follwing query returns the 10 least expensive employees: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=salary sum ascf.employee_name.stats.facet.limit=10 The follwing query returns the employee that got the highest salary ever: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=salary max descf.employee_name.stats.facet.limit=1 The follwing query returns the employee that got the lowest salary ever: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=salary min ascf.employee_name.stats.facet.limit=1 The follwing query returns the 10 first (lexicographically) employees: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=employee_name index ascf.employee_name.stats.facet.limit=10 The follwing query returns the 10 employees that have worked for the longest period: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=employee_name count descf.employee_name.stats.facet.limit=10 The follwing query returns the 10 employee whose salaries vary the most: q=*:*stats=truestats.field=salarystats.facet=employee_namef.employee_name.stats.facet.sort=salary countdistinct descf.employee_name.stats.facet.limit=10 Attached a patch implementing this in StatsComponent. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-Windows (64bit/jdk1.8.0_20-ea-b23) - Build # 4135 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Windows/4135/ Java: 64bit/jdk1.8.0_20-ea-b23 -XX:+UseCompressedOops -XX:+UseG1GC 2 tests failed. REGRESSION: org.apache.solr.client.solrj.embedded.MergeIndexesEmbeddedTest.testMergeMultipleRequest Error Message: Test abandoned because suite timeout was reached. Stack Trace: java.lang.Exception: Test abandoned because suite timeout was reached. at __randomizedtesting.SeedInfo.seed([3869897B384A9BD8]:0) FAILED: junit.framework.TestSuite.org.apache.solr.client.solrj.embedded.MergeIndexesEmbeddedTest Error Message: Suite timeout exceeded (= 720 msec). Stack Trace: java.lang.Exception: Suite timeout exceeded (= 720 msec). at __randomizedtesting.SeedInfo.seed([3869897B384A9BD8]:0) Build Log: [...truncated 12878 lines...] [junit4] Suite: org.apache.solr.client.solrj.embedded.MergeIndexesEmbeddedTest [junit4] 2 Creating dataDir: C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\build\solr-solrj\test\J0\.\solr.client.solrj.embedded.MergeIndexesEmbeddedTest-3869897B384A9BD8-001\init-core-data-001 [junit4] 2 342428 T662 oas.SolrTestCaseJ4.buildSSLConfig Randomized ssl (true) and clientAuth (true) [junit4] 2 342431 T662 oas.SolrTestCaseJ4.setUp ###Starting testMergeIndexesByCoreName [junit4] 2 342433 T662 oasc.SolrResourceLoader.init new SolrResourceLoader for directory: 'C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\example\multicore\' [junit4] 2 342452 T662 oasc.ConfigSolr.fromFile Loading container configuration from C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\example\multicore\solr.xml [junit4] 2 342460 T662 oasc.CoreContainer.init New CoreContainer 1523286332 [junit4] 2 342460 T662 oasc.CoreContainer.load Loading cores into CoreContainer [instanceDir=C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\example\multicore\] [junit4] 2 342461 T662 oashc.HttpShardHandlerFactory.getParameter Setting socketTimeout to: 0 [junit4] 2 342461 T662 oashc.HttpShardHandlerFactory.getParameter Setting urlScheme to: https [junit4] 2 342461 T662 oashc.HttpShardHandlerFactory.getParameter Setting connTimeout to: 0 [junit4] 2 342461 T662 oashc.HttpShardHandlerFactory.getParameter Setting maxConnectionsPerHost to: 20 [junit4] 2 342461 T662 oashc.HttpShardHandlerFactory.getParameter Setting corePoolSize to: 0 [junit4] 2 342461 T662 oashc.HttpShardHandlerFactory.getParameter Setting maximumPoolSize to: 2147483647 [junit4] 2 342461 T662 oashc.HttpShardHandlerFactory.getParameter Setting maxThreadIdleTime to: 5 [junit4] 2 342461 T662 oashc.HttpShardHandlerFactory.getParameter Setting sizeOfQueue to: -1 [junit4] 2 342462 T662 oashc.HttpShardHandlerFactory.getParameter Setting fairnessPolicy to: false [junit4] 2 342467 T662 oasu.UpdateShardHandler.init Creating UpdateShardHandler HTTP client with params: socketTimeout=0connTimeout=0retry=false [junit4] 2 342469 T662 oasl.LogWatcher.createWatcher SLF4J impl is org.slf4j.impl.Log4jLoggerFactory [junit4] 2 342469 T662 oasl.LogWatcher.newRegisteredLogWatcher Registering Log Listener [Log4j (org.slf4j.impl.Log4jLoggerFactory)] [junit4] 2 342469 T662 oasc.CoreContainer.load Host Name: [junit4] 2 342472 T663 oasc.SolrResourceLoader.init new SolrResourceLoader for directory: 'C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\example\multicore\core0\' [junit4] 2 342474 T664 oasc.SolrResourceLoader.init new SolrResourceLoader for directory: 'C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\example\multicore\core1\' [junit4] 2 342494 T663 oasc.SolrConfig.init Using Lucene MatchVersion: LUCENE_4_10 [junit4] 2 342497 T663 oasc.SolrConfig.init Loaded SolrConfig: solrconfig.xml [junit4] 2 342499 T663 oass.IndexSchema.readSchema Reading Solr Schema from schema.xml [junit4] 2 342501 T663 oass.IndexSchema.readSchema [core0] Schema name=example core zero [junit4] 2 342504 T663 oass.IndexSchema.readSchema default search field in schema is name [junit4] 2 342505 T663 oass.IndexSchema.readSchema query parser default operator is OR [junit4] 2 342505 T663 oass.IndexSchema.readSchema unique key field: id [junit4] 2 342505 T663 oasc.CoreContainer.create Creating SolrCore 'core0' using configuration from instancedir C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\example\multicore\core0\ [junit4] 2 342505 T663 oasc.SolrCore.initDirectoryFactory solr.StandardDirectoryFactory [junit4] 2 342505 T663 oasc.SolrCore.init [core0] Opening new SolrCore at C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\example\multicore\core0\, dataDir=C:\Users\JenkinsSlave\workspace\Lucene-Solr-4.x-Windows\solr\build\solr-solrj\test\J0\solr.client.solrj.embedded.MergeIndexesEmbeddedTest-3869897B384A9BD8-001\tempDir-001\ [junit4] 2 342505 T663
[jira] [Assigned] (SOLR-6191) Self Describing SearchComponents, RequestHandlers, params. etc.
[ https://issues.apache.org/jira/browse/SOLR-6191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul reassigned SOLR-6191: Assignee: Noble Paul Self Describing SearchComponents, RequestHandlers, params. etc. --- Key: SOLR-6191 URL: https://issues.apache.org/jira/browse/SOLR-6191 Project: Solr Issue Type: Bug Reporter: Vitaliy Zhovtyuk Assignee: Noble Paul Labels: features Attachments: SOLR-6191.patch We should have self describing parameters for search components, etc. I think we should support UNIX style short and long names and that you should also be able to get a short description of what a parameter does if you ask for INFO on it. For instance, fl could also be fieldList, etc. Also, we should put this into the base classes so that new components can add to it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6191) Self Describing SearchComponents, RequestHandlers, params. etc.
[ https://issues.apache.org/jira/browse/SOLR-6191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-6191: - Attachment: SOLR-6191.patch This annotates the corresponding class with the parameter info Take a look at PingRequestHandler.java for an example There is a coreadmin API called methodinfo which can give you these details and lookup can be done by the name or the class name Self Describing SearchComponents, RequestHandlers, params. etc. --- Key: SOLR-6191 URL: https://issues.apache.org/jira/browse/SOLR-6191 Project: Solr Issue Type: Bug Reporter: Vitaliy Zhovtyuk Assignee: Noble Paul Labels: features Attachments: SOLR-6191.patch, SOLR-6191.patch We should have self describing parameters for search components, etc. I think we should support UNIX style short and long names and that you should also be able to get a short description of what a parameter does if you ask for INFO on it. For instance, fl could also be fieldList, etc. Also, we should put this into the base classes so that new components can add to it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-6191) Self Describing SearchComponents, RequestHandlers, params. etc.
[ https://issues.apache.org/jira/browse/SOLR-6191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14083927#comment-14083927 ] Noble Paul edited comment on SOLR-6191 at 8/3/14 9:14 AM: -- This annotates the corresponding class with the parameter info Take a look at PingRequestHandler.java for an example There is a coreadmin API called methodinfo which can give you these details and lookup can be done by the name or the class name example : http://localhost:8983/solr/admin/cores?action=methodinfoname=/admin/pingcore=collection1wt=json was (Author: noble.paul): This annotates the corresponding class with the parameter info Take a look at PingRequestHandler.java for an example There is a coreadmin API called methodinfo which can give you these details and lookup can be done by the name or the class name Self Describing SearchComponents, RequestHandlers, params. etc. --- Key: SOLR-6191 URL: https://issues.apache.org/jira/browse/SOLR-6191 Project: Solr Issue Type: Bug Reporter: Vitaliy Zhovtyuk Assignee: Noble Paul Labels: features Attachments: SOLR-6191.patch, SOLR-6191.patch We should have self describing parameters for search components, etc. I think we should support UNIX style short and long names and that you should also be able to get a short description of what a parameter does if you ask for INFO on it. For instance, fl could also be fieldList, etc. Also, we should put this into the base classes so that new components can add to it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6315) Remove SimpleOrderedMap
Shai Erera created SOLR-6315: Summary: Remove SimpleOrderedMap Key: SOLR-6315 URL: https://issues.apache.org/jira/browse/SOLR-6315 Project: Solr Issue Type: Improvement Components: clients - java Reporter: Shai Erera Assignee: Shai Erera As I described on SOLR-912, SimpleOrderedMap is redundant and generally useless class, with confusing jdocs. We should remove it. I'll attach a patch shortly. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Linux (64bit/ibm-j9-jdk7) - Build # 10952 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/10952/ Java: 64bit/ibm-j9-jdk7 -Xjit:exclude={org/apache/lucene/util/fst/FST.pack(IIF)Lorg/apache/lucene/util/fst/FST;} 1 tests failed. REGRESSION: org.apache.solr.update.AutoCommitTest.testMaxTime Error Message: Exception during query Stack Trace: java.lang.RuntimeException: Exception during query at __randomizedtesting.SeedInfo.seed([D9FA84AE2CFD88FD:430EF94CB26714C1]:0) at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:719) at org.apache.solr.update.AutoCommitTest.testMaxTime(AutoCommitTest.java:227) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:94) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55) at java.lang.reflect.Method.invoke(Method.java:619) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at java.lang.Thread.run(Thread.java:853) Caused by: java.lang.RuntimeException: REQUEST FAILED: xpath=//result[@numFound=0] xml response was: ?xml version=1.0 encoding=UTF-8?
[jira] [Updated] (SOLR-6315) Remove SimpleOrderedMap
[ https://issues.apache.org/jira/browse/SOLR-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated SOLR-6315: - Attachment: SOLR-6315.patch Simple patch. It's huge because it's a rote search-and-replace, but nothing special about it. I ran tests and it seems that they aren't happy with the change though. I think one problem is with JSONResponseWriter which serializes the response based on the type of NamedList, and then some tests rely on it. I.e. now that no code creates a SimpleOrderedList, it never serializes the NamedList as {{writeNamedListAsMapWithDups()}} ... So does SimpleOrderedList exists solely for response writing?? If so, we should document that that's its only purpose, and that e.g. there's nothing simple about it. But maybe we can fix JSONResponseWriter somehow ... I'll dig there. If anyone knows this code and have a suggestion, please chime in. Remove SimpleOrderedMap --- Key: SOLR-6315 URL: https://issues.apache.org/jira/browse/SOLR-6315 Project: Solr Issue Type: Improvement Components: clients - java Reporter: Shai Erera Assignee: Shai Erera Attachments: SOLR-6315.patch As I described on SOLR-912, SimpleOrderedMap is redundant and generally useless class, with confusing jdocs. We should remove it. I'll attach a patch shortly. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-6315) Remove SimpleOrderedMap
[ https://issues.apache.org/jira/browse/SOLR-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14083936#comment-14083936 ] Shai Erera edited comment on SOLR-6315 at 8/3/14 10:34 AM: --- Simple patch. It's a rote search-and-replace which touches many places in the code, but the change is really simple. I ran tests and it seems that they aren't happy with the change though. I think one problem is with JSONResponseWriter which serializes the response based on the type of NamedList, and then some tests rely on it. I.e. now that no code creates a SimpleOrderedList, it never serializes the NamedList as {{writeNamedListAsMapWithDups()}} ... So does SimpleOrderedList exists solely for response writing?? If so, we should document that that's its only purpose, and that e.g. there's nothing simple about it. But maybe we can fix JSONResponseWriter somehow ... I'll dig there. If anyone knows this code and have a suggestion, please chime in. was (Author: shaie): Simple patch. It's huge because it's a rote search-and-replace, but nothing special about it. I ran tests and it seems that they aren't happy with the change though. I think one problem is with JSONResponseWriter which serializes the response based on the type of NamedList, and then some tests rely on it. I.e. now that no code creates a SimpleOrderedList, it never serializes the NamedList as {{writeNamedListAsMapWithDups()}} ... So does SimpleOrderedList exists solely for response writing?? If so, we should document that that's its only purpose, and that e.g. there's nothing simple about it. But maybe we can fix JSONResponseWriter somehow ... I'll dig there. If anyone knows this code and have a suggestion, please chime in. Remove SimpleOrderedMap --- Key: SOLR-6315 URL: https://issues.apache.org/jira/browse/SOLR-6315 Project: Solr Issue Type: Improvement Components: clients - java Reporter: Shai Erera Assignee: Shai Erera Attachments: SOLR-6315.patch As I described on SOLR-912, SimpleOrderedMap is redundant and generally useless class, with confusing jdocs. We should remove it. I'll attach a patch shortly. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6191) Self Describing SearchComponents, RequestHandlers, params. etc.
[ https://issues.apache.org/jira/browse/SOLR-6191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-6191: - Attachment: SOLR-6191.patch With a testcase I guess [~shalinmangar]'s admin UI enhancements would need this. I would need to annotate other classes. But the core plumbing is here. Self Describing SearchComponents, RequestHandlers, params. etc. --- Key: SOLR-6191 URL: https://issues.apache.org/jira/browse/SOLR-6191 Project: Solr Issue Type: Bug Reporter: Vitaliy Zhovtyuk Assignee: Noble Paul Labels: features Attachments: SOLR-6191.patch, SOLR-6191.patch, SOLR-6191.patch We should have self describing parameters for search components, etc. I think we should support UNIX style short and long names and that you should also be able to get a short description of what a parameter does if you ask for INFO on it. For instance, fl could also be fieldList, etc. Also, we should put this into the base classes so that new components can add to it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.8.0) - Build # 1748 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/1748/ Java: 64bit/jdk1.8.0 -XX:-UseCompressedOops -XX:+UseSerialGC 1 tests failed. REGRESSION: org.apache.solr.schema.TestCloudSchemaless.testDistribSearch Error Message: Timeout occured while waiting response from server at: https://127.0.0.1:51192/_d/ec/collection1 Stack Trace: org.apache.solr.client.solrj.SolrServerException: Timeout occured while waiting response from server at: https://127.0.0.1:51192/_d/ec/collection1 at __randomizedtesting.SeedInfo.seed([AE68B334D30A2408:2F8E3D2CA4554434]:0) at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:561) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54) at org.apache.solr.schema.TestCloudSchemaless.doTest(TestCloudSchemaless.java:140) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:865) at sun.reflect.GeneratedMethodAccessor43.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
[JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.8.0_11) - Build # 10831 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/10831/ Java: 32bit/jdk1.8.0_11 -server -XX:+UseParallelGC 1 tests failed. REGRESSION: org.apache.solr.cloud.MultiThreadedOCPTest.testDistribSearch Error Message: Task 3002 did not complete, final state: failed Stack Trace: java.lang.AssertionError: Task 3002 did not complete, final state: failed at __randomizedtesting.SeedInfo.seed([5386848B53BA9EE9:D2600A9324E5FED5]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.solr.cloud.MultiThreadedOCPTest.testDeduplicationOfSubmittedTasks(MultiThreadedOCPTest.java:163) at org.apache.solr.cloud.MultiThreadedOCPTest.doTest(MultiThreadedOCPTest.java:72) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:867) at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at
[jira] [Commented] (SOLR-6315) Remove SimpleOrderedMap
[ https://issues.apache.org/jira/browse/SOLR-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14083954#comment-14083954 ] Shai Erera commented on SOLR-6315: -- I thought about it, I think it's odd that whoever creates the NamedList instance needs to decide on the type based on how it's going to be serialized. Likewise for JSONResponseWriter to decide to write a NamedList as map style, if it's SimpleOrderedMap. In fact, I think it's somewhat broken: even if you pass {{json.nl=flat}}, most of the response would still be serialized as a map, because most of the code today creates a SimpleOrderedMap. I think that in fact our default should have been map. Yes, today it's flat, but if you look at SolrTestCaseJ4 it assumes it's a map, and that happens because of that hack around SimpleOrderedMap. Also, because so much code creates SimpleOrderedMap, the effective default is map. I don't know what do others feel about changing the default from flat to map (I personally think it's odd for a JSON writer to not use JSON style, that is map, by default), but I don't think we should keep SimpleOrderedMap in the code only because of response writing. So would appreciate some guidance here -- we should fix this somehow. If we change the default to map, I think it will let us remove SimpleOrderedMap, and fix the few places in the code which are badly written anyway as they rely on some default, yet not well defined (and some arcane), behavior. SolrTestCaseJ4 can easily be fixed to assume a Map and not List. But this changes back-compat, so I'm not sure -- should this be done for trunk only? I hope not, because it's a big change and would complicate future merges with 4x. Remove SimpleOrderedMap --- Key: SOLR-6315 URL: https://issues.apache.org/jira/browse/SOLR-6315 Project: Solr Issue Type: Improvement Components: clients - java Reporter: Shai Erera Assignee: Shai Erera Attachments: SOLR-6315.patch As I described on SOLR-912, SimpleOrderedMap is redundant and generally useless class, with confusing jdocs. We should remove it. I'll attach a patch shortly. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Lucene versioning logic
On Sat, Aug 2, 2014 at 12:41 PM, Shai Erera ser...@gmail.com wrote: Another proposal that I made on LUCENE-5859 is to get rid of Version (for Analyzers) and follow the solution we have with Codecs. If an Analyzer changes its runtime behavior, and e.g not marked @experimental, it can create a Foo49Analyzer with the new behavior. That way, apps are still safe when they upgrade, since their Foo45Analyzer still exists (but deprecated). And they can always copy a Foo45Analyzer when they upgrade to Lucene 6.0 where it no longer exists... with this approach, there's no single version across the app - it just uses the specific Analyzer impls. But the usability here would be really bad. For codecs there isn't much a better thing to name it anyway, and codecs are super-expert to change. For analyzers usability is paramount. I do think its ok to name _backwards_ compat tokenizer/tokenfilter classes this way. In fact its already this way in trunk for any back compat *actually doing something*: Lucene43NgramTokenizer, Lucene47WordDelimiterFilter. The Version parameters are just for show, not doing anything! - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Lucene versioning logic
Yes, I agree that Foo49Analyzer is an odd name. Better if it was named FooAnalyzerWithNoApostrophe, and I'm fine if that Analyzer chose to name its different versions like that. But in the absence of better naming ideas, I proposed the Foo49Analyzer. If we already have such Analyzers, then we are in fact implementing that approach, only didn't make that decision globally. So whether it's odd or not, let's first agree if we are willing to have these analyzers in our code base (i.e. w/ the back-compat support). If we do, we can let each Analyzer decide on its naming. Analyzers aren't Codecs, I agree, and sticking the Lucene version in their name is probably not the best thing to do, as the Lucene version is more associated with the index format. But if a fixed Analyzer cannot come up w/ a better name, I think the Lucene version there is not that horrible. And, it lets us easily remove Version.java. Shai On Sun, Aug 3, 2014 at 2:43 PM, Robert Muir rcm...@gmail.com wrote: On Sat, Aug 2, 2014 at 12:41 PM, Shai Erera ser...@gmail.com wrote: Another proposal that I made on LUCENE-5859 is to get rid of Version (for Analyzers) and follow the solution we have with Codecs. If an Analyzer changes its runtime behavior, and e.g not marked @experimental, it can create a Foo49Analyzer with the new behavior. That way, apps are still safe when they upgrade, since their Foo45Analyzer still exists (but deprecated). And they can always copy a Foo45Analyzer when they upgrade to Lucene 6.0 where it no longer exists... with this approach, there's no single version across the app - it just uses the specific Analyzer impls. But the usability here would be really bad. For codecs there isn't much a better thing to name it anyway, and codecs are super-expert to change. For analyzers usability is paramount. I do think its ok to name _backwards_ compat tokenizer/tokenfilter classes this way. In fact its already this way in trunk for any back compat *actually doing something*: Lucene43NgramTokenizer, Lucene47WordDelimiterFilter. The Version parameters are just for show, not doing anything! - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Lucene versioning logic
You don't read what i wrote. Read it again. On Sun, Aug 3, 2014 at 7:49 AM, Shai Erera ser...@gmail.com wrote: Yes, I agree that Foo49Analyzer is an odd name. Better if it was named FooAnalyzerWithNoApostrophe, and I'm fine if that Analyzer chose to name its different versions like that. But in the absence of better naming ideas, I proposed the Foo49Analyzer. If we already have such Analyzers, then we are in fact implementing that approach, only didn't make that decision globally. So whether it's odd or not, let's first agree if we are willing to have these analyzers in our code base (i.e. w/ the back-compat support). If we do, we can let each Analyzer decide on its naming. Analyzers aren't Codecs, I agree, and sticking the Lucene version in their name is probably not the best thing to do, as the Lucene version is more associated with the index format. But if a fixed Analyzer cannot come up w/ a better name, I think the Lucene version there is not that horrible. And, it lets us easily remove Version.java. Shai On Sun, Aug 3, 2014 at 2:43 PM, Robert Muir rcm...@gmail.com wrote: On Sat, Aug 2, 2014 at 12:41 PM, Shai Erera ser...@gmail.com wrote: Another proposal that I made on LUCENE-5859 is to get rid of Version (for Analyzers) and follow the solution we have with Codecs. If an Analyzer changes its runtime behavior, and e.g not marked @experimental, it can create a Foo49Analyzer with the new behavior. That way, apps are still safe when they upgrade, since their Foo45Analyzer still exists (but deprecated). And they can always copy a Foo45Analyzer when they upgrade to Lucene 6.0 where it no longer exists... with this approach, there's no single version across the app - it just uses the specific Analyzer impls. But the usability here would be really bad. For codecs there isn't much a better thing to name it anyway, and codecs are super-expert to change. For analyzers usability is paramount. I do think its ok to name _backwards_ compat tokenizer/tokenfilter classes this way. In fact its already this way in trunk for any back compat *actually doing something*: Lucene43NgramTokenizer, Lucene47WordDelimiterFilter. The Version parameters are just for show, not doing anything! - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Lucene versioning logic
Oh, I misread this part I do think its ok to name ... -- replaced do with don't :). So you say that if we have a FooAnalyzer in 4.5 and change its behavior in 4.9, then we add a Foo45Analyzer as a back-compat support, and FooAnalyzer in 4.9 keeps its name, but with different behavior? That means that an app who didn't read CHANGES will be broken upon upgrade, but if it does read CHANGES, it at least has a way to retain desired behavior. So the thing now is whether FooAnalyzer is always _current_ and an app should choose a backwards version of it (if it wants to), vs if FooAnalyzer is _always the same_, and if you want to move forward you have to explicitly use a NewFooAnalyzer? Of course, when FooAnalyzer takes a Version, then an app only needs to change its Version CONSTANT, to get best behavior ... but as you point out, seems like we failed to implement that approach in our code already, which suggests this approach is not intuitive to our committers, so why do we expect our users to understand it ... I am +1 on either of the approaches (both get rid of Version.java). I don't feel bad with asking users to read CHANGES before they upgrade, and it does mean that FooAnalyzer always gives you the best behavior, which is important for new users or if you always re-index. Vs the second approach which always prefers backwards compatibility, and telling users to read the javadocs (and CHANGES) in order to find the best version of FooAnalyzer. There is another issue w/ a global Version CONSTANT, which today we encourage apps to use -- if you use two analyzers, but you want to work with a different Version of each (because of all sorts of reasons), having a global constant is bad. The explicit Foo45Analyzer (or Foo49Analyzer, whichever) lets you mix whichever versions that you want. Shai On Sun, Aug 3, 2014 at 3:02 PM, Robert Muir rcm...@gmail.com wrote: You don't read what i wrote. Read it again. On Sun, Aug 3, 2014 at 7:49 AM, Shai Erera ser...@gmail.com wrote: Yes, I agree that Foo49Analyzer is an odd name. Better if it was named FooAnalyzerWithNoApostrophe, and I'm fine if that Analyzer chose to name its different versions like that. But in the absence of better naming ideas, I proposed the Foo49Analyzer. If we already have such Analyzers, then we are in fact implementing that approach, only didn't make that decision globally. So whether it's odd or not, let's first agree if we are willing to have these analyzers in our code base (i.e. w/ the back-compat support). If we do, we can let each Analyzer decide on its naming. Analyzers aren't Codecs, I agree, and sticking the Lucene version in their name is probably not the best thing to do, as the Lucene version is more associated with the index format. But if a fixed Analyzer cannot come up w/ a better name, I think the Lucene version there is not that horrible. And, it lets us easily remove Version.java. Shai On Sun, Aug 3, 2014 at 2:43 PM, Robert Muir rcm...@gmail.com wrote: On Sat, Aug 2, 2014 at 12:41 PM, Shai Erera ser...@gmail.com wrote: Another proposal that I made on LUCENE-5859 is to get rid of Version (for Analyzers) and follow the solution we have with Codecs. If an Analyzer changes its runtime behavior, and e.g not marked @experimental, it can create a Foo49Analyzer with the new behavior. That way, apps are still safe when they upgrade, since their Foo45Analyzer still exists (but deprecated). And they can always copy a Foo45Analyzer when they upgrade to Lucene 6.0 where it no longer exists... with this approach, there's no single version across the app - it just uses the specific Analyzer impls. But the usability here would be really bad. For codecs there isn't much a better thing to name it anyway, and codecs are super-expert to change. For analyzers usability is paramount. I do think its ok to name _backwards_ compat tokenizer/tokenfilter classes this way. In fact its already this way in trunk for any back compat *actually doing something*: Lucene43NgramTokenizer, Lucene47WordDelimiterFilter. The Version parameters are just for show, not doing anything! - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6315) Remove SimpleOrderedMap
[ https://issues.apache.org/jira/browse/SOLR-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14083962#comment-14083962 ] Jack Krupansky commented on SOLR-6315: -- Is order a part of the contract for the usages of this class? I mean, the current Javadoc does explicitly say that repetition and null values are NOT a part of the contract, but it doesn't say that order, another feature of NameList, is not important, while the name itself says Ordered. Kind of ambiguous, so a first order (Hah!) of business is to clarify whether maintaining order is a part of the contract, and then to validate that contract with actual usages. Switching to map implies that order is no longer part of the contract, so it will be free to vary from release to release or between JVMs. Personally, I wish that Map was UnorderedMap, or even UnstableOrderMap, to make the contract crystal clear. In fact it would be great to have the ordering of serialization of Map be a seeded random test framework parameter to catch cases where the code or test cases have become dependent on order of map serialization or any other non-contract behavior for that matter. Will this change have ANY behavior change that will be visible to Solr application developers or users? Remove SimpleOrderedMap --- Key: SOLR-6315 URL: https://issues.apache.org/jira/browse/SOLR-6315 Project: Solr Issue Type: Improvement Components: clients - java Reporter: Shai Erera Assignee: Shai Erera Attachments: SOLR-6315.patch As I described on SOLR-912, SimpleOrderedMap is redundant and generally useless class, with confusing jdocs. We should remove it. I'll attach a patch shortly. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Lucene versioning logic
No, I didn't say any of this, please read it again :) Old back-compat *Tokenizer/Tokenfilter* are named this way. *Tokenizer/Tokenfilter* Only the old ones. Just to be clear: *Tokenizer/Tokenfilter* Their Factories still use Version to produce the right one (as we can't remove version from there, or we will have complaints from solr developers). So users who want the version-style back compat can just use the factories. really. On the other hand new users can do 'new LowerCaseFilter()' without the bullshit. For Analyzers, there is a setter. Users who want to use *OUR ANALYZERS* with back compat, call the setter. But its not mandatory-in-your-face-ctor. I am +1 to Ryan's proposal, so please look for more elaboration there. I am -1 to putting Versions in the name of Analyzers. On Sun, Aug 3, 2014 at 8:21 AM, Shai Erera ser...@gmail.com wrote: Oh, I misread this part I do think its ok to name ... -- replaced do with don't :). So you say that if we have a FooAnalyzer in 4.5 and change its behavior in 4.9, then we add a Foo45Analyzer as a back-compat support, and FooAnalyzer in 4.9 keeps its name, but with different behavior? That means that an app who didn't read CHANGES will be broken upon upgrade, but if it does read CHANGES, it at least has a way to retain desired behavior. So the thing now is whether FooAnalyzer is always _current_ and an app should choose a backwards version of it (if it wants to), vs if FooAnalyzer is _always the same_, and if you want to move forward you have to explicitly use a NewFooAnalyzer? Of course, when FooAnalyzer takes a Version, then an app only needs to change its Version CONSTANT, to get best behavior ... but as you point out, seems like we failed to implement that approach in our code already, which suggests this approach is not intuitive to our committers, so why do we expect our users to understand it ... I am +1 on either of the approaches (both get rid of Version.java). I don't feel bad with asking users to read CHANGES before they upgrade, and it does mean that FooAnalyzer always gives you the best behavior, which is important for new users or if you always re-index. Vs the second approach which always prefers backwards compatibility, and telling users to read the javadocs (and CHANGES) in order to find the best version of FooAnalyzer. There is another issue w/ a global Version CONSTANT, which today we encourage apps to use -- if you use two analyzers, but you want to work with a different Version of each (because of all sorts of reasons), having a global constant is bad. The explicit Foo45Analyzer (or Foo49Analyzer, whichever) lets you mix whichever versions that you want. Shai On Sun, Aug 3, 2014 at 3:02 PM, Robert Muir rcm...@gmail.com wrote: You don't read what i wrote. Read it again. On Sun, Aug 3, 2014 at 7:49 AM, Shai Erera ser...@gmail.com wrote: Yes, I agree that Foo49Analyzer is an odd name. Better if it was named FooAnalyzerWithNoApostrophe, and I'm fine if that Analyzer chose to name its different versions like that. But in the absence of better naming ideas, I proposed the Foo49Analyzer. If we already have such Analyzers, then we are in fact implementing that approach, only didn't make that decision globally. So whether it's odd or not, let's first agree if we are willing to have these analyzers in our code base (i.e. w/ the back-compat support). If we do, we can let each Analyzer decide on its naming. Analyzers aren't Codecs, I agree, and sticking the Lucene version in their name is probably not the best thing to do, as the Lucene version is more associated with the index format. But if a fixed Analyzer cannot come up w/ a better name, I think the Lucene version there is not that horrible. And, it lets us easily remove Version.java. Shai On Sun, Aug 3, 2014 at 2:43 PM, Robert Muir rcm...@gmail.com wrote: On Sat, Aug 2, 2014 at 12:41 PM, Shai Erera ser...@gmail.com wrote: Another proposal that I made on LUCENE-5859 is to get rid of Version (for Analyzers) and follow the solution we have with Codecs. If an Analyzer changes its runtime behavior, and e.g not marked @experimental, it can create a Foo49Analyzer with the new behavior. That way, apps are still safe when they upgrade, since their Foo45Analyzer still exists (but deprecated). And they can always copy a Foo45Analyzer when they upgrade to Lucene 6.0 where it no longer exists... with this approach, there's no single version across the app - it just uses the specific Analyzer impls. But the usability here would be really bad. For codecs there isn't much a better thing to name it anyway, and codecs are super-expert to change. For analyzers usability is paramount. I do think its ok to name _backwards_ compat tokenizer/tokenfilter classes this way. In fact its already this way in trunk for any back compat *actually doing
[JENKINS] Lucene-Solr-NightlyTests-4.x - Build # 590 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-4.x/590/ 2 tests failed. REGRESSION: org.apache.solr.cloud.OverseerTest.testOverseerFailure Error Message: Could not register as the leader because creating the ephemeral registration node in ZooKeeper failed Stack Trace: org.apache.solr.common.SolrException: Could not register as the leader because creating the ephemeral registration node in ZooKeeper failed at __randomizedtesting.SeedInfo.seed([53BB0B78D4E89FC5:57B3848BC64D70E4]:0) at org.apache.solr.cloud.ShardLeaderElectionContextBase.runLeaderProcess(ElectionContext.java:144) at org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163) at org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125) at org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:155) at org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:314) at org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221) at org.apache.solr.cloud.OverseerTest$MockZKController.publishState(OverseerTest.java:155) at org.apache.solr.cloud.OverseerTest.testOverseerFailure(OverseerTest.java:660) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at
Re: Lucene versioning logic
OK I see, it's the Tokenizers and Filters that usually change, the Analyzer only needs Version to determine which TokenStream chain to return, and so we achieve that w/ Ryan's proposal of setVersion(). I'd still feel better if Version was a final setting on an Analyzer, i.e. that a single Analyzer instance always behaves consistently, and cannot alternate behavior down-stream if someone called setVersion(). But this is a really stupid thing to do. Maybe setVersion() can return an Analyzer, so you're sure that instance is not modifiable. But maybe this is just over engineering... I'm +0.5 to that. I prefer Version to be mandatory somehow (class name, ctor argument), but I can live with setVersion as well... Shai On Sun, Aug 3, 2014 at 3:30 PM, Robert Muir rcm...@gmail.com wrote: No, I didn't say any of this, please read it again :) Old back-compat *Tokenizer/Tokenfilter* are named this way. *Tokenizer/Tokenfilter* Only the old ones. Just to be clear: *Tokenizer/Tokenfilter* Their Factories still use Version to produce the right one (as we can't remove version from there, or we will have complaints from solr developers). So users who want the version-style back compat can just use the factories. really. On the other hand new users can do 'new LowerCaseFilter()' without the bullshit. For Analyzers, there is a setter. Users who want to use *OUR ANALYZERS* with back compat, call the setter. But its not mandatory-in-your-face-ctor. I am +1 to Ryan's proposal, so please look for more elaboration there. I am -1 to putting Versions in the name of Analyzers. On Sun, Aug 3, 2014 at 8:21 AM, Shai Erera ser...@gmail.com wrote: Oh, I misread this part I do think its ok to name ... -- replaced do with don't :). So you say that if we have a FooAnalyzer in 4.5 and change its behavior in 4.9, then we add a Foo45Analyzer as a back-compat support, and FooAnalyzer in 4.9 keeps its name, but with different behavior? That means that an app who didn't read CHANGES will be broken upon upgrade, but if it does read CHANGES, it at least has a way to retain desired behavior. So the thing now is whether FooAnalyzer is always _current_ and an app should choose a backwards version of it (if it wants to), vs if FooAnalyzer is _always the same_, and if you want to move forward you have to explicitly use a NewFooAnalyzer? Of course, when FooAnalyzer takes a Version, then an app only needs to change its Version CONSTANT, to get best behavior ... but as you point out, seems like we failed to implement that approach in our code already, which suggests this approach is not intuitive to our committers, so why do we expect our users to understand it ... I am +1 on either of the approaches (both get rid of Version.java). I don't feel bad with asking users to read CHANGES before they upgrade, and it does mean that FooAnalyzer always gives you the best behavior, which is important for new users or if you always re-index. Vs the second approach which always prefers backwards compatibility, and telling users to read the javadocs (and CHANGES) in order to find the best version of FooAnalyzer. There is another issue w/ a global Version CONSTANT, which today we encourage apps to use -- if you use two analyzers, but you want to work with a different Version of each (because of all sorts of reasons), having a global constant is bad. The explicit Foo45Analyzer (or Foo49Analyzer, whichever) lets you mix whichever versions that you want. Shai On Sun, Aug 3, 2014 at 3:02 PM, Robert Muir rcm...@gmail.com wrote: You don't read what i wrote. Read it again. On Sun, Aug 3, 2014 at 7:49 AM, Shai Erera ser...@gmail.com wrote: Yes, I agree that Foo49Analyzer is an odd name. Better if it was named FooAnalyzerWithNoApostrophe, and I'm fine if that Analyzer chose to name its different versions like that. But in the absence of better naming ideas, I proposed the Foo49Analyzer. If we already have such Analyzers, then we are in fact implementing that approach, only didn't make that decision globally. So whether it's odd or not, let's first agree if we are willing to have these analyzers in our code base (i.e. w/ the back-compat support). If we do, we can let each Analyzer decide on its naming. Analyzers aren't Codecs, I agree, and sticking the Lucene version in their name is probably not the best thing to do, as the Lucene version is more associated with the index format. But if a fixed Analyzer cannot come up w/ a better name, I think the Lucene version there is not that horrible. And, it lets us easily remove Version.java. Shai On Sun, Aug 3, 2014 at 2:43 PM, Robert Muir rcm...@gmail.com wrote: On Sat, Aug 2, 2014 at 12:41 PM, Shai Erera ser...@gmail.com wrote: Another proposal that I made on LUCENE-5859 is to get rid of Version (for
[jira] [Commented] (SOLR-6315) Remove SimpleOrderedMap
[ https://issues.apache.org/jira/browse/SOLR-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14083970#comment-14083970 ] Shai Erera commented on SOLR-6315: -- First, this class is entirely bogus the way I see it. Its javadocs are completely unrelated to the class itself. SimpleOrderedMap = NamedList ... well, at least if you ignore JSONResponseWriter hack around it. This issue is about fixing that -- removing that bogus class and make the rest of the code use the right thing, which is NamedList. As for ordering, of NamedList in general, I didn't see places in the code that rely on ordering between keys. That is if you set the keys k2 and k1, that anywhere some piece of code relies on getting them in that order. The only ordering I see is between a certain key's _values_. For Solr users, changing the default json.nl to map from flat will require users to review their app on upgrade. But as I noted, since the majority of the code uses SimpleOrderedMap, and since JSONResponseWriter *always* serializes it as a map, totally ignoring json.nl, I doubt if that's a real backwards break. I'd welcome any solution that will allow us to get rid of that redundant class and keep Solr act as it does today ... Remove SimpleOrderedMap --- Key: SOLR-6315 URL: https://issues.apache.org/jira/browse/SOLR-6315 Project: Solr Issue Type: Improvement Components: clients - java Reporter: Shai Erera Assignee: Shai Erera Attachments: SOLR-6315.patch As I described on SOLR-912, SimpleOrderedMap is redundant and generally useless class, with confusing jdocs. We should remove it. I'll attach a patch shortly. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6315) Remove SimpleOrderedMap
[ https://issues.apache.org/jira/browse/SOLR-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14083981#comment-14083981 ] Yonik Seeley commented on SOLR-6315: bq. Will this change have ANY behavior change that will be visible to Solr application developers or users? Most certainly. Aside from internal interface changes, it will change a great deal of JSON output. I doubt tests would pass. But this is also probably under-tested (anything that only tests via XML won't see the difference). Remove SimpleOrderedMap --- Key: SOLR-6315 URL: https://issues.apache.org/jira/browse/SOLR-6315 Project: Solr Issue Type: Improvement Components: clients - java Reporter: Shai Erera Assignee: Shai Erera Attachments: SOLR-6315.patch As I described on SOLR-912, SimpleOrderedMap is redundant and generally useless class, with confusing jdocs. We should remove it. I'll attach a patch shortly. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6016) Failure indexing exampledocs with example-schemaless mode
[ https://issues.apache.org/jira/browse/SOLR-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vitaliy Zhovtyuk updated SOLR-6016: --- Attachment: SOLR-6016.patch Added test with random order of values on schemaless example config, asserted thrown exception Failure indexing exampledocs with example-schemaless mode - Key: SOLR-6016 URL: https://issues.apache.org/jira/browse/SOLR-6016 Project: Solr Issue Type: Bug Components: documentation, Schema and Analysis Affects Versions: 4.7.2, 4.8 Reporter: Shalin Shekhar Mangar Attachments: SOLR-6016.patch, SOLR-6016.patch, solr.log Steps to reproduce: # cd example; java -Dsolr.solr.home=example-schemaless/solr -jar start.jar # cd exampledocs; java -jar post.jar *.xml Output from post.jar {code} Posting files to base url http://localhost:8983/solr/update using content-type application/xml.. POSTing file gb18030-example.xml POSTing file hd.xml POSTing file ipod_other.xml SimplePostTool: WARNING: Solr returned an error #400 Bad Request SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 400 for URL: http://localhost:8983/solr/update POSTing file ipod_video.xml SimplePostTool: WARNING: Solr returned an error #400 Bad Request SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 400 for URL: http://localhost:8983/solr/update POSTing file manufacturers.xml POSTing file mem.xml SimplePostTool: WARNING: Solr returned an error #400 Bad Request SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 400 for URL: http://localhost:8983/solr/update POSTing file money.xml POSTing file monitor2.xml SimplePostTool: WARNING: Solr returned an error #400 Bad Request SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 400 for URL: http://localhost:8983/solr/update POSTing file monitor.xml SimplePostTool: WARNING: Solr returned an error #400 Bad Request SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 400 for URL: http://localhost:8983/solr/update POSTing file mp500.xml SimplePostTool: WARNING: Solr returned an error #400 Bad Request SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 400 for URL: http://localhost:8983/solr/update POSTing file sd500.xml SimplePostTool: WARNING: Solr returned an error #400 Bad Request SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 400 for URL: http://localhost:8983/solr/update POSTing file solr.xml POSTing file utf8-example.xml POSTing file vidcard.xml SimplePostTool: WARNING: Solr returned an error #400 Bad Request SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 400 for URL: http://localhost:8983/solr/update 14 files indexed. COMMITting Solr index changes to http://localhost:8983/solr/update.. Time spent: 0:00:00.401 {code} Exceptions in Solr (I am pasting just one of them): {code} 5105 [qtp697879466-14] ERROR org.apache.solr.core.SolrCore – org.apache.solr.common.SolrException: ERROR: [doc=EN7800GTX/2DHTV/256M] Error adding field 'price'='479.95' msg=For input string: 479.95 at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:167) at org.apache.solr.update.AddUpdateCommand.getLuceneDocument(AddUpdateCommand.java:77) at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:234) at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:160) at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69) at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51) .. Caused by: java.lang.NumberFormatException: For input string: 479.95 at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Long.parseLong(Long.java:441) at java.lang.Long.parseLong(Long.java:483) at org.apache.solr.schema.TrieField.createField(TrieField.java:609) at org.apache.solr.schema.TrieField.createFields(TrieField.java:660) {code} The full solr.log is attached. I understand why these errors occur but since we ship example data with Solr to demonstrate our core features, I expect that indexing exampledocs should work without errors. -- This message was sent by Atlassian
[jira] [Commented] (SOLR-6016) Failure indexing exampledocs with example-schemaless mode
[ https://issues.apache.org/jira/browse/SOLR-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084004#comment-14084004 ] Mark Miller commented on SOLR-6016: --- bq. The schemaless mode is useful for one reason: To automatically generate a schema/mapping that you can later modify! I disagree. As you say, a tool that looks at a bunch of sample data is better for that. The tool, while cool, would not provide for the only reason schmealess is actually useful - the new user and the prototyper. You should be able to pass a flag that puts Solr in 'easy' mode like this because that is the best getting started experience. That is the easiest way to play around right away. There is a lot of benefit to take obstacles out of the way as users learn, and then having them go back once they have firmer ideas in their head. IMO, schemaless is *very* valuable for getting started and random prototyping. It's not very useful for production setups. Failure indexing exampledocs with example-schemaless mode - Key: SOLR-6016 URL: https://issues.apache.org/jira/browse/SOLR-6016 Project: Solr Issue Type: Bug Components: documentation, Schema and Analysis Affects Versions: 4.7.2, 4.8 Reporter: Shalin Shekhar Mangar Attachments: SOLR-6016.patch, SOLR-6016.patch, solr.log Steps to reproduce: # cd example; java -Dsolr.solr.home=example-schemaless/solr -jar start.jar # cd exampledocs; java -jar post.jar *.xml Output from post.jar {code} Posting files to base url http://localhost:8983/solr/update using content-type application/xml.. POSTing file gb18030-example.xml POSTing file hd.xml POSTing file ipod_other.xml SimplePostTool: WARNING: Solr returned an error #400 Bad Request SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 400 for URL: http://localhost:8983/solr/update POSTing file ipod_video.xml SimplePostTool: WARNING: Solr returned an error #400 Bad Request SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 400 for URL: http://localhost:8983/solr/update POSTing file manufacturers.xml POSTing file mem.xml SimplePostTool: WARNING: Solr returned an error #400 Bad Request SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 400 for URL: http://localhost:8983/solr/update POSTing file money.xml POSTing file monitor2.xml SimplePostTool: WARNING: Solr returned an error #400 Bad Request SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 400 for URL: http://localhost:8983/solr/update POSTing file monitor.xml SimplePostTool: WARNING: Solr returned an error #400 Bad Request SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 400 for URL: http://localhost:8983/solr/update POSTing file mp500.xml SimplePostTool: WARNING: Solr returned an error #400 Bad Request SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 400 for URL: http://localhost:8983/solr/update POSTing file sd500.xml SimplePostTool: WARNING: Solr returned an error #400 Bad Request SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 400 for URL: http://localhost:8983/solr/update POSTing file solr.xml POSTing file utf8-example.xml POSTing file vidcard.xml SimplePostTool: WARNING: Solr returned an error #400 Bad Request SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 400 for URL: http://localhost:8983/solr/update 14 files indexed. COMMITting Solr index changes to http://localhost:8983/solr/update.. Time spent: 0:00:00.401 {code} Exceptions in Solr (I am pasting just one of them): {code} 5105 [qtp697879466-14] ERROR org.apache.solr.core.SolrCore – org.apache.solr.common.SolrException: ERROR: [doc=EN7800GTX/2DHTV/256M] Error adding field 'price'='479.95' msg=For input string: 479.95 at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:167) at org.apache.solr.update.AddUpdateCommand.getLuceneDocument(AddUpdateCommand.java:77) at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:234) at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:160) at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69) at
[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #1178: POMs out of sync
Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/1178/ 1 tests failed. FAILED: org.apache.solr.cloud.MultiThreadedOCPTest.testDistribSearch Error Message: We have a failed SPLITSHARD task Stack Trace: java.lang.AssertionError: We have a failed SPLITSHARD task at __randomizedtesting.SeedInfo.seed([156D764E36BDF0F2:948BF85641E290CE]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.solr.cloud.MultiThreadedOCPTest.testTaskExclusivity(MultiThreadedOCPTest.java:125) at org.apache.solr.cloud.MultiThreadedOCPTest.doTest(MultiThreadedOCPTest.java:71) Build Log: [...truncated 52712 lines...] BUILD FAILED /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/build.xml:490: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/build.xml:182: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/extra-targets.xml:77: Java returned: 1 Total time: 173 minutes 57 seconds Build step 'Invoke Ant' marked build as failure Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6315) Remove SimpleOrderedMap
[ https://issues.apache.org/jira/browse/SOLR-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084016#comment-14084016 ] Yonik Seeley commented on SOLR-6315: bq. First, this class is entirely bogus the way I see it. Its javadocs are completely unrelated to the class itself. You know... I'm so tired of this style of communication (that seems like it has become the norm in Lucene and is now expanding in Solr) that I almost didn't respond. The validity of one's arguments has become conflated with the force and aggressiveness with which the argument is made. It almost feels like it's a wolf pack complete with alpha male. If you're not forceful and aggressive, you risk getting steamrolled. Hand in hand with adjectives designed to make more of an impact... shit, broken, horrible,fucked up, terrible, bogus, stupid, etc... Making strong assertions also hampers collaboration since it causes unnecessary strong emotional attachment to initial positions (http://en.wikipedia.org/wiki/Confirmation_bias#Persistence_of_discredited_beliefs) We used to speak in terms of improving things, and avoided denigrating contributions (new or past). The history of some of these cultural changes is interesting (esp the need to up the ante via adjectives... everything has become horrible or completely broken): http://markmail.org/search/?q=shit+list%3Aorg.apache.lucene.java-dev http://markmail.org/search/?q=horrible+list%3Aorg.apache.lucene.java-dev http://markmail.org/search/?q=stupid+list%3Aorg.apache.lucene.java-dev It's also instructive to go back and look at the first time that horrible was even used on the list (2004): {quote} I thought that it might be a good candidate for Lucene 2 as the FSDirectory code is horrible and non-J2EE compliant. Your constructive criticism is greatly appreciated! Have a nice day, Doug {quote} I'm bringing this up because I think some committers have gone down this supercharge all your arguments path without consciously realizing it because it has become more common. To those who recognize and resist taking that path, thank you. Remove SimpleOrderedMap --- Key: SOLR-6315 URL: https://issues.apache.org/jira/browse/SOLR-6315 Project: Solr Issue Type: Improvement Components: clients - java Reporter: Shai Erera Assignee: Shai Erera Attachments: SOLR-6315.patch As I described on SOLR-912, SimpleOrderedMap is redundant and generally useless class, with confusing jdocs. We should remove it. I'll attach a patch shortly. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6315) Remove SimpleOrderedMap
[ https://issues.apache.org/jira/browse/SOLR-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084023#comment-14084023 ] Mark Miller commented on SOLR-6315: --- bq. supercharge all your arguments path without consciously realizing it because it has become more common. I think it's up to each person to decide which of their arguments they want to super charge with powerful adjectives such as horrible or stupid. Easy obstructionism became so common as well, that I think it's a natural reaction. It's also just how some of our committers communicate - and there has been some of it in some of the earliest committers if you don't cherry pick the terms. Let's not pretend it's some brand new thing. It became so difficult to improve some things, it's no wonder a ton of people have ended up somewhat frustrated in my opinion. If Shai thinks this class is now spurious or bogus, why shouldn't he say so? Perhaps it is now. Perhaps it's not. I don't see the comment as negative or supercharged at all. He's discussing pretty politely. Judging the whole statement and even the whole series of statements by this one word just seems silly. bq. http://markmail.org/search/?q=horrible+list%3Aorg.apache.lucene.java-dev Reading through these, most of it is fine usage IMO. All that said, I'd rather see this kind of effort spent on something else and just leave poor SimpleOrderedMap alone. Small win IMO, but to each his own. Remove SimpleOrderedMap --- Key: SOLR-6315 URL: https://issues.apache.org/jira/browse/SOLR-6315 Project: Solr Issue Type: Improvement Components: clients - java Reporter: Shai Erera Assignee: Shai Erera Attachments: SOLR-6315.patch As I described on SOLR-912, SimpleOrderedMap is redundant and generally useless class, with confusing jdocs. We should remove it. I'll attach a patch shortly. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6315) Remove SimpleOrderedMap
[ https://issues.apache.org/jira/browse/SOLR-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084031#comment-14084031 ] Yonik Seeley commented on SOLR-6315: just how some people communicate and the idea that it's OK to say anything that comes to mind (because that's what you think) can be used to excuse any form of communication whatsoever. Everyone was just saying what they thought in LUCENE-5859 too. No thank you. It's not like single instances of this are fine or not fine... I was trying to make a general point how pervasive use of supercharging arguments fosters conflict and inhibits cooperation and consensus. Remove SimpleOrderedMap --- Key: SOLR-6315 URL: https://issues.apache.org/jira/browse/SOLR-6315 Project: Solr Issue Type: Improvement Components: clients - java Reporter: Shai Erera Assignee: Shai Erera Attachments: SOLR-6315.patch As I described on SOLR-912, SimpleOrderedMap is redundant and generally useless class, with confusing jdocs. We should remove it. I'll attach a patch shortly. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Windows (32bit/jdk1.8.0_20-ea-b23) - Build # 4227 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Windows/4227/ Java: 32bit/jdk1.8.0_20-ea-b23 -client -XX:+UseG1GC 1 tests failed. REGRESSION: org.apache.solr.cloud.BasicDistributedZkTest.testDistribSearch Error Message: commitWithin did not work on node: http://127.0.0.1:58942/collection1 expected:68 but was:67 Stack Trace: java.lang.AssertionError: commitWithin did not work on node: http://127.0.0.1:58942/collection1 expected:68 but was:67 at __randomizedtesting.SeedInfo.seed([4BE8DDAB2D9629DC:CA0E53B35AC949E0]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.apache.solr.cloud.BasicDistributedZkTest.doTest(BasicDistributedZkTest.java:346) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:865) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at
[JENKINS] Lucene-Solr-Tests-4.x-Java7 - Build # 2046 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-Tests-4.x-Java7/2046/ 1 tests failed. FAILED: org.apache.solr.cloud.MultiThreadedOCPTest.testDistribSearch Error Message: We have a failed SPLITSHARD task Stack Trace: java.lang.AssertionError: We have a failed SPLITSHARD task at __randomizedtesting.SeedInfo.seed([EA97083AC58390AA:6B718622B2DCF096]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.solr.cloud.MultiThreadedOCPTest.testTaskExclusivity(MultiThreadedOCPTest.java:125) at org.apache.solr.cloud.MultiThreadedOCPTest.doTest(MultiThreadedOCPTest.java:71) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:867) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at
[jira] [Comment Edited] (LUCENE-5861) CachingTokenFilter should use ArrayList not LinkedList
[ https://issues.apache.org/jira/browse/LUCENE-5861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14083617#comment-14083617 ] Paul Elschot edited comment on LUCENE-5861 at 8/3/14 6:10 PM: -- TeeSinkTokenFilter.SinkTokenStream in the analysis/common module (package o.a.l.analysis.sinks) uses a LinkedList, too. I also prefer an ArrayList, but I used a LinkedList in PrefillTokenStream of LUCENE-5687 because the existing code uses it and I don't know of any existing performance tests for this. To grow an ArrayList would it be good to use ArrayUtil.oversize() ? was (Author: paul.elsc...@xs4all.nl): TeeSinkTokenFilter.SinkTokenStream in the analysis comon module (o.a.l.analysis.sinks) uses a LinkedList, too. I also prefer an ArrayList, but I used a LinkedList also in the PrefillTokenStream of LUCENE-5687 because the existing code uses it and I don't know of any existing performance tests for this. To grow an ArrayList would it be good to use ArrayUtil.oversize() ? CachingTokenFilter should use ArrayList not LinkedList -- Key: LUCENE-5861 URL: https://issues.apache.org/jira/browse/LUCENE-5861 Project: Lucene - Core Issue Type: Improvement Components: modules/analysis Reporter: David Smiley Assignee: David Smiley Priority: Minor CachingTokenFilter, to my surprise, puts each new AttributeSource.State onto a LinkedList. I think it should be an ArrayList. On large fields that get analyzed, there can be a ton of State objects to cache. I also observe that State is itself a linked list of other State objects. Perhaps we could take this one step further and do parallel arrays of AttributeImpl, thereby bypassing State. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5687) Add PrefillTokenStream in analysis common module
[ https://issues.apache.org/jira/browse/LUCENE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084062#comment-14084062 ] Paul Elschot commented on LUCENE-5687: -- This depends on LUCENE-5861 for the use of ArrayList instead of LinkedList to save the token attribute states. Currently a LinkedList is used here. Add PrefillTokenStream in analysis common module Key: LUCENE-5687 URL: https://issues.apache.org/jira/browse/LUCENE-5687 Project: Lucene - Core Issue Type: Improvement Components: modules/analysis Affects Versions: 4.9 Reporter: Paul Elschot Priority: Minor Fix For: 4.9 -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6315) Remove SimpleOrderedMap
[ https://issues.apache.org/jira/browse/SOLR-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084077#comment-14084077 ] Shai Erera commented on SOLR-6315: -- [~yo...@apache.org], please excuse me if I used the word bogus inappropriately. I'll admit that as a non-native English speaker, I often make mistakes. I assumed that the word bogus is just a synonym for fake, but perhaps it's used in the English language in negative contexts. If this is the case, this was not my intention ... you know me better than that. But I do think this class pretends to be something that it is not. Its documentation is misleading and conflicting (use if it access by key is more important... and this class does not provide efficient lookup by key). The class itself contains absolutely no code beyond NamedList. It contains 3 ctors which delegate to super() as-is, and a clone() which is exactly like NamedList's, only returns a SimpleOrderedMap. Also, the class's name is misleading -- see Jack's comment above and the word 'order'. It looks as if the class pretends to be an ordered map, where its jdocs specifically say that you should use it if you prefer access by key than maintaining order... I was merely pointing facts about the class, and why I think it should be removed. Maybe once it had another role, or actually did something different than NamedList, but it isn't today. Having said all that, you don't see me commit anything, or jump to conclusions .. I sincerely ask for guidance from people who know this code better than I do, like you, what's the best way to deal with it. [~markrmil...@gmail.com], I don't think that leaving this class alone is a good resolution in this case. I consider redundant code as something that needs to go. Especially if it's confusing, and more so public API. Even if not a user-facing API, it's still an API w/ many uses in the code. I realize there are probably other places I can contribute to Solr, much more important ones, but this is where I started when I reviewed the code. I'm BTW not trying to fix this class, but to fix/simplify NamedList overall (see my last comment on SOLR-912). This helps me get familiar with the code, and I'm sure that I'll learn more as I get comments about what I'm trying to do. Sometimes a fresh set of eyes lead to simplifications that others just don't have the time to notice. More so, I feel that JSONResponseWriter misbehaves. I'm sure that there are good reasons to it, too. But the way I see it, no matter what style you ask it to output, if it encounters a SimpleOrderedMap, it always output it as a map. This feels wrong to me ... And worse, if I'm a Solr developer and need to create a NamedList instance, I need to choose between Named to Simple because of how they are output differently. And, the tests that currently fail, only fail because they assume some undocumented behavior of JSONResponseWriter (maybe as Yonik said, it's because this isn't well tested). Why would a test assume that the JSON response it parsed, is parsed to a Map, when the default style is flat? And of course, as soon as I removed the instanceof check in JSONResponseWriter, these tests failed to cast a List to a Map. This leads me to believe that the effective default style is map, no matter what the code says. As a side comment I stated that I find it odd that a JSON response writer/formatter's default style is not common JSON. Not that an array output is not valid, just that it's not common. And I admit that I don't know the reason for the flat style, but I'm sure if it's there, it's probably used by someone. Look at JSONWriterTest.toJson(). It creates a SolrQueryResponse (which uses SimpleOrderedMap internally) and a JSONResponseWriter with style arrarr (array-of-array). Yet it expects an output that looks like a map. In fact, it's a mixed style of map and flat/arr. This is because JSONResponseWriter completely ignores the style once it encounters the SimpleOrderedMap. I don't understand if this is really an expected behavior, or this test tests buggy behavior... why would such a test ever pass, when the output is not what it expects? So again, this started on my part as a simple attempt to improve and cleanup NamedList and friends. But I now wonder if our JSON formatter does the right thing or not. Remove SimpleOrderedMap --- Key: SOLR-6315 URL: https://issues.apache.org/jira/browse/SOLR-6315 Project: Solr Issue Type: Improvement Components: clients - java Reporter: Shai Erera Assignee: Shai Erera Attachments: SOLR-6315.patch As I described on SOLR-912, SimpleOrderedMap is redundant and generally useless class, with confusing jdocs. We should remove it. I'll attach a patch shortly. -- This message was sent by
[jira] [Commented] (LUCENE-5847) Improved implementation of language models in lucene
[ https://issues.apache.org/jira/browse/LUCENE-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084078#comment-14084078 ] Hadas Raviv commented on LUCENE-5847: - Implementing a specific scorer for Dirichlet or JM is problematic in my opinion, since language models can be used for calculating a ranking score in response to a BooleanQuery composed of several terms, TermQuery composed of a single term or SpanQuery composed of near by terms. In order to have the correct score for each of the query types, multiple specific scorers should be implemented. This would mean that we would have BackgroundAwareBooleanScorer , BackgroundAwareTermScorer etc. Effectively, it would duplicate many existing scorers. If anyone has a suggestion for a design that does not duplicate scorers while keeping the scorer interface unchanged please share and I will try to implement it. Improved implementation of language models in lucene - Key: LUCENE-5847 URL: https://issues.apache.org/jira/browse/LUCENE-5847 Project: Lucene - Core Issue Type: Improvement Components: core/search Reporter: Hadas Raviv Priority: Minor Fix For: 5.0 Attachments: LUCENE-2507.patch The current implementation of language models in lucene is based on the paper A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval by Zhai and Lafferty ('01). Specifically, LMDiricheltSimilarity and LMJelinikMercerSimilarity use a normalized smoothed score for a matching term in a document, as suggested in the above mentioned paper. However, lucene doesn't assign a score to query terms that do not appear in a matched document. According to the pure LM approach, these terms should be assigned with a collection probability background score. If one uses the Jelinik Mercer smoothing method, the final result list produced by lucene is rank equivalent to the one that would have been created by a full LM implementation. However, this is not the case for Dirichlet smoothing method, because the background score is document dependent. Documents in which not all query terms appear, are missing the document-dependant background score for the missing terms. This component affects the final ranking of documents in the list. Since LM is a baseline method in many works in the IR research field, I attach a patch that implements a full LM in lucene. The basic issue that should be addressed here is assigning a document with a score that depends on *all* the query terms, collection statistics and the document length. The general idea of what I did is adding a new getBackGroundScore(int docID) method to similarity, scorer and bulkScorer. Than, when a collector assigns a score to a document (score = scorer.score()) I added the backgound score (score=scorer.score()+scorer.background(doc)) that is assigned by the similarity class used for ranking. The patch also includes a correction of the document length such that it will be the real document length and not the encoded one. It is required for the full LM implementation. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6315) Remove SimpleOrderedMap
[ https://issues.apache.org/jira/browse/SOLR-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084079#comment-14084079 ] Mark Miller commented on SOLR-6315: --- bq. I don't think that leaving this class alone is a good resolution in this case. Oh, I'm not suggesting a resolution. I'm just giving my opinion on the effort to remove this class and maintain reasonable back compat. I don't feel strongly about this issue either way. Remove SimpleOrderedMap --- Key: SOLR-6315 URL: https://issues.apache.org/jira/browse/SOLR-6315 Project: Solr Issue Type: Improvement Components: clients - java Reporter: Shai Erera Assignee: Shai Erera Attachments: SOLR-6315.patch As I described on SOLR-912, SimpleOrderedMap is redundant and generally useless class, with confusing jdocs. We should remove it. I'll attach a patch shortly. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4396) BooleanScorer should sometimes be used for MUST clauses
[ https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084081#comment-14084081 ] Paul Elschot commented on LUCENE-4396: -- I cannot quickly find recent java code discussed here, in spite of the github commits mentioned. Would it be possible to provide an easier reference to the recent code, e.g. a pull request or the name of a github repo containing this? BooleanScorer should sometimes be used for MUST clauses --- Key: LUCENE-4396 URL: https://issues.apache.org/jira/browse/LUCENE-4396 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Attachments: And.tasks, AndOr.tasks, AndOr.tasks, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, SIZE.perf, all.perf, luceneutil-score-equal.patch, luceneutil-score-equal.patch, stat.cpp, stat.cpp Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT. If there is one or more MUST clauses we always use BooleanScorer2. But I suspect that unless the MUST clauses have very low hit count compared to the other clauses, that BooleanScorer would perform better than BooleanScorer2. BooleanScorer still has some vestiges from when it used to handle MUST so it shouldn't be hard to bring back this capability ... I think the challenging part might be the heuristics on when to use which (likely we would have to use firstDocID as proxy for total hit count). Likely we should also have BooleanScorer sometimes use .advance() on the subs in this case, eg if suddenly the MUST clause skips 100 docs then you want to .advance() all the SHOULD clauses. I won't have near term time to work on this so feel free to take it if you are inspired! -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6315) Remove SimpleOrderedMap
[ https://issues.apache.org/jira/browse/SOLR-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084095#comment-14084095 ] Yonik Seeley commented on SOLR-6315: If one is unsure of something, questions will go further than bold assertions, and avoid putting unnecessary stakes in the ground... e.g. It seems like SimpleOrderedMap doesn't do anything different than NamedList. Can it be removed? Assertions like Its javadocs are completely unrelated to the class itself. leave me totally at a loss of where to begin... it's like saying it's all just wrong. Excuse us, we tried to make the javadoc as clear as possible. So, to the technical issues explaining why the things are the way they are: 1) Solr started with XML only, and NamedList was a container to hold data for XML serialization. NamedList appeared all over in the code. 2) When we added JSON support, we realized that for some data, one would want it serialized as a map, and for other data, one would want it serialized as something that would preserve order (unfortunately many/most clients that consume JSON do not preserve order of objects). 3) We added a subclass of NamedList, called SimpleOrderedMap to make this semantic distinction (think of it like a marker interface...). 4) The JSONResponseWriter hack you refer to is the entire point of the class... to make the semantic distinction to serializers. So changing all uses of SimpleOrderedMap to NamedList would lose all that semantic information (and break back compat for anything other than XML responses). Some of the javadoc snippets are being taken out of context. I'll quote the full javadoc for both NamedList and SimpleOrderedMap for people's reference: NamedList: {code} /** * A simple container class for modeling an ordered list of name/value pairs. * * p * Unlike Maps: * /p * ul * liNames may be repeated/li * liOrder of elements is maintained/li * liElements may be accessed by numeric index/li * liNames and Values can both be null/li * /ul * * p * A NamedList provides fast access by element number, but not by name. * /p * p * When a NamedList is serialized, order is considered more important than access * by key, so ResponseWriters that output to a format such as JSON will normally * choose a data structure that allows order to be easily preserved in various * clients (i.e. not a straight map). * If access by key is more important for serialization, see {@link SimpleOrderedMap}, * or simply use a regular {@link Map} * /p * */ {code} SimpleOrderedMap: {code} /** codeSimpleOrderedMap/code is a {@link NamedList} where access by key is more * important than maintaining order when it comes to representing the * held data in other forms, as ResponseWriters normally do. * It's normally not a good idea to repeat keys or use null keys, but this * is not enforced. If key uniqueness enforcement is desired, use a regular {@link Map}. * p * For example, a JSON response writer may choose to write a SimpleOrderedMap * as {foo:10,bar:20} and may choose to write a NamedList as * [foo,10,bar,20]. An XML response writer may choose to render both * the same way. * /p * p * This class does not provide efficient lookup by key, it's main purpose is * to hold data to be serialized. It aims to minimize overhead and to be * efficient at adding new elements. * /p */ {code} Are there changes to the javadoc we can make that would make it more clear? Remove SimpleOrderedMap --- Key: SOLR-6315 URL: https://issues.apache.org/jira/browse/SOLR-6315 Project: Solr Issue Type: Improvement Components: clients - java Reporter: Shai Erera Assignee: Shai Erera Attachments: SOLR-6315.patch As I described on SOLR-912, SimpleOrderedMap is redundant and generally useless class, with confusing jdocs. We should remove it. I'll attach a patch shortly. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5736) Separate the classifiers to online and caching where possible
[ https://issues.apache.org/jira/browse/LUCENE-5736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergő Törcsvári updated LUCENE-5736: Attachment: 0803-caching.patch Separate the classifiers to online and caching where possible - Key: LUCENE-5736 URL: https://issues.apache.org/jira/browse/LUCENE-5736 Project: Lucene - Core Issue Type: Sub-task Components: modules/classification Reporter: Gergő Törcsvári Assignee: Tommaso Teofili Attachments: 0803-caching.patch, CachingNaiveBayesClassifier.java The Lucene classifier implementations are now near onlines if they get a near realtime reader. It is good for the users whoes have a continously changing dataset, but slow for not changing datasets. The idea is: What if we implement a cache and speed up the results where it is possible. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5699) Lucene classification score calculation normalize and return lists
[ https://issues.apache.org/jira/browse/LUCENE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergő Törcsvári updated LUCENE-5699: Attachment: 0803-base.patch Lucene classification score calculation normalize and return lists -- Key: LUCENE-5699 URL: https://issues.apache.org/jira/browse/LUCENE-5699 Project: Lucene - Core Issue Type: Sub-task Components: modules/classification Reporter: Gergő Törcsvári Assignee: Tommaso Teofili Attachments: 06-06-5699.patch, 0730.patch, 0803-base.patch Now the classifiers can return only the best matching classes. If somebody want it to use more complex tasks he need to modify these classes for get second and third results too. If it is possible to return a list and it is not a lot resource why we dont do that? (We iterate a list so also.) The Bayes classifier get too small return values, and there were a bug with the zero floats. It was fixed with logarithmic. It would be nice to scale the class scores sum vlue to one, and then we coud compare two documents return score and relevance. (If we dont do this the wordcount in the test documents affected the result score.) With bulletpoints: * In the Bayes classification normalized score values, and return with result lists. * In the KNN classifier possibility to return a result list. * Make the ClassificationResult Comparable for list sorting. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6315) Remove SimpleOrderedMap
[ https://issues.apache.org/jira/browse/SOLR-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084111#comment-14084111 ] Shai Erera commented on SOLR-6315: -- Thanks Yonik for explaining this. I realized as I started to debug test failures that SimpleOrderedMap is sort of a marker interface. What bothered me in the javadocs are the things that I noted above (the conflicting statements about when to use this class, its name in relation to Jack's comment). Maybe this could be named MapNamedList or something, without mentioning order or access by key and efficiency. And a comment about how this class is a marker interface for response writers to output its content as a map ... I also wish that JSONResponseWriter would decouple itself from a particular type of NamedList. It feels like a hack to me, and I don't mean that negatively. For example, a better 'hack' IMO is if we added to NamedList a preferedOutputStyle member, which could be {{null}} for all NLs other than MapNamedList. Maybe we won't even need that specific class, we could add a ctor argument .. there are options. And then JSONResponseWriter could check the preferedOutputStyle as an override to the default style it received ... maybe then it would make sense to me, I don't know. As a user of the API, maybe even Solr's REST API, it's not clear to me how should I refer to json.nl. It seems that when I set it, I determine the output of _some_ elements in the response JSON, but not all of them. It's as if I need to set the style, run some queries, note the elements that do not correspond to the style that I asked for, and rely on that output's style? bq. one would want it serialized as something that would preserve order What do you mean by preserve order? The order of all values of a single key, or order between both keys and values? An example, if I make these series of sets: {{set(k1, v1); set(k2,v3); set(k1, v2);}}, which of the following outputs do you refer to as preserving order: {noformat} { k1: [ v1, v2 ], k2 : v3 } {noformat} OR {noformat} [ k1, v1, k2, v3, k1, v2 ] {noformat} I realize the second one adheres to any order preservation, but I don't understand if that's the intention, or the former format is considered as preserving order too. What's your recommendation then for this issue? I feel like there are some things we can do to improve the code here. Maybe there are minor like a class rename, javadocs clarification and that's it. Maybe they are slightly broader, e.g. the preferedOutputStyle parameter to NamedList. What's also your opinion about JSONResponseWriter ignoring the json.nl parameter for SimpledOrderedMap? Is this OK, API-wise? Remove SimpleOrderedMap --- Key: SOLR-6315 URL: https://issues.apache.org/jira/browse/SOLR-6315 Project: Solr Issue Type: Improvement Components: clients - java Reporter: Shai Erera Assignee: Shai Erera Attachments: SOLR-6315.patch As I described on SOLR-912, SimpleOrderedMap is redundant and generally useless class, with confusing jdocs. We should remove it. I'll attach a patch shortly. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5698) Evaluate Lucene classification on publicly available datasets
[ https://issues.apache.org/jira/browse/LUCENE-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gergő Törcsvári updated LUCENE-5698: Attachment: 0803-test.patch I separated the assertions and the data preparator functions. It's need some renaming! Evaluate Lucene classification on publicly available datasets - Key: LUCENE-5698 URL: https://issues.apache.org/jira/browse/LUCENE-5698 Project: Lucene - Core Issue Type: Sub-task Components: modules/classification Reporter: Gergő Törcsvári Attachments: 0803-test.patch The Lucene classification module need some publicly available dataset for keep track on the development. Now it woud be nice to have some generated fast test-sets, and some bigger real world dataset too. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-Windows (32bit/jdk1.8.0_11) - Build # 4136 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Windows/4136/ Java: 32bit/jdk1.8.0_11 -server -XX:+UseConcMarkSweepGC 1 tests failed. REGRESSION: org.apache.solr.cloud.MultiThreadedOCPTest.testDistribSearch Error Message: Task 3002 did not complete, final state: failed Stack Trace: java.lang.AssertionError: Task 3002 did not complete, final state: failed at __randomizedtesting.SeedInfo.seed([AA76C0C1081F95D9:2B904ED97F40F5E5]:0) at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.solr.cloud.MultiThreadedOCPTest.testDeduplicationOfSubmittedTasks(MultiThreadedOCPTest.java:163) at org.apache.solr.cloud.MultiThreadedOCPTest.doTest(MultiThreadedOCPTest.java:72) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:867) at sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at
[jira] [Commented] (LUCENE-5666) Add UninvertingReader
[ https://issues.apache.org/jira/browse/LUCENE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084125#comment-14084125 ] Mikhail Khludnev commented on LUCENE-5666: -- [~rcmuir] I'm poring SOLR-6234 to trunk, and observe the annoying issue. Before, FieldCache was always available, but now if doc-vals are not written (indexed), DocValues.get... yields emptyXxx, that break tests silently. I'd rather prefer to get NPE or other Illegal..Exception explicitly. What do I see wrong? Add UninvertingReader - Key: LUCENE-5666 URL: https://issues.apache.org/jira/browse/LUCENE-5666 Project: Lucene - Core Issue Type: Improvement Reporter: Robert Muir Fix For: 5.0 Attachments: LUCENE-5666.patch Currently the fieldcache is not pluggable at all. It would be better if everything used the docvalues apis. This would allow people to customize the implementation, extend the classes with custom subclasses with additional stuff, etc etc. FieldCache can be accessed via the docvalues apis, using the FilterReader api. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6315) Remove SimpleOrderedMap
[ https://issues.apache.org/jira/browse/SOLR-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084130#comment-14084130 ] Yonik Seeley commented on SOLR-6315: I guess my confusion is why this is confusing at all... I'l try to put it as succinctly as possible to try and reduce confusion: {code} If you want the data serialized as a JSON Object, use SimpleOrderedMap. Given that some clients will lose JSON Object ordering information, use NamedList to serialize to something that will preserve order (e.g. [key1,val1,key2,val2]) {code} At the end of the day, this is really all about JSON and client / language adapters. An unordered Map has a natural mapping to JSON... but something where order is more important (represented by NamedList) does not. This is why we picked one [k1,v1,k2,v2,...] but allowed it to be overridden by json.nl: http://wiki.apache.org/solr/SolJSON#JSON_specific_parameters If someone feels the javadoc needs to be improved, please put up a patch. Remove SimpleOrderedMap --- Key: SOLR-6315 URL: https://issues.apache.org/jira/browse/SOLR-6315 Project: Solr Issue Type: Improvement Components: clients - java Reporter: Shai Erera Assignee: Shai Erera Attachments: SOLR-6315.patch As I described on SOLR-912, SimpleOrderedMap is redundant and generally useless class, with confusing jdocs. We should remove it. I'll attach a patch shortly. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4396) BooleanScorer should sometimes be used for MUST clauses
[ https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084152#comment-14084152 ] Michael McCandless commented on LUCENE-4396: Da attached the latest patch here ... I'm not sure where the github commit is ... BooleanScorer should sometimes be used for MUST clauses --- Key: LUCENE-4396 URL: https://issues.apache.org/jira/browse/LUCENE-4396 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Attachments: And.tasks, AndOr.tasks, AndOr.tasks, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, SIZE.perf, all.perf, luceneutil-score-equal.patch, luceneutil-score-equal.patch, stat.cpp, stat.cpp Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT. If there is one or more MUST clauses we always use BooleanScorer2. But I suspect that unless the MUST clauses have very low hit count compared to the other clauses, that BooleanScorer would perform better than BooleanScorer2. BooleanScorer still has some vestiges from when it used to handle MUST so it shouldn't be hard to bring back this capability ... I think the challenging part might be the heuristics on when to use which (likely we would have to use firstDocID as proxy for total hit count). Likely we should also have BooleanScorer sometimes use .advance() on the subs in this case, eg if suddenly the MUST clause skips 100 docs then you want to .advance() all the SHOULD clauses. I won't have near term time to work on this so feel free to take it if you are inspired! -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Review Request 24220: Split clusterstate.json per collection and watch states selectively.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24220/ --- Review request for lucene. Repository: lucene Description --- Split clusterstate.json per collection and watch states selectively Diffs - trunk/solr/core/src/java/org/apache/solr/cloud/Overseer.java 1615458 trunk/solr/core/src/java/org/apache/solr/cloud/OverseerCollectionProcessor.java 1615458 trunk/solr/core/src/java/org/apache/solr/cloud/ZkController.java 1615458 trunk/solr/core/src/java/org/apache/solr/handler/admin/CollectionsHandler.java 1615458 trunk/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java 1615458 trunk/solr/core/src/java/org/apache/solr/servlet/ZookeeperInfoServlet.java 1615458 trunk/solr/core/src/test/org/apache/solr/cloud/ExternalCollectionsTest.java PRE-CREATION trunk/solr/solrj/src/java/org/apache/solr/client/solrj/impl/CloudSolrServer.java 1615458 trunk/solr/solrj/src/java/org/apache/solr/common/cloud/ClusterState.java 1615458 trunk/solr/solrj/src/java/org/apache/solr/common/cloud/DocCollection.java 1615458 trunk/solr/solrj/src/java/org/apache/solr/common/cloud/ZkStateReader.java 1615458 trunk/solr/solrj/src/test/org/apache/solr/client/solrj/impl/CloudSolrServerTest.java 1615458 trunk/solr/test-framework/src/java/org/apache/solr/cloud/AbstractFullDistribZkTestBase.java 1615458 Diff: https://reviews.apache.org/r/24220/diff/ Testing --- Thanks, Mark Miller
Re: Review Request 24220: Split clusterstate.json per collection and watch states selectively.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24220/ --- (Updated Aug. 4, 2014, 12:04 a.m.) Review request for lucene. Changes --- Here is a patch that focuses only on bringing the code up to standard project formatting guidelines. Repository: lucene Description --- Split clusterstate.json per collection and watch states selectively Diffs (updated) - trunk/solr/core/src/java/org/apache/solr/cloud/Overseer.java 1615458 trunk/solr/core/src/java/org/apache/solr/cloud/OverseerCollectionProcessor.java 1615458 trunk/solr/core/src/java/org/apache/solr/cloud/ZkController.java 1615458 trunk/solr/core/src/java/org/apache/solr/handler/admin/CollectionsHandler.java 1615458 trunk/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java 1615458 trunk/solr/core/src/java/org/apache/solr/servlet/ZookeeperInfoServlet.java 1615458 trunk/solr/core/src/test/org/apache/solr/cloud/ExternalCollectionsTest.java PRE-CREATION trunk/solr/solrj/src/java/org/apache/solr/client/solrj/impl/CloudSolrServer.java 1615458 trunk/solr/solrj/src/java/org/apache/solr/common/cloud/ClusterState.java 1615458 trunk/solr/solrj/src/java/org/apache/solr/common/cloud/DocCollection.java 1615458 trunk/solr/solrj/src/java/org/apache/solr/common/cloud/ZkStateReader.java 1615458 trunk/solr/solrj/src/test/org/apache/solr/client/solrj/impl/CloudSolrServerTest.java 1615458 trunk/solr/test-framework/src/java/org/apache/solr/cloud/AbstractFullDistribZkTestBase.java 1615458 Diff: https://reviews.apache.org/r/24220/diff/ Testing --- Thanks, Mark Miller
[jira] [Updated] (SOLR-5473) Split clusterstate.json per collection and watch states selectively
[ https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-5473: -- Fix Version/s: 4.10 Labels: SolrCloud (was: ) Split clusterstate.json per collection and watch states selectively Key: SOLR-5473 URL: https://issues.apache.org/jira/browse/SOLR-5473 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul Labels: SolrCloud Fix For: 5.0, 4.10 Attachments: SOLR-5473-74 .patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74_POC.patch, SOLR-5473-configname-fix.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473_undo.patch, ec2-23-20-119-52_solr.log, ec2-50-16-38-73_solr.log As defined in the parent issue, store the states of each collection under /collections/collectionname/state.json node and watches state changes selectively. https://reviews.apache.org/r/24220/ -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5473) Split clusterstate.json per collection and watch states selectively
[ https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-5473: -- Description: As defined in the parent issue, store the states of each collection under /collections/collectionname/state.json node and watches state changes selectively. https://reviews.apache.org/r/24220/ was:As defined in the parent issue, store the states of each collection under /collections/collectionname/state.json node Split clusterstate.json per collection and watch states selectively Key: SOLR-5473 URL: https://issues.apache.org/jira/browse/SOLR-5473 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul Labels: SolrCloud Fix For: 5.0, 4.10 Attachments: SOLR-5473-74 .patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74_POC.patch, SOLR-5473-configname-fix.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473_undo.patch, ec2-23-20-119-52_solr.log, ec2-50-16-38-73_solr.log As defined in the parent issue, store the states of each collection under /collections/collectionname/state.json node and watches state changes selectively. https://reviews.apache.org/r/24220/ -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5473) Split clusterstate.json per collection and watch states selectively
[ https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084171#comment-14084171 ] Mark Miller commented on SOLR-5473: --- Another patch to follow and then some comments. Split clusterstate.json per collection and watch states selectively Key: SOLR-5473 URL: https://issues.apache.org/jira/browse/SOLR-5473 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul Labels: SolrCloud Fix For: 5.0, 4.10 Attachments: SOLR-5473-74 .patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74_POC.patch, SOLR-5473-configname-fix.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473_undo.patch, ec2-23-20-119-52_solr.log, ec2-50-16-38-73_solr.log As defined in the parent issue, store the states of each collection under /collections/collectionname/state.json node and watches state changes selectively. https://reviews.apache.org/r/24220/ -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5473) Split clusterstate.json per collection and watch states selectively
[ https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-5473: -- Attachment: SOLR-5473.patch Here is a patch that focuses only on bringing the code up to standard project formatting guidelines. reviewboard: https://reviews.apache.org/r/24220/ changes: https://reviews.apache.org/r/24220/diff/1-2/ Split clusterstate.json per collection and watch states selectively Key: SOLR-5473 URL: https://issues.apache.org/jira/browse/SOLR-5473 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul Labels: SolrCloud Fix For: 5.0, 4.10 Attachments: SOLR-5473-74 .patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74_POC.patch, SOLR-5473-configname-fix.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473_undo.patch, ec2-23-20-119-52_solr.log, ec2-50-16-38-73_solr.log As defined in the parent issue, store the states of each collection under /collections/collectionname/state.json node and watches state changes selectively. https://reviews.apache.org/r/24220/ -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5473) Split clusterstate.json per collection and watch states selectively
[ https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084177#comment-14084177 ] Mark Miller commented on SOLR-5473: --- Also, FYI, I think 'SOLR-5810 State of external collections not displayed in cloud graph panel' should be resolved as part of this issue if it's still applicable. I think that's a pretty critical part of this as we are not considering this an optional 'mode' but an architecture transition. Looks like some good progress has already been made on SOLR-5810, so hopefully it's just a matter of buttoning that up and rolling it into this. Split clusterstate.json per collection and watch states selectively Key: SOLR-5473 URL: https://issues.apache.org/jira/browse/SOLR-5473 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul Labels: SolrCloud Fix For: 5.0, 4.10 Attachments: SOLR-5473-74 .patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74_POC.patch, SOLR-5473-configname-fix.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473_undo.patch, ec2-23-20-119-52_solr.log, ec2-50-16-38-73_solr.log As defined in the parent issue, store the states of each collection under /collections/collectionname/state.json node and watches state changes selectively. https://reviews.apache.org/r/24220/ -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5810) State of external collections not displayed in cloud graph panel
[ https://issues.apache.org/jira/browse/SOLR-5810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084179#comment-14084179 ] Mark Miller commented on SOLR-5810: --- How is this going [~thelabdude]? This seems like a critical part of SOLR-5473 , so I'd like to get them in together. State of external collections not displayed in cloud graph panel Key: SOLR-5810 URL: https://issues.apache.org/jira/browse/SOLR-5810 Project: Solr Issue Type: Improvement Components: SolrCloud, web gui Reporter: Timothy Potter Assignee: Timothy Potter Attachments: SOLR-5810-prelim.patch, SOLR-5810.prelim2.patch External collections (SOLR-5473) are not displayed in the Cloud - graph panel, which makes it very hard to see which external collections have problems, such as after a downed node comes back online. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5473) Split clusterstate.json per collection and watch states selectively
[ https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084210#comment-14084210 ] Mark Miller commented on SOLR-5473: --- I'm not sure I agree with the use of collections api param for controlling this anymore. Also: * I think if we are transitioning to this, on 5x at least, it should default to true. We should probably remove the old mode as well, but I'm all for waiting on that so we don't diverge 4x from 5x too early. * I don't really see the need to support a mixed install. It seems like you should either have the new mode or the deprecated mode. Exposing the option of the two in the collections api is leaking internals that made more sense when this was implemented as an optional collection feature. If an expert user with an existing install would like to turn this on for new collections, that's a consideration, but I think that should be an expert, perhaps undoc'd param, and this overall architecture change should be controlled at the top Solr config level, perhaps at the cluster config level? Split clusterstate.json per collection and watch states selectively Key: SOLR-5473 URL: https://issues.apache.org/jira/browse/SOLR-5473 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul Labels: SolrCloud Fix For: 5.0, 4.10 Attachments: SOLR-5473-74 .patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74_POC.patch, SOLR-5473-configname-fix.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473_undo.patch, ec2-23-20-119-52_solr.log, ec2-50-16-38-73_solr.log As defined in the parent issue, store the states of each collection under /collections/collectionname/state.json node and watches state changes selectively. https://reviews.apache.org/r/24220/ -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5473) Split clusterstate.json per collection and watch states selectively
[ https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084212#comment-14084212 ] Mark Miller commented on SOLR-5473: --- I'm in the middle of a patch, but putting it down tonight. I'll come back to it tomorrow night. Split clusterstate.json per collection and watch states selectively Key: SOLR-5473 URL: https://issues.apache.org/jira/browse/SOLR-5473 Project: Solr Issue Type: Sub-task Components: SolrCloud Reporter: Noble Paul Assignee: Noble Paul Labels: SolrCloud Fix For: 5.0, 4.10 Attachments: SOLR-5473-74 .patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74_POC.patch, SOLR-5473-configname-fix.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473_undo.patch, ec2-23-20-119-52_solr.log, ec2-50-16-38-73_solr.log As defined in the parent issue, store the states of each collection under /collections/collectionname/state.json node and watches state changes selectively. https://reviews.apache.org/r/24220/ -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-MacOSX (64bit/jdk1.8.0) - Build # 1714 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/1714/ Java: 64bit/jdk1.8.0 -XX:+UseCompressedOops -XX:+UseParallelGC 1 tests failed. REGRESSION: org.apache.solr.schema.TestCloudSchemaless.testDistribSearch Error Message: Timeout occured while waiting response from server at: https://127.0.0.1:54364/collection1 Stack Trace: org.apache.solr.client.solrj.SolrServerException: Timeout occured while waiting response from server at: https://127.0.0.1:54364/collection1 at __randomizedtesting.SeedInfo.seed([9745DA21048A19C8:16A3543973D579F4]:0) at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:561) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54) at org.apache.solr.schema.TestCloudSchemaless.doTest(TestCloudSchemaless.java:140) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:867) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at
[jira] [Updated] (SOLR-6214) Snapshots numberToKeep param only keeps n-1 backups
[ https://issues.apache.org/jira/browse/SOLR-6214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramana updated SOLR-6214: - Attachment: (was: SOLR-6214.patch) Snapshots numberToKeep param only keeps n-1 backups --- Key: SOLR-6214 URL: https://issues.apache.org/jira/browse/SOLR-6214 Project: Solr Issue Type: Bug Affects Versions: 4.9 Reporter: Mathias H. Assignee: Shalin Shekhar Mangar Priority: Minor The numberToKeep param for snapshots doesn't work anymore. If you set the param to '2', only '1' backup is kept. In the ReplicationHandler in line 377 snapShooter.validateCreateSnapshot(); creates an empty directory for the new snapshot. The deleteOldBackups() method in Snapshooter which will be executed before the backup is created, now sees the two directories an deletes the old one. But this is wrong because the empty directory for the new backup should not be considered. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6214) Snapshots numberToKeep param only keeps n-1 backups
[ https://issues.apache.org/jira/browse/SOLR-6214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramana updated SOLR-6214: - Attachment: SOLR-6214.patch Snapshots numberToKeep param only keeps n-1 backups --- Key: SOLR-6214 URL: https://issues.apache.org/jira/browse/SOLR-6214 Project: Solr Issue Type: Bug Affects Versions: 4.9 Reporter: Mathias H. Assignee: Shalin Shekhar Mangar Priority: Minor Attachments: SOLR-6214.patch The numberToKeep param for snapshots doesn't work anymore. If you set the param to '2', only '1' backup is kept. In the ReplicationHandler in line 377 snapShooter.validateCreateSnapshot(); creates an empty directory for the new snapshot. The deleteOldBackups() method in Snapshooter which will be executed before the backup is created, now sees the two directories an deletes the old one. But this is wrong because the empty directory for the new backup should not be considered. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6214) Snapshots numberToKeep param only keeps n-1 backups
[ https://issues.apache.org/jira/browse/SOLR-6214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084290#comment-14084290 ] Ramana commented on SOLR-6214: -- Shalin, Updated the patch with test case. Please verify. Snapshots numberToKeep param only keeps n-1 backups --- Key: SOLR-6214 URL: https://issues.apache.org/jira/browse/SOLR-6214 Project: Solr Issue Type: Bug Affects Versions: 4.9 Reporter: Mathias H. Assignee: Shalin Shekhar Mangar Priority: Minor Attachments: SOLR-6214.patch The numberToKeep param for snapshots doesn't work anymore. If you set the param to '2', only '1' backup is kept. In the ReplicationHandler in line 377 snapShooter.validateCreateSnapshot(); creates an empty directory for the new snapshot. The deleteOldBackups() method in Snapshooter which will be executed before the backup is created, now sees the two directories an deletes the old one. But this is wrong because the empty directory for the new backup should not be considered. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6316) API to know number of backups available
Ramana created SOLR-6316: Summary: API to know number of backups available Key: SOLR-6316 URL: https://issues.apache.org/jira/browse/SOLR-6316 Project: Solr Issue Type: Bug Affects Versions: 4.9 Reporter: Ramana Priority: Minor I am using Replication backup command to create snapshot of my index. http://localhost:8983/solr/replication?command=backupnumberToKeep=2 At any point, If I would like to know how many number of back ups available, we don't have any API that supports this. The close one i see is http://localhost:8983/solr/replication?command=details But the above URL gives overview of snapshots available. It doesn't say how many number of snapshots available. lst name=backup str name=startTimeSat Aug 02 08:33:37 IST 2014/str int name=fileCount24/int str name=statussuccess/str str name=snapshotCompletedAtSat Aug 02 08:33:37 IST 2014/str null name=snapshotName/ /lst -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6316) API to know number of backups available
[ https://issues.apache.org/jira/browse/SOLR-6316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramana updated SOLR-6316: - Issue Type: Improvement (was: Bug) API to know number of backups available --- Key: SOLR-6316 URL: https://issues.apache.org/jira/browse/SOLR-6316 Project: Solr Issue Type: Improvement Affects Versions: 4.9 Reporter: Ramana Priority: Minor I am using Replication backup command to create snapshot of my index. http://localhost:8983/solr/replication?command=backupnumberToKeep=2 At any point, If I would like to know how many number of back ups available, we don't have any API that supports this. The close one i see is http://localhost:8983/solr/replication?command=details But the above URL gives overview of snapshots available. It doesn't say how many number of snapshots available. lst name=backup str name=startTimeSat Aug 02 08:33:37 IST 2014/str int name=fileCount24/int str name=statussuccess/str str name=snapshotCompletedAtSat Aug 02 08:33:37 IST 2014/str null name=snapshotName/ /lst -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4396) BooleanScorer should sometimes be used for MUST clauses
[ https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Da Huang updated LUCENE-4396: - Attachment: tasks.cpp LUCENE-4396.patch And.tasks The patch based on git mirror commit 67d17eb81b754fa242bb91e1b91070fd8b38ecd9 . In this patch, I remove those unused classes, encapsulate some functions and fix some bugs. Besides, the tasks file used before has heavy relevance between cases. I think it's not good. Therefore, I generate a new tasks file. The file And.tasks is the new tasks file, while 'tasks.cpp' is the program to generate this tasks file. You can generate tasks file by running {code} g++ tasks.cpp -std=c++0x -o tasks ./tasks wikimedium.10M.nostopwords.tasks And.tasks {code} The perf. on the new tasks file is as follows. {code} TaskQPS baseline StdDevQPS my_version StdDev Pct diff HighAnd5LowNot5.40 (5.1%)4.88 (4.2%) -9.6% ( -18% -0%) HighAnd5LowOr7.05 (10.2%)6.87 (3.8%) -2.6% ( -15% - 12%) LowAnd5LowNot 27.17 (2.1%) 26.47 (2.6%) -2.6% ( -7% -2%) HighAnd5HighOr1.13 (3.8%)1.11 (2.2%) -1.8% ( -7% -4%) LowAnd5LowOr 31.82 (2.6%) 31.35 (2.3%) -1.5% ( -6% -3%) PKLookup 98.80 (5.2%) 102.02 (6.3%) 3.3% ( -7% - 15%) HighAnd5HighNot1.95 (1.0%)2.04 (2.1%) 4.7% ( 1% -7%) LowAnd5HighNot9.46 (2.9%) 10.32 (2.7%) 9.0% ( 3% - 15%) LowAnd5HighOr7.56 (2.8%)8.42 (2.8%) 11.4% ( 5% - 17%) LowAnd60HighOr0.51 (2.5%)0.82 (4.8%) 58.7% ( 50% - 67%) LowAnd60LowNot2.61 (1.0%)4.64 (3.4%) 78.0% ( 72% - 83%) HighAnd60LowNot1.30 (1.2%)2.36 (3.7%) 81.1% ( 75% - 87%) HighAnd60LowOr1.18 (1.3%)2.15 (3.7%) 82.0% ( 76% - 88%) LowAnd60LowOr2.25 (0.6%)4.61 (4.2%) 104.7% ( 99% - 110%) HighAnd60HighOr0.10 (0.7%)0.26 (4.8%) 151.2% ( 144% - 157%) LowAnd60HighNot0.53 (2.5%)1.62 (8.0%) 204.0% ( 188% - 220%) HighAnd60HighNot0.14 (0.9%)0.59 (8.9%) 328.4% ( 315% - 341%) {code} My next step is to do more tests to get better rules and make sure the correctness. I think it can be finished by this Friday. As the suggested pencil down date is comming, I will begin to scrub the code, improve the comments, and write document in conclusion. BooleanScorer should sometimes be used for MUST clauses --- Key: LUCENE-4396 URL: https://issues.apache.org/jira/browse/LUCENE-4396 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Attachments: And.tasks, And.tasks, AndOr.tasks, AndOr.tasks, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, SIZE.perf, all.perf, luceneutil-score-equal.patch, luceneutil-score-equal.patch, stat.cpp, stat.cpp, tasks.cpp Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT. If there is one or more MUST clauses we always use BooleanScorer2. But I suspect that unless the MUST clauses have very low hit count compared to the other clauses, that BooleanScorer would perform better than BooleanScorer2. BooleanScorer still has some vestiges from when it used to handle MUST so it shouldn't be hard to bring back this capability ... I think the challenging part might be the heuristics on when to use which (likely we would have to use firstDocID as proxy for total hit count). Likely we should also have BooleanScorer sometimes use .advance() on the subs in this case, eg if suddenly the MUST clause skips 100 docs then you want to .advance() all the SHOULD clauses. I won't have near term time to work on this so feel free to take it if you are inspired! -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4396) BooleanScorer should sometimes be used for MUST clauses
[ https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084332#comment-14084332 ] Da Huang commented on LUCENE-4396: -- Hi, [~paul.elsc...@xs4all.nl]. The commit hash code mentioned here just indicates which commit the patch should apply on. If you want to get the java latest code discussed here for example, you can do these {code} git clone https://github.com/apache/lucene-solr cd lucene-solr git checkout 67d17eb81b754fa242bb91e1b91070fd8b38ecd9 git apply LUCENE-4396.patch {code} LUCENE-4396.patch is attached on this page, you can download it first. Hope this can help you. btw, there is a repo where I'm maintaining the code, but the repo is on the server in my lab. You're not able to clone from that repo without password. Sorry for that. BooleanScorer should sometimes be used for MUST clauses --- Key: LUCENE-4396 URL: https://issues.apache.org/jira/browse/LUCENE-4396 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Attachments: And.tasks, And.tasks, AndOr.tasks, AndOr.tasks, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, SIZE.perf, all.perf, luceneutil-score-equal.patch, luceneutil-score-equal.patch, stat.cpp, stat.cpp, tasks.cpp Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT. If there is one or more MUST clauses we always use BooleanScorer2. But I suspect that unless the MUST clauses have very low hit count compared to the other clauses, that BooleanScorer would perform better than BooleanScorer2. BooleanScorer still has some vestiges from when it used to handle MUST so it shouldn't be hard to bring back this capability ... I think the challenging part might be the heuristics on when to use which (likely we would have to use firstDocID as proxy for total hit count). Likely we should also have BooleanScorer sometimes use .advance() on the subs in this case, eg if suddenly the MUST clause skips 100 docs then you want to .advance() all the SHOULD clauses. I won't have near term time to work on this so feel free to take it if you are inspired! -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org