[jira] [Updated] (LUCENE-5257) Lock down centralized versioning of ivy dependencies
[ https://issues.apache.org/jira/browse/LUCENE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Rowe updated LUCENE-5257: --- Attachment: LUCENE-5257.patch New patch with a few changes: * Added orphan checking for {{/org/name}} coordinate keys in {{ivy-versions.properties}} file that are not referred to in any {{ivy.xml}} file. * In order to add orphan checking, I had to make the task run only at the top level, so that it sees all the {{ivy.xml}} files across the whole project. The simplest way to do that with the current build was to place the {{check-lib-versions}} target in {{lucene/build.xml}}, and then invoke it from the top-level {{validate}} target, recursively checking the {{ivy.xml}} files in the parent of the {{lucene/}} directory. (This is a little clunky - really it should live at the top level - but I can live with it.) * Cleaned up the {{ivy-versions.properties}} file, so that versions shared by two or more libs are represented as a property named for their org/groupId (and sometimes part of their name/artifactId if the org/groupId didn't seem to be sufficiently differentiating). I'll commit later today if there are no objections. Lock down centralized versioning of ivy dependencies Key: LUCENE-5257 URL: https://issues.apache.org/jira/browse/LUCENE-5257 Project: Lucene - Core Issue Type: Improvement Components: general/build Reporter: Steve Rowe Assignee: Steve Rowe Priority: Minor Attachments: LUCENE-5257.patch, LUCENE-5257.patch LUCENE-5249 introduced centralized versioning of 3rd party dependencies and converted all ivy.xml files across Lucene/Solr to use this scheme. But there is nothing preventing people from ignoring this setup and (intentionally or not) introducing non-centralized dependency versions. SOLR-3664 discusses the problem of out-of-sync 3rd party dependencies between Lucene/Solr modules. Centralized versioning makes synchronization problems less likely but not impossible. One fairly simple way to ensure that all modules use the same version of 3rd party deps would be to require that all deps in ivy.xml would have to use the {{rev=$\{/org/name}}} syntax, via a validation script. The problem remains that there may eventually be a *requirement* to use different 3rd party libs in different modules. Any form of lockdown here should allow for this possibility. Hoss's suggestion from a conversation on #lucene IRC earlier today: {noformat} hoss perhaps exceptions could be by naming convetion sarowe can you give an example? hoss ie: variables must match either ${group}/${artifact} or they must match /VERSION_MISTMATCH_EXCEPTION/${group}/${artifact} sarowe nice idea no external config required hoss right and it has to be real obvious when you are bucking convention hoss or better yet: ${group}/${artifact}/VERSION_MISTMATCH_EXCEPTION ... and there is another check that the version file is in ascii order so you are garuntted that it has to be right there in the versions file one line after ${group}/${artifact} sarowe i like it hoss no change someone updating ${group}/${artifact} won't notice it i suppose really it should be ${group}/${artifact}/VERSION_MISTMATCH_EXCEPTION/${reason} ... since you might have more then one exception per ${group}/${artifact} but now i'm just making things up w/o evn really understanding the conventions you've alreay put in place sarowe :) hoss you get the idea sarowe yes {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5257) Lock down centralized versioning of ivy dependencies
[ https://issues.apache.org/jira/browse/LUCENE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Rowe updated LUCENE-5257: --- Attachment: LUCENE-5257.patch Patch with cosmetic cleanups and a CHANGES.txt entry. Lock down centralized versioning of ivy dependencies Key: LUCENE-5257 URL: https://issues.apache.org/jira/browse/LUCENE-5257 Project: Lucene - Core Issue Type: Improvement Components: general/build Reporter: Steve Rowe Assignee: Steve Rowe Priority: Minor Attachments: LUCENE-5257.patch, LUCENE-5257.patch, LUCENE-5257.patch LUCENE-5249 introduced centralized versioning of 3rd party dependencies and converted all ivy.xml files across Lucene/Solr to use this scheme. But there is nothing preventing people from ignoring this setup and (intentionally or not) introducing non-centralized dependency versions. SOLR-3664 discusses the problem of out-of-sync 3rd party dependencies between Lucene/Solr modules. Centralized versioning makes synchronization problems less likely but not impossible. One fairly simple way to ensure that all modules use the same version of 3rd party deps would be to require that all deps in ivy.xml would have to use the {{rev=$\{/org/name}}} syntax, via a validation script. The problem remains that there may eventually be a *requirement* to use different 3rd party libs in different modules. Any form of lockdown here should allow for this possibility. Hoss's suggestion from a conversation on #lucene IRC earlier today: {noformat} hoss perhaps exceptions could be by naming convetion sarowe can you give an example? hoss ie: variables must match either ${group}/${artifact} or they must match /VERSION_MISTMATCH_EXCEPTION/${group}/${artifact} sarowe nice idea no external config required hoss right and it has to be real obvious when you are bucking convention hoss or better yet: ${group}/${artifact}/VERSION_MISTMATCH_EXCEPTION ... and there is another check that the version file is in ascii order so you are garuntted that it has to be right there in the versions file one line after ${group}/${artifact} sarowe i like it hoss no change someone updating ${group}/${artifact} won't notice it i suppose really it should be ${group}/${artifact}/VERSION_MISTMATCH_EXCEPTION/${reason} ... since you might have more then one exception per ${group}/${artifact} but now i'm just making things up w/o evn really understanding the conventions you've alreay put in place sarowe :) hoss you get the idea sarowe yes {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5251) New Dictionary Implementation for Suggester consumption
[ https://issues.apache.org/jira/browse/LUCENE-5251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13785956#comment-13785956 ] Areek Zillur commented on LUCENE-5251: -- Thanks for the quick review! I will upload a patch soon, incorporating the suggested changes. New Dictionary Implementation for Suggester consumption --- Key: LUCENE-5251 URL: https://issues.apache.org/jira/browse/LUCENE-5251 Project: Lucene - Core Issue Type: New Feature Components: core/search Reporter: Areek Zillur Attachments: LUCENE-5251.patch, LUCENE-5251.patch With the vast array of new suggester, It would be nice to have a dictionary implementation that could feed the suggesters terms, weights and (optionally) payloads from the lucene index. The idea of this dictionary implementation is to grab stored documents from the index and use user-configured fields for terms, weights and payloads. use-case: If you have a document with three fields - product_id - product_name - product_popularity_score then using this implementation would enable you to have a suggester for product_name using the weight of product_popularity_score and return you the payload of product_id, with which you can do further processing on (example: construct a url etc). -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5301) DELETEALIAS command prints CREATEALIAS in logs
Jan Høydahl created SOLR-5301: - Summary: DELETEALIAS command prints CREATEALIAS in logs Key: SOLR-5301 URL: https://issues.apache.org/jira/browse/SOLR-5301 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.4, 4.5 Reporter: Jan Høydahl Priority: Minor Fix For: 5.0, 4.6 A simple copy/paste error in https://github.com/apache/lucene-solr/blob/33d22db31e63482a1b1aad0cf90c4030bc359ffe/solr/core/src/java/org/apache/solr/handler/admin/CollectionsHandler.java#L265 Line 265 should say OverseerCollectionProcessor.DELETEALIAS As far as I can see the bug only affects logging. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5254) SegmentCoreReader's owner reference back to the first SegmentReader causes leaks
[ https://issues.apache.org/jira/browse/LUCENE-5254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786085#comment-13786085 ] ASF subversion and git services commented on LUCENE-5254: - Commit 1529135 from [~mikemccand] in branch 'dev/trunk' [ https://svn.apache.org/r1529135 ] LUCENE-5254: don't hold ref to original SR from SCR, to avoid bounded leak of things like live docs bitset SegmentCoreReader's owner reference back to the first SegmentReader causes leaks -- Key: LUCENE-5254 URL: https://issues.apache.org/jira/browse/LUCENE-5254 Project: Lucene - Core Issue Type: Bug Affects Versions: 5.0, 4.6 Reporter: Michael McCandless Assignee: Michael McCandless Attachments: LUCENE-5254.patch Spinoff from LUCENE-5248, where Shai discovered this ... SegmentCoreReaders has a SegmentReader owner member, that points to the first SegmentReader that was opened. When that SR is reopened to SR2, e.g. because new deletes or NDV updates happened, the same SCR is shared. But, even if you close SR1, any thing it points to cannot be GCd because SCR is pointing to it. I think the big things are liveDocs and the NDV update maps; Shai is going to fix the latter in LUCENE-5248, so this issue should fix liveDocs. The simplest fix is to make liveDocs not final and null it out in doClose ... but that's sort of fragile (what if we add other members in the future and forget to null them on close?). I think it'd be better to eliminate the owner reference; it's only used so we can evict FieldCache entry once the core is closed. Maybe we can just store the coreCacheKey instead? -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5254) SegmentCoreReader's owner reference back to the first SegmentReader causes leaks
[ https://issues.apache.org/jira/browse/LUCENE-5254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-5254. Resolution: Fixed Fix Version/s: 4.6 5.0 SegmentCoreReader's owner reference back to the first SegmentReader causes leaks -- Key: LUCENE-5254 URL: https://issues.apache.org/jira/browse/LUCENE-5254 Project: Lucene - Core Issue Type: Bug Affects Versions: 5.0, 4.6 Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0, 4.6 Attachments: LUCENE-5254.patch Spinoff from LUCENE-5248, where Shai discovered this ... SegmentCoreReaders has a SegmentReader owner member, that points to the first SegmentReader that was opened. When that SR is reopened to SR2, e.g. because new deletes or NDV updates happened, the same SCR is shared. But, even if you close SR1, any thing it points to cannot be GCd because SCR is pointing to it. I think the big things are liveDocs and the NDV update maps; Shai is going to fix the latter in LUCENE-5248, so this issue should fix liveDocs. The simplest fix is to make liveDocs not final and null it out in doClose ... but that's sort of fragile (what if we add other members in the future and forget to null them on close?). I think it'd be better to eliminate the owner reference; it's only used so we can evict FieldCache entry once the core is closed. Maybe we can just store the coreCacheKey instead? -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5254) SegmentCoreReader's owner reference back to the first SegmentReader causes leaks
[ https://issues.apache.org/jira/browse/LUCENE-5254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786087#comment-13786087 ] ASF subversion and git services commented on LUCENE-5254: - Commit 1529136 from [~mikemccand] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1529136 ] LUCENE-5254: don't hold ref to original SR from SCR, to avoid bounded leak of things like live docs bitset SegmentCoreReader's owner reference back to the first SegmentReader causes leaks -- Key: LUCENE-5254 URL: https://issues.apache.org/jira/browse/LUCENE-5254 Project: Lucene - Core Issue Type: Bug Affects Versions: 5.0, 4.6 Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0, 4.6 Attachments: LUCENE-5254.patch Spinoff from LUCENE-5248, where Shai discovered this ... SegmentCoreReaders has a SegmentReader owner member, that points to the first SegmentReader that was opened. When that SR is reopened to SR2, e.g. because new deletes or NDV updates happened, the same SCR is shared. But, even if you close SR1, any thing it points to cannot be GCd because SCR is pointing to it. I think the big things are liveDocs and the NDV update maps; Shai is going to fix the latter in LUCENE-5248, so this issue should fix liveDocs. The simplest fix is to make liveDocs not final and null it out in doClose ... but that's sort of fragile (what if we add other members in the future and forget to null them on close?). I think it'd be better to eliminate the owner reference; it's only used so we can evict FieldCache entry once the core is closed. Maybe we can just store the coreCacheKey instead? -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5254) SegmentCoreReader's owner reference back to the first SegmentReader causes leaks
[ https://issues.apache.org/jira/browse/LUCENE-5254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786101#comment-13786101 ] ASF subversion and git services commented on LUCENE-5254: - Commit 1529139 from [~mikemccand] in branch 'dev/trunk' [ https://svn.apache.org/r1529139 ] LUCENE-5254: just pass 'this' to the CoreClosedListeners SegmentCoreReader's owner reference back to the first SegmentReader causes leaks -- Key: LUCENE-5254 URL: https://issues.apache.org/jira/browse/LUCENE-5254 Project: Lucene - Core Issue Type: Bug Affects Versions: 5.0, 4.6 Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0, 4.6 Attachments: LUCENE-5254.patch Spinoff from LUCENE-5248, where Shai discovered this ... SegmentCoreReaders has a SegmentReader owner member, that points to the first SegmentReader that was opened. When that SR is reopened to SR2, e.g. because new deletes or NDV updates happened, the same SCR is shared. But, even if you close SR1, any thing it points to cannot be GCd because SCR is pointing to it. I think the big things are liveDocs and the NDV update maps; Shai is going to fix the latter in LUCENE-5248, so this issue should fix liveDocs. The simplest fix is to make liveDocs not final and null it out in doClose ... but that's sort of fragile (what if we add other members in the future and forget to null them on close?). I think it'd be better to eliminate the owner reference; it's only used so we can evict FieldCache entry once the core is closed. Maybe we can just store the coreCacheKey instead? -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5254) SegmentCoreReader's owner reference back to the first SegmentReader causes leaks
[ https://issues.apache.org/jira/browse/LUCENE-5254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786104#comment-13786104 ] ASF subversion and git services commented on LUCENE-5254: - Commit 1529141 from [~mikemccand] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1529141 ] LUCENE-5254: just pass 'this' to the CoreClosedListeners SegmentCoreReader's owner reference back to the first SegmentReader causes leaks -- Key: LUCENE-5254 URL: https://issues.apache.org/jira/browse/LUCENE-5254 Project: Lucene - Core Issue Type: Bug Affects Versions: 5.0, 4.6 Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0, 4.6 Attachments: LUCENE-5254.patch Spinoff from LUCENE-5248, where Shai discovered this ... SegmentCoreReaders has a SegmentReader owner member, that points to the first SegmentReader that was opened. When that SR is reopened to SR2, e.g. because new deletes or NDV updates happened, the same SCR is shared. But, even if you close SR1, any thing it points to cannot be GCd because SCR is pointing to it. I think the big things are liveDocs and the NDV update maps; Shai is going to fix the latter in LUCENE-5248, so this issue should fix liveDocs. The simplest fix is to make liveDocs not final and null it out in doClose ... but that's sort of fragile (what if we add other members in the future and forget to null them on close?). I think it'd be better to eliminate the owner reference; it's only used so we can evict FieldCache entry once the core is closed. Maybe we can just store the coreCacheKey instead? -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5228) IndexWriter.addIndexes copies raw files but acquires no locks
[ https://issues.apache.org/jira/browse/LUCENE-5228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786107#comment-13786107 ] Shai Erera commented on LUCENE-5228: The problem is with Directories that don't support locking, e.g. on HDFS. But I guess NoLockFactory is a reasonable solution for them. I don't think there's any performance concern here because addIndexes(Directory...) is doing so much work (depending on the index size of course), that acquiring a lock on each Directory seems negligible. Let's do that? And also change the jdoc so explicitly state that and the NoLockFactory solution for Directories that cannot support locking? IndexWriter.addIndexes copies raw files but acquires no locks - Key: LUCENE-5228 URL: https://issues.apache.org/jira/browse/LUCENE-5228 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir I see stuff like: merge problem with lucene 3 and 4 indices (from solr users list), and cannot even think how to respond to these users because so many things can go wrong with IndexWriter.addIndexes(Directory) it currently has in its javadocs: NOTE: the index in each Directory must not be changed (opened by a writer) while this method is running. This method does not acquire a write lock in each input Directory, so it is up to the caller to enforce this. This method should be acquiring locks: its copying *RAW FILES*. Otherwise we should remove it. If someone doesnt like that, or is mad because its 10ns slower, they can use NoLockFactory. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5301) DELETEALIAS command prints CREATEALIAS in logs
[ https://issues.apache.org/jira/browse/SOLR-5301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl updated SOLR-5301: -- Attachment: SOLR-5301.patch Simple patch DELETEALIAS command prints CREATEALIAS in logs -- Key: SOLR-5301 URL: https://issues.apache.org/jira/browse/SOLR-5301 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.4, 4.5 Reporter: Jan Høydahl Priority: Minor Fix For: 5.0, 4.6 Attachments: SOLR-5301.patch A simple copy/paste error in https://github.com/apache/lucene-solr/blob/33d22db31e63482a1b1aad0cf90c4030bc359ffe/solr/core/src/java/org/apache/solr/handler/admin/CollectionsHandler.java#L265 Line 265 should say OverseerCollectionProcessor.DELETEALIAS As far as I can see the bug only affects logging. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-MacOSX (64bit/jdk1.6.0) - Build # 857 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/857/ Java: 64bit/jdk1.6.0 -XX:-UseCompressedOops -XX:+UseParallelGC 1 tests failed. REGRESSION: org.apache.solr.spelling.SpellCheckCollatorTest.testEstimatedHitCounts Error Message: Exception during query Stack Trace: java.lang.RuntimeException: Exception during query at __randomizedtesting.SeedInfo.seed([FD919F175CE8CA35:CC2A2122F9D7DAE5]:0) at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:637) at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:604) at org.apache.solr.spelling.SpellCheckCollatorTest.testEstimatedHitCounts(SpellCheckCollatorTest.java:521) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559) at com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358) at java.lang.Thread.run(Thread.java:680) Caused by: java.lang.RuntimeException: REQUEST FAILED:
[jira] [Created] (SOLR-5302) Analytics Component
Steven Bower created SOLR-5302: -- Summary: Analytics Component Key: SOLR-5302 URL: https://issues.apache.org/jira/browse/SOLR-5302 Project: Solr Issue Type: New Feature Reporter: Steven Bower This ticket is to track a replacement for the StatsComponent. The AnalyticsComponent supports the following features: * All functionality of StatsComponent (SOLR-4499) * Field Faceting (SOLR-3435) ** Support for limit ** Sorting (bucket name or any stat in the bucket ** Support for offset * Range Faceting ** Supports all options of standard range faceting * Query Faceting (SOLR-2925) * Ability to use overall/field facet statistics as input to range/query faceting (ie calc min/max date and then facet over that range * Support for more complex aggregate/mapping operations (SOLR-1622) ** Aggregations: min, max, sum, sum-of-square, count, missing, stddev, mean, median, percentiles ** Operations: negation, abs, add, multiply, divide, power, log, date math, string reversal, string concat ** Easily pluggable framework to add additional operations * New / cleaner output format Outstanding Issues: * Multi-value field support for stats (supported for faceting) * Multi-shard support (may not be possible for some operations, eg median) -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5302) Analytics Component
[ https://issues.apache.org/jira/browse/SOLR-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786162#comment-13786162 ] Steven Bower commented on SOLR-5302: Related tickets Analytics Component --- Key: SOLR-5302 URL: https://issues.apache.org/jira/browse/SOLR-5302 Project: Solr Issue Type: New Feature Reporter: Steven Bower This ticket is to track a replacement for the StatsComponent. The AnalyticsComponent supports the following features: * All functionality of StatsComponent (SOLR-4499) * Field Faceting (SOLR-3435) ** Support for limit ** Sorting (bucket name or any stat in the bucket ** Support for offset * Range Faceting ** Supports all options of standard range faceting * Query Faceting (SOLR-2925) * Ability to use overall/field facet statistics as input to range/query faceting (ie calc min/max date and then facet over that range * Support for more complex aggregate/mapping operations (SOLR-1622) ** Aggregations: min, max, sum, sum-of-square, count, missing, stddev, mean, median, percentiles ** Operations: negation, abs, add, multiply, divide, power, log, date math, string reversal, string concat ** Easily pluggable framework to add additional operations * New / cleaner output format Outstanding Issues: * Multi-value field support for stats (supported for faceting) * Multi-shard support (may not be possible for some operations, eg median) -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5302) Analytics Component
[ https://issues.apache.org/jira/browse/SOLR-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Bower updated SOLR-5302: --- Attachment: Statistical Expressions.pdf Search Analytics Component.pdf solr_analytics-2013.10.04.patch Initial patch, please review/comment. Additionally PDF exports of some docs for using the component Analytics Component --- Key: SOLR-5302 URL: https://issues.apache.org/jira/browse/SOLR-5302 Project: Solr Issue Type: New Feature Reporter: Steven Bower Attachments: Search Analytics Component.pdf, solr_analytics-2013.10.04.patch, Statistical Expressions.pdf This ticket is to track a replacement for the StatsComponent. The AnalyticsComponent supports the following features: * All functionality of StatsComponent (SOLR-4499) * Field Faceting (SOLR-3435) ** Support for limit ** Sorting (bucket name or any stat in the bucket ** Support for offset * Range Faceting ** Supports all options of standard range faceting * Query Faceting (SOLR-2925) * Ability to use overall/field facet statistics as input to range/query faceting (ie calc min/max date and then facet over that range * Support for more complex aggregate/mapping operations (SOLR-1622) ** Aggregations: min, max, sum, sum-of-square, count, missing, stddev, mean, median, percentiles ** Operations: negation, abs, add, multiply, divide, power, log, date math, string reversal, string concat ** Easily pluggable framework to add additional operations * New / cleaner output format Outstanding Issues: * Multi-value field support for stats (supported for faceting) * Multi-shard support (may not be possible for some operations, eg median) -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5302) Analytics Component
[ https://issues.apache.org/jira/browse/SOLR-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786171#comment-13786171 ] Uwe Schindler commented on SOLR-5302: - Hi, thanks for the patch! We also got your iCLA. Could you please remove this from every license header?: {noformat} + * Copyright 2013 Bloomberg Finance L.P. + * {noformat} Uwe Analytics Component --- Key: SOLR-5302 URL: https://issues.apache.org/jira/browse/SOLR-5302 Project: Solr Issue Type: New Feature Reporter: Steven Bower Attachments: Search Analytics Component.pdf, solr_analytics-2013.10.04.patch, Statistical Expressions.pdf This ticket is to track a replacement for the StatsComponent. The AnalyticsComponent supports the following features: * All functionality of StatsComponent (SOLR-4499) * Field Faceting (SOLR-3435) ** Support for limit ** Sorting (bucket name or any stat in the bucket ** Support for offset * Range Faceting ** Supports all options of standard range faceting * Query Faceting (SOLR-2925) * Ability to use overall/field facet statistics as input to range/query faceting (ie calc min/max date and then facet over that range * Support for more complex aggregate/mapping operations (SOLR-1622) ** Aggregations: min, max, sum, sum-of-square, count, missing, stddev, mean, median, percentiles ** Operations: negation, abs, add, multiply, divide, power, log, date math, string reversal, string concat ** Easily pluggable framework to add additional operations * New / cleaner output format Outstanding Issues: * Multi-value field support for stats (supported for faceting) * Multi-shard support (may not be possible for some operations, eg median) -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Joel Bernstein
Congrats Joel! Thanks, Kranti K. Parisa http://www.linkedin.com/in/krantiparisa On Fri, Oct 4, 2013 at 12:43 AM, David Smiley (@MITRE.org) dsmi...@mitre.org wrote: Welcome to the team Joel! - Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book -- View this message in context: http://lucene.472066.n3.nabble.com/Welcome-Joel-Bernstein-tp4093247p4093417.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5285) Solr response format should support child Docs
[ https://issues.apache.org/jira/browse/SOLR-5285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Thacker updated SOLR-5285: Attachment: SOLR-5285.patch Syntax: [children parentFilter=content_type:parentDocument] This patch adds support for nested docs for wt=xml. I will add for other writers soon. There is no way to specify return fields for a childDoc. Should there be an option? I added ChildDocTransformerFactory#toSolrDocument which is the same as TextResponseWriter#toSolrDocument. It's not the best thing to do. [~mkhludnev] thanks for your suggestion :) Solr response format should support child Docs -- Key: SOLR-5285 URL: https://issues.apache.org/jira/browse/SOLR-5285 Project: Solr Issue Type: New Feature Reporter: Varun Thacker Fix For: 5.0, 4.6 Attachments: SOLR-5285.patch Solr has added support for taking childDocs as input ( only XML till now ). It's currently used for BlockJoinQuery. I feel that if a user indexes a document with child docs, even if he isn't using the BJQ features and is just searching which results in a hit on the parentDoc, it's childDocs should be returned in the response format. [~hossman_luc...@fucit.org] on IRC suggested that the DocTransformers would be the place to add childDocs to the response. Now given a docId one needs to find out all the childDoc id's. A couple of approaches which I could think of are 1. Maintain the relation between a parentDoc and it's childDocs during indexing time in maybe a separate index? 2. Somehow emulate what happens in ToParentBlockJoinQuery.nextDoc() - Given a parentDoc it finds out all the childDocs but this requires a childScorer. Am I missing something obvious on how to find the relation between a parentDoc and it's childDocs because none of the above solutions for this look right. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5303) numShards property is not properly taken into account
Federico Piai created SOLR-5303: --- Summary: numShards property is not properly taken into account Key: SOLR-5303 URL: https://issues.apache.org/jira/browse/SOLR-5303 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.3.1, 4.2 Environment: SolR on 3 VMs each with an external Zookeeper, multi-core startups Reporter: Federico Piai It looks like the 'numShards' argument is ignored by SolR. The number of shards is always defaulted to 1 unless cores are dynamically created with Collection API. I had this log : INFO: numShards not found on descriptor - reading it from system property, I looked for the piece of code where the log was located ((https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/java/org/apache/solr/cloud/ZkController.java)) and I found a possible error : log.info(numShards not found on descriptor - reading it from system property); numShards = Integer.getInteger(ZkStateReader.NUM_SHARDS_PROP); -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5302) Analytics Component
[ https://issues.apache.org/jira/browse/SOLR-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786198#comment-13786198 ] Shawn Heisey commented on SOLR-5302: I love new functionality. Thank you for all the time and effort! I was going to suggest that you just replace the existing StatsComponent rather than create a new component, but as I look a little bit into things, it looks like it might not be a new component from the user/admin perspective, just the code perspective. I haven't looked in-depth, but I do see a new class in the patch, so I'm slightly confused. That confusion may clear up after I've looked deeper. Side note, and most likely not your fault at all: Your PDF text is invisible in my in-browser PDF viewer. Windows 8 Pro, Firefox 24.0. Everything is fine if downloaded and opened in Adobe Reader. I think this is probably using the PDF viewer built into Windows 8, which *sucks*. Analytics Component --- Key: SOLR-5302 URL: https://issues.apache.org/jira/browse/SOLR-5302 Project: Solr Issue Type: New Feature Reporter: Steven Bower Attachments: Search Analytics Component.pdf, solr_analytics-2013.10.04.patch, Statistical Expressions.pdf This ticket is to track a replacement for the StatsComponent. The AnalyticsComponent supports the following features: * All functionality of StatsComponent (SOLR-4499) * Field Faceting (SOLR-3435) ** Support for limit ** Sorting (bucket name or any stat in the bucket ** Support for offset * Range Faceting ** Supports all options of standard range faceting * Query Faceting (SOLR-2925) * Ability to use overall/field facet statistics as input to range/query faceting (ie calc min/max date and then facet over that range * Support for more complex aggregate/mapping operations (SOLR-1622) ** Aggregations: min, max, sum, sum-of-square, count, missing, stddev, mean, median, percentiles ** Operations: negation, abs, add, multiply, divide, power, log, date math, string reversal, string concat ** Easily pluggable framework to add additional operations * New / cleaner output format Outstanding Issues: * Multi-value field support for stats (supported for faceting) * Multi-shard support (may not be possible for some operations, eg median) -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5302) Analytics Component
[ https://issues.apache.org/jira/browse/SOLR-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786200#comment-13786200 ] Steven Bower commented on SOLR-5302: We originally had this code integrated into the stats component but we wanted to change the output format which made that a bit more complex.. it easily can go back in and replace it... also the olap=true i am not wedded to for turning it on, it was just better than a shortened version of analytics ;) Analytics Component --- Key: SOLR-5302 URL: https://issues.apache.org/jira/browse/SOLR-5302 Project: Solr Issue Type: New Feature Reporter: Steven Bower Attachments: Search Analytics Component.pdf, solr_analytics-2013.10.04.patch, Statistical Expressions.pdf This ticket is to track a replacement for the StatsComponent. The AnalyticsComponent supports the following features: * All functionality of StatsComponent (SOLR-4499) * Field Faceting (SOLR-3435) ** Support for limit ** Sorting (bucket name or any stat in the bucket ** Support for offset * Range Faceting ** Supports all options of standard range faceting * Query Faceting (SOLR-2925) * Ability to use overall/field facet statistics as input to range/query faceting (ie calc min/max date and then facet over that range * Support for more complex aggregate/mapping operations (SOLR-1622) ** Aggregations: min, max, sum, sum-of-square, count, missing, stddev, mean, median, percentiles ** Operations: negation, abs, add, multiply, divide, power, log, date math, string reversal, string concat ** Easily pluggable framework to add additional operations * New / cleaner output format Outstanding Issues: * Multi-value field support for stats (supported for faceting) * Multi-shard support (may not be possible for some operations, eg median) -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5302) Analytics Component
[ https://issues.apache.org/jira/browse/SOLR-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786206#comment-13786206 ] Robert Muir commented on SOLR-5302: --- Can we remove all the class.equals/isassignablefrom stuff? we should instead use proper fieldtype methods ... only use instanceof when absolutely necessary, and only instanceof, and please open an issue when it because it means solr is broken. using instanceof, isassignablefrom, class.equals, etc completely breaks solr's pluggability in increasingly bogus ways. Analytics Component --- Key: SOLR-5302 URL: https://issues.apache.org/jira/browse/SOLR-5302 Project: Solr Issue Type: New Feature Reporter: Steven Bower Attachments: Search Analytics Component.pdf, solr_analytics-2013.10.04.patch, Statistical Expressions.pdf This ticket is to track a replacement for the StatsComponent. The AnalyticsComponent supports the following features: * All functionality of StatsComponent (SOLR-4499) * Field Faceting (SOLR-3435) ** Support for limit ** Sorting (bucket name or any stat in the bucket ** Support for offset * Range Faceting ** Supports all options of standard range faceting * Query Faceting (SOLR-2925) * Ability to use overall/field facet statistics as input to range/query faceting (ie calc min/max date and then facet over that range * Support for more complex aggregate/mapping operations (SOLR-1622) ** Aggregations: min, max, sum, sum-of-square, count, missing, stddev, mean, median, percentiles ** Operations: negation, abs, add, multiply, divide, power, log, date math, string reversal, string concat ** Easily pluggable framework to add additional operations * New / cleaner output format Outstanding Issues: * Multi-value field support for stats (supported for faceting) * Multi-shard support (may not be possible for some operations, eg median) -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4978) Spatial search with point query won't find identical indexed point
[ https://issues.apache.org/jira/browse/LUCENE-4978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786287#comment-13786287 ] oli mcc commented on LUCENE-4978: - Hi David I think I've uncovered this same issue via Elasticsearch, see [issue 3795 | https://github.com/elasticsearch/elasticsearch/issues/3795] and have verified it with some test cases I've written. Any chance you'd have time to take a look at this? I'm digging in myself, but am still just at a stage of getting a sense for the codebase. Spatial search with point query won't find identical indexed point -- Key: LUCENE-4978 URL: https://issues.apache.org/jira/browse/LUCENE-4978 Project: Lucene - Core Issue Type: Bug Components: modules/spatial Affects Versions: 4.1 Reporter: David Smiley Assignee: David Smiley Priority: Minor Given a document with indexed POINT (10 20), when a search for INTERSECTS( POINT (10 20)) is issued, no results are returned. The work-around is to not search with a point shape, use a very small-radius circle or rectangle. (I'm marking this issue as minor because it's easy to do this). An unstated objective of the PrefixTree/grid approximation is that no matter what precision you use, an intersects query will find all true-positives. Due to approximations, it may also find some close false-positives. But in the case above, that unstated promise is violated. But it can also happen for query shapes other than points which do in fact barely enclose the point given at index time yet the indexed point is in-effect shifted to the center point of a cell which could be outside the query shape, and ultimately leading to a false-negative. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-4978) Spatial search with point query won't find identical indexed point
[ https://issues.apache.org/jira/browse/LUCENE-4978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786287#comment-13786287 ] oli mcc edited comment on LUCENE-4978 at 10/4/13 4:20 PM: -- Hi David, I think I've uncovered this same issue via Elasticsearch, see [issue 3795 | https://github.com/elasticsearch/elasticsearch/issues/3795] and have verified it with some test cases I've written. Any chance you'd have time to take a look at this? I'm digging in myself, but am still just at a stage of getting a sense for the codebase. was (Author: olimcc): Hi David I think I've uncovered this same issue via Elasticsearch, see [issue 3795 | https://github.com/elasticsearch/elasticsearch/issues/3795] and have verified it with some test cases I've written. Any chance you'd have time to take a look at this? I'm digging in myself, but am still just at a stage of getting a sense for the codebase. Spatial search with point query won't find identical indexed point -- Key: LUCENE-4978 URL: https://issues.apache.org/jira/browse/LUCENE-4978 Project: Lucene - Core Issue Type: Bug Components: modules/spatial Affects Versions: 4.1 Reporter: David Smiley Assignee: David Smiley Priority: Minor Given a document with indexed POINT (10 20), when a search for INTERSECTS( POINT (10 20)) is issued, no results are returned. The work-around is to not search with a point shape, use a very small-radius circle or rectangle. (I'm marking this issue as minor because it's easy to do this). An unstated objective of the PrefixTree/grid approximation is that no matter what precision you use, an intersects query will find all true-positives. Due to approximations, it may also find some close false-positives. But in the case above, that unstated promise is violated. But it can also happen for query shapes other than points which do in fact barely enclose the point given at index time yet the indexed point is in-effect shifted to the center point of a cell which could be outside the query shape, and ultimately leading to a false-negative. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5302) Analytics Component
[ https://issues.apache.org/jira/browse/SOLR-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786291#comment-13786291 ] Houston Paul Putman IV commented on SOLR-5302: -- The fieldtype methods should only work when working with fields though. I think we also use the class.equals stuff with ValueSource classes... Analytics Component --- Key: SOLR-5302 URL: https://issues.apache.org/jira/browse/SOLR-5302 Project: Solr Issue Type: New Feature Reporter: Steven Bower Attachments: Search Analytics Component.pdf, solr_analytics-2013.10.04.patch, Statistical Expressions.pdf This ticket is to track a replacement for the StatsComponent. The AnalyticsComponent supports the following features: * All functionality of StatsComponent (SOLR-4499) * Field Faceting (SOLR-3435) ** Support for limit ** Sorting (bucket name or any stat in the bucket ** Support for offset * Range Faceting ** Supports all options of standard range faceting * Query Faceting (SOLR-2925) * Ability to use overall/field facet statistics as input to range/query faceting (ie calc min/max date and then facet over that range * Support for more complex aggregate/mapping operations (SOLR-1622) ** Aggregations: min, max, sum, sum-of-square, count, missing, stddev, mean, median, percentiles ** Operations: negation, abs, add, multiply, divide, power, log, date math, string reversal, string concat ** Easily pluggable framework to add additional operations * New / cleaner output format Outstanding Issues: * Multi-value field support for stats (supported for faceting) * Multi-shard support (may not be possible for some operations, eg median) -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4509) Disable Stale Check - Distributed Search (Performance)
[ https://issues.apache.org/jira/browse/SOLR-4509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786290#comment-13786290 ] Ryan Zezeski commented on SOLR-4509: I recently updated Yokozuna (1) to use Solr 4.4.0. After running a query benchmark I noticed that throughput had dropped to 44% of the baseline. After some head scratching I realized that my distributed search patch had not applied successfully. Sure enough, after I updated the patch for 4.4.0 throughput returned to 100%+ of baseline. Below is a table showing results of query benchmark for 4.3.0, 4.4.0 and 4.4.0 without this patch. The throughput drops to less than half of Solr 4.4.0 with the patch and the latency more than doubles. |Measurement |Solr 4.3.0 |Solr 4.4.0 |Solr 4.4.0 w/o Patch | ||-|-|--| |Mean Throughput |1512 ops/s |1525 ops/s |670 ops/s (44%) | |Median Latency |22.0ms |21.6ms |46.2ms (2.1x) | |95th Latency|29.8ms |29.4ms |76.8ms (2.6x) | |99th Latency|35.3ms |34.6ms |86.2ms (2.5x) | These results are against a 4-node cluster all hosted on 1 physical machine. Manual distributed search is used for querying, there is no use of SolrCloud. There are only 1 million small text documents stored. The query matches only 1 of these documents. The query results and filter caches are enabled and should have a high hit ratio. The point is to make the queries inexpensive as possible to see what other overhead might occur. There may very well be scenarios where this patch makes little to no difference. But in this case it seems to make a big one. This update is not to prove that my patch makes a significant difference in all cases. Rather, I accidentally ran this benchmark and was surprised at the difference I saw. I wanted to ping this issue in hopes that others might try the patch to see if it helps. Here is the corresponding ticket on the Yokozuna repo: https://github.com/basho/yokozuna/pull/197 1: Yokozuna is a project with integrates Solr with the Riak database. https://github.com/basho/yokozuna Disable Stale Check - Distributed Search (Performance) -- Key: SOLR-4509 URL: https://issues.apache.org/jira/browse/SOLR-4509 Project: Solr Issue Type: Improvement Components: search Environment: 5 node SmartOS cluster (all nodes living in same global zone - i.e. same physical machine) Reporter: Ryan Zezeski Priority: Minor Attachments: baremetal-stale-nostale-med-latency.dat, baremetal-stale-nostale-med-latency.svg, baremetal-stale-nostale-throughput.dat, baremetal-stale-nostale-throughput.svg, IsStaleTime.java, SOLR-4509.patch By disabling the Apache HTTP Client stale check I've witnessed a 2-4x increase in throughput and reduction of over 100ms. This patch was made in the context of a project I'm leading, called Yokozuna, which relies on distributed search. Here's the patch on Yokozuna: https://github.com/rzezeski/yokozuna/pull/26 Here's a write-up I did on my findings: http://www.zinascii.com/2013/solr-distributed-search-and-the-stale-check.html I'm happy to answer any questions or make changes to the patch to make it acceptable. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4509) Disable Stale Check - Distributed Search (Performance)
[ https://issues.apache.org/jira/browse/SOLR-4509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Zezeski updated SOLR-4509: --- Attachment: SOLR-4509-4_4_0.patch Version of the patch that applies to 4.4.0. Disable Stale Check - Distributed Search (Performance) -- Key: SOLR-4509 URL: https://issues.apache.org/jira/browse/SOLR-4509 Project: Solr Issue Type: Improvement Components: search Environment: 5 node SmartOS cluster (all nodes living in same global zone - i.e. same physical machine) Reporter: Ryan Zezeski Priority: Minor Attachments: baremetal-stale-nostale-med-latency.dat, baremetal-stale-nostale-med-latency.svg, baremetal-stale-nostale-throughput.dat, baremetal-stale-nostale-throughput.svg, IsStaleTime.java, SOLR-4509-4_4_0.patch, SOLR-4509.patch By disabling the Apache HTTP Client stale check I've witnessed a 2-4x increase in throughput and reduction of over 100ms. This patch was made in the context of a project I'm leading, called Yokozuna, which relies on distributed search. Here's the patch on Yokozuna: https://github.com/rzezeski/yokozuna/pull/26 Here's a write-up I did on my findings: http://www.zinascii.com/2013/solr-distributed-search-and-the-stale-check.html I'm happy to answer any questions or make changes to the patch to make it acceptable. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5302) Analytics Component
[ https://issues.apache.org/jira/browse/SOLR-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786291#comment-13786291 ] Houston Putman edited comment on SOLR-5302 at 10/4/13 4:27 PM: --- The fieldtype methods should only work when working with fields though. I think we also use the class.equals stuff with ValueSource classes... Yeah, I just checked and we use it to check the type (numeric, string or date) of the value source or function. So we need to make a fix for that. was (Author: houstonputman): The fieldtype methods should only work when working with fields though. I think we also use the class.equals stuff with ValueSource classes... Analytics Component --- Key: SOLR-5302 URL: https://issues.apache.org/jira/browse/SOLR-5302 Project: Solr Issue Type: New Feature Reporter: Steven Bower Attachments: Search Analytics Component.pdf, solr_analytics-2013.10.04.patch, Statistical Expressions.pdf This ticket is to track a replacement for the StatsComponent. The AnalyticsComponent supports the following features: * All functionality of StatsComponent (SOLR-4499) * Field Faceting (SOLR-3435) ** Support for limit ** Sorting (bucket name or any stat in the bucket ** Support for offset * Range Faceting ** Supports all options of standard range faceting * Query Faceting (SOLR-2925) * Ability to use overall/field facet statistics as input to range/query faceting (ie calc min/max date and then facet over that range * Support for more complex aggregate/mapping operations (SOLR-1622) ** Aggregations: min, max, sum, sum-of-square, count, missing, stddev, mean, median, percentiles ** Operations: negation, abs, add, multiply, divide, power, log, date math, string reversal, string concat ** Easily pluggable framework to add additional operations * New / cleaner output format Outstanding Issues: * Multi-value field support for stats (supported for faceting) * Multi-shard support (may not be possible for some operations, eg median) -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5298) user versioning
[ https://issues.apache.org/jira/browse/SOLR-5298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786298#comment-13786298 ] Yonik Seeley commented on SOLR-5298: \_new\_version\_ ? Yeah, I like it! Since a user adding \_version\_ means they are specifying the existing version, \_new\_version\_ obviously means they are specifying the new version. user versioning --- Key: SOLR-5298 URL: https://issues.apache.org/jira/browse/SOLR-5298 Project: Solr Issue Type: New Feature Reporter: Yonik Seeley Assignee: Yonik Seeley Solr currently handles the assignment of version numbers, but it would be useful to allow the user to specify their own version numbers. For consistency, it would then be the users responsibility to specify versions on all updates (i.e. it would be undefined behavior if sometimes the user specified their own versions and sometimes did not). -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5302) Analytics Component
[ https://issues.apache.org/jira/browse/SOLR-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786410#comment-13786410 ] Steven Bower commented on SOLR-5302: [~thetaphi] Sent a mail over to our legal folks as this is what they instructed me to do.. will follow up and resolve Analytics Component --- Key: SOLR-5302 URL: https://issues.apache.org/jira/browse/SOLR-5302 Project: Solr Issue Type: New Feature Reporter: Steven Bower Attachments: Search Analytics Component.pdf, solr_analytics-2013.10.04.patch, Statistical Expressions.pdf This ticket is to track a replacement for the StatsComponent. The AnalyticsComponent supports the following features: * All functionality of StatsComponent (SOLR-4499) * Field Faceting (SOLR-3435) ** Support for limit ** Sorting (bucket name or any stat in the bucket ** Support for offset * Range Faceting ** Supports all options of standard range faceting * Query Faceting (SOLR-2925) * Ability to use overall/field facet statistics as input to range/query faceting (ie calc min/max date and then facet over that range * Support for more complex aggregate/mapping operations (SOLR-1622) ** Aggregations: min, max, sum, sum-of-square, count, missing, stddev, mean, median, percentiles ** Operations: negation, abs, add, multiply, divide, power, log, date math, string reversal, string concat ** Easily pluggable framework to add additional operations * New / cleaner output format Outstanding Issues: * Multi-value field support for stats (supported for faceting) * Multi-shard support (may not be possible for some operations, eg median) -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5302) Analytics Component
[ https://issues.apache.org/jira/browse/SOLR-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786420#comment-13786420 ] Uwe Schindler commented on SOLR-5302: - Hi Steven, I refer to this one: http://www.apache.org/legal/src-headers.html {quote} *Source File Headers for Code Developed at the ASF* This section refers only to works submitted directly to the ASF by the copyright owner or owner's agent. If the source file is submitted with a copyright notice included in it, the copyright owner (or owner's agent) must either: - remove such notices, or - move them to the NOTICE file associated with each applicable project release, or - provide written permission for the ASF to make such removal or relocation of the notices. Each source file should include the following license header -- note that there should be no copyright notice in the header: {noformat} Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the License); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. {noformat} {quote} Analytics Component --- Key: SOLR-5302 URL: https://issues.apache.org/jira/browse/SOLR-5302 Project: Solr Issue Type: New Feature Reporter: Steven Bower Attachments: Search Analytics Component.pdf, solr_analytics-2013.10.04.patch, Statistical Expressions.pdf This ticket is to track a replacement for the StatsComponent. The AnalyticsComponent supports the following features: * All functionality of StatsComponent (SOLR-4499) * Field Faceting (SOLR-3435) ** Support for limit ** Sorting (bucket name or any stat in the bucket ** Support for offset * Range Faceting ** Supports all options of standard range faceting * Query Faceting (SOLR-2925) * Ability to use overall/field facet statistics as input to range/query faceting (ie calc min/max date and then facet over that range * Support for more complex aggregate/mapping operations (SOLR-1622) ** Aggregations: min, max, sum, sum-of-square, count, missing, stddev, mean, median, percentiles ** Operations: negation, abs, add, multiply, divide, power, log, date math, string reversal, string concat ** Easily pluggable framework to add additional operations * New / cleaner output format Outstanding Issues: * Multi-value field support for stats (supported for faceting) * Multi-shard support (may not be possible for some operations, eg median) -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5302) Analytics Component
[ https://issues.apache.org/jira/browse/SOLR-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786420#comment-13786420 ] Uwe Schindler edited comment on SOLR-5302 at 10/4/13 5:58 PM: -- Hi Steven, I refer to this one: http://www.apache.org/legal/src-headers.html {quote} *Source File Headers for Code Developed at the ASF* This section refers only to works submitted directly to the ASF by the copyright owner or owner's agent. If the source file is submitted with a copyright notice included in it, the copyright owner (or owner's agent) must either: - remove such notices, or - move them to the NOTICE file associated with each applicable project release, or - provide written permission for the ASF to make such removal or relocation of the notices. Each source file should include the following license header -- note that there should be no copyright notice in the header: {noformat} Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the License); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. {noformat} {quote} was (Author: thetaphi): Hi Steven, I refer to this one: http://www.apache.org/legal/src-headers.html {quote} *Source File Headers for Code Developed at the ASF* This section refers only to works submitted directly to the ASF by the copyright owner or owner's agent. If the source file is submitted with a copyright notice included in it, the copyright owner (or owner's agent) must either: - remove such notices, or - move them to the NOTICE file associated with each applicable project release, or - provide written permission for the ASF to make such removal or relocation of the notices. Each source file should include the following license header -- note that there should be no copyright notice in the header: {noformat} Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the License); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. {noformat} {quote} Analytics Component --- Key: SOLR-5302 URL: https://issues.apache.org/jira/browse/SOLR-5302 Project: Solr Issue Type: New Feature Reporter: Steven Bower Attachments: Search Analytics Component.pdf, solr_analytics-2013.10.04.patch, Statistical Expressions.pdf This ticket is to track a replacement for the StatsComponent. The AnalyticsComponent supports the following features: * All functionality of StatsComponent (SOLR-4499) * Field Faceting (SOLR-3435) ** Support for limit ** Sorting (bucket name or any stat in the bucket ** Support for offset * Range Faceting ** Supports all options of standard range faceting * Query Faceting (SOLR-2925) * Ability to use overall/field facet statistics as input to range/query faceting (ie calc min/max date and then facet over that range * Support for more complex aggregate/mapping operations (SOLR-1622) ** Aggregations: min, max, sum, sum-of-square, count, missing, stddev, mean, median, percentiles ** Operations: negation, abs, add, multiply, divide, power, log, date math, string reversal, string concat ** Easily pluggable framework to add additional operations * New / cleaner output format Outstanding Issues: * Multi-value field support for stats (supported for faceting) * Multi-shard support (may not be possible for some operations, eg median) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (LUCENE-5257) Lock down centralized versioning of ivy dependencies
[ https://issues.apache.org/jira/browse/LUCENE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786425#comment-13786425 ] ASF subversion and git services commented on LUCENE-5257: - Commit 1529243 from [~steve_rowe] in branch 'dev/trunk' [ https://svn.apache.org/r1529243 ] LUCENE-5257: Lock down centralized versioning of ivy dependencies Lock down centralized versioning of ivy dependencies Key: LUCENE-5257 URL: https://issues.apache.org/jira/browse/LUCENE-5257 Project: Lucene - Core Issue Type: Improvement Components: general/build Reporter: Steve Rowe Assignee: Steve Rowe Priority: Minor Attachments: LUCENE-5257.patch, LUCENE-5257.patch, LUCENE-5257.patch LUCENE-5249 introduced centralized versioning of 3rd party dependencies and converted all ivy.xml files across Lucene/Solr to use this scheme. But there is nothing preventing people from ignoring this setup and (intentionally or not) introducing non-centralized dependency versions. SOLR-3664 discusses the problem of out-of-sync 3rd party dependencies between Lucene/Solr modules. Centralized versioning makes synchronization problems less likely but not impossible. One fairly simple way to ensure that all modules use the same version of 3rd party deps would be to require that all deps in ivy.xml would have to use the {{rev=$\{/org/name}}} syntax, via a validation script. The problem remains that there may eventually be a *requirement* to use different 3rd party libs in different modules. Any form of lockdown here should allow for this possibility. Hoss's suggestion from a conversation on #lucene IRC earlier today: {noformat} hoss perhaps exceptions could be by naming convetion sarowe can you give an example? hoss ie: variables must match either ${group}/${artifact} or they must match /VERSION_MISTMATCH_EXCEPTION/${group}/${artifact} sarowe nice idea no external config required hoss right and it has to be real obvious when you are bucking convention hoss or better yet: ${group}/${artifact}/VERSION_MISTMATCH_EXCEPTION ... and there is another check that the version file is in ascii order so you are garuntted that it has to be right there in the versions file one line after ${group}/${artifact} sarowe i like it hoss no change someone updating ${group}/${artifact} won't notice it i suppose really it should be ${group}/${artifact}/VERSION_MISTMATCH_EXCEPTION/${reason} ... since you might have more then one exception per ${group}/${artifact} but now i'm just making things up w/o evn really understanding the conventions you've alreay put in place sarowe :) hoss you get the idea sarowe yes {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5304) Typo in exception string in CurrencyField.java
Caleb Burns created SOLR-5304: - Summary: Typo in exception string in CurrencyField.java Key: SOLR-5304 URL: https://issues.apache.org/jira/browse/SOLR-5304 Project: Solr Issue Type: Bug Affects Versions: 4.4 Reporter: Caleb Burns Priority: Trivial There is a typo in an exception string in CurrencyField.java. As of today, in https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/java/org/apache/solr/schema/CurrencyField.java on line 149: throw new SolrException(ErrorCode.BAD_REQUEST, Error instantiating exhange rate provider +exchangeRateProviderClass+: + e.getMessage(), e); should be: throw new SolrException(ErrorCode.BAD_REQUEST, Error instantiating exchange rate provider +exchangeRateProviderClass+: + e.getMessage(), e); exchange is misspelled as exhange. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5304) Typo in exception string in CurrencyField.java
[ https://issues.apache.org/jira/browse/SOLR-5304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Caleb Burns updated SOLR-5304: -- Description: There is a typo in an exception string in CurrencyField.java. As of today, in https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/java/org/apache/solr/schema/CurrencyField.java on line 149: {code:title=CurrencyField.java} throw new SolrException(ErrorCode.BAD_REQUEST, Error instantiating exhange rate provider +exchangeRateProviderClass+: + e.getMessage(), e); {code} should be: {code:title=CurrencyField.java} throw new SolrException(ErrorCode.BAD_REQUEST, Error instantiating exchange rate provider +exchangeRateProviderClass+: + e.getMessage(), e); {code} exchange is misspelled as exhange. was: There is a typo in an exception string in CurrencyField.java. As of today, in https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/java/org/apache/solr/schema/CurrencyField.java on line 149: throw new SolrException(ErrorCode.BAD_REQUEST, Error instantiating exhange rate provider +exchangeRateProviderClass+: + e.getMessage(), e); should be: throw new SolrException(ErrorCode.BAD_REQUEST, Error instantiating exchange rate provider +exchangeRateProviderClass+: + e.getMessage(), e); exchange is misspelled as exhange. Typo in exception string in CurrencyField.java -- Key: SOLR-5304 URL: https://issues.apache.org/jira/browse/SOLR-5304 Project: Solr Issue Type: Bug Affects Versions: 4.4 Reporter: Caleb Burns Priority: Trivial Labels: typo Original Estimate: 5m Remaining Estimate: 5m There is a typo in an exception string in CurrencyField.java. As of today, in https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/java/org/apache/solr/schema/CurrencyField.java on line 149: {code:title=CurrencyField.java} throw new SolrException(ErrorCode.BAD_REQUEST, Error instantiating exhange rate provider +exchangeRateProviderClass+: + e.getMessage(), e); {code} should be: {code:title=CurrencyField.java} throw new SolrException(ErrorCode.BAD_REQUEST, Error instantiating exchange rate provider +exchangeRateProviderClass+: + e.getMessage(), e); {code} exchange is misspelled as exhange. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5257) Lock down centralized versioning of ivy dependencies
[ https://issues.apache.org/jira/browse/LUCENE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786438#comment-13786438 ] ASF subversion and git services commented on LUCENE-5257: - Commit 1529246 from [~steve_rowe] in branch 'dev/trunk' [ https://svn.apache.org/r1529246 ] LUCENE-5257: Fix top-level validate target subant invocation syntax Lock down centralized versioning of ivy dependencies Key: LUCENE-5257 URL: https://issues.apache.org/jira/browse/LUCENE-5257 Project: Lucene - Core Issue Type: Improvement Components: general/build Reporter: Steve Rowe Assignee: Steve Rowe Priority: Minor Attachments: LUCENE-5257.patch, LUCENE-5257.patch, LUCENE-5257.patch LUCENE-5249 introduced centralized versioning of 3rd party dependencies and converted all ivy.xml files across Lucene/Solr to use this scheme. But there is nothing preventing people from ignoring this setup and (intentionally or not) introducing non-centralized dependency versions. SOLR-3664 discusses the problem of out-of-sync 3rd party dependencies between Lucene/Solr modules. Centralized versioning makes synchronization problems less likely but not impossible. One fairly simple way to ensure that all modules use the same version of 3rd party deps would be to require that all deps in ivy.xml would have to use the {{rev=$\{/org/name}}} syntax, via a validation script. The problem remains that there may eventually be a *requirement* to use different 3rd party libs in different modules. Any form of lockdown here should allow for this possibility. Hoss's suggestion from a conversation on #lucene IRC earlier today: {noformat} hoss perhaps exceptions could be by naming convetion sarowe can you give an example? hoss ie: variables must match either ${group}/${artifact} or they must match /VERSION_MISTMATCH_EXCEPTION/${group}/${artifact} sarowe nice idea no external config required hoss right and it has to be real obvious when you are bucking convention hoss or better yet: ${group}/${artifact}/VERSION_MISTMATCH_EXCEPTION ... and there is another check that the version file is in ascii order so you are garuntted that it has to be right there in the versions file one line after ${group}/${artifact} sarowe i like it hoss no change someone updating ${group}/${artifact} won't notice it i suppose really it should be ${group}/${artifact}/VERSION_MISTMATCH_EXCEPTION/${reason} ... since you might have more then one exception per ${group}/${artifact} but now i'm just making things up w/o evn really understanding the conventions you've alreay put in place sarowe :) hoss you get the idea sarowe yes {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5257) Lock down centralized versioning of ivy dependencies
[ https://issues.apache.org/jira/browse/LUCENE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786458#comment-13786458 ] ASF subversion and git services commented on LUCENE-5257: - Commit 1529248 from [~steve_rowe] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1529248 ] LUCENE-5257: Lock down centralized versioning of ivy dependencies (merged trunk r1529243 and r1529246) Lock down centralized versioning of ivy dependencies Key: LUCENE-5257 URL: https://issues.apache.org/jira/browse/LUCENE-5257 Project: Lucene - Core Issue Type: Improvement Components: general/build Reporter: Steve Rowe Assignee: Steve Rowe Priority: Minor Attachments: LUCENE-5257.patch, LUCENE-5257.patch, LUCENE-5257.patch LUCENE-5249 introduced centralized versioning of 3rd party dependencies and converted all ivy.xml files across Lucene/Solr to use this scheme. But there is nothing preventing people from ignoring this setup and (intentionally or not) introducing non-centralized dependency versions. SOLR-3664 discusses the problem of out-of-sync 3rd party dependencies between Lucene/Solr modules. Centralized versioning makes synchronization problems less likely but not impossible. One fairly simple way to ensure that all modules use the same version of 3rd party deps would be to require that all deps in ivy.xml would have to use the {{rev=$\{/org/name}}} syntax, via a validation script. The problem remains that there may eventually be a *requirement* to use different 3rd party libs in different modules. Any form of lockdown here should allow for this possibility. Hoss's suggestion from a conversation on #lucene IRC earlier today: {noformat} hoss perhaps exceptions could be by naming convetion sarowe can you give an example? hoss ie: variables must match either ${group}/${artifact} or they must match /VERSION_MISTMATCH_EXCEPTION/${group}/${artifact} sarowe nice idea no external config required hoss right and it has to be real obvious when you are bucking convention hoss or better yet: ${group}/${artifact}/VERSION_MISTMATCH_EXCEPTION ... and there is another check that the version file is in ascii order so you are garuntted that it has to be right there in the versions file one line after ${group}/${artifact} sarowe i like it hoss no change someone updating ${group}/${artifact} won't notice it i suppose really it should be ${group}/${artifact}/VERSION_MISTMATCH_EXCEPTION/${reason} ... since you might have more then one exception per ${group}/${artifact} but now i'm just making things up w/o evn really understanding the conventions you've alreay put in place sarowe :) hoss you get the idea sarowe yes {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5249) All Lucene/Solr modules should use the same dependency versions
[ https://issues.apache.org/jira/browse/LUCENE-5249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786467#comment-13786467 ] ASF subversion and git services commented on LUCENE-5249: - Commit 1529250 from [~steve_rowe] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1529250 ] LUCENE-5257: merge CHANGES.txt entry with LUCENE-5249's entry (merged trunk r1529249) All Lucene/Solr modules should use the same dependency versions --- Key: LUCENE-5249 URL: https://issues.apache.org/jira/browse/LUCENE-5249 Project: Lucene - Core Issue Type: Improvement Components: general/build Reporter: Steve Rowe Assignee: Steve Rowe Priority: Minor Fix For: 5.0, 4.6 Attachments: LUCENE-5162.patch [~markrmil...@gmail.com] wrote on the dev list: {quote} I'd like it for some things if we actually kept the versions somewhere else - for instance, Hadoop dependencies should match across the mr module and the core module. Perhaps we could define versions for dependencies across multiple modules that should probably match, in a prop file or ant file and use sys sub for them in the ivy files. For something like Hadoop, that would also make it simple to use Hadoop 1 rather than 2 with a single sys prop override. Same with some other depenencies. {quote} -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5249) All Lucene/Solr modules should use the same dependency versions
[ https://issues.apache.org/jira/browse/LUCENE-5249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786464#comment-13786464 ] ASF subversion and git services commented on LUCENE-5249: - Commit 1529249 from [~steve_rowe] in branch 'dev/trunk' [ https://svn.apache.org/r1529249 ] LUCENE-5257: merge CHANGES.txt entry with LUCENE-5249's entry All Lucene/Solr modules should use the same dependency versions --- Key: LUCENE-5249 URL: https://issues.apache.org/jira/browse/LUCENE-5249 Project: Lucene - Core Issue Type: Improvement Components: general/build Reporter: Steve Rowe Assignee: Steve Rowe Priority: Minor Fix For: 5.0, 4.6 Attachments: LUCENE-5162.patch [~markrmil...@gmail.com] wrote on the dev list: {quote} I'd like it for some things if we actually kept the versions somewhere else - for instance, Hadoop dependencies should match across the mr module and the core module. Perhaps we could define versions for dependencies across multiple modules that should probably match, in a prop file or ant file and use sys sub for them in the ivy files. For something like Hadoop, that would also make it simple to use Hadoop 1 rather than 2 with a single sys prop override. Same with some other depenencies. {quote} -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5257) Lock down centralized versioning of ivy dependencies
[ https://issues.apache.org/jira/browse/LUCENE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786466#comment-13786466 ] ASF subversion and git services commented on LUCENE-5257: - Commit 1529250 from [~steve_rowe] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1529250 ] LUCENE-5257: merge CHANGES.txt entry with LUCENE-5249's entry (merged trunk r1529249) Lock down centralized versioning of ivy dependencies Key: LUCENE-5257 URL: https://issues.apache.org/jira/browse/LUCENE-5257 Project: Lucene - Core Issue Type: Improvement Components: general/build Reporter: Steve Rowe Assignee: Steve Rowe Priority: Minor Attachments: LUCENE-5257.patch, LUCENE-5257.patch, LUCENE-5257.patch LUCENE-5249 introduced centralized versioning of 3rd party dependencies and converted all ivy.xml files across Lucene/Solr to use this scheme. But there is nothing preventing people from ignoring this setup and (intentionally or not) introducing non-centralized dependency versions. SOLR-3664 discusses the problem of out-of-sync 3rd party dependencies between Lucene/Solr modules. Centralized versioning makes synchronization problems less likely but not impossible. One fairly simple way to ensure that all modules use the same version of 3rd party deps would be to require that all deps in ivy.xml would have to use the {{rev=$\{/org/name}}} syntax, via a validation script. The problem remains that there may eventually be a *requirement* to use different 3rd party libs in different modules. Any form of lockdown here should allow for this possibility. Hoss's suggestion from a conversation on #lucene IRC earlier today: {noformat} hoss perhaps exceptions could be by naming convetion sarowe can you give an example? hoss ie: variables must match either ${group}/${artifact} or they must match /VERSION_MISTMATCH_EXCEPTION/${group}/${artifact} sarowe nice idea no external config required hoss right and it has to be real obvious when you are bucking convention hoss or better yet: ${group}/${artifact}/VERSION_MISTMATCH_EXCEPTION ... and there is another check that the version file is in ascii order so you are garuntted that it has to be right there in the versions file one line after ${group}/${artifact} sarowe i like it hoss no change someone updating ${group}/${artifact} won't notice it i suppose really it should be ${group}/${artifact}/VERSION_MISTMATCH_EXCEPTION/${reason} ... since you might have more then one exception per ${group}/${artifact} but now i'm just making things up w/o evn really understanding the conventions you've alreay put in place sarowe :) hoss you get the idea sarowe yes {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5257) Lock down centralized versioning of ivy dependencies
[ https://issues.apache.org/jira/browse/LUCENE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786463#comment-13786463 ] ASF subversion and git services commented on LUCENE-5257: - Commit 1529249 from [~steve_rowe] in branch 'dev/trunk' [ https://svn.apache.org/r1529249 ] LUCENE-5257: merge CHANGES.txt entry with LUCENE-5249's entry Lock down centralized versioning of ivy dependencies Key: LUCENE-5257 URL: https://issues.apache.org/jira/browse/LUCENE-5257 Project: Lucene - Core Issue Type: Improvement Components: general/build Reporter: Steve Rowe Assignee: Steve Rowe Priority: Minor Attachments: LUCENE-5257.patch, LUCENE-5257.patch, LUCENE-5257.patch LUCENE-5249 introduced centralized versioning of 3rd party dependencies and converted all ivy.xml files across Lucene/Solr to use this scheme. But there is nothing preventing people from ignoring this setup and (intentionally or not) introducing non-centralized dependency versions. SOLR-3664 discusses the problem of out-of-sync 3rd party dependencies between Lucene/Solr modules. Centralized versioning makes synchronization problems less likely but not impossible. One fairly simple way to ensure that all modules use the same version of 3rd party deps would be to require that all deps in ivy.xml would have to use the {{rev=$\{/org/name}}} syntax, via a validation script. The problem remains that there may eventually be a *requirement* to use different 3rd party libs in different modules. Any form of lockdown here should allow for this possibility. Hoss's suggestion from a conversation on #lucene IRC earlier today: {noformat} hoss perhaps exceptions could be by naming convetion sarowe can you give an example? hoss ie: variables must match either ${group}/${artifact} or they must match /VERSION_MISTMATCH_EXCEPTION/${group}/${artifact} sarowe nice idea no external config required hoss right and it has to be real obvious when you are bucking convention hoss or better yet: ${group}/${artifact}/VERSION_MISTMATCH_EXCEPTION ... and there is another check that the version file is in ascii order so you are garuntted that it has to be right there in the versions file one line after ${group}/${artifact} sarowe i like it hoss no change someone updating ${group}/${artifact} won't notice it i suppose really it should be ${group}/${artifact}/VERSION_MISTMATCH_EXCEPTION/${reason} ... since you might have more then one exception per ${group}/${artifact} but now i'm just making things up w/o evn really understanding the conventions you've alreay put in place sarowe :) hoss you get the idea sarowe yes {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5257) Lock down centralized versioning of ivy dependencies
[ https://issues.apache.org/jira/browse/LUCENE-5257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Rowe resolved LUCENE-5257. Resolution: Fixed Fix Version/s: 4.6 5.0 Committed to trunk and branch_4x. Lock down centralized versioning of ivy dependencies Key: LUCENE-5257 URL: https://issues.apache.org/jira/browse/LUCENE-5257 Project: Lucene - Core Issue Type: Improvement Components: general/build Reporter: Steve Rowe Assignee: Steve Rowe Priority: Minor Fix For: 5.0, 4.6 Attachments: LUCENE-5257.patch, LUCENE-5257.patch, LUCENE-5257.patch LUCENE-5249 introduced centralized versioning of 3rd party dependencies and converted all ivy.xml files across Lucene/Solr to use this scheme. But there is nothing preventing people from ignoring this setup and (intentionally or not) introducing non-centralized dependency versions. SOLR-3664 discusses the problem of out-of-sync 3rd party dependencies between Lucene/Solr modules. Centralized versioning makes synchronization problems less likely but not impossible. One fairly simple way to ensure that all modules use the same version of 3rd party deps would be to require that all deps in ivy.xml would have to use the {{rev=$\{/org/name}}} syntax, via a validation script. The problem remains that there may eventually be a *requirement* to use different 3rd party libs in different modules. Any form of lockdown here should allow for this possibility. Hoss's suggestion from a conversation on #lucene IRC earlier today: {noformat} hoss perhaps exceptions could be by naming convetion sarowe can you give an example? hoss ie: variables must match either ${group}/${artifact} or they must match /VERSION_MISTMATCH_EXCEPTION/${group}/${artifact} sarowe nice idea no external config required hoss right and it has to be real obvious when you are bucking convention hoss or better yet: ${group}/${artifact}/VERSION_MISTMATCH_EXCEPTION ... and there is another check that the version file is in ascii order so you are garuntted that it has to be right there in the versions file one line after ${group}/${artifact} sarowe i like it hoss no change someone updating ${group}/${artifact} won't notice it i suppose really it should be ${group}/${artifact}/VERSION_MISTMATCH_EXCEPTION/${reason} ... since you might have more then one exception per ${group}/${artifact} but now i'm just making things up w/o evn really understanding the conventions you've alreay put in place sarowe :) hoss you get the idea sarowe yes {noformat} -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-3664) risk of inconsistency in solr(contrib)-module-thirdparty dependencies
[ https://issues.apache.org/jira/browse/SOLR-3664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Rowe resolved SOLR-3664. -- Resolution: Fixed Fix Version/s: 4.6 5.0 Assignee: Steve Rowe Should be fully addressed by LUCENE-5249 and LUCENE-5257. risk of inconsistency in solr(contrib)-module-thirdparty dependencies --- Key: SOLR-3664 URL: https://issues.apache.org/jira/browse/SOLR-3664 Project: Solr Issue Type: Improvement Reporter: Hoss Man Assignee: Steve Rowe Fix For: 5.0, 4.6 Currently, if something in solr has an indirect dependency on a third-party package via a dependency on a lucene module, that is tracked in a solr/\*\*/ivy.xml files and redundant copies of the third-party LICENSE/NOTICE/jar.sha1 files are committed under solr/\*\* This presents a risk that these files may fall out of sync if/when the dependencies of the lucene module are updated in the future (ie: a developer could update a lucene module to depend on a new package -- or a new version of an existing package -- w/o remembering to upgrade the corresponding ivy related files in solr) we should try to eliminate this risk -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4978) Spatial search with point query won't find identical indexed point
[ https://issues.apache.org/jira/browse/LUCENE-4978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated LUCENE-4978: - Fix Version/s: 4.6 Spatial search with point query won't find identical indexed point -- Key: LUCENE-4978 URL: https://issues.apache.org/jira/browse/LUCENE-4978 Project: Lucene - Core Issue Type: Bug Components: modules/spatial Affects Versions: 4.1 Reporter: David Smiley Assignee: David Smiley Priority: Minor Fix For: 4.6 Given a document with indexed POINT (10 20), when a search for INTERSECTS( POINT (10 20)) is issued, no results are returned. The work-around is to not search with a point shape, use a very small-radius circle or rectangle. (I'm marking this issue as minor because it's easy to do this). An unstated objective of the PrefixTree/grid approximation is that no matter what precision you use, an intersects query will find all true-positives. Due to approximations, it may also find some close false-positives. But in the case above, that unstated promise is violated. But it can also happen for query shapes other than points which do in fact barely enclose the point given at index time yet the indexed point is in-effect shifted to the center point of a cell which could be outside the query shape, and ultimately leading to a false-negative. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4978) Spatial search with point query won't find identical indexed point
[ https://issues.apache.org/jira/browse/LUCENE-4978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786519#comment-13786519 ] David Smiley commented on LUCENE-4978: -- Hi Oli, This issue is indeed the root cause of the one you refer to in ES. I spent a little time on fixing the problem months ago but held off because I wanted to better understand the performance trade-off, and I hadn't yet developed a benchmark -- through I have one now in LUCENE-2844. Correct me if I'm wrong but I heard ES has a point-query optimization. At least I thought I saw something like that when I looked through ES's docs a couple months ago. I would like to add such an optimization within Lucene-spatial which would effectively avoid this particular issue you hit because it would end up being a simple Lucene term query. This underlying issue would still exist though, it just wouldn't show up with a point query. If you want a quick solution that only addresses intersection with a Point query, then you could modify the code I reference in the comment above to not use cell.getCenter() when queryShape is an instance of Point. Make sense? To be clear though, the quick solution or a solution optimizing a point query doesn't actually address the underlying problem; it just fixes it for point queries only. It's still possible to index a point that fits inside a query rectangle extremely close to the edge, and depending on which side of the grid line the rectangle border is, you might not match the point. Spatial search with point query won't find identical indexed point -- Key: LUCENE-4978 URL: https://issues.apache.org/jira/browse/LUCENE-4978 Project: Lucene - Core Issue Type: Bug Components: modules/spatial Affects Versions: 4.1 Reporter: David Smiley Assignee: David Smiley Priority: Minor Fix For: 4.6 Given a document with indexed POINT (10 20), when a search for INTERSECTS( POINT (10 20)) is issued, no results are returned. The work-around is to not search with a point shape, use a very small-radius circle or rectangle. (I'm marking this issue as minor because it's easy to do this). An unstated objective of the PrefixTree/grid approximation is that no matter what precision you use, an intersects query will find all true-positives. Due to approximations, it may also find some close false-positives. But in the case above, that unstated promise is violated. But it can also happen for query shapes other than points which do in fact barely enclose the point given at index time yet the indexed point is in-effect shifted to the center point of a cell which could be outside the query shape, and ultimately leading to a false-negative. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5302) Analytics Component
[ https://issues.apache.org/jira/browse/SOLR-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Bower updated SOLR-5302: --- Attachment: (was: solr_analytics-2013.10.04.patch) Analytics Component --- Key: SOLR-5302 URL: https://issues.apache.org/jira/browse/SOLR-5302 Project: Solr Issue Type: New Feature Reporter: Steven Bower Attachments: Search Analytics Component.pdf, solr_analytics-2013.10.04-2.patch, Statistical Expressions.pdf This ticket is to track a replacement for the StatsComponent. The AnalyticsComponent supports the following features: * All functionality of StatsComponent (SOLR-4499) * Field Faceting (SOLR-3435) ** Support for limit ** Sorting (bucket name or any stat in the bucket ** Support for offset * Range Faceting ** Supports all options of standard range faceting * Query Faceting (SOLR-2925) * Ability to use overall/field facet statistics as input to range/query faceting (ie calc min/max date and then facet over that range * Support for more complex aggregate/mapping operations (SOLR-1622) ** Aggregations: min, max, sum, sum-of-square, count, missing, stddev, mean, median, percentiles ** Operations: negation, abs, add, multiply, divide, power, log, date math, string reversal, string concat ** Easily pluggable framework to add additional operations * New / cleaner output format Outstanding Issues: * Multi-value field support for stats (supported for faceting) * Multi-shard support (may not be possible for some operations, eg median) -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5302) Analytics Component
[ https://issues.apache.org/jira/browse/SOLR-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Bower updated SOLR-5302: --- Attachment: solr_analytics-2013.10.04-2.patch Updated patch: * Updated license comment to remove copyrights * Added copyright notice to NOTICE.txt * Cleaned up lots of Javadoc warnings * Cleaned up some exception handling Analytics Component --- Key: SOLR-5302 URL: https://issues.apache.org/jira/browse/SOLR-5302 Project: Solr Issue Type: New Feature Reporter: Steven Bower Attachments: Search Analytics Component.pdf, solr_analytics-2013.10.04-2.patch, Statistical Expressions.pdf This ticket is to track a replacement for the StatsComponent. The AnalyticsComponent supports the following features: * All functionality of StatsComponent (SOLR-4499) * Field Faceting (SOLR-3435) ** Support for limit ** Sorting (bucket name or any stat in the bucket ** Support for offset * Range Faceting ** Supports all options of standard range faceting * Query Faceting (SOLR-2925) * Ability to use overall/field facet statistics as input to range/query faceting (ie calc min/max date and then facet over that range * Support for more complex aggregate/mapping operations (SOLR-1622) ** Aggregations: min, max, sum, sum-of-square, count, missing, stddev, mean, median, percentiles ** Operations: negation, abs, add, multiply, divide, power, log, date math, string reversal, string concat ** Easily pluggable framework to add additional operations * New / cleaner output format Outstanding Issues: * Multi-value field support for stats (supported for faceting) * Multi-shard support (may not be possible for some operations, eg median) -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5302) Analytics Component
[ https://issues.apache.org/jira/browse/SOLR-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786552#comment-13786552 ] Steven Bower commented on SOLR-5302: Removed original patch file as it contained incorrect copyright headers Analytics Component --- Key: SOLR-5302 URL: https://issues.apache.org/jira/browse/SOLR-5302 Project: Solr Issue Type: New Feature Reporter: Steven Bower Attachments: Search Analytics Component.pdf, solr_analytics-2013.10.04-2.patch, Statistical Expressions.pdf This ticket is to track a replacement for the StatsComponent. The AnalyticsComponent supports the following features: * All functionality of StatsComponent (SOLR-4499) * Field Faceting (SOLR-3435) ** Support for limit ** Sorting (bucket name or any stat in the bucket ** Support for offset * Range Faceting ** Supports all options of standard range faceting * Query Faceting (SOLR-2925) * Ability to use overall/field facet statistics as input to range/query faceting (ie calc min/max date and then facet over that range * Support for more complex aggregate/mapping operations (SOLR-1622) ** Aggregations: min, max, sum, sum-of-square, count, missing, stddev, mean, median, percentiles ** Operations: negation, abs, add, multiply, divide, power, log, date math, string reversal, string concat ** Easily pluggable framework to add additional operations * New / cleaner output format Outstanding Issues: * Multi-value field support for stats (supported for faceting) * Multi-shard support (may not be possible for some operations, eg median) -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5302) Analytics Component
[ https://issues.apache.org/jira/browse/SOLR-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786579#comment-13786579 ] Yonik Seeley commented on SOLR-5302: Sweet... nice work guys! Implementation details are just that. But perhaps we should land this on trunk and let the interface bake so it doesn't accidentally get released early in a 4x release? On a quick scroll through, it looks like mostly new files, which is great (i.e. it won't complicate the backporting/merging of other solr features from 4x to trunk) Analytics Component --- Key: SOLR-5302 URL: https://issues.apache.org/jira/browse/SOLR-5302 Project: Solr Issue Type: New Feature Reporter: Steven Bower Attachments: Search Analytics Component.pdf, solr_analytics-2013.10.04-2.patch, Statistical Expressions.pdf This ticket is to track a replacement for the StatsComponent. The AnalyticsComponent supports the following features: * All functionality of StatsComponent (SOLR-4499) * Field Faceting (SOLR-3435) ** Support for limit ** Sorting (bucket name or any stat in the bucket ** Support for offset * Range Faceting ** Supports all options of standard range faceting * Query Faceting (SOLR-2925) * Ability to use overall/field facet statistics as input to range/query faceting (ie calc min/max date and then facet over that range * Support for more complex aggregate/mapping operations (SOLR-1622) ** Aggregations: min, max, sum, sum-of-square, count, missing, stddev, mean, median, percentiles ** Operations: negation, abs, add, multiply, divide, power, log, date math, string reversal, string concat ** Easily pluggable framework to add additional operations * New / cleaner output format Outstanding Issues: * Multi-value field support for stats (supported for faceting) * Multi-shard support (may not be possible for some operations, eg median) -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5302) Analytics Component
[ https://issues.apache.org/jira/browse/SOLR-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786627#comment-13786627 ] Steven Bower commented on SOLR-5302: Yup.. we intentionally layed it out so that there is very little (only 2 files) that need to change in order to merge this in. Would love for this to end up on trunk. We are actively working on this as well, adding new functionality, performance tuning, etc.. If I had commit access to trunk I'd gladly keep it up to date, merged with the latest, as well as keep up patch releases for 4.x (as that is what we are deploying it against currently into our production environment) Analytics Component --- Key: SOLR-5302 URL: https://issues.apache.org/jira/browse/SOLR-5302 Project: Solr Issue Type: New Feature Reporter: Steven Bower Attachments: Search Analytics Component.pdf, solr_analytics-2013.10.04-2.patch, Statistical Expressions.pdf This ticket is to track a replacement for the StatsComponent. The AnalyticsComponent supports the following features: * All functionality of StatsComponent (SOLR-4499) * Field Faceting (SOLR-3435) ** Support for limit ** Sorting (bucket name or any stat in the bucket ** Support for offset * Range Faceting ** Supports all options of standard range faceting * Query Faceting (SOLR-2925) * Ability to use overall/field facet statistics as input to range/query faceting (ie calc min/max date and then facet over that range * Support for more complex aggregate/mapping operations (SOLR-1622) ** Aggregations: min, max, sum, sum-of-square, count, missing, stddev, mean, median, percentiles ** Operations: negation, abs, add, multiply, divide, power, log, date math, string reversal, string concat ** Easily pluggable framework to add additional operations * New / cleaner output format Outstanding Issues: * Multi-value field support for stats (supported for faceting) * Multi-shard support (may not be possible for some operations, eg median) -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5301) DELETEALIAS command prints CREATEALIAS in logs
[ https://issues.apache.org/jira/browse/SOLR-5301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786696#comment-13786696 ] ASF subversion and git services commented on SOLR-5301: --- Commit 1529335 from jan...@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1529335 ] SOLR-5301: DELETEALIAS command prints CREATEALIAS in logs DELETEALIAS command prints CREATEALIAS in logs -- Key: SOLR-5301 URL: https://issues.apache.org/jira/browse/SOLR-5301 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.4, 4.5 Reporter: Jan Høydahl Priority: Minor Fix For: 5.0, 4.6 Attachments: SOLR-5301.patch A simple copy/paste error in https://github.com/apache/lucene-solr/blob/33d22db31e63482a1b1aad0cf90c4030bc359ffe/solr/core/src/java/org/apache/solr/handler/admin/CollectionsHandler.java#L265 Line 265 should say OverseerCollectionProcessor.DELETEALIAS As far as I can see the bug only affects logging. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5301) DELETEALIAS command prints CREATEALIAS in logs
[ https://issues.apache.org/jira/browse/SOLR-5301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl resolved SOLR-5301. --- Resolution: Fixed DELETEALIAS command prints CREATEALIAS in logs -- Key: SOLR-5301 URL: https://issues.apache.org/jira/browse/SOLR-5301 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.4, 4.5 Reporter: Jan Høydahl Priority: Minor Fix For: 5.0, 4.6 Attachments: SOLR-5301.patch A simple copy/paste error in https://github.com/apache/lucene-solr/blob/33d22db31e63482a1b1aad0cf90c4030bc359ffe/solr/core/src/java/org/apache/solr/handler/admin/CollectionsHandler.java#L265 Line 265 should say OverseerCollectionProcessor.DELETEALIAS As far as I can see the bug only affects logging. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5301) DELETEALIAS command prints CREATEALIAS in logs
[ https://issues.apache.org/jira/browse/SOLR-5301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786703#comment-13786703 ] ASF subversion and git services commented on SOLR-5301: --- Commit 1529341 from jan...@apache.org in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1529341 ] SOLR-5301: DELETEALIAS command prints CREATEALIAS in logs (merge from trunk) DELETEALIAS command prints CREATEALIAS in logs -- Key: SOLR-5301 URL: https://issues.apache.org/jira/browse/SOLR-5301 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.4, 4.5 Reporter: Jan Høydahl Priority: Minor Fix For: 5.0, 4.6 Attachments: SOLR-5301.patch A simple copy/paste error in https://github.com/apache/lucene-solr/blob/33d22db31e63482a1b1aad0cf90c4030bc359ffe/solr/core/src/java/org/apache/solr/handler/admin/CollectionsHandler.java#L265 Line 265 should say OverseerCollectionProcessor.DELETEALIAS As far as I can see the bug only affects logging. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5277) Stamp core names on log entries for certain classes
[ https://issues.apache.org/jira/browse/SOLR-5277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786728#comment-13786728 ] Shawn Heisey commented on SOLR-5277: I've been trying to learn about MDC. It's supported by what I think represent the top three slf4j binding choices. The docs say it works on a per-thread basis. For requests, I think this makes things pretty simple. At the earliest point where the request enters Solr, we clear the MDC and put what we want in there. As the request dives further down, it's likely to continue being run by the same thread. If more info that we want to log is available at lower levels, we add to the MDC at each relevant point. For other things that happen, especially things that don't happen because of a request like core initialization, is there a direct correlation between a particular thread and a core, or does one thread handle multiple cores or random cores? Please feel free to point me at particular classes where I can start my research. Stamp core names on log entries for certain classes --- Key: SOLR-5277 URL: https://issues.apache.org/jira/browse/SOLR-5277 Project: Solr Issue Type: Bug Components: search, update Affects Versions: 4.3.1, 4.4, 4.5 Reporter: Dmitry Kan Attachments: SOLR-5277.patch It is handy that certain Java classes stamp a [coreName] on a log entry. It would be useful for multicore setup if more classes would stamp this information. In particular we came accross a situaion with commits coming in a quick succession to the same multicore shard and found it to be hard time figuring out was it the same core or different cores. The classes in question with log sample output: o.a.s.c.SolrCore 06:57:53.577 [qtp1640764503-13617] INFO org.apache.solr.core.SolrCore - SolrDeletionPolicy.onCommit: commits:num=2 11:53:19.056 [coreLoadExecutor-3-thread-1] INFO org.apache.solr.core.SolrCore - Soft AutoCommit: if uncommited for 1000ms; o.a.s.u.UpdateHandler 14:45:24.447 [commitScheduler-9-thread-1] INFO org.apache.solr.update.UpdateHandler - start commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false} 06:57:53.591 [qtp1640764503-13617] INFO org.apache.solr.update.UpdateHandler - end_commit_flush o.a.s.s.SolrIndexSearcher 14:45:24.553 [commitScheduler-7-thread-1] INFO org.apache.solr.search.SolrIndexSearcher - Opening Searcher@1067e5a9 main The original question was posted on #solr and on SO: http://stackoverflow.com/questions/19026577/how-to-output-solr-core-name-with-log4j -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5277) Stamp core names on log entries for certain classes
[ https://issues.apache.org/jira/browse/SOLR-5277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786734#comment-13786734 ] Shawn Heisey commented on SOLR-5277: Some 'thinking out loud': The fact that newly spawned threads inherit the MDC is awesome. It can be fairly minimal or really verbose if we add to it based on whether certain logging levels are enabled. Stamp core names on log entries for certain classes --- Key: SOLR-5277 URL: https://issues.apache.org/jira/browse/SOLR-5277 Project: Solr Issue Type: Bug Components: search, update Affects Versions: 4.3.1, 4.4, 4.5 Reporter: Dmitry Kan Attachments: SOLR-5277.patch It is handy that certain Java classes stamp a [coreName] on a log entry. It would be useful for multicore setup if more classes would stamp this information. In particular we came accross a situaion with commits coming in a quick succession to the same multicore shard and found it to be hard time figuring out was it the same core or different cores. The classes in question with log sample output: o.a.s.c.SolrCore 06:57:53.577 [qtp1640764503-13617] INFO org.apache.solr.core.SolrCore - SolrDeletionPolicy.onCommit: commits:num=2 11:53:19.056 [coreLoadExecutor-3-thread-1] INFO org.apache.solr.core.SolrCore - Soft AutoCommit: if uncommited for 1000ms; o.a.s.u.UpdateHandler 14:45:24.447 [commitScheduler-9-thread-1] INFO org.apache.solr.update.UpdateHandler - start commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false} 06:57:53.591 [qtp1640764503-13617] INFO org.apache.solr.update.UpdateHandler - end_commit_flush o.a.s.s.SolrIndexSearcher 14:45:24.553 [commitScheduler-7-thread-1] INFO org.apache.solr.search.SolrIndexSearcher - Opening Searcher@1067e5a9 main The original question was posted on #solr and on SO: http://stackoverflow.com/questions/19026577/how-to-output-solr-core-name-with-log4j -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Roadmap for fixing features broken by core autodiscovery
There are two use-cases that appear broken with the new core auto-discovery mechanism: *1) The Core Admin Handler's CREATE command no longer works to create brand new cores* (unless you have logged on the box and created the core's directory structure manually, which largely defeats the purpose of the CREATE command). With the old Solr.xml format, we could spin up as many cores as we wanted to dynamically with the following command: http://localhost:8983/solr/admin/cores?action=CREATEname=newCore1; instanceDir=collection1dataDir=newCore1/data ... http://localhost:8983/solr/admin/cores?action=CREATEname=newCoreN; instanceDir=collection1dataDir=newCoreN/data In the new core discovery mode, this exception is now thrown: Error CREATEing SolrCore 'newCore1': Could not create a new core in solr/collection1/as another core is already defined there The exception is being intentionally thrown in CorePropertiesLocator.java because a core.properties file already exists in solr/collection1 (and only one can exist per directory). *2) Having a shared configuration directory (instanceDir) across many cores no longer works*. Every core has to have it's own conf/ directory, and this doesn't seem to be overridable any longer. Previously, it was possible to have many cores share the same instanceDir (and just override their dataDir for obvious reasons). Now, it is necessary to copy and paste identical config files for each Solr core. I don't know if there's already a current roadmap for fixing this. I saw https://issues.apache.org/jira/browse/SOLR-4478, which suggested replacing instanceDir with the ability to specify a named configSet. This solves problem 2, but not problem1 (since you still can't have multiple core.properties files in the same folder). Based on Erick's comments in the JIRA ticket, it also sounds like this ticket is also dead at the moment. There is definitely a need to have a shared config directory - whether that is through a configSet or an explicit indexDir doesn't matter to me. There's also a need to be able to dynamically create Solr cores from external systems. I currently can't upgrade to core auto discovery because it doesn't allow dynamic core creation. Does anyone have some thoughts on how to best get these features working again under core autodiscovery? Adding instanceDir to core.properties seems like an easy solution, but there must be a desire not to do that or it would probably have already been done. I'm happy to contribute some time to resolving this if there is agreed upon path forward. Thanks, -Trey
Re: Welcome Joel Bernstein
Welcome Joel! Wolfgang. On Oct 3, 2013, at 9:56 AM, Erick Erickson wrote: Welcome Joel! On Thu, Oct 3, 2013 at 9:33 AM, Martijn v Groningen martijn.v.gronin...@gmail.com wrote: Welcome Joel! On 3 October 2013 15:45, Shawn Heisey s...@elyograg.org wrote: On 10/2/2013 11:24 PM, Grant Ingersoll wrote: The Lucene PMC is happy to welcome Joel Bernstein as a committer on the Lucene and Solr project. Joel has been working on a number of issues on the project and we look forward to his continued contributions going forward. Welcome to the project! Best of luck to you! - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Met vriendelijke groet, Martijn van Groningen - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Roadmap for fixing features broken by core autodiscovery
On 10/4/2013 7:21 PM, Trey Grainger wrote: There are two use-cases that appear broken with the new core auto-discovery mechanism: *1) The Core Admin Handler's CREATE command no longer works to create brand new cores* (unless you have logged on the box and created the core's directory structure manually, which largely defeats the purpose of the CREATE command). With the old Solr.xml format, we could spin up as many cores as we wanted to dynamically with the following command: http://localhost:8983/solr/admin/cores?action=CREATEname=newCore1instanceDir=collection1dataDir=newCore1/data ... http://localhost:8983/solr/admin/cores?action=CREATEname=newCoreNinstanceDir=collection1dataDir=newCoreN/data In the new core discovery mode, this exception is now thrown: Error CREATEing SolrCore 'newCore1': Could not create a new core in solr/collection1/as another core is already defined there The CREATE action has *always* required that you have your configuration on the disk before you call it. You are sharing the instanceDir, which is the only reason you can skip that step. If you want completely dynamic creation, use SolrCloud, which keeps the config in zookeeper and requires ZERO config information to exist on the disk. *2) Having a shared configuration directory (instanceDir) across many cores no longer works*. Every core has to have it's own conf/ directory, and this doesn't seem to be overridable any longer. Previously, it was possible to have many cores share the same instanceDir (and just override their dataDir for obvious reasons). Now, it is necessary to copy and paste identical config files for each Solr core. From what I understand talking to the people that worked on this, the lack of a shared instanceDir was completely deliberate. It's the only way that core discovery can work in any kind of predictable and sane manner. The entire point of it is that every core is self-contained and solr.xml isn't used to tell Solr about them. I personally have never tried to share the instanceDir. I do have shared configs, though - my corename/conf directories have symlinks to a shared config directory. I also don't dynamically create cores - I have seven shards, each of which has a live core and a build core. There are two other cores that serve as frontends, with the shards parameter in the request handlers. I don't know if there's already a current roadmap for fixing this. I saw https://issues.apache.org/jira/browse/SOLR-4478, which suggested replacing instanceDir with the ability to specify a named configSet. This solves problem 2, but not problem1 (since you still can't have multiple core.properties files in the same folder). Based on Erick's comments in the JIRA ticket, it also sounds like this ticket is also dead at the moment. There is definitely a need to have a shared config directory - whether that is through a configSet or an explicit indexDir doesn't matter to me. There's also a need to be able to dynamically create Solr cores from external systems. I currently can't upgrade to core auto discovery because it doesn't allow dynamic core creation. Does anyone have some thoughts on how to best get these features working again under core autodiscovery? Adding instanceDir to core.properties seems like an easy solution, but there must be a desire not to do that or it would probably have already been done. Thankfully, you do not need to upgrade to core discovery anytime soon. All future 4.x versions will support the old format, and any problems with that will be considered bugs. It will be mandatory in Solr 5.0, which currently doesn't have any kind of release roadmap or timeframe. I suspect that what we currently call SolrCloud will also be mandatory in 5.0, and that gives you shared configs with zookeeper. Requiring zookeeper allows completely dynamic core/collection creation, because the only thing that will be on the disk is the index and transaction log data. Thanks, Shawn - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4956) the korean analyzer that has a korean morphological analyzer and dictionaries
[ https://issues.apache.org/jira/browse/LUCENE-4956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786899#comment-13786899 ] SooMyung Lee commented on LUCENE-4956: -- Hi Christian, I didn't hear any news from you since last August. Do you have any problem with moving to next step? I run a Korean developers community for the Korean Analyzer. I announced that Arirang analyzer will be incorporated into lucene and solr soon. So, many developers are waiting for that. I want we go to next step quickly. If you need any help, Please let me know. the korean analyzer that has a korean morphological analyzer and dictionaries - Key: LUCENE-4956 URL: https://issues.apache.org/jira/browse/LUCENE-4956 Project: Lucene - Core Issue Type: New Feature Components: modules/analysis Affects Versions: 4.2 Reporter: SooMyung Lee Assignee: Christian Moen Labels: newbie Attachments: kr.analyzer.4x.tar, lucene-4956.patch, lucene4956.patch, LUCENE-4956.patch Korean language has specific characteristic. When developing search service with lucene solr in korean, there are some problems in searching and indexing. The korean analyer solved the problems with a korean morphological anlyzer. It consists of a korean morphological analyzer, dictionaries, a korean tokenizer and a korean filter. The korean anlyzer is made for lucene and solr. If you develop a search service with lucene in korean, It is the best idea to choose the korean analyzer. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4956) the korean analyzer that has a korean morphological analyzer and dictionaries
[ https://issues.apache.org/jira/browse/LUCENE-4956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13786905#comment-13786905 ] Christian Moen commented on LUCENE-4956: Thanks for pushing me on this. I'll have a look at your recent changes and commit to trunk shortly if everything seems fine. I hope to have this committed to trunk early next week. Sorry for this having dragged out. the korean analyzer that has a korean morphological analyzer and dictionaries - Key: LUCENE-4956 URL: https://issues.apache.org/jira/browse/LUCENE-4956 Project: Lucene - Core Issue Type: New Feature Components: modules/analysis Affects Versions: 4.2 Reporter: SooMyung Lee Assignee: Christian Moen Labels: newbie Attachments: kr.analyzer.4x.tar, lucene-4956.patch, lucene4956.patch, LUCENE-4956.patch Korean language has specific characteristic. When developing search service with lucene solr in korean, there are some problems in searching and indexing. The korean analyer solved the problems with a korean morphological anlyzer. It consists of a korean morphological analyzer, dictionaries, a korean tokenizer and a korean filter. The korean anlyzer is made for lucene and solr. If you develop a search service with lucene in korean, It is the best idea to choose the korean analyzer. -- This message was sent by Atlassian JIRA (v6.1#6144) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Joel Bernstein
Welcome Joel! koji (13/10/03 14:24), Grant Ingersoll wrote: Hi, The Lucene PMC is happy to welcome Joel Bernstein as a committer on the Lucene and Solr project. Joel has been working on a number of issues on the project and we look forward to his continued contributions going forward. Welcome aboard, Joel! -Grant - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- http://soleami.com/blog/automatically-acquiring-synonym-knowledge-from-wikipedia.html - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org