[jira] [Commented] (LUCENE-7655) Speed up geo-distance queries that match most documents
[ https://issues.apache.org/jira/browse/LUCENE-7655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16115037#comment-16115037 ] Maciej Zasada commented on LUCENE-7655: --- Hi [~jpountz], I'd like to contribute this improvement. I opened a PR on GH, would you mind taking a look? Unit tests pass, I also ran luceneutil spatial test: {code} perf.IndexAndSearchOpenStreetMaps -points -distance {code} and after tuning queries to match ~75% of the docs, I did see ~ +20% QPS gain. Let me know if I can do anything more. Cheers! > Speed up geo-distance queries that match most documents > --- > > Key: LUCENE-7655 > URL: https://issues.apache.org/jira/browse/LUCENE-7655 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > > I think the same optimization that was applied in LUCENE-7641 would also work > with geo-distance queries? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7585) ConcurrentLFUCache throws NoSuchElementException under a write-heavy load
[ https://issues.apache.org/jira/browse/SOLR-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14557684#comment-14557684 ] Maciej Zasada commented on SOLR-7585: - +1 for a fixed thread count and DefaultSolrThreadFactory. Cheers! > ConcurrentLFUCache throws NoSuchElementException under a write-heavy load > - > > Key: SOLR-7585 > URL: https://issues.apache.org/jira/browse/SOLR-7585 > Project: Solr > Issue Type: Bug >Affects Versions: 5.1 >Reporter: Maciej Zasada >Priority: Minor > Attachments: SOLR-7585.patch, SOLR-7585.patch, SOLR-7585.patch, > SOLR-7585.patch, SOLR-7585.patch > > > Under a write-heavy load {{ConcurrentLFUCache}} throws > {{NoSuchElementException}}. The problem lies within > {{org.apache.solr.util.ConcurrentLFUCache#put}} method, which allows for a > race condition between the check and the call to {{markAndSweep}} method. > Despite that a thread must acquire a lock to perform sweeping, it's still > possible that multiple threads successfully detected a need for calling > markAndSweep. If they execute it sequentially, subsequent runs will fail with > {{NoSuchElementException}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7585) ConcurrentLFUCache throws NoSuchElementException under a write-heavy load
[ https://issues.apache.org/jira/browse/SOLR-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14557555#comment-14557555 ] Maciej Zasada commented on SOLR-7585: - Looks good to me, +1. Cheers! > ConcurrentLFUCache throws NoSuchElementException under a write-heavy load > - > > Key: SOLR-7585 > URL: https://issues.apache.org/jira/browse/SOLR-7585 > Project: Solr > Issue Type: Bug >Affects Versions: 5.1 >Reporter: Maciej Zasada >Priority: Minor > Attachments: SOLR-7585.patch, SOLR-7585.patch, SOLR-7585.patch, > SOLR-7585.patch > > > Under a write-heavy load {{ConcurrentLFUCache}} throws > {{NoSuchElementException}}. The problem lies within > {{org.apache.solr.util.ConcurrentLFUCache#put}} method, which allows for a > race condition between the check and the call to {{markAndSweep}} method. > Despite that a thread must acquire a lock to perform sweeping, it's still > possible that multiple threads successfully detected a need for calling > markAndSweep. If they execute it sequentially, subsequent runs will fail with > {{NoSuchElementException}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7585) ConcurrentLFUCache throws NoSuchElementException under a write-heavy load
[ https://issues.apache.org/jira/browse/SOLR-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14557517#comment-14557517 ] Maciej Zasada commented on SOLR-7585: - Yes, I saw SOLR-3393 before, but I assumed it will take a while to deprecate and remove the old class, and we're currently experiencing some exceptions because of this race condition :) Having in mind that you're working on more performant LFU cache, I tried to make changes as small and noninvasive as possible. bq. it looks like markAndSweep was designed to reduce the cache size to lowerWaterMark. Exactly. I believe that your change will actually reduce (sometimes) the cache size to be equal to {{lowerWaterMark -1}}, instead of equal to {{lowerWaterMark}}. bq. Is it acceptable to avoid eviction if the current cache size is somewhere between the two watermarks? Eviction will be only avoided if the size was less than or equal to {{upperWaterMark}}, which is exactly the same condition as in {{org.apache.solr.util.ConcurrentLFUCache#put}} method (L#139), where it's decided whether to call {{markAndSweep}} at all or not. Essentially, the same condition is checked after acquiring a lock to mitigate a race condition. > ConcurrentLFUCache throws NoSuchElementException under a write-heavy load > - > > Key: SOLR-7585 > URL: https://issues.apache.org/jira/browse/SOLR-7585 > Project: Solr > Issue Type: Bug >Affects Versions: 5.1 >Reporter: Maciej Zasada >Priority: Minor > Attachments: SOLR-7585.patch, SOLR-7585.patch, SOLR-7585.patch > > > Under a write-heavy load {{ConcurrentLFUCache}} throws > {{NoSuchElementException}}. The problem lies within > {{org.apache.solr.util.ConcurrentLFUCache#put}} method, which allows for a > race condition between the check and the call to {{markAndSweep}} method. > Despite that a thread must acquire a lock to perform sweeping, it's still > possible that multiple threads successfully detected a need for calling > markAndSweep. If they execute it sequentially, subsequent runs will fail with > {{NoSuchElementException}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-7585) ConcurrentLFUCache throws NoSuchElementException under a write-heavy load
[ https://issues.apache.org/jira/browse/SOLR-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14557504#comment-14557504 ] Maciej Zasada commented on SOLR-7585: - Personally I like exiting the code early. If I understand correctly, new implementation will always evict at least one entry, which means cache size may drop below {{lowerWaterMark}} (which somehow brakes a contract). To fix underlying problem we should get rid of a race condition at all, but I haven't measure the performance impact of such change. Moreover, first approach has 2 additional benefits: * not producing additional garbage * not executing unnecessary code (if cache size is below {{upperWaterMark}}, there's no need to evict anything) both seem to be important for a cache. WDYT? > ConcurrentLFUCache throws NoSuchElementException under a write-heavy load > - > > Key: SOLR-7585 > URL: https://issues.apache.org/jira/browse/SOLR-7585 > Project: Solr > Issue Type: Bug >Affects Versions: 5.1 >Reporter: Maciej Zasada >Priority: Minor > Attachments: SOLR-7585.patch, SOLR-7585.patch, SOLR-7585.patch > > > Under a write-heavy load {{ConcurrentLFUCache}} throws > {{NoSuchElementException}}. The problem lies within > {{org.apache.solr.util.ConcurrentLFUCache#put}} method, which allows for a > race condition between the check and the call to {{markAndSweep}} method. > Despite that a thread must acquire a lock to perform sweeping, it's still > possible that multiple threads successfully detected a need for calling > markAndSweep. If they execute it sequentially, subsequent runs will fail with > {{NoSuchElementException}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-7585) ConcurrentLFUCache throws NoSuchElementException under a write-heavy load
[ https://issues.apache.org/jira/browse/SOLR-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Zasada updated SOLR-7585: Attachment: SOLR-7585.patch Of course, latest patch contains same test compatible with Java 7. My bad, I tested against {{trunk}} (which is now on Java 8) instead of a 5.x branch. Cheers! > ConcurrentLFUCache throws NoSuchElementException under a write-heavy load > - > > Key: SOLR-7585 > URL: https://issues.apache.org/jira/browse/SOLR-7585 > Project: Solr > Issue Type: Bug >Affects Versions: 5.1 >Reporter: Maciej Zasada >Priority: Minor > Attachments: SOLR-7585.patch, SOLR-7585.patch > > > Under a write-heavy load {{ConcurrentLFUCache}} throws > {{NoSuchElementException}}. The problem lies within > {{org.apache.solr.util.ConcurrentLFUCache#put}} method, which allows for a > race condition between the check and the call to {{markAndSweep}} method. > Despite that a thread must acquire a lock to perform sweeping, it's still > possible that multiple threads successfully detected a need for calling > markAndSweep. If they execute it sequentially, subsequent runs will fail with > {{NoSuchElementException}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-7585) ConcurrentLFUCache throws NoSuchElementException under a write-heavy load
[ https://issues.apache.org/jira/browse/SOLR-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Zasada updated SOLR-7585: Attachment: SOLR-7585.patch Attached a patch containing a test case + a fix that mitigates the race condition. Cheers! > ConcurrentLFUCache throws NoSuchElementException under a write-heavy load > - > > Key: SOLR-7585 > URL: https://issues.apache.org/jira/browse/SOLR-7585 > Project: Solr > Issue Type: Bug >Affects Versions: 5.1 >Reporter: Maciej Zasada >Priority: Minor > Attachments: SOLR-7585.patch > > > Under a write-heavy load {{ConcurrentLFUCache}} throws > {{NoSuchElementException}}. The problem lies within > {{org.apache.solr.util.ConcurrentLFUCache#put}} method, which allows for a > race condition between the check and the call to {{markAndSweep}} method. > Despite that a thread must acquire a lock to perform sweeping, it's still > possible that multiple threads successfully detected a need for calling > markAndSweep. If they execute it sequentially, subsequent runs will fail with > {{NoSuchElementException}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-7585) ConcurrentLFUCache throws NoSuchElementException under a write-heavy load
Maciej Zasada created SOLR-7585: --- Summary: ConcurrentLFUCache throws NoSuchElementException under a write-heavy load Key: SOLR-7585 URL: https://issues.apache.org/jira/browse/SOLR-7585 Project: Solr Issue Type: Bug Affects Versions: 5.1 Reporter: Maciej Zasada Priority: Minor Under a write-heavy load {{ConcurrentLFUCache}} throws {{NoSuchElementException}}. The problem lies within {{org.apache.solr.util.ConcurrentLFUCache#put}} method, which allows for a race condition between the check and the call to {{markAndSweep}} method. Despite that a thread must acquire a lock to perform sweeping, it's still possible that multiple threads successfully detected a need for calling markAndSweep. If they execute it sequentially, subsequent runs will fail with {{NoSuchElementException}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6202) Simplify the use of "rq" for query re-ranking
[ https://issues.apache.org/jira/browse/SOLR-6202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Zasada updated SOLR-6202: Attachment: SOLR-6202.patch Hi [~hossman], [~joel.bernstein] I'd like to summit a patch for this issue. I tried to make changes accordingly to your suggestions: * default parser for {{rq}} is {{ReRankQParser}} * {{reRankQuery}} local param is deprecated in favour of using {{v}} * query is parsed using {{subQuery()}} I duplicated rerank tests so they use simplified syntax as well. If you have any suggestions, I'm more than happy to hear them. Cheers, Maciej > Simplify the use of "rq" for query re-ranking > - > > Key: SOLR-6202 > URL: https://issues.apache.org/jira/browse/SOLR-6202 > Project: Solr > Issue Type: Improvement >Reporter: Hoss Man > Attachments: SOLR-6202.patch > > > It seems like there is some really easy tweaks we could make to the "rq" > (Re-Ranking Query) API to make it less verbose and easier to use: > * in QueryComponent > ** when parsing "rq", we should use "rerank" as the default parser > *** should be a trivial change to the QParser.getParser when initializin > rqparser > * in ReRankQParserPlugin > ** we should deprecate/remove the "reRankQuery" local param and just use "v" > *** this is the implicit param created when using the body of the query > string, or it can be explicit (to refer to variables) > ** we should leverage subQuery() (just like NestedQParser does) so that > "defType" is respected when parsing the wrapped query > Unless i'm overlooking something, the combination of these simple changes > should drastically simplify a lot of use cases... > {noformat} > BEFORE: rq={!rerank reRankQuery=$rqq}rqq=XXX > AFTER: rq=XXX > BEFORE: rq={!rerank reRankQuery=$rqq reRankDocs=200}$rqq={!func}XXX > AFTER: rq={!reRankDocs=200 defType=func}XXX > {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5746) solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; expects odd type for "shareSchema"
[ https://issues.apache.org/jira/browse/SOLR-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14063425#comment-14063425 ] Maciej Zasada commented on SOLR-5746: - Hi [~hossman], thanks for the code review and all your suggestions. I'll keep them in mind for the future work. Revised patch looks good to me. Cheers, Maciej > solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; > expects odd type for "shareSchema" > -- > > Key: SOLR-5746 > URL: https://issues.apache.org/jira/browse/SOLR-5746 > Project: Solr > Issue Type: Bug >Affects Versions: 4.3, 4.4, 4.5, 4.6 >Reporter: Hoss Man > Attachments: SOLR-5746.patch, SOLR-5746.patch, SOLR-5746.patch, > SOLR-5746.patch, SOLR-5746.patch > > > A comment in the ref guide got me looking at ConfigSolrXml.java and noticing > that the parsing of solr.xml options here is very brittle and confusing. In > particular: > * if a boolean option "foo" is expected along the lines of {{ name="foo">true}} it will silently ignore {{ name="foo">true}} > * likewise for an int option {{32}} vs {{ name="bar">32}} > ... this is inconsistent with the way solrconfig.xml is parsed. In > solrconfig.xml, the xml nodes are parsed into a NamedList, and the above > options will work in either form, but an invalid value such as {{ name="foo">NOT A BOOLEAN}} will generate an error earlier (when > parsing config) then {{NOT A BOOLEAN}} (attempt to > parse the string as a bool the first time the config value is needed) > In addition, i notice this really confusing line... > {code} > propMap.put(CfgProp.SOLR_SHARESCHEMA, > doSub("solr/str[@name='shareSchema']")); > {code} > "shareSchema" is used internally as a boolean option, but as written the > parsing code will ignore it unless the user explicitly configures it as a > {{}} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5746) solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; expects odd type for "shareSchema"
[ https://issues.apache.org/jira/browse/SOLR-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14060685#comment-14060685 ] Maciej Zasada edited comment on SOLR-5746 at 7/15/14 11:08 AM: --- Hi [~hossman], I've attached updated patch file: * removed {{DOMUtil.readNamedChildrenAsNamedList}} method and used (slightly modified) existing API of {{DOMUtil}} instead. * removed reading values from {{SolrParam}} - they are being read directly from the {{NamedList<>}} * added reporting of duplicated config options ({{DEBUG}} level) per parent node, as well as exception message containing list of duplicated parameters, e.g. {code} 1 STRING-1 … 2 STRING-2 {code} will cause an exception: {code} Duplicated 2 config parameter(s) in solr.xml file: [int-param, str-param] {code} However, if parameters with a same name are attached to different parent nodes everything will pass just fine, e.g. {code} 1 STRING-1 … … 2 STRING-2 {code} In this case no exception will be thrown. Some examples to sum it up: |{{solr.xml}} file fragment|Expected type|Parsing result| |{{44}}|{{Integer}}|(/) |{{44}}|{{Integer}}|(/) |{{44}}|{{Integer}}|(x) |{{44}}|{{Integer}}|(x) |{{44}}|{{Integer}}|(x) |{{44}}|{{Integer}}|(x) |{{true}}|{{Boolean}}|(x) |{{true}}|{{Boolean}}|(/) |{{true}}|{{Boolean}}|(x) |{{true}}|{{Boolean}}|(/) |{{true}}|{{Boolean}}|(x) |{{true}}|{{Boolean}}|(x) {{ant clean test}} shows that there's no regression. [~jkrupan] this change clearly is not backward compatible with the existing {{solr.xml}} files. For instance - unknown config values won't be silently ignored - an exception will be thrown instead. However, I didn't realise that {{solr.xml}} files are versioned the same way as {{schema.xml}} files are. Should I bump the schema version to 1.6? Cheers, Maciej was (Author: maciej.zasada): Hi [~hossman], I've attached updated patch file: * removed {{DOMUtil.readNamedChildrenAsNamedList}} method and used (slightly modified) existing API of {{DOMUtil}} instead. * removed reading values from {{SolrParam}} - they are being read directly from the {{NamedList<>}} * added reporting of duplicated config options ({{DEBUG}} level) per parent node, as well as exception message containing list of duplicated parameters, e.g. {code} 1 STRING-1 … 2 STRING-2 {code} will cause an exception: {code} Duplicated 2 config parameter(s) in solr.xml file: [int-param, str-param] {code} However, if parameters with a same name are attached to different parent nodes everything will pass just fine, e.g. {code} 1 STRING-1 … … 2 STRING-2 {code} In this case no exception will be thrown. Some examples to sum it up: |{{solr.xml}} file fragment|Expected type|Parsing result| |{{44}}|{{Integer}}|(/) |{{44}}|{{Integer}}|(/) |{{44}}|{{Integer}}|(/) |{{44}}|{{Integer}}|(x) |{{44}}|{{Integer}}|(x) |{{44}}|{{Integer}}|(x) |{{true}}|{{Boolean}}|(x) |{{true}}|{{Boolean}}|(/) |{{true}}|{{Boolean}}|(x) |{{true}}|{{Boolean}}|(/) |{{true}}|{{Boolean}}|(x) |{{true}}|{{Boolean}}|(x) {{ant clean test}} shows that there's no regression. [~jkrupan] this change clearly is not backward compatible with the existing {{solr.xml}} files. For instance - unknown config values won't be silently ignored - an exception will be thrown instead. However, I didn't realise that {{solr.xml}} files are versioned the same way as {{schema.xml}} files are. Should I bump the schema version to 1.6? Cheers, Maciej > solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; > expects odd type for "shareSchema" > -- > > Key: SOLR-5746 > URL: https://issues.apache.org/jira/browse/SOLR-5746 > Project: Solr > Issue Type: Bug >Affects Versions: 4.3, 4.4, 4.5, 4.6 >Reporter: Hoss Man > Attachments: SOLR-5746.patch, SOLR-5746.patch, SOLR-5746.patch, > SOLR-5746.patch > > > A comment in the ref guide got me looking at ConfigSolrXml.java and noticing > that the parsing of solr.xml options here is very brittle and confusing. In > particular: > * if a boolean option "foo" is expected along the lines of {{ name="foo">true}} it will silently ignore {{ name="foo">true}} > * likewise for an int option {{32}} vs {{ name="bar">32}} > ... this is inconsistent with the way solrconfig.xml is parsed. In > solrconfig.xml, the xml nodes are parsed into a NamedList, and the above > options will work in either form, but an invalid value such as {{ name="foo">NOT A BOOLEAN}} will generate an error earlier (when > parsing config) then {{NOT A BOOLEAN}} (attempt to > parse the string as a bool the first time the config value is needed) > In addition, i notice this really confusing line... > {code} > propMap.put(CfgProp.SOLR_SHARESCHEMA, > doSub("solr/str[@na
[jira] [Updated] (SOLR-5746) solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; expects odd type for "shareSchema"
[ https://issues.apache.org/jira/browse/SOLR-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Zasada updated SOLR-5746: Attachment: SOLR-5746.patch Added some more unit test to the patch. > solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; > expects odd type for "shareSchema" > -- > > Key: SOLR-5746 > URL: https://issues.apache.org/jira/browse/SOLR-5746 > Project: Solr > Issue Type: Bug >Affects Versions: 4.3, 4.4, 4.5, 4.6 >Reporter: Hoss Man > Attachments: SOLR-5746.patch, SOLR-5746.patch, SOLR-5746.patch, > SOLR-5746.patch > > > A comment in the ref guide got me looking at ConfigSolrXml.java and noticing > that the parsing of solr.xml options here is very brittle and confusing. In > particular: > * if a boolean option "foo" is expected along the lines of {{ name="foo">true}} it will silently ignore {{ name="foo">true}} > * likewise for an int option {{32}} vs {{ name="bar">32}} > ... this is inconsistent with the way solrconfig.xml is parsed. In > solrconfig.xml, the xml nodes are parsed into a NamedList, and the above > options will work in either form, but an invalid value such as {{ name="foo">NOT A BOOLEAN}} will generate an error earlier (when > parsing config) then {{NOT A BOOLEAN}} (attempt to > parse the string as a bool the first time the config value is needed) > In addition, i notice this really confusing line... > {code} > propMap.put(CfgProp.SOLR_SHARESCHEMA, > doSub("solr/str[@name='shareSchema']")); > {code} > "shareSchema" is used internally as a boolean option, but as written the > parsing code will ignore it unless the user explicitly configures it as a > {{}} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5746) solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; expects odd type for "shareSchema"
[ https://issues.apache.org/jira/browse/SOLR-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Zasada updated SOLR-5746: Attachment: SOLR-5746.patch Hi [~hossman], I've attached updated patch file: * removed {{DOMUtil.readNamedChildrenAsNamedList}} method and used (slightly modified) existing API of {{DOMUtil}} instead. * removed reading values from {{SolrParam}} - they are being read directly from the {{NamedList<>}} * added reporting of duplicated config options ({{DEBUG}} level) per parent node, as well as exception message containing list of duplicated parameters, e.g. {code} 1 STRING-1 … 2 STRING-2 {code} will cause an exception: {code} Duplicated 2 config parameter(s) in solr.xml file: [int-param, str-param] {code} However, if parameters with a same name are attached to different parent nodes everything will pass just fine, e.g. {code} 1 STRING-1 … … 2 STRING-2 {code} In this case no exception will be thrown. Some examples to sum it up: |{{solr.xml}} file fragment|Expected type|Parsing result| |{{44}}|{{Integer}}|(/) |{{44}}|{{Integer}}|(/) |{{44}}|{{Integer}}|(/) |{{44}}|{{Integer}}|(x) |{{44}}|{{Integer}}|(x) |{{44}}|{{Integer}}|(x) |{{true}}|{{Boolean}}|(x) |{{true}}|{{Boolean}}|(/) |{{true}}|{{Boolean}}|(x) |{{true}}|{{Boolean}}|(/) |{{true}}|{{Boolean}}|(x) |{{true}}|{{Boolean}}|(x) {{ant clean test}} shows that there's no regression. [~jkrupan] this change clearly is not backward compatible with the existing {{solr.xml}} files. For instance - unknown config values won't be silently ignored - an exception will be thrown instead. However, I didn't realise that {{solr.xml}} files are versioned the same way as {{schema.xml}} files are. Should I bump the schema version to 1.6? Cheers, Maciej > solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; > expects odd type for "shareSchema" > -- > > Key: SOLR-5746 > URL: https://issues.apache.org/jira/browse/SOLR-5746 > Project: Solr > Issue Type: Bug >Affects Versions: 4.3, 4.4, 4.5, 4.6 >Reporter: Hoss Man > Attachments: SOLR-5746.patch, SOLR-5746.patch, SOLR-5746.patch > > > A comment in the ref guide got me looking at ConfigSolrXml.java and noticing > that the parsing of solr.xml options here is very brittle and confusing. In > particular: > * if a boolean option "foo" is expected along the lines of {{ name="foo">true}} it will silently ignore {{ name="foo">true}} > * likewise for an int option {{32}} vs {{ name="bar">32}} > ... this is inconsistent with the way solrconfig.xml is parsed. In > solrconfig.xml, the xml nodes are parsed into a NamedList, and the above > options will work in either form, but an invalid value such as {{ name="foo">NOT A BOOLEAN}} will generate an error earlier (when > parsing config) then {{NOT A BOOLEAN}} (attempt to > parse the string as a bool the first time the config value is needed) > In addition, i notice this really confusing line... > {code} > propMap.put(CfgProp.SOLR_SHARESCHEMA, > doSub("solr/str[@name='shareSchema']")); > {code} > "shareSchema" is used internally as a boolean option, but as written the > parsing code will ignore it unless the user explicitly configures it as a > {{}} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5746) solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; expects odd type for "shareSchema"
[ https://issues.apache.org/jira/browse/SOLR-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14059771#comment-14059771 ] Maciej Zasada commented on SOLR-5746: - bq. 1) can you explain why the need for the new DOMUtil.readNamedChildrenAsNamedList method that you added instead of just using the existing DOMUtil.childNodesToNamedList The reason why I didn’t use {{DOMUtil.childNodesToNamedList}} is that I wanted to move the type conversion from reading the DOM to the point, when I knew what type exactly should be used. If I used that method, all values would be upcasted to {{Object}}, and I would have to do the type checking to make sure if the type is correct, something like {code} if (value instanceof Integer) { type is ok, just store the value as it is } else if(value instanceof String) { type is not ok, but let's try parsing this String to something useful } {code} Instead, I’m storing only the "raw" values, and trying to imply the correct type only once, without the type checking. But when I’m thinking it all over again your approach seems to have a significant advantage - in proposed implementation type mismatch would be permitted for all types, not only {{}}, e.g. {{3.14}} would be perfectly valid config option, if {{double}} type was expected. You’re right, I’ll remove new method and use {{DOMUtil.childNodesToNamedList}}. bq. 2) Speaking of which: what's the purpose exactly of "configAsSolrParams" if the original NamedList is still being passed to the storeConfigPropertyAs* methods - why not just get the values directly from there? Since I stored "raw" config values, I used {{SolrParam}} to do the type conversion, but I didn’t find any API for a parameter removal. That’s why I’m keeping the original NamedList, so that I can remove correctly read values and keep track of the unknown ones. bq. 3) One piece of validation that i believe is still missing here is to throw an errir if/when a config value is specified multiple times I should’ve thought about that, good catch! I’ll add detection of such errors. > solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; > expects odd type for "shareSchema" > -- > > Key: SOLR-5746 > URL: https://issues.apache.org/jira/browse/SOLR-5746 > Project: Solr > Issue Type: Bug >Affects Versions: 4.3, 4.4, 4.5, 4.6 >Reporter: Hoss Man > Attachments: SOLR-5746.patch, SOLR-5746.patch > > > A comment in the ref guide got me looking at ConfigSolrXml.java and noticing > that the parsing of solr.xml options here is very brittle and confusing. In > particular: > * if a boolean option "foo" is expected along the lines of {{ name="foo">true}} it will silently ignore {{ name="foo">true}} > * likewise for an int option {{32}} vs {{ name="bar">32}} > ... this is inconsistent with the way solrconfig.xml is parsed. In > solrconfig.xml, the xml nodes are parsed into a NamedList, and the above > options will work in either form, but an invalid value such as {{ name="foo">NOT A BOOLEAN}} will generate an error earlier (when > parsing config) then {{NOT A BOOLEAN}} (attempt to > parse the string as a bool the first time the config value is needed) > In addition, i notice this really confusing line... > {code} > propMap.put(CfgProp.SOLR_SHARESCHEMA, > doSub("solr/str[@name='shareSchema']")); > {code} > "shareSchema" is used internally as a boolean option, but as written the > parsing code will ignore it unless the user explicitly configures it as a > {{}} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5746) solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; expects odd type for "shareSchema"
[ https://issues.apache.org/jira/browse/SOLR-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Zasada updated SOLR-5746: Attachment: SOLR-5746.patch Hi [~hossman], I've attached updated patch file: * used framework's randomisation wherever it made sense to me; * added exception messages assertions; * added reporting of multiple unexpended config options ({{DEBUG}} level), as well as exception message containing list of unknown parameters (e.g. {code}Unrecognised 3 config parameter(s) in solr.xml file: [foo, bar, baz]{code} {{ant clean test}} shows that there's no regression. Cheers, Maciej > solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; > expects odd type for "shareSchema" > -- > > Key: SOLR-5746 > URL: https://issues.apache.org/jira/browse/SOLR-5746 > Project: Solr > Issue Type: Bug >Affects Versions: 4.3, 4.4, 4.5, 4.6 >Reporter: Hoss Man > Attachments: SOLR-5746.patch, SOLR-5746.patch > > > A comment in the ref guide got me looking at ConfigSolrXml.java and noticing > that the parsing of solr.xml options here is very brittle and confusing. In > particular: > * if a boolean option "foo" is expected along the lines of {{ name="foo">true}} it will silently ignore {{ name="foo">true}} > * likewise for an int option {{32}} vs {{ name="bar">32}} > ... this is inconsistent with the way solrconfig.xml is parsed. In > solrconfig.xml, the xml nodes are parsed into a NamedList, and the above > options will work in either form, but an invalid value such as {{ name="foo">NOT A BOOLEAN}} will generate an error earlier (when > parsing config) then {{NOT A BOOLEAN}} (attempt to > parse the string as a bool the first time the config value is needed) > In addition, i notice this really confusing line... > {code} > propMap.put(CfgProp.SOLR_SHARESCHEMA, > doSub("solr/str[@name='shareSchema']")); > {code} > "shareSchema" is used internally as a boolean option, but as written the > parsing code will ignore it unless the user explicitly configures it as a > {{}} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5746) solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; expects odd type for "shareSchema"
[ https://issues.apache.org/jira/browse/SOLR-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Zasada updated SOLR-5746: Attachment: SOLR-5746.patch Hi [~hossman], I'd like to submit a patch for that issue. I made changes accordingly to your suggestions: * parsing logic has changed, so that config parameters are transformed to the expected types at the parsing time, instead of value-reading time. I'm transforming each solr.xml section to the NamedList, and later on to the SolrParams; Essentially, if {{boolean}} type of {{foo}} parameter is expected, {{true}} will work just fine. Same goes for other types. * exception is thrown if any unexpected values are found in the config at the parse time. If you have any suggestions, I'm more than happy to hear them. Cheers, Maciej > solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; > expects odd type for "shareSchema" > -- > > Key: SOLR-5746 > URL: https://issues.apache.org/jira/browse/SOLR-5746 > Project: Solr > Issue Type: Bug >Affects Versions: 4.3, 4.4, 4.5, 4.6 >Reporter: Hoss Man > Attachments: SOLR-5746.patch > > > A comment in the ref guide got me looking at ConfigSolrXml.java and noticing > that the parsing of solr.xml options here is very brittle and confusing. In > particular: > * if a boolean option "foo" is expected along the lines of {{ name="foo">true}} it will silently ignore {{ name="foo">true}} > * likewise for an int option {{32}} vs {{ name="bar">32}} > ... this is inconsistent with the way solrconfig.xml is parsed. In > solrconfig.xml, the xml nodes are parsed into a NamedList, and the above > options will work in either form, but an invalid value such as {{ name="foo">NOT A BOOLEAN}} will generate an error earlier (when > parsing config) then {{NOT A BOOLEAN}} (attempt to > parse the string as a bool the first time the config value is needed) > In addition, i notice this really confusing line... > {code} > propMap.put(CfgProp.SOLR_SHARESCHEMA, > doSub("solr/str[@name='shareSchema']")); > {code} > "shareSchema" is used internally as a boolean option, but as written the > parsing code will ignore it unless the user explicitly configures it as a > {{}} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org