[jira] [Commented] (LUCENE-7655) Speed up geo-distance queries that match most documents

2017-08-04 Thread Maciej Zasada (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-7655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16115037#comment-16115037
 ] 

Maciej Zasada commented on LUCENE-7655:
---

Hi [~jpountz],
I'd like to contribute this improvement. I opened a PR on GH, would you mind 
taking a look? Unit tests pass, I also ran luceneutil spatial test: {code} 
perf.IndexAndSearchOpenStreetMaps -points -distance {code}
and after tuning queries to match ~75% of the docs, I did see ~ +20% QPS gain. 
Let me know if I can do anything more. Cheers!

> Speed up geo-distance queries that match most documents
> ---
>
> Key: LUCENE-7655
> URL: https://issues.apache.org/jira/browse/LUCENE-7655
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
>
> I think the same optimization that was applied in LUCENE-7641 would also work 
> with geo-distance queries?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7585) ConcurrentLFUCache throws NoSuchElementException under a write-heavy load

2015-05-24 Thread Maciej Zasada (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14557684#comment-14557684
 ] 

Maciej Zasada commented on SOLR-7585:
-

+1 for a fixed thread count and DefaultSolrThreadFactory. Cheers!

> ConcurrentLFUCache throws NoSuchElementException under a write-heavy load
> -
>
> Key: SOLR-7585
> URL: https://issues.apache.org/jira/browse/SOLR-7585
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 5.1
>Reporter: Maciej Zasada
>Priority: Minor
> Attachments: SOLR-7585.patch, SOLR-7585.patch, SOLR-7585.patch, 
> SOLR-7585.patch, SOLR-7585.patch
>
>
> Under a write-heavy load {{ConcurrentLFUCache}} throws 
> {{NoSuchElementException}}. The problem lies within 
> {{org.apache.solr.util.ConcurrentLFUCache#put}} method, which allows for a 
> race condition between the check and the call to {{markAndSweep}} method. 
> Despite that a thread must acquire a lock to perform sweeping, it's still 
> possible that multiple threads successfully detected a need for calling 
> markAndSweep. If they execute it sequentially, subsequent runs will fail with 
> {{NoSuchElementException}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7585) ConcurrentLFUCache throws NoSuchElementException under a write-heavy load

2015-05-23 Thread Maciej Zasada (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14557555#comment-14557555
 ] 

Maciej Zasada commented on SOLR-7585:
-

Looks good to me, +1. Cheers!

> ConcurrentLFUCache throws NoSuchElementException under a write-heavy load
> -
>
> Key: SOLR-7585
> URL: https://issues.apache.org/jira/browse/SOLR-7585
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 5.1
>Reporter: Maciej Zasada
>Priority: Minor
> Attachments: SOLR-7585.patch, SOLR-7585.patch, SOLR-7585.patch, 
> SOLR-7585.patch
>
>
> Under a write-heavy load {{ConcurrentLFUCache}} throws 
> {{NoSuchElementException}}. The problem lies within 
> {{org.apache.solr.util.ConcurrentLFUCache#put}} method, which allows for a 
> race condition between the check and the call to {{markAndSweep}} method. 
> Despite that a thread must acquire a lock to perform sweeping, it's still 
> possible that multiple threads successfully detected a need for calling 
> markAndSweep. If they execute it sequentially, subsequent runs will fail with 
> {{NoSuchElementException}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7585) ConcurrentLFUCache throws NoSuchElementException under a write-heavy load

2015-05-23 Thread Maciej Zasada (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14557517#comment-14557517
 ] 

Maciej Zasada commented on SOLR-7585:
-

Yes, I saw SOLR-3393 before, but I assumed it will take a while to deprecate 
and remove the old class, and we're currently experiencing some exceptions 
because of this race condition :) Having in mind that you're working on more 
performant LFU cache, I tried to make changes as small and noninvasive as 
possible.

bq. it looks like markAndSweep was designed to reduce the cache size to 
lowerWaterMark.

Exactly. I believe that your change will actually reduce (sometimes) the cache 
size to be equal to {{lowerWaterMark -1}}, instead of equal to 
{{lowerWaterMark}}.

bq. Is it acceptable to avoid eviction if the current cache size is somewhere 
between the two watermarks?

Eviction will be only avoided if the size was less than or equal to 
{{upperWaterMark}}, which is exactly the same condition as in 
{{org.apache.solr.util.ConcurrentLFUCache#put}} method (L#139), where it's 
decided whether to call {{markAndSweep}} at all or not. Essentially, the same 
condition is checked after acquiring a lock to mitigate a race condition.

> ConcurrentLFUCache throws NoSuchElementException under a write-heavy load
> -
>
> Key: SOLR-7585
> URL: https://issues.apache.org/jira/browse/SOLR-7585
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 5.1
>Reporter: Maciej Zasada
>Priority: Minor
> Attachments: SOLR-7585.patch, SOLR-7585.patch, SOLR-7585.patch
>
>
> Under a write-heavy load {{ConcurrentLFUCache}} throws 
> {{NoSuchElementException}}. The problem lies within 
> {{org.apache.solr.util.ConcurrentLFUCache#put}} method, which allows for a 
> race condition between the check and the call to {{markAndSweep}} method. 
> Despite that a thread must acquire a lock to perform sweeping, it's still 
> possible that multiple threads successfully detected a need for calling 
> markAndSweep. If they execute it sequentially, subsequent runs will fail with 
> {{NoSuchElementException}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7585) ConcurrentLFUCache throws NoSuchElementException under a write-heavy load

2015-05-23 Thread Maciej Zasada (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14557504#comment-14557504
 ] 

Maciej Zasada commented on SOLR-7585:
-

Personally I like exiting the code early. If I understand correctly, new 
implementation will always evict at least one entry, which means cache size may 
drop below {{lowerWaterMark}} (which somehow brakes a contract). To fix 
underlying problem we should get rid of a race condition at all, but I haven't 
measure the performance impact of such change. Moreover, first approach has 2 
additional benefits:
* not producing additional garbage
* not executing unnecessary code (if cache size is below {{upperWaterMark}}, 
there's no need to evict anything)

both seem to be important for a cache. WDYT?

> ConcurrentLFUCache throws NoSuchElementException under a write-heavy load
> -
>
> Key: SOLR-7585
> URL: https://issues.apache.org/jira/browse/SOLR-7585
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 5.1
>Reporter: Maciej Zasada
>Priority: Minor
> Attachments: SOLR-7585.patch, SOLR-7585.patch, SOLR-7585.patch
>
>
> Under a write-heavy load {{ConcurrentLFUCache}} throws 
> {{NoSuchElementException}}. The problem lies within 
> {{org.apache.solr.util.ConcurrentLFUCache#put}} method, which allows for a 
> race condition between the check and the call to {{markAndSweep}} method. 
> Despite that a thread must acquire a lock to perform sweeping, it's still 
> possible that multiple threads successfully detected a need for calling 
> markAndSweep. If they execute it sequentially, subsequent runs will fail with 
> {{NoSuchElementException}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7585) ConcurrentLFUCache throws NoSuchElementException under a write-heavy load

2015-05-23 Thread Maciej Zasada (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maciej Zasada updated SOLR-7585:

Attachment: SOLR-7585.patch

Of course, latest patch contains same test compatible with Java 7. My bad, I 
tested against {{trunk}} (which is now on Java 8) instead of a 5.x branch. 
Cheers!

> ConcurrentLFUCache throws NoSuchElementException under a write-heavy load
> -
>
> Key: SOLR-7585
> URL: https://issues.apache.org/jira/browse/SOLR-7585
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 5.1
>Reporter: Maciej Zasada
>Priority: Minor
> Attachments: SOLR-7585.patch, SOLR-7585.patch
>
>
> Under a write-heavy load {{ConcurrentLFUCache}} throws 
> {{NoSuchElementException}}. The problem lies within 
> {{org.apache.solr.util.ConcurrentLFUCache#put}} method, which allows for a 
> race condition between the check and the call to {{markAndSweep}} method. 
> Despite that a thread must acquire a lock to perform sweeping, it's still 
> possible that multiple threads successfully detected a need for calling 
> markAndSweep. If they execute it sequentially, subsequent runs will fail with 
> {{NoSuchElementException}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7585) ConcurrentLFUCache throws NoSuchElementException under a write-heavy load

2015-05-22 Thread Maciej Zasada (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maciej Zasada updated SOLR-7585:

Attachment: SOLR-7585.patch

Attached a patch containing a test case + a fix that mitigates the race 
condition. Cheers!

> ConcurrentLFUCache throws NoSuchElementException under a write-heavy load
> -
>
> Key: SOLR-7585
> URL: https://issues.apache.org/jira/browse/SOLR-7585
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 5.1
>Reporter: Maciej Zasada
>Priority: Minor
> Attachments: SOLR-7585.patch
>
>
> Under a write-heavy load {{ConcurrentLFUCache}} throws 
> {{NoSuchElementException}}. The problem lies within 
> {{org.apache.solr.util.ConcurrentLFUCache#put}} method, which allows for a 
> race condition between the check and the call to {{markAndSweep}} method. 
> Despite that a thread must acquire a lock to perform sweeping, it's still 
> possible that multiple threads successfully detected a need for calling 
> markAndSweep. If they execute it sequentially, subsequent runs will fail with 
> {{NoSuchElementException}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-7585) ConcurrentLFUCache throws NoSuchElementException under a write-heavy load

2015-05-22 Thread Maciej Zasada (JIRA)
Maciej Zasada created SOLR-7585:
---

 Summary: ConcurrentLFUCache throws NoSuchElementException under a 
write-heavy load
 Key: SOLR-7585
 URL: https://issues.apache.org/jira/browse/SOLR-7585
 Project: Solr
  Issue Type: Bug
Affects Versions: 5.1
Reporter: Maciej Zasada
Priority: Minor


Under a write-heavy load {{ConcurrentLFUCache}} throws 
{{NoSuchElementException}}. The problem lies within 
{{org.apache.solr.util.ConcurrentLFUCache#put}} method, which allows for a race 
condition between the check and the call to {{markAndSweep}} method. Despite 
that a thread must acquire a lock to perform sweeping, it's still possible that 
multiple threads successfully detected a need for calling markAndSweep. If they 
execute it sequentially, subsequent runs will fail with 
{{NoSuchElementException}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6202) Simplify the use of "rq" for query re-ranking

2014-08-18 Thread Maciej Zasada (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maciej Zasada updated SOLR-6202:


Attachment: SOLR-6202.patch

Hi [~hossman], [~joel.bernstein]

I'd like to summit a patch for this issue. I tried to make changes accordingly 
to your suggestions: 
* default parser for {{rq}} is {{ReRankQParser}}
* {{reRankQuery}} local param is deprecated in favour of using {{v}}
* query is parsed using {{subQuery()}}

I duplicated rerank tests so they use simplified syntax as well. If you have 
any suggestions, I'm more than happy to hear them. 

Cheers, 
Maciej

> Simplify the use of "rq" for query re-ranking
> -
>
> Key: SOLR-6202
> URL: https://issues.apache.org/jira/browse/SOLR-6202
> Project: Solr
>  Issue Type: Improvement
>Reporter: Hoss Man
> Attachments: SOLR-6202.patch
>
>
> It seems like there is some really easy tweaks we could make to the "rq" 
> (Re-Ranking Query) API to make it less verbose and easier to use:
> * in QueryComponent
> ** when parsing "rq", we should use "rerank" as the default parser
> *** should be a trivial change to the QParser.getParser when initializin 
> rqparser
> * in ReRankQParserPlugin
> ** we should deprecate/remove the "reRankQuery" local param and just use "v"
> *** this is the implicit param created when using the body of the query 
> string, or it can be explicit (to refer to variables)
> ** we should leverage subQuery() (just like NestedQParser does) so that 
> "defType" is respected when parsing the wrapped query
> Unless i'm overlooking something, the combination of these simple changes 
> should drastically simplify a lot of use cases...
> {noformat}
> BEFORE: rq={!rerank reRankQuery=$rqq}rqq=XXX
>  AFTER: rq=XXX
> BEFORE: rq={!rerank reRankQuery=$rqq reRankDocs=200}$rqq={!func}XXX
>  AFTER: rq={!reRankDocs=200 defType=func}XXX
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5746) solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; expects odd type for "shareSchema"

2014-07-16 Thread Maciej Zasada (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14063425#comment-14063425
 ] 

Maciej Zasada commented on SOLR-5746:
-

Hi [~hossman],

thanks for the code review and all your suggestions. I'll keep them in mind for 
the future work. Revised patch looks good to me.

Cheers,
Maciej

> solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; 
> expects odd type for "shareSchema"   
> --
>
> Key: SOLR-5746
> URL: https://issues.apache.org/jira/browse/SOLR-5746
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.3, 4.4, 4.5, 4.6
>Reporter: Hoss Man
> Attachments: SOLR-5746.patch, SOLR-5746.patch, SOLR-5746.patch, 
> SOLR-5746.patch, SOLR-5746.patch
>
>
> A comment in the ref guide got me looking at ConfigSolrXml.java and noticing 
> that the parsing of solr.xml options here is very brittle and confusing.  In 
> particular:
> * if a boolean option "foo" is expected along the lines of {{ name="foo">true}} it will silently ignore {{ name="foo">true}}
> * likewise for an int option {{32}} vs {{ name="bar">32}}
> ... this is inconsistent with the way solrconfig.xml is parsed.  In 
> solrconfig.xml, the xml nodes are parsed into a NamedList, and the above 
> options will work in either form, but an invalid value such as {{ name="foo">NOT A BOOLEAN}} will generate an error earlier (when 
> parsing config) then {{NOT A BOOLEAN}} (attempt to 
> parse the string as a bool the first time the config value is needed)
> In addition, i notice this really confusing line...
> {code}
> propMap.put(CfgProp.SOLR_SHARESCHEMA, 
> doSub("solr/str[@name='shareSchema']"));
> {code}
> "shareSchema" is used internally as a boolean option, but as written the 
> parsing code will ignore it unless the user explicitly configures it as a 
> {{}}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-5746) solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; expects odd type for "shareSchema"

2014-07-15 Thread Maciej Zasada (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14060685#comment-14060685
 ] 

Maciej Zasada edited comment on SOLR-5746 at 7/15/14 11:08 AM:
---

Hi [~hossman],

I've attached updated patch file:
* removed {{DOMUtil.readNamedChildrenAsNamedList}} method and used (slightly 
modified) existing API of {{DOMUtil}} instead.
* removed reading values from {{SolrParam}} - they are being read directly from 
the {{NamedList<>}}
* added reporting of duplicated config options ({{DEBUG}} level) per parent 
node, as well as exception message containing list of duplicated parameters, 
e.g.

{code}

1
STRING-1
…
2
STRING-2


{code}
will cause an exception:
{code}
Duplicated 2 config parameter(s) in solr.xml file: [int-param, str-param]
{code}
However, if parameters with a same name are attached to different parent nodes 
everything will pass just fine, e.g.
{code}

1
STRING-1
…


…
2
STRING-2


{code}
In this case no exception will be thrown. 

Some examples to sum it up:

|{{solr.xml}} file fragment|Expected type|Parsing result|
|{{44}}|{{Integer}}|(/)
|{{44}}|{{Integer}}|(/)
|{{44}}|{{Integer}}|(x)
|{{44}}|{{Integer}}|(x)
|{{44}}|{{Integer}}|(x)
|{{44}}|{{Integer}}|(x)
|{{true}}|{{Boolean}}|(x)
|{{true}}|{{Boolean}}|(/)
|{{true}}|{{Boolean}}|(x)
|{{true}}|{{Boolean}}|(/)
|{{true}}|{{Boolean}}|(x)
|{{true}}|{{Boolean}}|(x)

{{ant clean test}} shows that there's no regression.

[~jkrupan] this change clearly is not backward compatible with the existing 
{{solr.xml}} files. For instance - unknown config values won't be silently 
ignored - an exception will be thrown instead. However, I didn't realise that 
{{solr.xml}} files are versioned the same way as {{schema.xml}} files are. 
Should I bump the schema version to 1.6?

Cheers,
Maciej


was (Author: maciej.zasada):
Hi [~hossman],

I've attached updated patch file:
* removed {{DOMUtil.readNamedChildrenAsNamedList}} method and used (slightly 
modified) existing API of {{DOMUtil}} instead.
* removed reading values from {{SolrParam}} - they are being read directly from 
the {{NamedList<>}}
* added reporting of duplicated config options ({{DEBUG}} level) per parent 
node, as well as exception message containing list of duplicated parameters, 
e.g.

{code}

1
STRING-1
…
2
STRING-2


{code}
will cause an exception:
{code}
Duplicated 2 config parameter(s) in solr.xml file: [int-param, str-param]
{code}
However, if parameters with a same name are attached to different parent nodes 
everything will pass just fine, e.g.
{code}

1
STRING-1
…


…
2
STRING-2


{code}
In this case no exception will be thrown. 

Some examples to sum it up:

|{{solr.xml}} file fragment|Expected type|Parsing result|
|{{44}}|{{Integer}}|(/)
|{{44}}|{{Integer}}|(/)
|{{44}}|{{Integer}}|(/)
|{{44}}|{{Integer}}|(x)
|{{44}}|{{Integer}}|(x)
|{{44}}|{{Integer}}|(x)
|{{true}}|{{Boolean}}|(x)
|{{true}}|{{Boolean}}|(/)
|{{true}}|{{Boolean}}|(x)
|{{true}}|{{Boolean}}|(/)
|{{true}}|{{Boolean}}|(x)
|{{true}}|{{Boolean}}|(x)

{{ant clean test}} shows that there's no regression.

[~jkrupan] this change clearly is not backward compatible with the existing 
{{solr.xml}} files. For instance - unknown config values won't be silently 
ignored - an exception will be thrown instead. However, I didn't realise that 
{{solr.xml}} files are versioned the same way as {{schema.xml}} files are. 
Should I bump the schema version to 1.6?

Cheers,
Maciej

> solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; 
> expects odd type for "shareSchema"   
> --
>
> Key: SOLR-5746
> URL: https://issues.apache.org/jira/browse/SOLR-5746
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.3, 4.4, 4.5, 4.6
>Reporter: Hoss Man
> Attachments: SOLR-5746.patch, SOLR-5746.patch, SOLR-5746.patch, 
> SOLR-5746.patch
>
>
> A comment in the ref guide got me looking at ConfigSolrXml.java and noticing 
> that the parsing of solr.xml options here is very brittle and confusing.  In 
> particular:
> * if a boolean option "foo" is expected along the lines of {{ name="foo">true}} it will silently ignore {{ name="foo">true}}
> * likewise for an int option {{32}} vs {{ name="bar">32}}
> ... this is inconsistent with the way solrconfig.xml is parsed.  In 
> solrconfig.xml, the xml nodes are parsed into a NamedList, and the above 
> options will work in either form, but an invalid value such as {{ name="foo">NOT A BOOLEAN}} will generate an error earlier (when 
> parsing config) then {{NOT A BOOLEAN}} (attempt to 
> parse the string as a bool the first time the config value is needed)
> In addition, i notice this really confusing line...
> {code}
> propMap.put(CfgProp.SOLR_SHARESCHEMA, 
> doSub("solr/str[@na

[jira] [Updated] (SOLR-5746) solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; expects odd type for "shareSchema"

2014-07-15 Thread Maciej Zasada (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maciej Zasada updated SOLR-5746:


Attachment: SOLR-5746.patch

Added some more unit test to the patch.

> solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; 
> expects odd type for "shareSchema"   
> --
>
> Key: SOLR-5746
> URL: https://issues.apache.org/jira/browse/SOLR-5746
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.3, 4.4, 4.5, 4.6
>Reporter: Hoss Man
> Attachments: SOLR-5746.patch, SOLR-5746.patch, SOLR-5746.patch, 
> SOLR-5746.patch
>
>
> A comment in the ref guide got me looking at ConfigSolrXml.java and noticing 
> that the parsing of solr.xml options here is very brittle and confusing.  In 
> particular:
> * if a boolean option "foo" is expected along the lines of {{ name="foo">true}} it will silently ignore {{ name="foo">true}}
> * likewise for an int option {{32}} vs {{ name="bar">32}}
> ... this is inconsistent with the way solrconfig.xml is parsed.  In 
> solrconfig.xml, the xml nodes are parsed into a NamedList, and the above 
> options will work in either form, but an invalid value such as {{ name="foo">NOT A BOOLEAN}} will generate an error earlier (when 
> parsing config) then {{NOT A BOOLEAN}} (attempt to 
> parse the string as a bool the first time the config value is needed)
> In addition, i notice this really confusing line...
> {code}
> propMap.put(CfgProp.SOLR_SHARESCHEMA, 
> doSub("solr/str[@name='shareSchema']"));
> {code}
> "shareSchema" is used internally as a boolean option, but as written the 
> parsing code will ignore it unless the user explicitly configures it as a 
> {{}}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5746) solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; expects odd type for "shareSchema"

2014-07-14 Thread Maciej Zasada (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maciej Zasada updated SOLR-5746:


Attachment: SOLR-5746.patch

Hi [~hossman],

I've attached updated patch file:
* removed {{DOMUtil.readNamedChildrenAsNamedList}} method and used (slightly 
modified) existing API of {{DOMUtil}} instead.
* removed reading values from {{SolrParam}} - they are being read directly from 
the {{NamedList<>}}
* added reporting of duplicated config options ({{DEBUG}} level) per parent 
node, as well as exception message containing list of duplicated parameters, 
e.g.

{code}

1
STRING-1
…
2
STRING-2


{code}
will cause an exception:
{code}
Duplicated 2 config parameter(s) in solr.xml file: [int-param, str-param]
{code}
However, if parameters with a same name are attached to different parent nodes 
everything will pass just fine, e.g.
{code}

1
STRING-1
…


…
2
STRING-2


{code}
In this case no exception will be thrown. 

Some examples to sum it up:

|{{solr.xml}} file fragment|Expected type|Parsing result|
|{{44}}|{{Integer}}|(/)
|{{44}}|{{Integer}}|(/)
|{{44}}|{{Integer}}|(/)
|{{44}}|{{Integer}}|(x)
|{{44}}|{{Integer}}|(x)
|{{44}}|{{Integer}}|(x)
|{{true}}|{{Boolean}}|(x)
|{{true}}|{{Boolean}}|(/)
|{{true}}|{{Boolean}}|(x)
|{{true}}|{{Boolean}}|(/)
|{{true}}|{{Boolean}}|(x)
|{{true}}|{{Boolean}}|(x)

{{ant clean test}} shows that there's no regression.

[~jkrupan] this change clearly is not backward compatible with the existing 
{{solr.xml}} files. For instance - unknown config values won't be silently 
ignored - an exception will be thrown instead. However, I didn't realise that 
{{solr.xml}} files are versioned the same way as {{schema.xml}} files are. 
Should I bump the schema version to 1.6?

Cheers,
Maciej

> solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; 
> expects odd type for "shareSchema"   
> --
>
> Key: SOLR-5746
> URL: https://issues.apache.org/jira/browse/SOLR-5746
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.3, 4.4, 4.5, 4.6
>Reporter: Hoss Man
> Attachments: SOLR-5746.patch, SOLR-5746.patch, SOLR-5746.patch
>
>
> A comment in the ref guide got me looking at ConfigSolrXml.java and noticing 
> that the parsing of solr.xml options here is very brittle and confusing.  In 
> particular:
> * if a boolean option "foo" is expected along the lines of {{ name="foo">true}} it will silently ignore {{ name="foo">true}}
> * likewise for an int option {{32}} vs {{ name="bar">32}}
> ... this is inconsistent with the way solrconfig.xml is parsed.  In 
> solrconfig.xml, the xml nodes are parsed into a NamedList, and the above 
> options will work in either form, but an invalid value such as {{ name="foo">NOT A BOOLEAN}} will generate an error earlier (when 
> parsing config) then {{NOT A BOOLEAN}} (attempt to 
> parse the string as a bool the first time the config value is needed)
> In addition, i notice this really confusing line...
> {code}
> propMap.put(CfgProp.SOLR_SHARESCHEMA, 
> doSub("solr/str[@name='shareSchema']"));
> {code}
> "shareSchema" is used internally as a boolean option, but as written the 
> parsing code will ignore it unless the user explicitly configures it as a 
> {{}}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5746) solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; expects odd type for "shareSchema"

2014-07-12 Thread Maciej Zasada (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14059771#comment-14059771
 ] 

Maciej Zasada commented on SOLR-5746:
-

bq. 1) can you explain why the need for the new 
DOMUtil.readNamedChildrenAsNamedList method that you added instead of just 
using the existing DOMUtil.childNodesToNamedList
The reason why I didn’t use {{DOMUtil.childNodesToNamedList}} is that I wanted 
to move the type conversion from reading the DOM to the point, when I knew what 
type exactly should be used. If I used that method, all values would be 
upcasted to {{Object}}, and I would have to do the type checking to make sure 
if the type is correct, something like 
{code}
if (value instanceof Integer) {
 type is ok, just store the value as it is
} else if(value instanceof String) {
 type is not ok, but let's try parsing this String to something useful
}
{code} Instead, I’m storing only the "raw" values, and trying to imply the 
correct type only once, without the type checking. But when I’m thinking it all 
over again your approach seems to have a significant advantage - in proposed 
implementation type mismatch would be permitted for all types, not only 
{{}}, e.g. {{3.14}} would be perfectly valid config 
option, if {{double}} type was expected. You’re right, I’ll remove new method 
and use {{DOMUtil.childNodesToNamedList}}.
bq. 2) Speaking of which: what's the purpose exactly of "configAsSolrParams" if 
the original NamedList is still being passed to the storeConfigPropertyAs* 
methods - why not just get the values directly from there?
Since I stored "raw" config values, I used {{SolrParam}} to do the type 
conversion, but I didn’t find any API for a parameter removal. That’s why I’m 
keeping the original NamedList, so that I can remove correctly read values and 
keep track of the unknown ones.
bq. 3) One piece of validation that i believe is still missing here is to throw 
an errir if/when a config value is specified multiple times
I should’ve thought about that, good catch! I’ll add detection of such errors.

> solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; 
> expects odd type for "shareSchema"   
> --
>
> Key: SOLR-5746
> URL: https://issues.apache.org/jira/browse/SOLR-5746
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.3, 4.4, 4.5, 4.6
>Reporter: Hoss Man
> Attachments: SOLR-5746.patch, SOLR-5746.patch
>
>
> A comment in the ref guide got me looking at ConfigSolrXml.java and noticing 
> that the parsing of solr.xml options here is very brittle and confusing.  In 
> particular:
> * if a boolean option "foo" is expected along the lines of {{ name="foo">true}} it will silently ignore {{ name="foo">true}}
> * likewise for an int option {{32}} vs {{ name="bar">32}}
> ... this is inconsistent with the way solrconfig.xml is parsed.  In 
> solrconfig.xml, the xml nodes are parsed into a NamedList, and the above 
> options will work in either form, but an invalid value such as {{ name="foo">NOT A BOOLEAN}} will generate an error earlier (when 
> parsing config) then {{NOT A BOOLEAN}} (attempt to 
> parse the string as a bool the first time the config value is needed)
> In addition, i notice this really confusing line...
> {code}
> propMap.put(CfgProp.SOLR_SHARESCHEMA, 
> doSub("solr/str[@name='shareSchema']"));
> {code}
> "shareSchema" is used internally as a boolean option, but as written the 
> parsing code will ignore it unless the user explicitly configures it as a 
> {{}}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5746) solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; expects odd type for "shareSchema"

2014-07-10 Thread Maciej Zasada (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maciej Zasada updated SOLR-5746:


Attachment: SOLR-5746.patch

Hi [~hossman],

I've attached updated patch file:
* used framework's randomisation wherever it made sense to me;
* added exception messages assertions;
* added reporting of multiple unexpended config options ({{DEBUG}} level), as 
well as exception message containing list of unknown parameters (e.g. 
{code}Unrecognised 3 config parameter(s) in solr.xml file: [foo, bar, baz]{code}
 
{{ant clean test}} shows that there's no regression.

Cheers,
Maciej

> solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; 
> expects odd type for "shareSchema"   
> --
>
> Key: SOLR-5746
> URL: https://issues.apache.org/jira/browse/SOLR-5746
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.3, 4.4, 4.5, 4.6
>Reporter: Hoss Man
> Attachments: SOLR-5746.patch, SOLR-5746.patch
>
>
> A comment in the ref guide got me looking at ConfigSolrXml.java and noticing 
> that the parsing of solr.xml options here is very brittle and confusing.  In 
> particular:
> * if a boolean option "foo" is expected along the lines of {{ name="foo">true}} it will silently ignore {{ name="foo">true}}
> * likewise for an int option {{32}} vs {{ name="bar">32}}
> ... this is inconsistent with the way solrconfig.xml is parsed.  In 
> solrconfig.xml, the xml nodes are parsed into a NamedList, and the above 
> options will work in either form, but an invalid value such as {{ name="foo">NOT A BOOLEAN}} will generate an error earlier (when 
> parsing config) then {{NOT A BOOLEAN}} (attempt to 
> parse the string as a bool the first time the config value is needed)
> In addition, i notice this really confusing line...
> {code}
> propMap.put(CfgProp.SOLR_SHARESCHEMA, 
> doSub("solr/str[@name='shareSchema']"));
> {code}
> "shareSchema" is used internally as a boolean option, but as written the 
> parsing code will ignore it unless the user explicitly configures it as a 
> {{}}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5746) solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; expects odd type for "shareSchema"

2014-07-10 Thread Maciej Zasada (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maciej Zasada updated SOLR-5746:


Attachment: SOLR-5746.patch

Hi [~hossman],

I'd like to submit a patch for that issue. I made changes accordingly to your 
suggestions:
* parsing logic has changed, so that config parameters are transformed to the 
expected types at the parsing time, instead of value-reading time. I'm 
transforming each solr.xml section to the NamedList, and later on to the 
SolrParams; Essentially, if {{boolean}} type of {{foo}} parameter is expected, 
{{true}} will work just fine. Same goes for other types.
* exception is thrown if any unexpected values are found in the config at the 
parse time.

If you have any suggestions, I'm more than happy to hear them.

Cheers,
Maciej

> solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; 
> expects odd type for "shareSchema"   
> --
>
> Key: SOLR-5746
> URL: https://issues.apache.org/jira/browse/SOLR-5746
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.3, 4.4, 4.5, 4.6
>Reporter: Hoss Man
> Attachments: SOLR-5746.patch
>
>
> A comment in the ref guide got me looking at ConfigSolrXml.java and noticing 
> that the parsing of solr.xml options here is very brittle and confusing.  In 
> particular:
> * if a boolean option "foo" is expected along the lines of {{ name="foo">true}} it will silently ignore {{ name="foo">true}}
> * likewise for an int option {{32}} vs {{ name="bar">32}}
> ... this is inconsistent with the way solrconfig.xml is parsed.  In 
> solrconfig.xml, the xml nodes are parsed into a NamedList, and the above 
> options will work in either form, but an invalid value such as {{ name="foo">NOT A BOOLEAN}} will generate an error earlier (when 
> parsing config) then {{NOT A BOOLEAN}} (attempt to 
> parse the string as a bool the first time the config value is needed)
> In addition, i notice this really confusing line...
> {code}
> propMap.put(CfgProp.SOLR_SHARESCHEMA, 
> doSub("solr/str[@name='shareSchema']"));
> {code}
> "shareSchema" is used internally as a boolean option, but as written the 
> parsing code will ignore it unless the user explicitly configures it as a 
> {{}}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org