[jira] [Commented] (SOLR-13787) An annotation based system to write v2 only APIs

2019-10-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949918#comment-16949918
 ] 

ASF subversion and git services commented on SOLR-13787:


Commit 83c80376fa57f7218b45735dd39316684f68db4c in lucene-solr's branch 
refs/heads/branch_8x from Noble Paul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=83c8037 ]

SOLR-13787: Better error logging


> An annotation based system to write v2 only APIs
> 
>
> Key: SOLR-13787
> URL: https://issues.apache.org/jira/browse/SOLR-13787
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
> Fix For: master (9.0), 8.3
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> example v2 API may look as follows
> {code:java}
> @V2EndPoint(method = POST, path = "/cluster/package", permission = 
> PermissionNameProvider.Name.ALL)
> public static class ApiTest {
>   @Command(name = "add")
>   public void add(SolrQueryRequest req, SolrQueryResponse rsp, AddVersion 
> addVersion) {
>   }
>   @Command(name = "delete")
>   public void del(SolrQueryRequest req, SolrQueryResponse rsp, List 
> names) {
>   }
> }
> public static class AddVersion {
>   @JsonProperty(value = "package", required = true)
>   public String pkg;
>   @JsonProperty(value = "version", required = true)
>   public String version;
>   @JsonProperty(value = "files", required = true)
>   public List files;
> }
> {code}
> This expects you to already hava a POJO annotated with jackson annotations
>  
> The annotations are:
>  
> {code:java}
> @Retention(RetentionPolicy.RUNTIME)
> @Target({ElementType.TYPE})
> public @interface EndPoint {
> /**The suoported http methods*/
>   SolrRequest.METHOD[] method();
> /**supported paths*/
>   String[] path();
>   PermissionNameProvider.Name permission();
> }
> {code}
> {code:java}
> @Retention(RetentionPolicy.RUNTIME)
> @Target(ElementType.METHOD)
> public @interface Command {
>/**if this is not a json command , leave it empty.
>* Keep in mind that you cannot have duplicates.
>* Only one method per name
>*
>*/
>   String name() default "";
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13787) An annotation based system to write v2 only APIs

2019-10-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949917#comment-16949917
 ] 

ASF subversion and git services commented on SOLR-13787:


Commit 84126ea0eae452ff3cebbd5eb2b7d94573eb841e in lucene-solr's branch 
refs/heads/master from Noble Paul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=84126ea ]

SOLR-13787: Better error logging


> An annotation based system to write v2 only APIs
> 
>
> Key: SOLR-13787
> URL: https://issues.apache.org/jira/browse/SOLR-13787
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
> Fix For: master (9.0), 8.3
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> example v2 API may look as follows
> {code:java}
> @V2EndPoint(method = POST, path = "/cluster/package", permission = 
> PermissionNameProvider.Name.ALL)
> public static class ApiTest {
>   @Command(name = "add")
>   public void add(SolrQueryRequest req, SolrQueryResponse rsp, AddVersion 
> addVersion) {
>   }
>   @Command(name = "delete")
>   public void del(SolrQueryRequest req, SolrQueryResponse rsp, List 
> names) {
>   }
> }
> public static class AddVersion {
>   @JsonProperty(value = "package", required = true)
>   public String pkg;
>   @JsonProperty(value = "version", required = true)
>   public String version;
>   @JsonProperty(value = "files", required = true)
>   public List files;
> }
> {code}
> This expects you to already hava a POJO annotated with jackson annotations
>  
> The annotations are:
>  
> {code:java}
> @Retention(RetentionPolicy.RUNTIME)
> @Target({ElementType.TYPE})
> public @interface EndPoint {
> /**The suoported http methods*/
>   SolrRequest.METHOD[] method();
> /**supported paths*/
>   String[] path();
>   PermissionNameProvider.Name permission();
> }
> {code}
> {code:java}
> @Retention(RetentionPolicy.RUNTIME)
> @Target(ElementType.METHOD)
> public @interface Command {
>/**if this is not a json command , leave it empty.
>* Keep in mind that you cannot have duplicates.
>* Only one method per name
>*
>*/
>   String name() default "";
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-13838) igain query parser generating invalid output

2019-10-11 Thread Peter Davie (Jira)
Peter Davie created SOLR-13838:
--

 Summary: igain query parser generating invalid output
 Key: SOLR-13838
 URL: https://issues.apache.org/jira/browse/SOLR-13838
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: query parsers
Affects Versions: 8.2
 Environment: The issue is a generic Java defect and therefore will be 
independent of the operating system or software platform.
Reporter: Peter Davie
 Fix For: 8.3
 Attachments: IGainTermsQParserPlugin.java.patch

Investigating the output from the "features()" stream source, terms are being 
returned with NaN for the score_f field:

    "docs": [
  {
    "featureSet_s": "business",
    "score_f": "NaN",
    "term_s": "1,011.15",
    "idf_d": "-Infinity",
    "index_i": 1,
    "id": "business_1"
  },
  {
    "featureSet_s": "business",
    "score_f": "NaN",
    "term_s": "10.3m",
    "idf_d": "-Infinity",
    "index_i": 2,
    "id": "business_2"
  },
  {
    "featureSet_s": "business",
    "score_f": "NaN",
    "term_s": "01",
    "idf_d": "-Infinity",
    "index_i": 3,
    "id": "business_3"
  },...

Looking into{{ org/apache/solr/search/IGainTermsQParserPlugin.java}}, it seems 
that when a term is not included in the positive or negative documents, the 
docFreq calculation (docFreq = xc + nc) is 0, which means that subsequent 
calculations result in NaN (division by 0).

Attached is a patch which skips terms for which docFreq
is 0 in the finish() method of IGainTermsQParserPlugin and this resolves the 
issues with NaN scores in the features() output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13815) Live split can lose data

2019-10-11 Thread Shalin Shekhar Mangar (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949901#comment-16949901
 ] 

Shalin Shekhar Mangar commented on SOLR-13815:
--

Probably too late for this comment but...

bq. Still... I don't think zookeeper can update multiple znodes at the same 
time, so we might still have a very small window where we see something like 
inactive/construction/construction. I'm not sure what the behavior of the 
current code would be in that case.

Actually inactive/construction/construction is impossible because the shard 
state is coming from the clusterstate which is a single znode updated 
atomically. So the states will either be active/construction/construction or 
active/recovery/recovery or inactive/active/active. No other state is possible.

> Live split can lose data
> 
>
> Key: SOLR-13815
> URL: https://issues.apache.org/jira/browse/SOLR-13815
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Yonik Seeley
>Priority: Major
> Fix For: 8.3
>
> Attachments: fail.191004_053129, fail.191004_093307
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This issue is to investigate potential data loss during a "live" split (i.e. 
> split happens while updates are flowing)
> This was discovered during the shared storage work which was based on a 
> non-release branch_8x sometime before 8.3, hence the first steps are to try 
> and reproduce on the master branch without any shared storage changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13815) Live split can lose data

2019-10-11 Thread Shalin Shekhar Mangar (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949902#comment-16949902
 ] 

Shalin Shekhar Mangar commented on SOLR-13815:
--

Thanks for investigating and fixing the problem!

> Live split can lose data
> 
>
> Key: SOLR-13815
> URL: https://issues.apache.org/jira/browse/SOLR-13815
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Yonik Seeley
>Priority: Major
> Fix For: 8.3
>
> Attachments: fail.191004_053129, fail.191004_093307
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This issue is to investigate potential data loss during a "live" split (i.e. 
> split happens while updates are flowing)
> This was discovered during the shared storage work which was based on a 
> non-release branch_8x sometime before 8.3, hence the first steps are to try 
> and reproduce on the master branch without any shared storage changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13787) An annotation based system to write v2 only APIs

2019-10-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949900#comment-16949900
 ] 

ASF subversion and git services commented on SOLR-13787:


Commit 4c67f1645ea4e21d0a2fbaaa084c5433faffc751 in lucene-solr's branch 
refs/heads/branch_8_3 from Noble Paul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=4c67f16 ]

SOLR-13787: Better error logging


> An annotation based system to write v2 only APIs
> 
>
> Key: SOLR-13787
> URL: https://issues.apache.org/jira/browse/SOLR-13787
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
> Fix For: master (9.0), 8.3
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> example v2 API may look as follows
> {code:java}
> @V2EndPoint(method = POST, path = "/cluster/package", permission = 
> PermissionNameProvider.Name.ALL)
> public static class ApiTest {
>   @Command(name = "add")
>   public void add(SolrQueryRequest req, SolrQueryResponse rsp, AddVersion 
> addVersion) {
>   }
>   @Command(name = "delete")
>   public void del(SolrQueryRequest req, SolrQueryResponse rsp, List 
> names) {
>   }
> }
> public static class AddVersion {
>   @JsonProperty(value = "package", required = true)
>   public String pkg;
>   @JsonProperty(value = "version", required = true)
>   public String version;
>   @JsonProperty(value = "files", required = true)
>   public List files;
> }
> {code}
> This expects you to already hava a POJO annotated with jackson annotations
>  
> The annotations are:
>  
> {code:java}
> @Retention(RetentionPolicy.RUNTIME)
> @Target({ElementType.TYPE})
> public @interface EndPoint {
> /**The suoported http methods*/
>   SolrRequest.METHOD[] method();
> /**supported paths*/
>   String[] path();
>   PermissionNameProvider.Name permission();
> }
> {code}
> {code:java}
> @Retention(RetentionPolicy.RUNTIME)
> @Target(ElementType.METHOD)
> public @interface Command {
>/**if this is not a json command , leave it empty.
>* Keep in mind that you cannot have duplicates.
>* Only one method per name
>*
>*/
>   String name() default "";
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13760) Date Math in "start" attribute of routed alias causes exception

2019-10-11 Thread Gus Heck (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949855#comment-16949855
 ] 

Gus Heck commented on SOLR-13760:
-

I should add that the nature of the failure is one of the assert not seeing a 
change in zookeeper, despite the test having waited on a watch for changes to 
aliases.json. Clearly since it passes other times, this is a test timing issue. 
The seed does not reproduce.

> Date Math in "start" attribute of routed alias causes exception
> ---
>
> Key: SOLR-13760
> URL: https://issues.apache.org/jira/browse/SOLR-13760
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Affects Versions: 8.3
>Reporter: Gus Heck
>Assignee: Gus Heck
>Priority: Major
> Fix For: 8.3
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The start parameter (for Time Routed Aliases and 2-Dimensional Routed Aliases 
> using time components) is meant to accept date math as well as a timestamp. 
> However it seems that none of the tests actually test this, and my changes 
> for DRA forgot to account for it in one place, so an exception is thrown 
> adding a document to an alias with such a configuration. Will add a test and 
> a fix. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13760) Date Math in "start" attribute of routed alias causes exception

2019-10-11 Thread Gus Heck (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949853#comment-16949853
 ] 

Gus Heck commented on SOLR-13760:
-

Noticed one test failure on fucit.org, for the test added in this ticket, which 
is very very irritating since I beasted the test with 40 simultaneous copies 
for 1000 tests before committing, completely saturating my cpu to the point 
where the machine was unusable for 2-3 hrs, and yet didn't have one failure. 
Will keep an eye on it. 

> Date Math in "start" attribute of routed alias causes exception
> ---
>
> Key: SOLR-13760
> URL: https://issues.apache.org/jira/browse/SOLR-13760
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Affects Versions: 8.3
>Reporter: Gus Heck
>Assignee: Gus Heck
>Priority: Major
> Fix For: 8.3
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The start parameter (for Time Routed Aliases and 2-Dimensional Routed Aliases 
> using time components) is meant to accept date math as well as a timestamp. 
> However it seems that none of the tests actually test this, and my changes 
> for DRA forgot to account for it in one place, so an exception is thrown 
> adding a document to an alias with such a configuration. Will add a test and 
> a fix. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13815) Live split can lose data

2019-10-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949755#comment-16949755
 ] 

ASF subversion and git services commented on SOLR-13815:


Commit 503fe7e9a9d5e80890fa7fe63c4fd56a161d0619 in lucene-solr's branch 
refs/heads/branch_8_3 from Yonik Seeley
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=503fe7e ]

SOLR-13815: fix live split data loss due to cluster state change between 
checking current shard state and getting list of subShards (#920)

* SOLR-13815: add simple live split test to help debugging possible issue

* SOLR-13815: fix live split data loss due to cluster state change berween 
checking current shard state and getting list of subShards


> Live split can lose data
> 
>
> Key: SOLR-13815
> URL: https://issues.apache.org/jira/browse/SOLR-13815
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Yonik Seeley
>Priority: Major
> Fix For: 8.3
>
> Attachments: fail.191004_053129, fail.191004_093307
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This issue is to investigate potential data loss during a "live" split (i.e. 
> split happens while updates are flowing)
> This was discovered during the shared storage work which was based on a 
> non-release branch_8x sometime before 8.3, hence the first steps are to try 
> and reproduce on the master branch without any shared storage changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13815) Live split can lose data

2019-10-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949757#comment-16949757
 ] 

ASF subversion and git services commented on SOLR-13815:


Commit 503fe7e9a9d5e80890fa7fe63c4fd56a161d0619 in lucene-solr's branch 
refs/heads/branch_8_3 from Yonik Seeley
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=503fe7e ]

SOLR-13815: fix live split data loss due to cluster state change between 
checking current shard state and getting list of subShards (#920)

* SOLR-13815: add simple live split test to help debugging possible issue

* SOLR-13815: fix live split data loss due to cluster state change berween 
checking current shard state and getting list of subShards


> Live split can lose data
> 
>
> Key: SOLR-13815
> URL: https://issues.apache.org/jira/browse/SOLR-13815
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Yonik Seeley
>Priority: Major
> Fix For: 8.3
>
> Attachments: fail.191004_053129, fail.191004_093307
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This issue is to investigate potential data loss during a "live" split (i.e. 
> split happens while updates are flowing)
> This was discovered during the shared storage work which was based on a 
> non-release branch_8x sometime before 8.3, hence the first steps are to try 
> and reproduce on the master branch without any shared storage changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13815) Live split can lose data

2019-10-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949756#comment-16949756
 ] 

ASF subversion and git services commented on SOLR-13815:


Commit 503fe7e9a9d5e80890fa7fe63c4fd56a161d0619 in lucene-solr's branch 
refs/heads/branch_8_3 from Yonik Seeley
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=503fe7e ]

SOLR-13815: fix live split data loss due to cluster state change between 
checking current shard state and getting list of subShards (#920)

* SOLR-13815: add simple live split test to help debugging possible issue

* SOLR-13815: fix live split data loss due to cluster state change berween 
checking current shard state and getting list of subShards


> Live split can lose data
> 
>
> Key: SOLR-13815
> URL: https://issues.apache.org/jira/browse/SOLR-13815
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Yonik Seeley
>Priority: Major
> Fix For: 8.3
>
> Attachments: fail.191004_053129, fail.191004_093307
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This issue is to investigate potential data loss during a "live" split (i.e. 
> split happens while updates are flowing)
> This was discovered during the shared storage work which was based on a 
> non-release branch_8x sometime before 8.3, hence the first steps are to try 
> and reproduce on the master branch without any shared storage changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13815) Live split can lose data

2019-10-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949752#comment-16949752
 ] 

ASF subversion and git services commented on SOLR-13815:


Commit cc62b9fac2302b8db627490efb88482ff6bbde54 in lucene-solr's branch 
refs/heads/branch_8x from Yonik Seeley
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=cc62b9f ]

SOLR-13815: fix live split data loss due to cluster state change between 
checking current shard state and getting list of subShards (#920)

* SOLR-13815: add simple live split test to help debugging possible issue

* SOLR-13815: fix live split data loss due to cluster state change berween 
checking current shard state and getting list of subShards


> Live split can lose data
> 
>
> Key: SOLR-13815
> URL: https://issues.apache.org/jira/browse/SOLR-13815
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Yonik Seeley
>Priority: Major
> Fix For: 8.3
>
> Attachments: fail.191004_053129, fail.191004_093307
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This issue is to investigate potential data loss during a "live" split (i.e. 
> split happens while updates are flowing)
> This was discovered during the shared storage work which was based on a 
> non-release branch_8x sometime before 8.3, hence the first steps are to try 
> and reproduce on the master branch without any shared storage changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13815) Live split can lose data

2019-10-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949753#comment-16949753
 ] 

ASF subversion and git services commented on SOLR-13815:


Commit cc62b9fac2302b8db627490efb88482ff6bbde54 in lucene-solr's branch 
refs/heads/branch_8x from Yonik Seeley
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=cc62b9f ]

SOLR-13815: fix live split data loss due to cluster state change between 
checking current shard state and getting list of subShards (#920)

* SOLR-13815: add simple live split test to help debugging possible issue

* SOLR-13815: fix live split data loss due to cluster state change berween 
checking current shard state and getting list of subShards


> Live split can lose data
> 
>
> Key: SOLR-13815
> URL: https://issues.apache.org/jira/browse/SOLR-13815
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Yonik Seeley
>Priority: Major
> Fix For: 8.3
>
> Attachments: fail.191004_053129, fail.191004_093307
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This issue is to investigate potential data loss during a "live" split (i.e. 
> split happens while updates are flowing)
> This was discovered during the shared storage work which was based on a 
> non-release branch_8x sometime before 8.3, hence the first steps are to try 
> and reproduce on the master branch without any shared storage changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13815) Live split can lose data

2019-10-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949754#comment-16949754
 ] 

ASF subversion and git services commented on SOLR-13815:


Commit cc62b9fac2302b8db627490efb88482ff6bbde54 in lucene-solr's branch 
refs/heads/branch_8x from Yonik Seeley
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=cc62b9f ]

SOLR-13815: fix live split data loss due to cluster state change between 
checking current shard state and getting list of subShards (#920)

* SOLR-13815: add simple live split test to help debugging possible issue

* SOLR-13815: fix live split data loss due to cluster state change berween 
checking current shard state and getting list of subShards


> Live split can lose data
> 
>
> Key: SOLR-13815
> URL: https://issues.apache.org/jira/browse/SOLR-13815
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Yonik Seeley
>Priority: Major
> Fix For: 8.3
>
> Attachments: fail.191004_053129, fail.191004_093307
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This issue is to investigate potential data loss during a "live" split (i.e. 
> split happens while updates are flowing)
> This was discovered during the shared storage work which was based on a 
> non-release branch_8x sometime before 8.3, hence the first steps are to try 
> and reproduce on the master branch without any shared storage changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13815) Live split can lose data

2019-10-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949720#comment-16949720
 ] 

ASF subversion and git services commented on SOLR-13815:


Commit a057b0d159f669d28565f48c3ee2bee76ab3d821 in lucene-solr's branch 
refs/heads/master from Yonik Seeley
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a057b0d ]

SOLR-13815: fix live split data loss due to cluster state change between 
checking current shard state and getting list of subShards (#920)

* SOLR-13815: add simple live split test to help debugging possible issue

* SOLR-13815: fix live split data loss due to cluster state change berween 
checking current shard state and getting list of subShards


> Live split can lose data
> 
>
> Key: SOLR-13815
> URL: https://issues.apache.org/jira/browse/SOLR-13815
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Yonik Seeley
>Priority: Major
> Fix For: 8.3
>
> Attachments: fail.191004_053129, fail.191004_093307
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This issue is to investigate potential data loss during a "live" split (i.e. 
> split happens while updates are flowing)
> This was discovered during the shared storage work which was based on a 
> non-release branch_8x sometime before 8.3, hence the first steps are to try 
> and reproduce on the master branch without any shared storage changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13815) Live split can lose data

2019-10-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949721#comment-16949721
 ] 

ASF subversion and git services commented on SOLR-13815:


Commit a057b0d159f669d28565f48c3ee2bee76ab3d821 in lucene-solr's branch 
refs/heads/master from Yonik Seeley
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a057b0d ]

SOLR-13815: fix live split data loss due to cluster state change between 
checking current shard state and getting list of subShards (#920)

* SOLR-13815: add simple live split test to help debugging possible issue

* SOLR-13815: fix live split data loss due to cluster state change berween 
checking current shard state and getting list of subShards


> Live split can lose data
> 
>
> Key: SOLR-13815
> URL: https://issues.apache.org/jira/browse/SOLR-13815
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Yonik Seeley
>Priority: Major
> Fix For: 8.3
>
> Attachments: fail.191004_053129, fail.191004_093307
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This issue is to investigate potential data loss during a "live" split (i.e. 
> split happens while updates are flowing)
> This was discovered during the shared storage work which was based on a 
> non-release branch_8x sometime before 8.3, hence the first steps are to try 
> and reproduce on the master branch without any shared storage changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13815) Live split can lose data

2019-10-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949719#comment-16949719
 ] 

ASF subversion and git services commented on SOLR-13815:


Commit a057b0d159f669d28565f48c3ee2bee76ab3d821 in lucene-solr's branch 
refs/heads/master from Yonik Seeley
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a057b0d ]

SOLR-13815: fix live split data loss due to cluster state change between 
checking current shard state and getting list of subShards (#920)

* SOLR-13815: add simple live split test to help debugging possible issue

* SOLR-13815: fix live split data loss due to cluster state change berween 
checking current shard state and getting list of subShards


> Live split can lose data
> 
>
> Key: SOLR-13815
> URL: https://issues.apache.org/jira/browse/SOLR-13815
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Yonik Seeley
>Priority: Major
> Fix For: 8.3
>
> Attachments: fail.191004_053129, fail.191004_093307
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This issue is to investigate potential data loss during a "live" split (i.e. 
> split happens while updates are flowing)
> This was discovered during the shared storage work which was based on a 
> non-release branch_8x sometime before 8.3, hence the first steps are to try 
> and reproduce on the master branch without any shared storage changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] yonik merged pull request #920: SOLR-13815: add simple live split test to help debugging possible issue

2019-10-11 Thread GitBox
yonik merged pull request #920: SOLR-13815: add simple live split test to help 
debugging possible issue
URL: https://github.com/apache/lucene-solr/pull/920
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13827) Fail on Unknown operation in Request Parameters API

2019-10-11 Thread Noble Paul (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949715#comment-16949715
 ] 

Noble Paul commented on SOLR-13827:
---

I guess we should just fix this opne issue and use the rewrite using 
annotations later

> Fail on Unknown operation in Request Parameters API
> ---
>
> Key: SOLR-13827
> URL: https://issues.apache.org/jira/browse/SOLR-13827
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: config-api
>Reporter: Munendra S N
>Assignee: Munendra S N
>Priority: Minor
>
> Request Parameters API supports set, update and delete operations. For any 
> other operation, The API should fail and return error.
> Currently, for unknown operation API returns 200 status
> The config/overlay API fails on unknown operations



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13793) HTTPSolrCall makes cascading calls even when all replicas are down for a collection

2019-10-11 Thread Kesharee Nandan Vishwakarma (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949702#comment-16949702
 ] 

Kesharee Nandan Vishwakarma commented on SOLR-13793:


Sounds good, let me know if you need any changes or separate patch.

> HTTPSolrCall makes cascading calls even when all replicas are down for a 
> collection
> ---
>
> Key: SOLR-13793
> URL: https://issues.apache.org/jira/browse/SOLR-13793
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Affects Versions: 6.6, master (9.0)
>Reporter: Kesharee Nandan Vishwakarma
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13793.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> REMOTEQUERY action in HTTPSolrCall ends up making too many cascading 
> remoteQuery calls when all all the replicas of a collection are in down 
> state. 
> This results in increase in thread count, unresponsive solr nodes and 
> eventually node (one's which have this collection) going out of live nodes.
> *Example scenario*: Consider a cluster with 3 nodes(solr1, solrw1, 
> solr-overseer1). A collection is present on solr1, solrw1 but both replicas 
> are in down state. When a search request is made to solr-overseer1, since 
> replica is not present locally a remote query is made to solr1 (we also 
> consider inactive slices/coreUrls), solr1 also doesn't see an active replica 
> present locally, it forwards to solrw1, again solrw1 will forward request to 
> solr1. This goes on till both of solr1, solrw1 become unresponsive. Attached 
> logs for this.
> This is happening because we are considering [inactive 
> slices|https://github.com/apache/lucene-solr/blob/68fa249034ba8b273955f20097700dc2fbb7a800/solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java#L913
>  ], [inactive coreUrl| 
> https://github.com/apache/lucene-solr/blob/68fa249034ba8b273955f20097700dc2fbb7a800/solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java#L929]
>  while forwarding requests to nodes.
> *Steps to reproduce*:
> #  Bring down all replicas of a collection but ensure nodes containing them 
> are up 
> # Make any search call to any of solr nodes for this collection. 
>  
> *Possible fixes*: 
> # Ensure we select only active slices/coreUrls before making remote queries
> # Put a limit on cascading calls probably limit to number of replicas 
>  
> {noformat} 
> solrw1_1 |
> solrw1_1 | 2019-09-24 09:35:14.458 ERROR (qtp762152757-8772) [   ] 
> o.a.s.s.HttpSolrCall null:org.apache.solr.common.SolrException: Error trying 
> to proxy request for url: http://solr1:8983/solr/kg3/select
> solrw1_1 |at 
> org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:660)
> solrw1_1 |at 
> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:514)
> solrw1_1 |at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361)
> solrw1_1 |at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305)
> solrw1_1 |at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
> solrw1_1 |at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
> solrw1_1 |at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> solrw1_1 |at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> solrw1_1 |at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
> solrw1_1 |at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
> solrw1_1 |at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
> solrw1_1 |at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
> solrw1_1 |at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
> solrw1_1 |at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> solrw1_1 |at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
> solrw1_1 |at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
> solrw1_1 |at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> solrw1_1 |at 
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
> solrw1_1 |at 
> 

[GitHub] [lucene-solr] atris commented on a change in pull request #916: LUCENE-8213: Asynchronous Caching in LRUQueryCache

2019-10-11 Thread GitBox
atris commented on a change in pull request #916: LUCENE-8213: Asynchronous 
Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/916#discussion_r334107604
 
 

 ##
 File path: lucene/core/src/test/org/apache/lucene/search/TestLRUQueryCache.java
 ##
 @@ -1691,4 +1954,180 @@ public void testBulkScorerLocking() throws Exception {
 t.start();
 t.join();
   }
+
+  public void testRejectedExecution() throws IOException {
+ExecutorService service = new TestIndexSearcher.RejectingMockExecutor();
+Directory dir = newDirectory();
+final RandomIndexWriter w = new RandomIndexWriter(random(), dir);
+
+Document doc = new Document();
+StringField f = new StringField("color", "blue", Store.NO);
+doc.add(f);
+w.addDocument(doc);
+f.setStringValue("red");
+w.addDocument(doc);
+f.setStringValue("green");
+w.addDocument(doc);
+final DirectoryReader reader = w.getReader();
+
+final Query red = new TermQuery(new Term("color", "red"));
+
+IndexSearcher searcher = new IndexSearcher(reader, service);
+
+final LRUQueryCache queryCache = new LRUQueryCache(2, 10, context -> 
true);
+
+searcher.setQueryCache(queryCache);
+searcher.setQueryCachingPolicy(ALWAYS_CACHE);
+
+// To ensure that failing ExecutorService still allows query to run
+// successfully
+
+searcher.search(new ConstantScoreQuery(red), 1);
+assertEquals(Collections.singletonList(red), queryCache.cachedQueries());
+
+reader.close();
+w.close();
+dir.close();
+service.shutdown();
+  }
+
+  public void testClosedReaderExecution() throws IOException {
+CountDownLatch latch = new CountDownLatch(1);
+ExecutorService service = new BlockedMockExecutor(latch);
+
+Directory dir = newDirectory();
+final RandomIndexWriter w = new RandomIndexWriter(random(), dir);
+
+for (int i = 0; i < 100; i++) {
+  Document doc = new Document();
+  StringField f = new StringField("color", "blue", Store.NO);
+  doc.add(f);
+  w.addDocument(doc);
+  f.setStringValue("red");
+  w.addDocument(doc);
+  f.setStringValue("green");
+  w.addDocument(doc);
+
+  if (i % 10 == 0) {
+w.commit();
+  }
+}
+
+final DirectoryReader reader = w.getReader();
+
+final Query red = new TermQuery(new Term("color", "red"));
+
+IndexSearcher searcher = new IndexSearcher(reader, service) {
+  @Override
+  protected LeafSlice[] slices(List leaves) {
+ArrayList slices = new ArrayList<>();
+for (LeafReaderContext ctx : leaves) {
+  slices.add(new LeafSlice(Arrays.asList(ctx)));
+}
+return slices.toArray(new LeafSlice[0]);
+  }
+};
+
+final LRUQueryCache queryCache = new LRUQueryCache(2, 10, context -> 
true);
+
+searcher.setQueryCache(queryCache);
+searcher.setQueryCachingPolicy(ALWAYS_CACHE);
+
+// To ensure that failing ExecutorService still allows query to run
+// successfully
+
+ExecutorService tempService = new ThreadPoolExecutor(2, 2, 0L, 
TimeUnit.MILLISECONDS,
+new LinkedBlockingQueue(),
+new NamedThreadFactory("TestLRUQueryCache"));
+
+tempService.submit(new Runnable() {
+  @Override
+  public void run() {
+try {
+  Thread.sleep(100);
+  reader.close();
+} catch (Exception e) {
+  throw new RuntimeException(e.getMessage());
+}
+
+latch.countDown();
+
+  }
+});
+
+searcher.search(new ConstantScoreQuery(red), 1);
+
+assertEquals(Collections.singletonList(red), queryCache.cachedQueries());
 
 Review comment:
   Hmm, yeah, it is kind of strange, since the reader definitely gets closed 
before LRUQueryCache tries to cache the value -- but the SegmentReader still 
seems to be open when the caching is attempted. (I attached a debugger and 
jumped around).
   
   Do we need to go over all LeafReaderContext instances in the associated 
searcher and manually close them for this to work the way we expect?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13778) Windows JDK SSL Test Failure trend: SSLException: Software caused connection abort: recv failed

2019-10-11 Thread Chris M. Hostetter (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949681#comment-16949681
 ] 

Chris M. Hostetter commented on SOLR-13778:
---


I just realized we're seeing a slightly _different_ SSLException from Uwe's 
java13 windows VMs...

{noformat}
   [junit4]> Throwable #1: 
org.apache.solr.client.solrj.SolrServerException: IOException occurred when 
talking to server at: https://127.0.0.1:551
21/solr
   [junit4]>at 
__randomizedtesting.SeedInfo.seed([E2C1EFE3F69FB5C6:35E9A23BE77FFC28]:0)
   [junit4]>at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:679)
   [junit4]>at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:265)
   [junit4]>at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:248)
   [junit4]>at 
org.apache.solr.client.solrj.impl.LBSolrClient.doRequest(LBSolrClient.java:368)
   [junit4]>at 
org.apache.solr.client.solrj.impl.LBSolrClient.request(LBSolrClient.java:296)
   [junit4]>at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.sendRequest(BaseCloudSolrClient.java:1128)
   [junit4]>at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.requestWithRetryOnStaleState(BaseCloudSolrClient.java:897)
   [junit4]>at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.request(BaseCloudSolrClient.java:829)
   [junit4]>at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:211)
   [junit4]>at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:228)
   [junit4]>at 
org.apache.solr.cloud.MiniSolrCloudCluster.deleteAllCollections(MiniSolrCloudCluster.java:549)
   [junit4]>at 
org.apache.solr.cloud.TestCloudSearcherWarming.tearDown(TestCloudSearcherWarming.java:79)
   [junit4]>at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   [junit4]>at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   [junit4]>at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   [junit4]>at 
java.base/java.lang.reflect.Method.invoke(Method.java:567)
   [junit4]>at java.base/java.lang.Thread.run(Thread.java:830)
   [junit4]> Caused by: javax.net.ssl.SSLException: An established 
connection was aborted by the software in your host machine
   [junit4]>at 
java.base/sun.security.ssl.Alert.createSSLException(Alert.java:127)
   [junit4]>at 
java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:324)
   [junit4]>at 
java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:267)
   [junit4]>at 
java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:262)
   [junit4]>at 
java.base/sun.security.ssl.SSLSocketImpl.handleException(SSLSocketImpl.java:1652)
   [junit4]>at 
java.base/sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:1038)
   [junit4]>at 
org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
   [junit4]>at 
org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
   [junit4]>at 
org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282)
   [junit4]>at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
   [junit4]>at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
   [junit4]>at 
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
   [junit4]>at 
org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
   [junit4]>at 
org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165)
   [junit4]>at 
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
   [junit4]>at 
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
   [junit4]>at 
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
   [junit4]>at 
org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
   [junit4]>at 
org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
   [junit4]>at 
org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
   [junit4]>at 

[GitHub] [lucene-solr] atris commented on a change in pull request #916: LUCENE-8213: Asynchronous Caching in LRUQueryCache

2019-10-11 Thread GitBox
atris commented on a change in pull request #916: LUCENE-8213: Asynchronous 
Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/916#discussion_r334101248
 
 

 ##
 File path: lucene/core/src/test/org/apache/lucene/search/TestLRUQueryCache.java
 ##
 @@ -244,6 +275,213 @@ public void testLRUEviction() throws Exception {
 dir.close();
   }
 
+  public void testLRUConcurrentLoadAndEviction() throws Exception {
+Directory dir = newDirectory();
+final RandomIndexWriter w = new RandomIndexWriter(random(), dir);
+
+Document doc = new Document();
+StringField f = new StringField("color", "blue", Store.NO);
+doc.add(f);
+w.addDocument(doc);
+f.setStringValue("red");
+w.addDocument(doc);
+f.setStringValue("green");
+w.addDocument(doc);
+final DirectoryReader reader = w.getReader();
+ExecutorService service = new ThreadPoolExecutor(4, 4, 0L, 
TimeUnit.MILLISECONDS,
+new LinkedBlockingQueue(),
+new NamedThreadFactory("TestLRUQueryCache"));
+
+IndexSearcher searcher = new IndexSearcher(reader, service);
+
+final CountDownLatch[] latch = {new CountDownLatch(1)};
+
+final LRUQueryCache queryCache = new LRUQueryCache(2, 10, context -> 
true) {
+  @Override
+  protected void onDocIdSetCache(Object readerCoreKey, long ramBytesUsed) {
+super.onDocIdSetCache(readerCoreKey, ramBytesUsed);
+latch[0].countDown();
+  }
+};
+
+final Query blue = new TermQuery(new Term("color", "blue"));
+final Query red = new TermQuery(new Term("color", "red"));
+final Query green = new TermQuery(new Term("color", "green"));
+
+assertEquals(Collections.emptyList(), queryCache.cachedQueries());
+
+searcher.setQueryCache(queryCache);
+// the filter is not cached on any segment: no changes
+searcher.setQueryCachingPolicy(NEVER_CACHE);
+searcher.search(new ConstantScoreQuery(green), 1);
+assertEquals(Collections.emptyList(), queryCache.cachedQueries());
+
+searcher.setQueryCachingPolicy(ALWAYS_CACHE);
+
+// First read should miss
+searcher.search(new ConstantScoreQuery(red), 1);
+
+
+// Let the cache load be completed
+latch[0].await();
+searcher.search(new ConstantScoreQuery(red), 1);
+
+// Second read should hit
+searcher.search(new ConstantScoreQuery(red), 1);
+assertEquals(Collections.singletonList(red), queryCache.cachedQueries());
 
 Review comment:
   The second search is there to test that once the value is loaded 
asynchronously -- it exists and does not trigger another load (hence the lack 
of a wait there). Removed the extra search. thanks


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on issue #923: LUCENE-8988: Introduce Global Feature Based Early Termination For Sorted Fields

2019-10-11 Thread GitBox
atris commented on issue #923: LUCENE-8988: Introduce Global Feature Based 
Early Termination For Sorted Fields
URL: https://github.com/apache/lucene-solr/pull/923#issuecomment-541154942
 
 
   Any thoughts on this one? Seems useful enough?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8920) Reduce size of FSTs due to use of direct-addressing encoding

2019-10-11 Thread Adrien Grand (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949654#comment-16949654
 ] 

Adrien Grand commented on LUCENE-8920:
--

Right, this is what I had in mind, trying to reproduce the issue with values 
that look more real.

> Reduce size of FSTs due to use of direct-addressing encoding 
> -
>
> Key: LUCENE-8920
> URL: https://issues.apache.org/jira/browse/LUCENE-8920
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael Sokolov
>Priority: Blocker
> Fix For: 8.3
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Some data can lead to worst-case ~4x RAM usage due to this optimization. 
> Several ideas were suggested to combat this on the mailing list:
> bq. I think we can improve thesituation here by tracking, per-FST instance, 
> the size increase we're seeing while building (or perhaps do a preliminary 
> pass before building) in order to decide whether to apply the encoding. 
> bq. we could also make the encoding a bit more efficient. For instance I 
> noticed that arc metadata is pretty large in some cases (in the 10-20 bytes) 
> which make gaps very costly. Associating each label with a dense id and 
> having an intermediate lookup, ie. lookup label -> id and then id->arc offset 
> instead of doing label->arc directly could save a lot of space in some cases? 
> Also it seems that we are repeating the label in the arc metadata when 
> array-with-gaps is used, even though it shouldn't be necessary since the 
> label is implicit from the address?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-8920) Reduce size of FSTs due to use of direct-addressing encoding

2019-10-11 Thread Michael Sokolov (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949647#comment-16949647
 ] 

Michael Sokolov edited comment on LUCENE-8920 at 10/11/19 5:14 PM:
---

For posterity, this is the worst case test that spreads out terms
{{
for (int i = 0; i < 100; ++i) {
   byte[] b = new byte[5];
   random().nextBytes(b);
   for (int j = 0; j < b.length; ++j){
      b[j] &= 0xfc; // make this byte a multiple of 4
   }
    entries.add(new BytesRef(b));
 }
 buildFST(entries).ramBytesUsed();}}



was (Author: sokolov):
{{For posterity, this is the worst case test that spreads out terms}}

for (int i = 0; i < 100; ++i) {
   byte[] b = new byte[5];
   random().nextBytes(b);
   for (int j = 0; j < b.length; ++j)

{     b[j] &= 0xfc; // make this byte a multiple of 4   }

 entries.add(new BytesRef(b));
 }

buildFST(entries).ramBytesUsed();

> Reduce size of FSTs due to use of direct-addressing encoding 
> -
>
> Key: LUCENE-8920
> URL: https://issues.apache.org/jira/browse/LUCENE-8920
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael Sokolov
>Priority: Blocker
> Fix For: 8.3
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Some data can lead to worst-case ~4x RAM usage due to this optimization. 
> Several ideas were suggested to combat this on the mailing list:
> bq. I think we can improve thesituation here by tracking, per-FST instance, 
> the size increase we're seeing while building (or perhaps do a preliminary 
> pass before building) in order to decide whether to apply the encoding. 
> bq. we could also make the encoding a bit more efficient. For instance I 
> noticed that arc metadata is pretty large in some cases (in the 10-20 bytes) 
> which make gaps very costly. Associating each label with a dense id and 
> having an intermediate lookup, ie. lookup label -> id and then id->arc offset 
> instead of doing label->arc directly could save a lot of space in some cases? 
> Also it seems that we are repeating the label in the arc metadata when 
> array-with-gaps is used, even though it shouldn't be necessary since the 
> label is implicit from the address?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-8920) Reduce size of FSTs due to use of direct-addressing encoding

2019-10-11 Thread Michael Sokolov (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949647#comment-16949647
 ] 

Michael Sokolov edited comment on LUCENE-8920 at 10/11/19 5:13 PM:
---

{{For posterity, this is the worst case test that spreads out terms}}

for (int i = 0; i < 100; ++i) {
   byte[] b = new byte[5];
   random().nextBytes(b);
   for (int j = 0; j < b.length; ++j)

{     b[j] &= 0xfc; // make this byte a multiple of 4   }

 entries.add(new BytesRef(b));
 }

buildFST(entries).ramBytesUsed();


was (Author: sokolov):
{{For posterity, this is the worst case test that spreads out terms}}

{{}}for (int i = 0; i < 100; ++i) {
  byte[] b = new byte[5];
  random().nextBytes(b);
  for (int j = 0; j < b.length; ++j) {
    b[j] &= 0xfc; // make this byte a multiple of 4
  }
 entries.add(new BytesRef(b));
}

buildFST(entries).ramBytesUsed();

> Reduce size of FSTs due to use of direct-addressing encoding 
> -
>
> Key: LUCENE-8920
> URL: https://issues.apache.org/jira/browse/LUCENE-8920
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael Sokolov
>Priority: Blocker
> Fix For: 8.3
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Some data can lead to worst-case ~4x RAM usage due to this optimization. 
> Several ideas were suggested to combat this on the mailing list:
> bq. I think we can improve thesituation here by tracking, per-FST instance, 
> the size increase we're seeing while building (or perhaps do a preliminary 
> pass before building) in order to decide whether to apply the encoding. 
> bq. we could also make the encoding a bit more efficient. For instance I 
> noticed that arc metadata is pretty large in some cases (in the 10-20 bytes) 
> which make gaps very costly. Associating each label with a dense id and 
> having an intermediate lookup, ie. lookup label -> id and then id->arc offset 
> instead of doing label->arc directly could save a lot of space in some cases? 
> Also it seems that we are repeating the label in the arc metadata when 
> array-with-gaps is used, even though it shouldn't be necessary since the 
> label is implicit from the address?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8920) Reduce size of FSTs due to use of direct-addressing encoding

2019-10-11 Thread Michael Sokolov (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949647#comment-16949647
 ] 

Michael Sokolov commented on LUCENE-8920:
-

{{For posterity, this is the worst case test that spreads out terms}}

{{}}for (int i = 0; i < 100; ++i) {
  byte[] b = new byte[5];
  random().nextBytes(b);
  for (int j = 0; j < b.length; ++j) {
    b[j] &= 0xfc; // make this byte a multiple of 4
  }
 entries.add(new BytesRef(b));
}

buildFST(entries).ramBytesUsed();

> Reduce size of FSTs due to use of direct-addressing encoding 
> -
>
> Key: LUCENE-8920
> URL: https://issues.apache.org/jira/browse/LUCENE-8920
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael Sokolov
>Priority: Blocker
> Fix For: 8.3
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Some data can lead to worst-case ~4x RAM usage due to this optimization. 
> Several ideas were suggested to combat this on the mailing list:
> bq. I think we can improve thesituation here by tracking, per-FST instance, 
> the size increase we're seeing while building (or perhaps do a preliminary 
> pass before building) in order to decide whether to apply the encoding. 
> bq. we could also make the encoding a bit more efficient. For instance I 
> noticed that arc metadata is pretty large in some cases (in the 10-20 bytes) 
> which make gaps very costly. Associating each label with a dense id and 
> having an intermediate lookup, ie. lookup label -> id and then id->arc offset 
> instead of doing label->arc directly could save a lot of space in some cases? 
> Also it seems that we are repeating the label in the arc metadata when 
> array-with-gaps is used, even though it shouldn't be necessary since the 
> label is implicit from the address?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8920) Reduce size of FSTs due to use of direct-addressing encoding

2019-10-11 Thread Michael Sokolov (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949635#comment-16949635
 ] 

Michael Sokolov commented on LUCENE-8920:
-

I think you had previously created a test case for this, [~jpountz] that 
demonstrated some larger memory usage than we wanted. I was referring to the 
fact that it was somewhat artificial data distribution, and the main issue that 
seems to arise is some regression tests at ES that may? have a more realistic 
distribution of terms. I'm just not convinced that we need to handle every 
adversarial case?

> Reduce size of FSTs due to use of direct-addressing encoding 
> -
>
> Key: LUCENE-8920
> URL: https://issues.apache.org/jira/browse/LUCENE-8920
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael Sokolov
>Priority: Blocker
> Fix For: 8.3
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Some data can lead to worst-case ~4x RAM usage due to this optimization. 
> Several ideas were suggested to combat this on the mailing list:
> bq. I think we can improve thesituation here by tracking, per-FST instance, 
> the size increase we're seeing while building (or perhaps do a preliminary 
> pass before building) in order to decide whether to apply the encoding. 
> bq. we could also make the encoding a bit more efficient. For instance I 
> noticed that arc metadata is pretty large in some cases (in the 10-20 bytes) 
> which make gaps very costly. Associating each label with a dense id and 
> having an intermediate lookup, ie. lookup label -> id and then id->arc offset 
> instead of doing label->arc directly could save a lot of space in some cases? 
> Also it seems that we are repeating the label in the arc metadata when 
> array-with-gaps is used, even though it shouldn't be necessary since the 
> label is implicit from the address?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13817) Deprecate legacy SolrCache implementations

2019-10-11 Thread Ben Manes (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949630#comment-16949630
 ] 

Ben Manes commented on SOLR-13817:
--

You might also want to review whether atomic computations (loading through the 
cache) would provide a performance benefit. This is supported by Caffeine 
(built on {{computeIfAbsent}}) and avoids performing costly redundant work. It 
probably isn't worth the effort to implement it in the other caches if they are 
eventually removed.

For example {{RptWithGeometrySpatialField}} and {{BlockJoinParentQParser}} show 
classic patterns of a racy get-compute-put idiom:
{code}
SolrCache parentCache = request.getSearcher().getCache(CACHE_NAME);
// lazily retrieve from solr cache
Filter filter = null;
if (parentCache != null) {
  filter = (Filter) parentCache.get(parentList);
}
BitDocIdSetFilterWrapper result;
if (filter instanceof BitDocIdSetFilterWrapper) {
  result = (BitDocIdSetFilterWrapper) filter;
} else {
  result = new BitDocIdSetFilterWrapper(createParentFilter(parentList));
  if (parentCache != null) {
parentCache.put(parentList, result);
  }
}
return result;
{code}

If multiple threads require the same key then they will observe have a cache 
miss, perform an expensive call (or else why cached?), and insert their 
results. By using {{computeIfAbsent}} style call, this will be performed by one 
thread under a striped lock (hashbin lock) and the others will wait patiently 
for the results. If it was present, in Caffeine's case, it will be a lock-free 
read so there is no locking overhead. This avoids cache stampedes and can have 
a performance impact under load.

> Deprecate legacy SolrCache implementations
> --
>
> Key: SOLR-13817
> URL: https://issues.apache.org/jira/browse/SOLR-13817
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Andrzej Bialecki
>Assignee: Andrzej Bialecki
>Priority: Major
>
> Now that SOLR-8241 has been committed I propose to deprecate other cache 
> implementations in 8x and remove them altogether from 9.0, in order to reduce 
> confusion and maintenance costs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13835) HttpSolrCall produces incorrect extra AuditEvent on AuthorizationResponse.PROMPT

2019-10-11 Thread Chris M. Hostetter (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949629#comment-16949629
 ] 

Chris M. Hostetter commented on SOLR-13835:
---

Jan: Maybe i'm missing something, but IIUC in the context of how simple those 
blocks were when the code was initially added, it was reasonable for the first 
block to fall through to the second:

Back when the code was introduced in SOLR-7757:
 * If authResp.status == PROMPT: do some logging specific to the authResp, and 
add some HTTP response headers specified by the auth plugin
 * If authResp.status != OK: sendError(authResp.status)
 ** ie: it didn't matter if the authResp.status was PROMPT, or FORBIDDEN, or 
anything else ... it wasn't ok so send an error with

...it's only as a result of changes introduced since then (with the addition of 
audit logging to each of the conditionals) that we now have a bug in the form 
of multiple Audit Events when authResp is PROMPT.

IIUC: from the perspective of the external client the behavior is still 
entirely correct either way, it's only if/how an AuditLogger plugin is used and 
what it expects that seems to be at risk.

(particularly since the AuthorizationPlugin API seems open enough (ie: there is 
no fixed enum of  authResponse.statusCode values)  that a custom plugin could 
return a lot of diff non-200/202 error codes that the AuditLogger would all 
report as "UNAUTHORIZED")

> HttpSolrCall produces incorrect extra AuditEvent on 
> AuthorizationResponse.PROMPT
> 
>
> Key: SOLR-13835
> URL: https://issues.apache.org/jira/browse/SOLR-13835
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Authentication, Authorization
>Reporter: Chris M. Hostetter
>Priority: Major
>
> spinning this out of SOLR-13741...
> {quote}
> Wrt the REJECTED + UNAUTHORIZED events I see the same as you, and I believe 
> there is a code bug, not a test bug. In HttpSolrCall#471 in the 
> {{authorize()}} call, if authResponse == PROMPT, it will actually match both 
> blocks and emit two audit events: 
> [https://github.com/apache/lucene-solr/blob/26ede632e6259eb9d16861a3c0f782c9c8999762/solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java#L475:L493]
>  
> {code:java}
> if (authResponse.statusCode == AuthorizationResponse.PROMPT.statusCode) {...}
> if (!(authResponse.statusCode == HttpStatus.SC_ACCEPTED) && 
> !(authResponse.statusCode == HttpStatus.SC_OK)) {...}
> {code}
> When code==401, it is also true that code!=200. Intuitively there should be 
> both a sendErrora and return RETURN before line #484 in the first if block?
> {quote}
> This causes any and all {{REJECTED}} AuditEvent messages to be accompanied by 
> a coresponding {{UNAUTHORIZED}} AuditEvent.  
> It's not yet clear if, from the perspective of the external client, there are 
> any other bugs in behavior (TBD)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] chatman opened a new pull request #942: SOLR-13834: ZkController#getSolrCloudManager() now uses the same ZkStateReader

2019-10-11 Thread GitBox
chatman opened a new pull request #942: SOLR-13834: 
ZkController#getSolrCloudManager() now uses the same ZkStateReader
URL: https://github.com/apache/lucene-solr/pull/942
 
 
   Details in the JIRA. All tests pass.
   (FYI, without the changes to AddShardCmd and SplitShardCmd, the 
CollectionsTooManyReplicasTest was failing.)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13834) ZkController#getSolrCloudManager() creates a new instance of ZkStateReader

2019-10-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949619#comment-16949619
 ] 

ASF subversion and git services commented on SOLR-13834:


Commit 1a45b35baf765b4ab13bf2edf7fc664af7d6d6c4 in lucene-solr's branch 
refs/heads/jira/SOLR-13834 from Ishan Chattopadhyaya
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=1a45b35 ]

SOLR-13834: ZkController#getSolrCloudManager() now uses the same ZkStateReader 
instance instead of instantiating a new one

ZkController#getSolrCloudManager() created a new instance of ZkStateReader, 
thereby causing mismatch in the
visibility of the cluster state and, as a result, undesired race conditions.


> ZkController#getSolrCloudManager() creates a new instance of ZkStateReader
> --
>
> Key: SOLR-13834
> URL: https://issues.apache.org/jira/browse/SOLR-13834
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Ishan Chattopadhyaya
>Priority: Major
>
> It should be reusing the existing ZkStateReader instance . Multiple 
> ZkStateReader instance have different visibility to the ZK state and cause 
> race conditions



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13472) HTTP requests to a node that does not hold a core of the collection are unauthorized

2019-10-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949593#comment-16949593
 ] 

ASF subversion and git services commented on SOLR-13472:


Commit b4242a1bfb418e8b1f1cedf4cf9f97e20e4cd866 in lucene-solr's branch 
refs/heads/branch_7_7 from Ishan Chattopadhyaya
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b4242a1 ]

SOLR-13472: Forwarded requests should skip authorization on receiving nodes


> HTTP requests to a node that does not hold a core of the collection are 
> unauthorized
> 
>
> Key: SOLR-13472
> URL: https://issues.apache.org/jira/browse/SOLR-13472
> Project: Solr
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 7.7.1, 8.0
>Reporter: adfel
>Assignee: Ishan Chattopadhyaya
>Priority: Minor
>  Labels: security
> Fix For: 8.2
>
> Attachments: SOLR-13472.patch, SOLR-13472.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> When creating collection in SolrCloud, collection is available for queries 
> and updates through all Solr nodes, in particular nodes that does not hold 
> one of collection's cores. This is expected behaviour that works when using 
> SolrJ client or HTTP requests.
> When enabling authorization rules it seems that this behaviour is broken for 
> HTTP requests:
>  - executing request to a node that holds part of the collection (core) obey 
> to authorization rules as expected.
>  - other nodes respond with code 403 - unauthorized request.
> SolrJ still works as expected.
> Tested both with BasicAuthPlugin and KerberosPlugin authentication plugins.
> +Steps for reproduce:+
> 1. Create a cloud made of 2 nodes (node_1, node_2).
> 2. Configure authentication and authorization by uploading following 
> security.json file to zookeeper:
>  
> {code:java}
> {
>  "authentication": {
>"blockUnknown": true,
>"class": "solr.BasicAuthPlugin",
>"credentials": {
>  "solr": "'solr' user password_hash",
>  "indexer_app": "'indexer_app' password_hash",
>  "read_user": "'read_user' password_hash"
>}
>  },
>  "authorization": {
>"class": "solr.RuleBasedAuthorizationPlugin",
>"permissions": [
>  {
>"name": "read",
>"role": "*"
>  },
>  {
>"name": "update",
>"role": [
>  "indexer",
>  "admin"
>]
>  },
>  {
>"name": "all",
>"role": "admin"
>  }
>],
>"user-role": {
>  "solr": "admin",
>  "indexer_app": "indexer"
>}
>  }
> }{code}
>  
> 3. create 'test' collection with one shard on *node_1*.
> -- 
> The following requests expected to succeed but return 403 status 
> (unauthorized request):
> {code:java}
> curl -u read_user:read_user "http://node_2/solr/test/select?q=*:*;
> curl -u indexer_app:indexer_app "http://node_2/solr/test/select?q=*:*;
> curl -u indexer_app:indexer_app "http://node_2/solr/test/update?commit=true;
> {code}
>  
> Authenticated '_solr_' user requests works as expected. My guess is due to 
> the special '_all_' role.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8928) BKDWriter could make splitting decisions based on the actual range of values

2019-10-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949591#comment-16949591
 ] 

ASF subversion and git services commented on LUCENE-8928:
-

Commit a9c77504023b3f1e0b81dbe52537fa19f4586200 in lucene-solr's branch 
refs/heads/branch_8x from Ignacio Vera
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a9c7750 ]

LUCENE-8928: Compute exact bounds every N splits (#926)

When building a kd-tree for dimensions n > 2, compute exact bounds for an inner 
node every N splits to improve the quality of the tree. N is defined by 
SPLITS_BEFORE_EXACT_BOUNDS which is set to 4.


> BKDWriter could make splitting decisions based on the actual range of values
> 
>
> Key: LUCENE-8928
> URL: https://issues.apache.org/jira/browse/LUCENE-8928
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently BKDWriter assumes that splitting on one dimension has no effect on 
> values in other dimensions. While this may be ok for geo points, this is 
> usually not true for ranges (or geo shapes, which are ranges too). Maybe we 
> could get better indexing by re-computing the range of values on each 
> dimension before making the choice of the split dimension?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] iverase merged pull request #926: LUCENE-8928: Compute exact bounds every N splits

2019-10-11 Thread GitBox
iverase merged pull request #926: LUCENE-8928: Compute exact bounds every N 
splits
URL: https://github.com/apache/lucene-solr/pull/926
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8920) Reduce size of FSTs due to use of direct-addressing encoding

2019-10-11 Thread Adrien Grand (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949575#comment-16949575
 ] 

Adrien Grand commented on LUCENE-8920:
--

Ah sorry it was not clear to me this was blocking you. I should be able to make 
a standalone test that reproduces the memory usage increase.

> Reduce size of FSTs due to use of direct-addressing encoding 
> -
>
> Key: LUCENE-8920
> URL: https://issues.apache.org/jira/browse/LUCENE-8920
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael Sokolov
>Priority: Blocker
> Fix For: 8.3
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Some data can lead to worst-case ~4x RAM usage due to this optimization. 
> Several ideas were suggested to combat this on the mailing list:
> bq. I think we can improve thesituation here by tracking, per-FST instance, 
> the size increase we're seeing while building (or perhaps do a preliminary 
> pass before building) in order to decide whether to apply the encoding. 
> bq. we could also make the encoding a bit more efficient. For instance I 
> noticed that arc metadata is pretty large in some cases (in the 10-20 bytes) 
> which make gaps very costly. Associating each label with a dense id and 
> having an intermediate lookup, ie. lookup label -> id and then id->arc offset 
> instead of doing label->arc directly could save a lot of space in some cases? 
> Also it seems that we are repeating the label in the arc metadata when 
> array-with-gaps is used, even though it shouldn't be necessary since the 
> label is implicit from the address?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8920) Reduce size of FSTs due to use of direct-addressing encoding

2019-10-11 Thread Michael Sokolov (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949568#comment-16949568
 ] 

Michael Sokolov commented on LUCENE-8920:
-

Fine by me. I find it too difficult to iterate on a more refined solution given 
limited access to the benchmarking tools we are using for evaluation.

> Reduce size of FSTs due to use of direct-addressing encoding 
> -
>
> Key: LUCENE-8920
> URL: https://issues.apache.org/jira/browse/LUCENE-8920
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael Sokolov
>Priority: Blocker
> Fix For: 8.3
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Some data can lead to worst-case ~4x RAM usage due to this optimization. 
> Several ideas were suggested to combat this on the mailing list:
> bq. I think we can improve thesituation here by tracking, per-FST instance, 
> the size increase we're seeing while building (or perhaps do a preliminary 
> pass before building) in order to decide whether to apply the encoding. 
> bq. we could also make the encoding a bit more efficient. For instance I 
> noticed that arc metadata is pretty large in some cases (in the 10-20 bytes) 
> which make gaps very costly. Associating each label with a dense id and 
> having an intermediate lookup, ie. lookup label -> id and then id->arc offset 
> instead of doing label->arc directly could save a lot of space in some cases? 
> Also it seems that we are repeating the label in the arc metadata when 
> array-with-gaps is used, even though it shouldn't be necessary since the 
> label is implicit from the address?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-13815) Live split can lose data

2019-10-11 Thread Yonik Seeley (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-13815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley updated SOLR-13815:

Fix Version/s: 8.3

> Live split can lose data
> 
>
> Key: SOLR-13815
> URL: https://issues.apache.org/jira/browse/SOLR-13815
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Yonik Seeley
>Priority: Major
> Fix For: 8.3
>
> Attachments: fail.191004_053129, fail.191004_093307
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This issue is to investigate potential data loss during a "live" split (i.e. 
> split happens while updates are flowing)
> This was discovered during the shared storage work which was based on a 
> non-release branch_8x sometime before 8.3, hence the first steps are to try 
> and reproduce on the master branch without any shared storage changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13105) A visual guide to Solr Math Expressions and Streaming Expressions

2019-10-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949539#comment-16949539
 ] 

ASF subversion and git services commented on SOLR-13105:


Commit 4af0b9f46256b7ce1ce203aee6fe891f5693657f in lucene-solr's branch 
refs/heads/SOLR-13105-visual from Joel Bernstein
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=4af0b9f ]

SOLR-13105: Improve ML docs 21


> A visual guide to Solr Math Expressions and Streaming Expressions
> -
>
> Key: SOLR-13105
> URL: https://issues.apache.org/jira/browse/SOLR-13105
> Project: Solr
>  Issue Type: New Feature
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
> Attachments: Screen Shot 2019-01-14 at 10.56.32 AM.png, Screen Shot 
> 2019-02-21 at 2.14.43 PM.png, Screen Shot 2019-03-03 at 2.28.35 PM.png, 
> Screen Shot 2019-03-04 at 7.47.57 PM.png, Screen Shot 2019-03-13 at 10.47.47 
> AM.png, Screen Shot 2019-03-30 at 6.17.04 PM.png
>
>
> Visualization is now a fundamental element of Solr Streaming Expressions and 
> Math Expressions. This ticket will create a visual guide to Solr Math 
> Expressions and Solr Streaming Expressions that includes *Apache Zeppelin* 
> visualization examples.
> It will also cover using the JDBC expression to *analyze* and *visualize* 
> results from any JDBC compliant data source.
> Intro from the guide:
> {code:java}
> Streaming Expressions exposes the capabilities of Solr Cloud as composable 
> functions. These functions provide a system for searching, transforming, 
> analyzing and visualizing data stored in Solr Cloud collections.
> At a high level there are four main capabilities that will be explored in the 
> documentation:
> * Searching, sampling and aggregating results from Solr.
> * Transforming result sets after they are retrieved from Solr.
> * Analyzing and modeling result sets using probability and statistics and 
> machine learning libraries.
> * Visualizing result sets, aggregations and statistical models of the data.
> {code}
>  
> A few sample visualizations are attached to the ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8920) Reduce size of FSTs due to use of direct-addressing encoding

2019-10-11 Thread Adrien Grand (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949498#comment-16949498
 ] 

Adrien Grand commented on LUCENE-8920:
--

Changing the constant would work for me, I just wonder that it would maybe be 
easier to revert in order to have to deal with fewer version numbers of the FST 
class in the future.

Maybe another way we could fix the worst-case memory usage while keeping the 
improved runtime would be to have the factor depend on how deep we are in the 
FST since this change is more useful on frequently accessed nodes, which are 
likely the nodes that are closer to the root?

I wouldn't want to hold the release too long because of this change so I'm 
suggesting reverting from all branches on Monday, and we can work on some of 
the options that have been mentioned above to keep the worst-case scenario more 
contained. Any objections?

> Reduce size of FSTs due to use of direct-addressing encoding 
> -
>
> Key: LUCENE-8920
> URL: https://issues.apache.org/jira/browse/LUCENE-8920
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael Sokolov
>Priority: Blocker
> Fix For: 8.3
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Some data can lead to worst-case ~4x RAM usage due to this optimization. 
> Several ideas were suggested to combat this on the mailing list:
> bq. I think we can improve thesituation here by tracking, per-FST instance, 
> the size increase we're seeing while building (or perhaps do a preliminary 
> pass before building) in order to decide whether to apply the encoding. 
> bq. we could also make the encoding a bit more efficient. For instance I 
> noticed that arc metadata is pretty large in some cases (in the 10-20 bytes) 
> which make gaps very costly. Associating each label with a dense id and 
> having an intermediate lookup, ie. lookup label -> id and then id->arc offset 
> instead of doing label->arc directly could save a lot of space in some cases? 
> Also it seems that we are repeating the label in the arc metadata when 
> array-with-gaps is used, even though it shouldn't be necessary since the 
> label is implicit from the address?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13787) An annotation based system to write v2 only APIs

2019-10-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949455#comment-16949455
 ] 

ASF subversion and git services commented on SOLR-13787:


Commit 5b6561eadb522150c8ea2954d60077ac445ad1d7 in lucene-solr's branch 
refs/heads/master from Noble Paul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=5b6561e ]

SOLR-13787: Support for Payload as 3rd param


> An annotation based system to write v2 only APIs
> 
>
> Key: SOLR-13787
> URL: https://issues.apache.org/jira/browse/SOLR-13787
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
> Fix For: master (9.0), 8.3
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> example v2 API may look as follows
> {code:java}
> @V2EndPoint(method = POST, path = "/cluster/package", permission = 
> PermissionNameProvider.Name.ALL)
> public static class ApiTest {
>   @Command(name = "add")
>   public void add(SolrQueryRequest req, SolrQueryResponse rsp, AddVersion 
> addVersion) {
>   }
>   @Command(name = "delete")
>   public void del(SolrQueryRequest req, SolrQueryResponse rsp, List 
> names) {
>   }
> }
> public static class AddVersion {
>   @JsonProperty(value = "package", required = true)
>   public String pkg;
>   @JsonProperty(value = "version", required = true)
>   public String version;
>   @JsonProperty(value = "files", required = true)
>   public List files;
> }
> {code}
> This expects you to already hava a POJO annotated with jackson annotations
>  
> The annotations are:
>  
> {code:java}
> @Retention(RetentionPolicy.RUNTIME)
> @Target({ElementType.TYPE})
> public @interface EndPoint {
> /**The suoported http methods*/
>   SolrRequest.METHOD[] method();
> /**supported paths*/
>   String[] path();
>   PermissionNameProvider.Name permission();
> }
> {code}
> {code:java}
> @Retention(RetentionPolicy.RUNTIME)
> @Target(ElementType.METHOD)
> public @interface Command {
>/**if this is not a json command , leave it empty.
>* Keep in mind that you cannot have duplicates.
>* Only one method per name
>*
>*/
>   String name() default "";
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13787) An annotation based system to write v2 only APIs

2019-10-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949453#comment-16949453
 ] 

ASF subversion and git services commented on SOLR-13787:


Commit dcb7abfc0ee3e9ac8827bf7b0128f1249fb7fc7e in lucene-solr's branch 
refs/heads/branch_8x from Noble Paul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=dcb7abf ]

SOLR-13787: Added support for PayLoad as 3rd param


> An annotation based system to write v2 only APIs
> 
>
> Key: SOLR-13787
> URL: https://issues.apache.org/jira/browse/SOLR-13787
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
> Fix For: master (9.0), 8.3
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> example v2 API may look as follows
> {code:java}
> @V2EndPoint(method = POST, path = "/cluster/package", permission = 
> PermissionNameProvider.Name.ALL)
> public static class ApiTest {
>   @Command(name = "add")
>   public void add(SolrQueryRequest req, SolrQueryResponse rsp, AddVersion 
> addVersion) {
>   }
>   @Command(name = "delete")
>   public void del(SolrQueryRequest req, SolrQueryResponse rsp, List 
> names) {
>   }
> }
> public static class AddVersion {
>   @JsonProperty(value = "package", required = true)
>   public String pkg;
>   @JsonProperty(value = "version", required = true)
>   public String version;
>   @JsonProperty(value = "files", required = true)
>   public List files;
> }
> {code}
> This expects you to already hava a POJO annotated with jackson annotations
>  
> The annotations are:
>  
> {code:java}
> @Retention(RetentionPolicy.RUNTIME)
> @Target({ElementType.TYPE})
> public @interface EndPoint {
> /**The suoported http methods*/
>   SolrRequest.METHOD[] method();
> /**supported paths*/
>   String[] path();
>   PermissionNameProvider.Name permission();
> }
> {code}
> {code:java}
> @Retention(RetentionPolicy.RUNTIME)
> @Target(ElementType.METHOD)
> public @interface Command {
>/**if this is not a json command , leave it empty.
>* Keep in mind that you cannot have duplicates.
>* Only one method per name
>*
>*/
>   String name() default "";
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13787) An annotation based system to write v2 only APIs

2019-10-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949454#comment-16949454
 ] 

ASF subversion and git services commented on SOLR-13787:


Commit 71e9564e0d520449b6eeb52a6f67ede91ff091a7 in lucene-solr's branch 
refs/heads/branch_8x from Noble Paul
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=71e9564 ]

SOLR-13787: Support for Payload as 3rd param


> An annotation based system to write v2 only APIs
> 
>
> Key: SOLR-13787
> URL: https://issues.apache.org/jira/browse/SOLR-13787
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Major
> Fix For: master (9.0), 8.3
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> example v2 API may look as follows
> {code:java}
> @V2EndPoint(method = POST, path = "/cluster/package", permission = 
> PermissionNameProvider.Name.ALL)
> public static class ApiTest {
>   @Command(name = "add")
>   public void add(SolrQueryRequest req, SolrQueryResponse rsp, AddVersion 
> addVersion) {
>   }
>   @Command(name = "delete")
>   public void del(SolrQueryRequest req, SolrQueryResponse rsp, List 
> names) {
>   }
> }
> public static class AddVersion {
>   @JsonProperty(value = "package", required = true)
>   public String pkg;
>   @JsonProperty(value = "version", required = true)
>   public String version;
>   @JsonProperty(value = "files", required = true)
>   public List files;
> }
> {code}
> This expects you to already hava a POJO annotated with jackson annotations
>  
> The annotations are:
>  
> {code:java}
> @Retention(RetentionPolicy.RUNTIME)
> @Target({ElementType.TYPE})
> public @interface EndPoint {
> /**The suoported http methods*/
>   SolrRequest.METHOD[] method();
> /**supported paths*/
>   String[] path();
>   PermissionNameProvider.Name permission();
> }
> {code}
> {code:java}
> @Retention(RetentionPolicy.RUNTIME)
> @Target(ElementType.METHOD)
> public @interface Command {
>/**if this is not a json command , leave it empty.
>* Keep in mind that you cannot have duplicates.
>* Only one method per name
>*
>*/
>   String name() default "";
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-13829) RecursiveEvaluator casts Continuous numbers to Discrete Numbers, causing mismatch

2019-10-11 Thread Joel Bernstein (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-13829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein resolved SOLR-13829.
---
Fix Version/s: 8.3
   Resolution: Resolved

> RecursiveEvaluator casts Continuous numbers to Discrete Numbers, causing 
> mismatch
> -
>
> Key: SOLR-13829
> URL: https://issues.apache.org/jira/browse/SOLR-13829
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Trey Grainger
>Priority: Major
> Fix For: 8.3
>
> Attachments: SOLR-13829.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In trying to use the "sort" streaming evaluator on float field (pfloat), I am 
> getting casting errors back based upon which values are calculated based upon 
> underlying values in a field.
> Example:
> *Docs:* (paste each into "Documents" pane in Solr Admin UI as type:"json")
>  
> {code:java}
> {"id": "1", "name":"donut","vector_fs":[5.0,0.0,1.0,5.0,0.0,4.0,5.0,1.0]}
> {"id": "2", "name":"cheese 
> pizza","vector_fs":[5.0,0.0,4.0,4.0,0.0,1.0,5.0,2.0]}{code}
>  
> *Streaming Expression:*
>  
> {code:java}
> sort(select(search(food_collection, q="*:*", fl="id,vector_fs", sort="id 
> asc"), cosineSimilarity(vector_fs, array(5.0,0.0,1.0,5.0,0.0,4.0,5.0,1.0)) as 
> sim, id), by="sim desc"){code}
>  
> *Response:*
>  
> {code:java}
> { 
>   "result-set": {
> "docs": [
>   {
> "EXCEPTION": "class java.lang.Double cannot be cast to class 
> java.lang.Long (java.lang.Double and java.lang.Long are in module java.base 
> of loader 'bootstrap')",
> "EOF": true,
> "RESPONSE_TIME": 13
>   }
> ]
>   }
> }{code}
>  
>  
> This is because in org.apache.solr.client.solrj.io.eval.RecursiveEvaluator, 
> there is a line which examines a numeric (BigDecimal) value and - regardless 
> of the type of the field the value originated from - converts it to a Long if 
> it looks like a whole number. This is the code in question from that class:
> {code:java}
> protected Object normalizeOutputType(Object value) {
> if(null == value){
>   return null;
> } else if (value instanceof VectorFunction) {
>   return value;
> } else if(value instanceof BigDecimal){
>   BigDecimal bd = (BigDecimal)value;
>   if(bd.signum() == 0 || bd.scale() <= 0 || 
> bd.stripTrailingZeros().scale() <= 0){
> try{
>   return bd.longValueExact();
> }
> catch(ArithmeticException e){
>   // value was too big for a long, so use a double which can handle 
> scientific notation
> }
>   }
>   
>   return bd.doubleValue();
> }
> ... [other type conversions]
> {code}
> Because of the *return bd.longValueExact()*; line, the calculated value for 
> "sim" in doc 1 is "Float(1)", whereas the calculated value for "sim" for doc 
> 2 is "Double(0.88938313). These are coming back as incompatible data types, 
> even though the source data is all of the same type and should be comparable.
> Thus when the *sort* evaluator streaming expression (and probably others) 
> runs on these calculated values and the list should contain ["0.88938313", 
> "1.0"], an exception is thrown because the it's trying to compare 
> incompatible data types [Double("0.99"), Long(1)].
> This bug is occurring on master currently, but has probably existed in the 
> codebase since at least August 2017.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13829) RecursiveEvaluator casts Continuous numbers to Discrete Numbers, causing mismatch

2019-10-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949427#comment-16949427
 ] 

ASF subversion and git services commented on SOLR-13829:


Commit 30feba4045967a95820af670d4e8a9b02e57b536 in lucene-solr's branch 
refs/heads/branch_8_3 from Joel Bernstein
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=30feba4 ]

SOLR-13829: Update CHANGES.txt


> RecursiveEvaluator casts Continuous numbers to Discrete Numbers, causing 
> mismatch
> -
>
> Key: SOLR-13829
> URL: https://issues.apache.org/jira/browse/SOLR-13829
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Trey Grainger
>Priority: Major
> Attachments: SOLR-13829.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In trying to use the "sort" streaming evaluator on float field (pfloat), I am 
> getting casting errors back based upon which values are calculated based upon 
> underlying values in a field.
> Example:
> *Docs:* (paste each into "Documents" pane in Solr Admin UI as type:"json")
>  
> {code:java}
> {"id": "1", "name":"donut","vector_fs":[5.0,0.0,1.0,5.0,0.0,4.0,5.0,1.0]}
> {"id": "2", "name":"cheese 
> pizza","vector_fs":[5.0,0.0,4.0,4.0,0.0,1.0,5.0,2.0]}{code}
>  
> *Streaming Expression:*
>  
> {code:java}
> sort(select(search(food_collection, q="*:*", fl="id,vector_fs", sort="id 
> asc"), cosineSimilarity(vector_fs, array(5.0,0.0,1.0,5.0,0.0,4.0,5.0,1.0)) as 
> sim, id), by="sim desc"){code}
>  
> *Response:*
>  
> {code:java}
> { 
>   "result-set": {
> "docs": [
>   {
> "EXCEPTION": "class java.lang.Double cannot be cast to class 
> java.lang.Long (java.lang.Double and java.lang.Long are in module java.base 
> of loader 'bootstrap')",
> "EOF": true,
> "RESPONSE_TIME": 13
>   }
> ]
>   }
> }{code}
>  
>  
> This is because in org.apache.solr.client.solrj.io.eval.RecursiveEvaluator, 
> there is a line which examines a numeric (BigDecimal) value and - regardless 
> of the type of the field the value originated from - converts it to a Long if 
> it looks like a whole number. This is the code in question from that class:
> {code:java}
> protected Object normalizeOutputType(Object value) {
> if(null == value){
>   return null;
> } else if (value instanceof VectorFunction) {
>   return value;
> } else if(value instanceof BigDecimal){
>   BigDecimal bd = (BigDecimal)value;
>   if(bd.signum() == 0 || bd.scale() <= 0 || 
> bd.stripTrailingZeros().scale() <= 0){
> try{
>   return bd.longValueExact();
> }
> catch(ArithmeticException e){
>   // value was too big for a long, so use a double which can handle 
> scientific notation
> }
>   }
>   
>   return bd.doubleValue();
> }
> ... [other type conversions]
> {code}
> Because of the *return bd.longValueExact()*; line, the calculated value for 
> "sim" in doc 1 is "Float(1)", whereas the calculated value for "sim" for doc 
> 2 is "Double(0.88938313). These are coming back as incompatible data types, 
> even though the source data is all of the same type and should be comparable.
> Thus when the *sort* evaluator streaming expression (and probably others) 
> runs on these calculated values and the list should contain ["0.88938313", 
> "1.0"], an exception is thrown because the it's trying to compare 
> incompatible data types [Double("0.99"), Long(1)].
> This bug is occurring on master currently, but has probably existed in the 
> codebase since at least August 2017.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13829) RecursiveEvaluator casts Continuous numbers to Discrete Numbers, causing mismatch

2019-10-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949422#comment-16949422
 ] 

ASF subversion and git services commented on SOLR-13829:


Commit bed9e7c47432777ff09fa8d03d435ad0e59b518a in lucene-solr's branch 
refs/heads/master from Joel Bernstein
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=bed9e7c ]

SOLR-13829: Update CHANGES.txt


> RecursiveEvaluator casts Continuous numbers to Discrete Numbers, causing 
> mismatch
> -
>
> Key: SOLR-13829
> URL: https://issues.apache.org/jira/browse/SOLR-13829
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Trey Grainger
>Priority: Major
> Attachments: SOLR-13829.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In trying to use the "sort" streaming evaluator on float field (pfloat), I am 
> getting casting errors back based upon which values are calculated based upon 
> underlying values in a field.
> Example:
> *Docs:* (paste each into "Documents" pane in Solr Admin UI as type:"json")
>  
> {code:java}
> {"id": "1", "name":"donut","vector_fs":[5.0,0.0,1.0,5.0,0.0,4.0,5.0,1.0]}
> {"id": "2", "name":"cheese 
> pizza","vector_fs":[5.0,0.0,4.0,4.0,0.0,1.0,5.0,2.0]}{code}
>  
> *Streaming Expression:*
>  
> {code:java}
> sort(select(search(food_collection, q="*:*", fl="id,vector_fs", sort="id 
> asc"), cosineSimilarity(vector_fs, array(5.0,0.0,1.0,5.0,0.0,4.0,5.0,1.0)) as 
> sim, id), by="sim desc"){code}
>  
> *Response:*
>  
> {code:java}
> { 
>   "result-set": {
> "docs": [
>   {
> "EXCEPTION": "class java.lang.Double cannot be cast to class 
> java.lang.Long (java.lang.Double and java.lang.Long are in module java.base 
> of loader 'bootstrap')",
> "EOF": true,
> "RESPONSE_TIME": 13
>   }
> ]
>   }
> }{code}
>  
>  
> This is because in org.apache.solr.client.solrj.io.eval.RecursiveEvaluator, 
> there is a line which examines a numeric (BigDecimal) value and - regardless 
> of the type of the field the value originated from - converts it to a Long if 
> it looks like a whole number. This is the code in question from that class:
> {code:java}
> protected Object normalizeOutputType(Object value) {
> if(null == value){
>   return null;
> } else if (value instanceof VectorFunction) {
>   return value;
> } else if(value instanceof BigDecimal){
>   BigDecimal bd = (BigDecimal)value;
>   if(bd.signum() == 0 || bd.scale() <= 0 || 
> bd.stripTrailingZeros().scale() <= 0){
> try{
>   return bd.longValueExact();
> }
> catch(ArithmeticException e){
>   // value was too big for a long, so use a double which can handle 
> scientific notation
> }
>   }
>   
>   return bd.doubleValue();
> }
> ... [other type conversions]
> {code}
> Because of the *return bd.longValueExact()*; line, the calculated value for 
> "sim" in doc 1 is "Float(1)", whereas the calculated value for "sim" for doc 
> 2 is "Double(0.88938313). These are coming back as incompatible data types, 
> even though the source data is all of the same type and should be comparable.
> Thus when the *sort* evaluator streaming expression (and probably others) 
> runs on these calculated values and the list should contain ["0.88938313", 
> "1.0"], an exception is thrown because the it's trying to compare 
> incompatible data types [Double("0.99"), Long(1)].
> This bug is occurring on master currently, but has probably existed in the 
> codebase since at least August 2017.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13828) Improve ExecutePlanAction error handling

2019-10-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949414#comment-16949414
 ] 

ASF subversion and git services commented on SOLR-13828:


Commit 9f9e19c2a647cb24e3ae3ec951a84112cb70ae0e in lucene-solr's branch 
refs/heads/branch_7_7 from Andrzej Bialecki
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=9f9e19c ]

SOLR-13828: Improve ExecutePlanAction error handling.


> Improve ExecutePlanAction error handling
> 
>
> Key: SOLR-13828
> URL: https://issues.apache.org/jira/browse/SOLR-13828
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: AutoScaling
>Affects Versions: 7.7.2, 8.2, 8.3
>Reporter: Andrzej Bialecki
>Assignee: Andrzej Bialecki
>Priority: Major
>
> There's a bug in {{ExecutePlanAction}}, where it's possible that in some 
> situations it would create duplicate asyncId-s for events with multiple 
> operations - unit tests didn't catch it probably because operations took 
> shorter time than the default task timeout, which is 120 sec and this 
> situation would arise if the task timeout was reached but the task was still 
> running.
> Also, error handling in ExecutePlanAction should be improved to correctly 
> throw Exceptions when an operation fails to complete - it's possible now for 
> an operation to fail yet the ExecutePlanAction to report success.
> This also brings a question of the task timeout - currently it's not 
> configurable, but it should be. It can be configured in the action properties.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-13837) AuditLogger must handle V2 requests better

2019-10-11 Thread Jira


 [ 
https://issues.apache.org/jira/browse/SOLR-13837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-13837:
---
Description: 
Spinoff from SOLR-13741

Turns out that Audit logger does not log the body of V2 Admin API requests and 
needs a general improvement in how V2 requests are handled, i.e:
 * We do not audit log the BODY of the request (which is where the action is)
 * We do not detect what collections the request is for (so the 
AuditEvent#collections array is null)
 * The resource path is internal format {{/v2/c}} instead of {{/api/c}} 
(should we convert the prefix in the AuditEvent?)

  was:
Spinoff from SOLR-13741

Turns out that Audit logger does not log the body of V2 Admin API requests and 
needs a general improvement in how V2 requests are handled.


> AuditLogger must handle V2 requests better
> --
>
> Key: SOLR-13837
> URL: https://issues.apache.org/jira/browse/SOLR-13837
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Auditlogging
>Affects Versions: 8.2
>Reporter: Jan Høydahl
>Priority: Major
>
> Spinoff from SOLR-13741
> Turns out that Audit logger does not log the body of V2 Admin API requests 
> and needs a general improvement in how V2 requests are handled, i.e:
>  * We do not audit log the BODY of the request (which is where the action is)
>  * We do not detect what collections the request is for (so the 
> AuditEvent#collections array is null)
>  * The resource path is internal format {{/v2/c}} instead of {{/api/c}} 
> (should we convert the prefix in the AuditEvent?)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest

2019-10-11 Thread Jira


[ 
https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949366#comment-16949366
 ] 

Jan Høydahl commented on SOLR-13741:


Ok, uploaded yet another patch with a new test for V2 API. Discovered that the 
path is {{/v2/c}} and not {{/api/c}} as expected, so modified the ADMIN 
detection based on that.

Also for V2 request we are lacking in several ways:
 * We do not audit log the BODY of the request (which is where the action is)
 * We do not detect what collections the request is for (so the 
AuditEvent#collections array is null)
 * The resource path is internal format {{/v2/c}} instead of {{/api/c}} 
(should we convert the prefix in the AuditEvent?)

I spun V2 improvements off into SOLR-13837 to not delay this effort 

> possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
> --
>
> Key: SOLR-13741
> URL: https://issues.apache.org/jira/browse/SOLR-13741
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, 
> SOLR-13741.patch, SOLR-13741.patch
>
>
> A while back i saw a weird non-reproducible failure from 
> AuditLoggerIntegrationTest.  When i started reading through that code, 2 
> things jumped out at me:
> # the way the 'delay' option works is brittle, and makes assumptions about 
> CPU scheduling that aren't neccessarily going to be true (and also suffers 
> from the problem that Thread.sleep isn't garunteed to sleep as long as you 
> ask it too)
> # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by 
> checking the size of a (List) {{buffer}} of recieved events in a sleep/poll 
> loop, until it contains at least N items -- but the code that adds items to 
> that buffer in the async Callback thread async _before_ the code that updates 
> other state variables (like the global {{count}} and the patch specific 
> {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 
> events added to the buffer, but calling {{assertEquals(3, 
> receiver.getTotalCount())}} could subsequently fail because that variable 
> hadn't been udpated yet.
> #2 was the source of the failures I was seeing, and while a quick fix for 
> that specific problem would be to update all other state _before_ adding the 
> event to the buffer, I set out to try and make more general improvements to 
> the test:
> * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data 
> structures
> * harden the assertions made about the expected events recieved (updating 
> some test methods that currently just assert the number of events recieved)
> * add new assertions that _only_ the expected events are recieved.
> In the process of doing this, I've found several oddities/descrepencies 
> between things the test currently claims/asserts, and what *actually* happens 
> under more rigerous scrutiny/assertions.
> I'll attach a patch shortly that has my (in progress) updates and inlcudes 
> copious nocommits about things seem suspect.  the summary of these concerns 
> is:
> * SolrException status codes that do not match what the existing test says 
> they should (but doesn't assert)
> * extra AuditEvents occuring that the existing test does not expect
> * AuditEvents for incorrect credentials that do not at all match the expected 
> AuditEvent in the existing test -- which the current test seems to miss in 
> it's assertions because it's picking up some extra events from triggered by 
> previuos requests earlier in the test that just happen to also match the 
> asserctions.
> ...it's not clear to me if the test logic is correct and these are "code 
> bugs" or if the test is faulty.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jgq2008303393 commented on a change in pull request #940: LUCENE-9002: Query caching leads to absurdly slow queries

2019-10-11 Thread GitBox
jgq2008303393 commented on a change in pull request #940: LUCENE-9002: Query 
caching leads to absurdly slow queries
URL: https://github.com/apache/lucene-solr/pull/940#discussion_r333896983
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -732,8 +741,39 @@ public ScorerSupplier scorerSupplier(LeafReaderContext 
context) throws IOExcepti
 
   if (docIdSet == null) {
 if (policy.shouldCache(in.getQuery())) {
-  docIdSet = cache(context);
-  putIfAbsent(in.getQuery(), docIdSet, cacheHelper);
+  final ScorerSupplier supplier = in.scorerSupplier(context);
+  if (supplier == null) {
+putIfAbsent(in.getQuery(), DocIdSet.EMPTY, cacheHelper);
+return null;
+  }
+
+  final long cost = supplier.cost();
+  return new ScorerSupplier() {
+@Override
+public Scorer get(long leadCost) throws IOException {
+  // skip cache operation which would slow query down too much
+  if ((cost > skipCacheCost || cost > leadCost * skipCacheFactor)
+  && in.getQuery() instanceof IndexOrDocValuesQuery) {
 
 Review comment:
   This PR is mainly for IndexOrDocValuesQuery now. 
   
   As discussed earlier, the reason why IndexOrDocValuesQuery slow down is that 
a large amount of data will be read during caching action, while only a small 
amount of data will be read from doc values when not caching. I don't find any 
other type of query that reads much more data for caching than it really needs. 
   
   @jpountz Looking forward to more discussions if you think this PR should 
apply to all query types.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jgq2008303393 commented on a change in pull request #940: LUCENE-9002: Query caching leads to absurdly slow queries

2019-10-11 Thread GitBox
jgq2008303393 commented on a change in pull request #940: LUCENE-9002: Query 
caching leads to absurdly slow queries
URL: https://github.com/apache/lucene-solr/pull/940#discussion_r333896831
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -732,8 +741,39 @@ public ScorerSupplier scorerSupplier(LeafReaderContext 
context) throws IOExcepti
 
   if (docIdSet == null) {
 if (policy.shouldCache(in.getQuery())) {
-  docIdSet = cache(context);
-  putIfAbsent(in.getQuery(), docIdSet, cacheHelper);
+  final ScorerSupplier supplier = in.scorerSupplier(context);
+  if (supplier == null) {
+putIfAbsent(in.getQuery(), DocIdSet.EMPTY, cacheHelper);
+return null;
+  }
+
+  final long cost = supplier.cost();
+  return new ScorerSupplier() {
+@Override
+public Scorer get(long leadCost) throws IOException {
+  // skip cache operation which would slow query down too much
+  if ((cost > skipCacheCost || cost > leadCost * skipCacheFactor)
 
 Review comment:
   We have tested different scenarios to observe the query latency with/without 
cacheing in an online ES cluster. Here is the result:
   
   | queryPattern  | latencyWithoutCaching  | latencyWithCaching | leadCost 
| rangeQueryCost  | skipCacheFactor |
   | -- | :---:  | :---: | :---:  | 
:---: | :---:  |
   | ip:xxx AND time:[t-1h, t] | 10ms | 36ms(+260%) | 20528 | 878979 | 42 |
   | ip:xxx AND time:[t-4h, t] | 10ms | 100ms(+900%) | 20528 | 4365870 | 212 |
   | ip:xxx AND time:[t-8h, t] | 11ms | 200ms(+1700%) | 20528 | 8724483 | 425 |
   | ip:xxx AND time:[t-12h, t] | 12ms | 300ms(+2400%) | 20528 | 13083096 | 637 
|
   | ip:xxx AND time:[t-24h, t] | 16ms | 500ms(+3000%) | 20528 | 26158936 | 
1274 |
   | ip:xxx AND time:[t-48h, t] | 30ms | 1200ms(3900%) | 20528 | 52310616 | 
2548 |
   
   As the table shows, query latency without caching is low and it's related 
with the final result set. Query latency with caching is much high and it's 
mainly related with _rangeQueryCost_. According to the above test, we set the 
default value of _skipCacheFactor_ to 250, which make the query slower by no 
more than 10 times.
   
   In addition to _skipCacheFactor_ which is similar to _maxCostFactor_ in 
LUCENE-8027, we add a new parameter _skipCacheCost_. The mainly reasons are:
   - control the time used for caching as the caching time is related to the 
cost of range query.
   - skip caching too large range queries which will consume too much memory 
and evict cache entries frequently.
   
   How do you think? Looking forward to your ideas. @jpountz 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest

2019-10-11 Thread Jira


 [ 
https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-13741:
---
Attachment: SOLR-13741.patch

> possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
> --
>
> Key: SOLR-13741
> URL: https://issues.apache.org/jira/browse/SOLR-13741
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, 
> SOLR-13741.patch
>
>
> A while back i saw a weird non-reproducible failure from 
> AuditLoggerIntegrationTest.  When i started reading through that code, 2 
> things jumped out at me:
> # the way the 'delay' option works is brittle, and makes assumptions about 
> CPU scheduling that aren't neccessarily going to be true (and also suffers 
> from the problem that Thread.sleep isn't garunteed to sleep as long as you 
> ask it too)
> # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by 
> checking the size of a (List) {{buffer}} of recieved events in a sleep/poll 
> loop, until it contains at least N items -- but the code that adds items to 
> that buffer in the async Callback thread async _before_ the code that updates 
> other state variables (like the global {{count}} and the patch specific 
> {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 
> events added to the buffer, but calling {{assertEquals(3, 
> receiver.getTotalCount())}} could subsequently fail because that variable 
> hadn't been udpated yet.
> #2 was the source of the failures I was seeing, and while a quick fix for 
> that specific problem would be to update all other state _before_ adding the 
> event to the buffer, I set out to try and make more general improvements to 
> the test:
> * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data 
> structures
> * harden the assertions made about the expected events recieved (updating 
> some test methods that currently just assert the number of events recieved)
> * add new assertions that _only_ the expected events are recieved.
> In the process of doing this, I've found several oddities/descrepencies 
> between things the test currently claims/asserts, and what *actually* happens 
> under more rigerous scrutiny/assertions.
> I'll attach a patch shortly that has my (in progress) updates and inlcudes 
> copious nocommits about things seem suspect.  the summary of these concerns 
> is:
> * SolrException status codes that do not match what the existing test says 
> they should (but doesn't assert)
> * extra AuditEvents occuring that the existing test does not expect
> * AuditEvents for incorrect credentials that do not at all match the expected 
> AuditEvent in the existing test -- which the current test seems to miss in 
> it's assertions because it's picking up some extra events from triggered by 
> previuos requests earlier in the test that just happen to also match the 
> asserctions.
> ...it's not clear to me if the test logic is correct and these are "code 
> bugs" or if the test is faulty.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest

2019-10-11 Thread Jira


 [ 
https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-13741:
---
Attachment: SOLR-13741.patch

> possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
> --
>
> Key: SOLR-13741
> URL: https://issues.apache.org/jira/browse/SOLR-13741
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch
>
>
> A while back i saw a weird non-reproducible failure from 
> AuditLoggerIntegrationTest.  When i started reading through that code, 2 
> things jumped out at me:
> # the way the 'delay' option works is brittle, and makes assumptions about 
> CPU scheduling that aren't neccessarily going to be true (and also suffers 
> from the problem that Thread.sleep isn't garunteed to sleep as long as you 
> ask it too)
> # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by 
> checking the size of a (List) {{buffer}} of recieved events in a sleep/poll 
> loop, until it contains at least N items -- but the code that adds items to 
> that buffer in the async Callback thread async _before_ the code that updates 
> other state variables (like the global {{count}} and the patch specific 
> {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 
> events added to the buffer, but calling {{assertEquals(3, 
> receiver.getTotalCount())}} could subsequently fail because that variable 
> hadn't been udpated yet.
> #2 was the source of the failures I was seeing, and while a quick fix for 
> that specific problem would be to update all other state _before_ adding the 
> event to the buffer, I set out to try and make more general improvements to 
> the test:
> * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data 
> structures
> * harden the assertions made about the expected events recieved (updating 
> some test methods that currently just assert the number of events recieved)
> * add new assertions that _only_ the expected events are recieved.
> In the process of doing this, I've found several oddities/descrepencies 
> between things the test currently claims/asserts, and what *actually* happens 
> under more rigerous scrutiny/assertions.
> I'll attach a patch shortly that has my (in progress) updates and inlcudes 
> copious nocommits about things seem suspect.  the summary of these concerns 
> is:
> * SolrException status codes that do not match what the existing test says 
> they should (but doesn't assert)
> * extra AuditEvents occuring that the existing test does not expect
> * AuditEvents for incorrect credentials that do not at all match the expected 
> AuditEvent in the existing test -- which the current test seems to miss in 
> it's assertions because it's picking up some extra events from triggered by 
> previuos requests earlier in the test that just happen to also match the 
> asserctions.
> ...it's not clear to me if the test logic is correct and these are "code 
> bugs" or if the test is faulty.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest

2019-10-11 Thread Jira


[ 
https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949321#comment-16949321
 ] 

Jan Høydahl commented on SOLR-13741:


{quote}why did the comment for a "wrong password" claim it was going to get a 
403 exception + audit log ?
{quote}
It should expect 401 when wrong password, this was probably all confused by 
SOLR-13835 in the initial test.
{quote}Are {{/admin/info/key}} events expect when auth is enabled ?

...is it ok for the test to explicitly ignore these events
{quote}
I can't recall dealing specially with this path. So muting during tests sounds 
like the right thing to do. Guess you could argue that it could be muted by 
default but the framework since it is a public always-open path? 
{quote}why the _actual_ audit log recieved in the "wrong password" situation is 
so different (and sparse) compared to other audit log events ?
// - the resource is *JUST* '/solr'
// - note that "resource" for every other expected event in this test class 
doesn't even
// *START* with (or include) the "/solr" portion of the URL
// - event 'resource' values are typically "/admin/etc..."
// - the requestType is 'UNKNOWN'
// - as opposed to the ADMIN that the existing test exists (and seems like 
should be correct){quote}
I will attach a new patch with some of this fixed:
 * Parsing "resource" from {{httpRequest.getPathInfo()}} instead of 
{{httpRequest.getContextPath()}} which is always /solr.
 * Detecting {{/admin/..}} as admin path in {{AuditEvent.findRequestType}} now 
that the resource is changed, giving requestType=ADMIN
 * However, principal is not filled since BasicAuth failed, which I believe is 
correct. But the HTTP headers are there for inspection... It would be nice to 
have the user field in AuditEvent also in this case, but that would mean that 
AuthPlugins would need to set it on MDC or something. It would be wrong to set 
principal on the request since that always means authenticated user, not?

{quote}// - this event has no solrParams at all
// - even though the httpQueryString show it's from the CREATE test2 req{quote}
This event is generated based on {{HttpServletRequest}} so we have no 
solrParams at this stage. In the new patch I have initialized the solrParams 
map from the httpRequest for a more consistent AuditEvent experience.

Hoss, this test is now so much better than what I managed to whip up the first 
time, thanks a ton for digging!

 

> possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
> --
>
> Key: SOLR-13741
> URL: https://issues.apache.org/jira/browse/SOLR-13741
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Chris M. Hostetter
>Assignee: Chris M. Hostetter
>Priority: Major
> Attachments: SOLR-13741.patch, SOLR-13741.patch
>
>
> A while back i saw a weird non-reproducible failure from 
> AuditLoggerIntegrationTest.  When i started reading through that code, 2 
> things jumped out at me:
> # the way the 'delay' option works is brittle, and makes assumptions about 
> CPU scheduling that aren't neccessarily going to be true (and also suffers 
> from the problem that Thread.sleep isn't garunteed to sleep as long as you 
> ask it too)
> # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by 
> checking the size of a (List) {{buffer}} of recieved events in a sleep/poll 
> loop, until it contains at least N items -- but the code that adds items to 
> that buffer in the async Callback thread async _before_ the code that updates 
> other state variables (like the global {{count}} and the patch specific 
> {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 
> events added to the buffer, but calling {{assertEquals(3, 
> receiver.getTotalCount())}} could subsequently fail because that variable 
> hadn't been udpated yet.
> #2 was the source of the failures I was seeing, and while a quick fix for 
> that specific problem would be to update all other state _before_ adding the 
> event to the buffer, I set out to try and make more general improvements to 
> the test:
> * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data 
> structures
> * harden the assertions made about the expected events recieved (updating 
> some test methods that currently just assert the number of events recieved)
> * add new assertions that _only_ the expected events are recieved.
> In the process of doing this, I've found several oddities/descrepencies 
> between things the test currently claims/asserts, and what *actually* happens 
> under more rigerous scrutiny/assertions.
> I'll attach a patch shortly that has my (in progress) updates and inlcudes 
> copious nocommits 

[GitHub] [lucene-solr] jpountz commented on a change in pull request #916: LUCENE-8213: Asynchronous Caching in LRUQueryCache

2019-10-11 Thread GitBox
jpountz commented on a change in pull request #916: LUCENE-8213: Asynchronous 
Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/916#discussion_r333896448
 
 

 ##
 File path: lucene/core/src/test/org/apache/lucene/search/TestLRUQueryCache.java
 ##
 @@ -244,6 +275,213 @@ public void testLRUEviction() throws Exception {
 dir.close();
   }
 
+  public void testLRUConcurrentLoadAndEviction() throws Exception {
+Directory dir = newDirectory();
+final RandomIndexWriter w = new RandomIndexWriter(random(), dir);
+
+Document doc = new Document();
+StringField f = new StringField("color", "blue", Store.NO);
+doc.add(f);
+w.addDocument(doc);
+f.setStringValue("red");
+w.addDocument(doc);
+f.setStringValue("green");
+w.addDocument(doc);
+final DirectoryReader reader = w.getReader();
+ExecutorService service = new ThreadPoolExecutor(4, 4, 0L, 
TimeUnit.MILLISECONDS,
+new LinkedBlockingQueue(),
+new NamedThreadFactory("TestLRUQueryCache"));
+
+IndexSearcher searcher = new IndexSearcher(reader, service);
+
+final CountDownLatch[] latch = {new CountDownLatch(1)};
+
+final LRUQueryCache queryCache = new LRUQueryCache(2, 10, context -> 
true) {
+  @Override
+  protected void onDocIdSetCache(Object readerCoreKey, long ramBytesUsed) {
+super.onDocIdSetCache(readerCoreKey, ramBytesUsed);
+latch[0].countDown();
+  }
+};
+
+final Query blue = new TermQuery(new Term("color", "blue"));
+final Query red = new TermQuery(new Term("color", "red"));
+final Query green = new TermQuery(new Term("color", "green"));
+
+assertEquals(Collections.emptyList(), queryCache.cachedQueries());
+
+searcher.setQueryCache(queryCache);
+// the filter is not cached on any segment: no changes
+searcher.setQueryCachingPolicy(NEVER_CACHE);
+searcher.search(new ConstantScoreQuery(green), 1);
+assertEquals(Collections.emptyList(), queryCache.cachedQueries());
+
+searcher.setQueryCachingPolicy(ALWAYS_CACHE);
+
+// First read should miss
+searcher.search(new ConstantScoreQuery(red), 1);
+
+
+// Let the cache load be completed
+latch[0].await();
+searcher.search(new ConstantScoreQuery(red), 1);
 
 Review comment:
   I think we should assert that the hit count incremented, in addition to 
searching again?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jpountz commented on a change in pull request #916: LUCENE-8213: Asynchronous Caching in LRUQueryCache

2019-10-11 Thread GitBox
jpountz commented on a change in pull request #916: LUCENE-8213: Asynchronous 
Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/916#discussion_r333892366
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -732,6 +734,21 @@ public ScorerSupplier scorerSupplier(LeafReaderContext 
context) throws IOExcepti
 
   if (docIdSet == null) {
 if (policy.shouldCache(in.getQuery())) {
+  boolean cacheSynchronously = executor == null;
+
+  // If asynchronous caching is requested, perform the same and return
+  // the uncached iterator
+  if (cacheSynchronously == false) {
+boolean asyncCachingSucceeded;
+asyncCachingSucceeded = cacheAsynchronously(context, cacheHelper);
 
 Review comment:
   merge declaration and assignment?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jpountz commented on a change in pull request #916: LUCENE-8213: Asynchronous Caching in LRUQueryCache

2019-10-11 Thread GitBox
jpountz commented on a change in pull request #916: LUCENE-8213: Asynchronous 
Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/916#discussion_r333896563
 
 

 ##
 File path: lucene/core/src/test/org/apache/lucene/search/TestLRUQueryCache.java
 ##
 @@ -244,6 +275,213 @@ public void testLRUEviction() throws Exception {
 dir.close();
   }
 
+  public void testLRUConcurrentLoadAndEviction() throws Exception {
+Directory dir = newDirectory();
+final RandomIndexWriter w = new RandomIndexWriter(random(), dir);
+
+Document doc = new Document();
+StringField f = new StringField("color", "blue", Store.NO);
+doc.add(f);
+w.addDocument(doc);
+f.setStringValue("red");
+w.addDocument(doc);
+f.setStringValue("green");
+w.addDocument(doc);
+final DirectoryReader reader = w.getReader();
+ExecutorService service = new ThreadPoolExecutor(4, 4, 0L, 
TimeUnit.MILLISECONDS,
+new LinkedBlockingQueue(),
+new NamedThreadFactory("TestLRUQueryCache"));
+
+IndexSearcher searcher = new IndexSearcher(reader, service);
+
+final CountDownLatch[] latch = {new CountDownLatch(1)};
+
+final LRUQueryCache queryCache = new LRUQueryCache(2, 10, context -> 
true) {
+  @Override
+  protected void onDocIdSetCache(Object readerCoreKey, long ramBytesUsed) {
+super.onDocIdSetCache(readerCoreKey, ramBytesUsed);
+latch[0].countDown();
+  }
+};
+
+final Query blue = new TermQuery(new Term("color", "blue"));
+final Query red = new TermQuery(new Term("color", "red"));
+final Query green = new TermQuery(new Term("color", "green"));
+
+assertEquals(Collections.emptyList(), queryCache.cachedQueries());
+
+searcher.setQueryCache(queryCache);
+// the filter is not cached on any segment: no changes
+searcher.setQueryCachingPolicy(NEVER_CACHE);
+searcher.search(new ConstantScoreQuery(green), 1);
+assertEquals(Collections.emptyList(), queryCache.cachedQueries());
+
+searcher.setQueryCachingPolicy(ALWAYS_CACHE);
+
+// First read should miss
+searcher.search(new ConstantScoreQuery(red), 1);
+
+
+// Let the cache load be completed
+latch[0].await();
+searcher.search(new ConstantScoreQuery(red), 1);
+
+// Second read should hit
+searcher.search(new ConstantScoreQuery(red), 1);
+assertEquals(Collections.singletonList(red), queryCache.cachedQueries());
+
+latch[0] = new CountDownLatch(1);
+searcher.search(new ConstantScoreQuery(green), 1);
+
+// Let the cache load be completed
+latch[0].await();
+assertEquals(Arrays.asList(red, green), queryCache.cachedQueries());
+
+searcher.search(new ConstantScoreQuery(red), 1);
+assertEquals(Arrays.asList(green, red), queryCache.cachedQueries());
 
 Review comment:
   Check that the hit count incremented?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jpountz commented on a change in pull request #916: LUCENE-8213: Asynchronous Caching in LRUQueryCache

2019-10-11 Thread GitBox
jpountz commented on a change in pull request #916: LUCENE-8213: Asynchronous 
Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/916#discussion_r333894810
 
 

 ##
 File path: lucene/core/src/test/org/apache/lucene/search/TestLRUQueryCache.java
 ##
 @@ -244,6 +275,213 @@ public void testLRUEviction() throws Exception {
 dir.close();
   }
 
+  public void testLRUConcurrentLoadAndEviction() throws Exception {
+Directory dir = newDirectory();
+final RandomIndexWriter w = new RandomIndexWriter(random(), dir);
+
+Document doc = new Document();
+StringField f = new StringField("color", "blue", Store.NO);
+doc.add(f);
+w.addDocument(doc);
+f.setStringValue("red");
+w.addDocument(doc);
+f.setStringValue("green");
+w.addDocument(doc);
+final DirectoryReader reader = w.getReader();
+ExecutorService service = new ThreadPoolExecutor(4, 4, 0L, 
TimeUnit.MILLISECONDS,
+new LinkedBlockingQueue(),
+new NamedThreadFactory("TestLRUQueryCache"));
+
+IndexSearcher searcher = new IndexSearcher(reader, service);
+
+final CountDownLatch[] latch = {new CountDownLatch(1)};
+
+final LRUQueryCache queryCache = new LRUQueryCache(2, 10, context -> 
true) {
+  @Override
+  protected void onDocIdSetCache(Object readerCoreKey, long ramBytesUsed) {
+super.onDocIdSetCache(readerCoreKey, ramBytesUsed);
+latch[0].countDown();
+  }
+};
+
+final Query blue = new TermQuery(new Term("color", "blue"));
+final Query red = new TermQuery(new Term("color", "red"));
+final Query green = new TermQuery(new Term("color", "green"));
+
+assertEquals(Collections.emptyList(), queryCache.cachedQueries());
+
+searcher.setQueryCache(queryCache);
+// the filter is not cached on any segment: no changes
+searcher.setQueryCachingPolicy(NEVER_CACHE);
+searcher.search(new ConstantScoreQuery(green), 1);
+assertEquals(Collections.emptyList(), queryCache.cachedQueries());
+
+searcher.setQueryCachingPolicy(ALWAYS_CACHE);
+
+// First read should miss
+searcher.search(new ConstantScoreQuery(red), 1);
+
+
+// Let the cache load be completed
+latch[0].await();
+searcher.search(new ConstantScoreQuery(red), 1);
+
+// Second read should hit
+searcher.search(new ConstantScoreQuery(red), 1);
+assertEquals(Collections.singletonList(red), queryCache.cachedQueries());
 
 Review comment:
   shouldn't we be able to assert on this directly after the call to 
`latch[0].await();` returns?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jpountz commented on a change in pull request #916: LUCENE-8213: Asynchronous Caching in LRUQueryCache

2019-10-11 Thread GitBox
jpountz commented on a change in pull request #916: LUCENE-8213: Asynchronous 
Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/916#discussion_r333900618
 
 

 ##
 File path: lucene/core/src/test/org/apache/lucene/search/TestLRUQueryCache.java
 ##
 @@ -1691,4 +1954,180 @@ public void testBulkScorerLocking() throws Exception {
 t.start();
 t.join();
   }
+
+  public void testRejectedExecution() throws IOException {
+ExecutorService service = new TestIndexSearcher.RejectingMockExecutor();
+Directory dir = newDirectory();
+final RandomIndexWriter w = new RandomIndexWriter(random(), dir);
+
+Document doc = new Document();
+StringField f = new StringField("color", "blue", Store.NO);
+doc.add(f);
+w.addDocument(doc);
+f.setStringValue("red");
+w.addDocument(doc);
+f.setStringValue("green");
+w.addDocument(doc);
+final DirectoryReader reader = w.getReader();
+
+final Query red = new TermQuery(new Term("color", "red"));
+
+IndexSearcher searcher = new IndexSearcher(reader, service);
+
+final LRUQueryCache queryCache = new LRUQueryCache(2, 10, context -> 
true);
+
+searcher.setQueryCache(queryCache);
+searcher.setQueryCachingPolicy(ALWAYS_CACHE);
+
+// To ensure that failing ExecutorService still allows query to run
+// successfully
+
+searcher.search(new ConstantScoreQuery(red), 1);
+assertEquals(Collections.singletonList(red), queryCache.cachedQueries());
+
+reader.close();
+w.close();
+dir.close();
+service.shutdown();
+  }
+
+  public void testClosedReaderExecution() throws IOException {
+CountDownLatch latch = new CountDownLatch(1);
+ExecutorService service = new BlockedMockExecutor(latch);
+
+Directory dir = newDirectory();
+final RandomIndexWriter w = new RandomIndexWriter(random(), dir);
+
+for (int i = 0; i < 100; i++) {
+  Document doc = new Document();
+  StringField f = new StringField("color", "blue", Store.NO);
+  doc.add(f);
+  w.addDocument(doc);
+  f.setStringValue("red");
+  w.addDocument(doc);
+  f.setStringValue("green");
+  w.addDocument(doc);
+
+  if (i % 10 == 0) {
+w.commit();
+  }
+}
+
+final DirectoryReader reader = w.getReader();
+
+final Query red = new TermQuery(new Term("color", "red"));
+
+IndexSearcher searcher = new IndexSearcher(reader, service) {
+  @Override
+  protected LeafSlice[] slices(List leaves) {
+ArrayList slices = new ArrayList<>();
+for (LeafReaderContext ctx : leaves) {
+  slices.add(new LeafSlice(Arrays.asList(ctx)));
+}
+return slices.toArray(new LeafSlice[0]);
 
 Review comment:
   nit: with recent versions of Java I like `slices.toArray(LeafSlice[]::new);` 
better


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jpountz commented on a change in pull request #916: LUCENE-8213: Asynchronous Caching in LRUQueryCache

2019-10-11 Thread GitBox
jpountz commented on a change in pull request #916: LUCENE-8213: Asynchronous 
Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/916#discussion_r333892145
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -449,12 +452,8 @@ void assertConsistent() {
   }
 
   @Override
-  public Weight doCache(Weight weight, QueryCachingPolicy policy) {
-while (weight instanceof CachingWrapperWeight) {
-  weight = ((CachingWrapperWeight) weight).in;
-}
-
-return new CachingWrapperWeight(weight, policy);
+  public Weight doCache(final Weight weight, QueryCachingPolicy policy, 
Executor executor) {
+return new CachingWrapperWeight(weight, policy, executor);
 
 Review comment:
   should we keep the unwrapping?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jpountz commented on a change in pull request #916: LUCENE-8213: Asynchronous Caching in LRUQueryCache

2019-10-11 Thread GitBox
jpountz commented on a change in pull request #916: LUCENE-8213: Asynchronous 
Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/916#discussion_r333898028
 
 

 ##
 File path: lucene/core/src/test/org/apache/lucene/search/TestLRUQueryCache.java
 ##
 @@ -244,6 +275,213 @@ public void testLRUEviction() throws Exception {
 dir.close();
   }
 
+  public void testLRUConcurrentLoadAndEviction() throws Exception {
+Directory dir = newDirectory();
+final RandomIndexWriter w = new RandomIndexWriter(random(), dir);
+
+Document doc = new Document();
+StringField f = new StringField("color", "blue", Store.NO);
+doc.add(f);
+w.addDocument(doc);
+f.setStringValue("red");
+w.addDocument(doc);
+f.setStringValue("green");
+w.addDocument(doc);
+final DirectoryReader reader = w.getReader();
+ExecutorService service = new ThreadPoolExecutor(4, 4, 0L, 
TimeUnit.MILLISECONDS,
+new LinkedBlockingQueue(),
+new NamedThreadFactory("TestLRUQueryCache"));
+
+IndexSearcher searcher = new IndexSearcher(reader, service);
+
+final CountDownLatch[] latch = {new CountDownLatch(1)};
+
+final LRUQueryCache queryCache = new LRUQueryCache(2, 10, context -> 
true) {
+  @Override
+  protected void onDocIdSetCache(Object readerCoreKey, long ramBytesUsed) {
+super.onDocIdSetCache(readerCoreKey, ramBytesUsed);
+latch[0].countDown();
+  }
+};
+
+final Query blue = new TermQuery(new Term("color", "blue"));
+final Query red = new TermQuery(new Term("color", "red"));
+final Query green = new TermQuery(new Term("color", "green"));
+
+assertEquals(Collections.emptyList(), queryCache.cachedQueries());
+
+searcher.setQueryCache(queryCache);
+// the filter is not cached on any segment: no changes
+searcher.setQueryCachingPolicy(NEVER_CACHE);
+searcher.search(new ConstantScoreQuery(green), 1);
+assertEquals(Collections.emptyList(), queryCache.cachedQueries());
+
+searcher.setQueryCachingPolicy(ALWAYS_CACHE);
+
+// First read should miss
+searcher.search(new ConstantScoreQuery(red), 1);
+
+
+// Let the cache load be completed
+latch[0].await();
+searcher.search(new ConstantScoreQuery(red), 1);
+
+// Second read should hit
+searcher.search(new ConstantScoreQuery(red), 1);
+assertEquals(Collections.singletonList(red), queryCache.cachedQueries());
+
+latch[0] = new CountDownLatch(1);
+searcher.search(new ConstantScoreQuery(green), 1);
+
+// Let the cache load be completed
+latch[0].await();
+assertEquals(Arrays.asList(red, green), queryCache.cachedQueries());
+
+searcher.search(new ConstantScoreQuery(red), 1);
+assertEquals(Arrays.asList(green, red), queryCache.cachedQueries());
+
+latch[0] = new CountDownLatch(1);
+
+searcher.search(new ConstantScoreQuery(blue), 1);
+
+// Let the cache load be completed
+latch[0].await();
+assertEquals(Arrays.asList(red, blue), queryCache.cachedQueries());
+
+searcher.search(new ConstantScoreQuery(blue), 1);
+assertEquals(Arrays.asList(red, blue), queryCache.cachedQueries());
+
+latch[0] = new CountDownLatch(1);
+
+searcher.search(new ConstantScoreQuery(green), 1);
+
+// Let the cache load be completed
+latch[0].await();
+assertEquals(Arrays.asList(blue, green), queryCache.cachedQueries());
+
+searcher.setQueryCachingPolicy(NEVER_CACHE);
+searcher.search(new ConstantScoreQuery(red), 1);
+assertEquals(Arrays.asList(blue, green), queryCache.cachedQueries());
 
 Review comment:
   maybe move the call to service.shutdown() above this line and also call 
`awaitTermination` to make sure that any ongoing cache operation are done so 
that the assertion doesn't succeed only because we are lucky with timing?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jpountz commented on a change in pull request #916: LUCENE-8213: Asynchronous Caching in LRUQueryCache

2019-10-11 Thread GitBox
jpountz commented on a change in pull request #916: LUCENE-8213: Asynchronous 
Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/916#discussion_r333898690
 
 

 ##
 File path: lucene/core/src/test/org/apache/lucene/search/TestLRUQueryCache.java
 ##
 @@ -244,6 +275,213 @@ public void testLRUEviction() throws Exception {
 dir.close();
   }
 
+  public void testLRUConcurrentLoadAndEviction() throws Exception {
+Directory dir = newDirectory();
+final RandomIndexWriter w = new RandomIndexWriter(random(), dir);
+
+Document doc = new Document();
+StringField f = new StringField("color", "blue", Store.NO);
+doc.add(f);
+w.addDocument(doc);
+f.setStringValue("red");
+w.addDocument(doc);
+f.setStringValue("green");
+w.addDocument(doc);
+final DirectoryReader reader = w.getReader();
+ExecutorService service = new ThreadPoolExecutor(4, 4, 0L, 
TimeUnit.MILLISECONDS,
+new LinkedBlockingQueue(),
+new NamedThreadFactory("TestLRUQueryCache"));
+
+IndexSearcher searcher = new IndexSearcher(reader, service);
+
+final CountDownLatch[] latch = {new CountDownLatch(1)};
+
+final LRUQueryCache queryCache = new LRUQueryCache(2, 10, context -> 
true) {
+  @Override
+  protected void onDocIdSetCache(Object readerCoreKey, long ramBytesUsed) {
+super.onDocIdSetCache(readerCoreKey, ramBytesUsed);
+latch[0].countDown();
+  }
+};
+
+final Query blue = new TermQuery(new Term("color", "blue"));
+final Query red = new TermQuery(new Term("color", "red"));
+final Query green = new TermQuery(new Term("color", "green"));
+
+assertEquals(Collections.emptyList(), queryCache.cachedQueries());
+
+searcher.setQueryCache(queryCache);
+// the filter is not cached on any segment: no changes
+searcher.setQueryCachingPolicy(NEVER_CACHE);
+searcher.search(new ConstantScoreQuery(green), 1);
+assertEquals(Collections.emptyList(), queryCache.cachedQueries());
+
+searcher.setQueryCachingPolicy(ALWAYS_CACHE);
+
+// First read should miss
+searcher.search(new ConstantScoreQuery(red), 1);
+
+
+// Let the cache load be completed
+latch[0].await();
+searcher.search(new ConstantScoreQuery(red), 1);
+
+// Second read should hit
+searcher.search(new ConstantScoreQuery(red), 1);
+assertEquals(Collections.singletonList(red), queryCache.cachedQueries());
+
+latch[0] = new CountDownLatch(1);
+searcher.search(new ConstantScoreQuery(green), 1);
+
+// Let the cache load be completed
+latch[0].await();
+assertEquals(Arrays.asList(red, green), queryCache.cachedQueries());
+
+searcher.search(new ConstantScoreQuery(red), 1);
+assertEquals(Arrays.asList(green, red), queryCache.cachedQueries());
+
+latch[0] = new CountDownLatch(1);
+
+searcher.search(new ConstantScoreQuery(blue), 1);
+
+// Let the cache load be completed
+latch[0].await();
+assertEquals(Arrays.asList(red, blue), queryCache.cachedQueries());
+
+searcher.search(new ConstantScoreQuery(blue), 1);
+assertEquals(Arrays.asList(red, blue), queryCache.cachedQueries());
+
+latch[0] = new CountDownLatch(1);
+
+searcher.search(new ConstantScoreQuery(green), 1);
+
+// Let the cache load be completed
+latch[0].await();
+assertEquals(Arrays.asList(blue, green), queryCache.cachedQueries());
+
+searcher.setQueryCachingPolicy(NEVER_CACHE);
+searcher.search(new ConstantScoreQuery(red), 1);
+assertEquals(Arrays.asList(blue, green), queryCache.cachedQueries());
+
+reader.close();
+w.close();
+dir.close();
+service.shutdown();
+  }
+
+  public void testLRUConcurrentLoadsOfSameQuery() throws Exception {
+Directory dir = newDirectory();
+final RandomIndexWriter w = new RandomIndexWriter(random(), dir);
+
+Document doc = new Document();
+StringField f = new StringField("color", "blue", Store.NO);
+doc.add(f);
+w.addDocument(doc);
+f.setStringValue("red");
+w.addDocument(doc);
+f.setStringValue("green");
+w.addDocument(doc);
+final DirectoryReader reader = w.getReader();
+ExecutorService service = new ThreadPoolExecutor(4, 4, 0L, 
TimeUnit.MILLISECONDS,
+new LinkedBlockingQueue(),
+new NamedThreadFactory("TestLRUQueryCache"));
+
+ExecutorService stressService = new ThreadPoolExecutor(15, 15, 0L, 
TimeUnit.MILLISECONDS,
+new LinkedBlockingQueue(),
+new NamedThreadFactory("TestLRUQueryCache2"));
+
+IndexSearcher searcher = new IndexSearcher(reader, service);
+
+final CountDownLatch latch = new CountDownLatch(1);
+
+final LRUQueryCache queryCache = new LRUQueryCache(2, 10, context -> 
true) {
+  @Override
+  protected void onDocIdSetCache(Object readerCoreKey, long ramBytesUsed) {
+super.onDocIdSetCache(readerCoreKey, ramBytesUsed);
+latch.countDown();
+  }
+};
+
+final Query green 

[GitHub] [lucene-solr] jpountz commented on a change in pull request #916: LUCENE-8213: Asynchronous Caching in LRUQueryCache

2019-10-11 Thread GitBox
jpountz commented on a change in pull request #916: LUCENE-8213: Asynchronous 
Caching in LRUQueryCache
URL: https://github.com/apache/lucene-solr/pull/916#discussion_r333907193
 
 

 ##
 File path: lucene/core/src/test/org/apache/lucene/search/TestLRUQueryCache.java
 ##
 @@ -1691,4 +1954,180 @@ public void testBulkScorerLocking() throws Exception {
 t.start();
 t.join();
   }
+
+  public void testRejectedExecution() throws IOException {
+ExecutorService service = new TestIndexSearcher.RejectingMockExecutor();
+Directory dir = newDirectory();
+final RandomIndexWriter w = new RandomIndexWriter(random(), dir);
+
+Document doc = new Document();
+StringField f = new StringField("color", "blue", Store.NO);
+doc.add(f);
+w.addDocument(doc);
+f.setStringValue("red");
+w.addDocument(doc);
+f.setStringValue("green");
+w.addDocument(doc);
+final DirectoryReader reader = w.getReader();
+
+final Query red = new TermQuery(new Term("color", "red"));
+
+IndexSearcher searcher = new IndexSearcher(reader, service);
+
+final LRUQueryCache queryCache = new LRUQueryCache(2, 10, context -> 
true);
+
+searcher.setQueryCache(queryCache);
+searcher.setQueryCachingPolicy(ALWAYS_CACHE);
+
+// To ensure that failing ExecutorService still allows query to run
+// successfully
+
+searcher.search(new ConstantScoreQuery(red), 1);
+assertEquals(Collections.singletonList(red), queryCache.cachedQueries());
+
+reader.close();
+w.close();
+dir.close();
+service.shutdown();
+  }
+
+  public void testClosedReaderExecution() throws IOException {
+CountDownLatch latch = new CountDownLatch(1);
+ExecutorService service = new BlockedMockExecutor(latch);
+
+Directory dir = newDirectory();
+final RandomIndexWriter w = new RandomIndexWriter(random(), dir);
+
+for (int i = 0; i < 100; i++) {
+  Document doc = new Document();
+  StringField f = new StringField("color", "blue", Store.NO);
+  doc.add(f);
+  w.addDocument(doc);
+  f.setStringValue("red");
+  w.addDocument(doc);
+  f.setStringValue("green");
+  w.addDocument(doc);
+
+  if (i % 10 == 0) {
+w.commit();
+  }
+}
+
+final DirectoryReader reader = w.getReader();
+
+final Query red = new TermQuery(new Term("color", "red"));
+
+IndexSearcher searcher = new IndexSearcher(reader, service) {
+  @Override
+  protected LeafSlice[] slices(List leaves) {
+ArrayList slices = new ArrayList<>();
+for (LeafReaderContext ctx : leaves) {
+  slices.add(new LeafSlice(Arrays.asList(ctx)));
+}
+return slices.toArray(new LeafSlice[0]);
+  }
+};
+
+final LRUQueryCache queryCache = new LRUQueryCache(2, 10, context -> 
true);
+
+searcher.setQueryCache(queryCache);
+searcher.setQueryCachingPolicy(ALWAYS_CACHE);
+
+// To ensure that failing ExecutorService still allows query to run
+// successfully
+
+ExecutorService tempService = new ThreadPoolExecutor(2, 2, 0L, 
TimeUnit.MILLISECONDS,
+new LinkedBlockingQueue(),
+new NamedThreadFactory("TestLRUQueryCache"));
+
+tempService.submit(new Runnable() {
+  @Override
+  public void run() {
+try {
+  Thread.sleep(100);
+  reader.close();
+} catch (Exception e) {
+  throw new RuntimeException(e.getMessage());
+}
+
+latch.countDown();
+
+  }
+});
+
+searcher.search(new ConstantScoreQuery(red), 1);
+
+assertEquals(Collections.singletonList(red), queryCache.cachedQueries());
 
 Review comment:
   This assertion is actually proving that the test is not working? We would 
except that nothing gets cached since the reader is already closed by the time 
that the executor needs to cache the query?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jgq2008303393 commented on a change in pull request #940: LUCENE-9002: Query caching leads to absurdly slow queries

2019-10-11 Thread GitBox
jgq2008303393 commented on a change in pull request #940: LUCENE-9002: Query 
caching leads to absurdly slow queries
URL: https://github.com/apache/lucene-solr/pull/940#discussion_r333896983
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -732,8 +741,39 @@ public ScorerSupplier scorerSupplier(LeafReaderContext 
context) throws IOExcepti
 
   if (docIdSet == null) {
 if (policy.shouldCache(in.getQuery())) {
-  docIdSet = cache(context);
-  putIfAbsent(in.getQuery(), docIdSet, cacheHelper);
+  final ScorerSupplier supplier = in.scorerSupplier(context);
+  if (supplier == null) {
+putIfAbsent(in.getQuery(), DocIdSet.EMPTY, cacheHelper);
+return null;
+  }
+
+  final long cost = supplier.cost();
+  return new ScorerSupplier() {
+@Override
+public Scorer get(long leadCost) throws IOException {
+  // skip cache operation which would slow query down too much
+  if ((cost > skipCacheCost || cost > leadCost * skipCacheFactor)
+  && in.getQuery() instanceof IndexOrDocValuesQuery) {
 
 Review comment:
   This PR is mainly for IndexOrDocValuesQuery now. 
   
   As discussed earlier, the reason why IndexOrDocValuesQuery slow down is that 
a large amount of data will be read during caching action, while only a small 
amount of data will be read from doc values when not caching. I don't find any 
other type of query that reads much more data for caching than it really needs. 
   
   @jpountz Looking forward to more discussions if you this PR should apply to 
all query types.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jgq2008303393 commented on a change in pull request #940: LUCENE-9002: Query caching leads to absurdly slow queries

2019-10-11 Thread GitBox
jgq2008303393 commented on a change in pull request #940: LUCENE-9002: Query 
caching leads to absurdly slow queries
URL: https://github.com/apache/lucene-solr/pull/940#discussion_r333896831
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -732,8 +741,39 @@ public ScorerSupplier scorerSupplier(LeafReaderContext 
context) throws IOExcepti
 
   if (docIdSet == null) {
 if (policy.shouldCache(in.getQuery())) {
-  docIdSet = cache(context);
-  putIfAbsent(in.getQuery(), docIdSet, cacheHelper);
+  final ScorerSupplier supplier = in.scorerSupplier(context);
+  if (supplier == null) {
+putIfAbsent(in.getQuery(), DocIdSet.EMPTY, cacheHelper);
+return null;
+  }
+
+  final long cost = supplier.cost();
+  return new ScorerSupplier() {
+@Override
+public Scorer get(long leadCost) throws IOException {
+  // skip cache operation which would slow query down too much
+  if ((cost > skipCacheCost || cost > leadCost * skipCacheFactor)
 
 Review comment:
   We have tested different scenarios to observe the query latency with/without 
cacheing in an online ES cluster. Here is the result:
   
   | queryPattern  | latencyWithoutCaching  | latencyWithCaching | leadCost 
| rangeQueryCost  | skipCacheFactor |
   | -- | :---:  | :---: | :---:  | 
:---: | :---:  |
   | ip:xxx AND time:[t-1h, t] | 10ms | 36ms(+260%) | 20528 | 878979 | 42 |
   | ip:xxx AND time:[t-4h, t] | 10ms | 100ms(+900%) | 20528 | 4365870 | 212 |
   | ip:xxx AND time:[t-8h, t] | 11ms | 200ms(+1700%) | 20528 | 8724483 | 425 |
   | ip:xxx AND time:[t-12h, t] | 12ms | 300ms(+2400%) | 20528 | 13083096 | 637 
|
   | ip:xxx AND time:[t-24h, t] | 16ms | 500ms(+3000%) | 20528 | 26158936 | 
1274 |
   | ip:xxx AND time:[t-48h, t] | 30ms | 1200ms(3900%) | 20528 | 52310616 | 
2548 |
   
   As the table shows, query latency without caching is low and it's related 
with the final result set. Query latency with caching is much high and it's 
mainly related with _rangeQueryCost_. According to the above test, we set the 
default value of _skipCacheFactor_ to 250, which make the query slower by no 
more than 10 times.
   
   In addition to _skipCacheFactor_ which is similar to _maxCostFactor_ in 
LUCENE-8027, we add a new parameter _skipCacheCost_. The mainly reasons are:
   - control the time used for caching as the caching time is related to the 
cost of range query.
   - skip caching too large range queries which will consume too much memory 
and evict cache frequently.
   
   How do you think? Looking forward to your ideas. @jpountz 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jgq2008303393 commented on a change in pull request #940: LUCENE-9002: Query caching leads to absurdly slow queries

2019-10-11 Thread GitBox
jgq2008303393 commented on a change in pull request #940: LUCENE-9002: Query 
caching leads to absurdly slow queries
URL: https://github.com/apache/lucene-solr/pull/940#discussion_r333815178
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -732,8 +741,39 @@ public ScorerSupplier scorerSupplier(LeafReaderContext 
context) throws IOExcepti
 
   if (docIdSet == null) {
 if (policy.shouldCache(in.getQuery())) {
-  docIdSet = cache(context);
-  putIfAbsent(in.getQuery(), docIdSet, cacheHelper);
+  final ScorerSupplier supplier = in.scorerSupplier(context);
+  if (supplier == null) {
+putIfAbsent(in.getQuery(), DocIdSet.EMPTY, cacheHelper);
+return null;
+  }
+
+  final long cost = supplier.cost();
+  return new ScorerSupplier() {
+@Override
+public Scorer get(long leadCost) throws IOException {
+  // skip cache operation which would slow query down too much
+  if (cost > skipCacheCost && cost > leadCost * skipCacheFactor
+  && in.getQuery() instanceof IndexOrDocValuesQuery) {
 
 Review comment:
   This PR is mainly for IndexOrDocValuesQuery now. 
   
   As discussed earlier, the reason why IndexOrDocValuesQuery slow down is that 
a large amount of data will be read during caching action, while only a small 
amount of data will be read from doc values when not caching. I don't find any 
other type of query that reads much more data for caching than it really needs. 
   
   @jpountz Looking forward to more discussions if you this PR should apply to 
all query types.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jgq2008303393 commented on a change in pull request #940: LUCENE-9002: Query caching leads to absurdly slow queries

2019-10-11 Thread GitBox
jgq2008303393 commented on a change in pull request #940: LUCENE-9002: Query 
caching leads to absurdly slow queries
URL: https://github.com/apache/lucene-solr/pull/940#discussion_r333851594
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -732,8 +741,39 @@ public ScorerSupplier scorerSupplier(LeafReaderContext 
context) throws IOExcepti
 
   if (docIdSet == null) {
 if (policy.shouldCache(in.getQuery())) {
-  docIdSet = cache(context);
-  putIfAbsent(in.getQuery(), docIdSet, cacheHelper);
+  final ScorerSupplier supplier = in.scorerSupplier(context);
+  if (supplier == null) {
+putIfAbsent(in.getQuery(), DocIdSet.EMPTY, cacheHelper);
+return null;
+  }
+
+  final long cost = supplier.cost();
+  return new ScorerSupplier() {
+@Override
+public Scorer get(long leadCost) throws IOException {
+  // skip cache operation which would slow query down too much
+  if (cost > skipCacheCost && cost > leadCost * skipCacheFactor
 
 Review comment:
   We have tested different scenarios to observe the query latency with/without 
cacheing in an online ES cluster. Here is the result:
   
   | queryPattern  | latencyWithoutCaching  | latencyWithCaching | leadCost 
| rangeQueryCost  | skipCacheFactor |
   | -- | :---:  | :---: | :---:  | 
:---: | :---:  |
   | ip:xxx AND time:[t-1h, t] | 10ms | 36ms(+260%) | 20528 | 878979 | 42 |
   | ip:xxx AND time:[t-4h, t] | 10ms | 100ms(+900%) | 20528 | 4365870 | 212 |
   | ip:xxx AND time:[t-8h, t] | 11ms | 200ms(+1700%) | 20528 | 8724483 | 425 |
   | ip:xxx AND time:[t-12h, t] | 12ms | 300ms(+2400%) | 20528 | 13083096 | 637 
|
   | ip:xxx AND time:[t-24h, t] | 16ms | 500ms(+3000%) | 20528 | 26158936 | 
1274 |
   | ip:xxx AND time:[t-48h, t] | 30ms | 1200ms(3900%) | 20528 | 52310616 | 
2548 |
   
   As the table shows, query latency without caching is low and it's related 
with the final result set. Query latency with caching is much high and it's 
mainly related with _rangeQueryCost_. According to the above test, we set the 
default value of _skipCacheFactor_ to 250, which make the query slower by no 
more than 10 times.
   
   In addition to _skipCacheFactor_ which is similar to _maxCostFactor_ in 
LUCENE-8027, we add a new parameter _skipCacheCost_. The mainly reasons are:
   - control the time used for caching as the caching time is related to the 
cost of range query.
   - skip caching too large range queries which will consume too much memory 
and evict cache frequently.
   
   How do you think? Looking forward to your ideas. @jpountz 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jgq2008303393 commented on a change in pull request #940: LUCENE-9002: Query caching leads to absurdly slow queries

2019-10-11 Thread GitBox
jgq2008303393 commented on a change in pull request #940: LUCENE-9002: Query 
caching leads to absurdly slow queries
URL: https://github.com/apache/lucene-solr/pull/940#discussion_r333851594
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -732,8 +741,39 @@ public ScorerSupplier scorerSupplier(LeafReaderContext 
context) throws IOExcepti
 
   if (docIdSet == null) {
 if (policy.shouldCache(in.getQuery())) {
-  docIdSet = cache(context);
-  putIfAbsent(in.getQuery(), docIdSet, cacheHelper);
+  final ScorerSupplier supplier = in.scorerSupplier(context);
+  if (supplier == null) {
+putIfAbsent(in.getQuery(), DocIdSet.EMPTY, cacheHelper);
+return null;
+  }
+
+  final long cost = supplier.cost();
+  return new ScorerSupplier() {
+@Override
+public Scorer get(long leadCost) throws IOException {
+  // skip cache operation which would slow query down too much
+  if (cost > skipCacheCost && cost > leadCost * skipCacheFactor
 
 Review comment:
   We have tested different scenarios to observe the query latency with/without 
cacheing in an online ES cluster. Here is the result:
   
   | queryPattern  | latencyWithoutCaching  | latencyWithCaching | leadCost 
| rangeQueryCost  | skipCacheFactor |
   | -- | :---:  | :---: | :---:  | 
:---: | :---:  |
   | ip:xxx AND time:[t-1h, t] | 10ms | 36ms(+260%) | 20528 | 878979 | 42 |
   | ip:xxx AND time:[t-4h, t] | 10ms | 100ms(+900%) | 20528 | 4365870 | 212 |
   | ip:xxx AND time:[t-8h, t] | 11ms | 200ms(+1700%) | 20528 | 8724483 | 425 |
   | ip:xxx AND time:[t-12h, t] | 12ms | 300ms(+2400%) | 20528 | 13083096 | 637 
|
   | ip:xxx AND time:[t-24h, t] | 16ms | 500ms(+3000%) | 20528 | 26158936 | 
1274 |
   | ip:xxx AND time:[t-48h, t] | 30ms | 1200ms(3900%) | 20528 | 52310616 | 
2548 |
   
   As the table shows, query latency without caching is low and it's related 
with the final result set. Query latency with caching is much high and it's 
mainly related with _rangeQueryCost_. According to the above test, we set the 
default value of _skipCacheFactor_ to 250, which make the query slower by no 
more than 10 times.
   
   In addition to _skipCacheFactor_ which is similar to _maxCostFactor_ in 
LUCENE-8027, we add a new parameter _skipCacheCost_. The mainly reasons are:
   - control the time used for caching as the caching time is related to the 
cost of range query.
   - skip caching too large range queries which will consume too much memory 
and evict cache frequently.
   
   How do you think? Looking forward to your ideas. @jpountz 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jgq2008303393 commented on a change in pull request #940: LUCENE-9002: Query caching leads to absurdly slow queries

2019-10-11 Thread GitBox
jgq2008303393 commented on a change in pull request #940: LUCENE-9002: Query 
caching leads to absurdly slow queries
URL: https://github.com/apache/lucene-solr/pull/940#discussion_r333851594
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -732,8 +741,39 @@ public ScorerSupplier scorerSupplier(LeafReaderContext 
context) throws IOExcepti
 
   if (docIdSet == null) {
 if (policy.shouldCache(in.getQuery())) {
-  docIdSet = cache(context);
-  putIfAbsent(in.getQuery(), docIdSet, cacheHelper);
+  final ScorerSupplier supplier = in.scorerSupplier(context);
+  if (supplier == null) {
+putIfAbsent(in.getQuery(), DocIdSet.EMPTY, cacheHelper);
+return null;
+  }
+
+  final long cost = supplier.cost();
+  return new ScorerSupplier() {
+@Override
+public Scorer get(long leadCost) throws IOException {
+  // skip cache operation which would slow query down too much
+  if (cost > skipCacheCost && cost > leadCost * skipCacheFactor
 
 Review comment:
   We have tested different scenarios to observe the query latency with/without 
cacheing in an online metric ES cluster. The result is as follows:
   
   | queryPattern  | latencyWithoutCaching  | latencyWithCaching | leadCost 
| rangeQueryCost  | skipCacheFactor |
   | -- | :---:  | :---: | :---:  | 
:---: | :---:  |
   | ip:xxx AND time:[t-1h, t] | 10ms | 36ms(+260%) | 20528 | 878979 | 42 |
   | ip:xxx AND time:[t-4h, t] | 10ms | 100ms(+900%) | 20528 | 4365870 | 212 |
   | ip:xxx AND time:[t-8h, t] | 11ms | 200ms(+1700%) | 20528 | 8724483 | 425 |
   | ip:xxx AND time:[t-12h, t] | 12ms | 300ms(+2400%) | 20528 | 13083096 | 637 
|
   | ip:xxx AND time:[t-24h, t] | 16ms | 500ms(+3000%) | 20528 | 26158936 | 
1274 |
   | ip:xxx AND time:[t-48h, t] | 30ms | 1200ms(3900%) | 20528 | 52310616 | 
2548 |
   
   As the table show, query latency without caching is low and related with the 
final result set, and query latency with caching is much high and mainly 
related with _rangeQueryCost_. According to the above test, we set the default 
value of _skipCacheFactor_ to 250, which make the query slower by no more than 
10 times.
   
   In addition to _skipCacheFactor_ which is similar to _maxCostFactor_ in 
LUCENE-8027, we have added a new parameter _skipCacheCost_. The mainly reasons 
are:
   - control the time used for caching as the caching time is related to the 
cost of range query.
   - skip caching too large range queries which will consume too much memory 
and evict cache frequently.
   
   How do you think? Looking forward to your ideas. @jpountz 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jgq2008303393 commented on a change in pull request #940: LUCENE-9002: Query caching leads to absurdly slow queries

2019-10-11 Thread GitBox
jgq2008303393 commented on a change in pull request #940: LUCENE-9002: Query 
caching leads to absurdly slow queries
URL: https://github.com/apache/lucene-solr/pull/940#discussion_r333815178
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -732,8 +741,39 @@ public ScorerSupplier scorerSupplier(LeafReaderContext 
context) throws IOExcepti
 
   if (docIdSet == null) {
 if (policy.shouldCache(in.getQuery())) {
-  docIdSet = cache(context);
-  putIfAbsent(in.getQuery(), docIdSet, cacheHelper);
+  final ScorerSupplier supplier = in.scorerSupplier(context);
+  if (supplier == null) {
+putIfAbsent(in.getQuery(), DocIdSet.EMPTY, cacheHelper);
+return null;
+  }
+
+  final long cost = supplier.cost();
+  return new ScorerSupplier() {
+@Override
+public Scorer get(long leadCost) throws IOException {
+  // skip cache operation which would slow query down too much
+  if (cost > skipCacheCost && cost > leadCost * skipCacheFactor
+  && in.getQuery() instanceof IndexOrDocValuesQuery) {
 
 Review comment:
   This PR is mainly for IndexOrDocValuesQuery now. 
   
   As discussed earlier, the reason why IndexOrDocValuesQuery slow down is that 
a large amount of data will be read during caching action, while only a small 
amount of data will be read from doc values when not caching. I don't find any 
other type of query that reads much more data for caching than it really needs. 
   
   @jpountz Looking forward to more discussions if you this PR should apply to 
all query types.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9004) Approximate nearest vector search

2019-10-11 Thread Adrien Grand (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949250#comment-16949250
 ] 

Adrien Grand commented on LUCENE-9004:
--

Pretty cool! I don't know HNSW so I can't comment on that part but it made me 
wonder about a couple things:
* +1 to per-segment structure and rebuild graphs when merging.
* You hacked doc-value formats for this POC, but I guess your end idea would 
have a dedicated file-format (in the Lucene API sense) to support this, e.g. 
VectorFileFormat, like we have PostingsFormat or PointsFormat?
* You added a TODO about supporting ints and floats, I worry this would 
complicate things too much. Supporting float only has a great advantage which 
is that you can compute distances with doubles and never have to worry about 
overflows or underflows. This would be much more challenging if we supported 
doubles. Regarding ints, codecs could optimize for the case when all dimensions 
don't have a fractional part (bfloat16 is another type that we might want to 
optimize for).
*  You said there is "no Query implementation", but I suspect getting one will 
be challenging with the current Query API which requires ordered iterators of 
doc IDs and accept arbitrary filters. So if you were to intersect with a 
selective filter, you wouldn't be able to know up-front how many 
nearest-neighbors you'd need to filter. Something like LongDistanceFeatureQuery 
or LatLonPointDistanceFeatureQuery which further filters documents as more 
documents get collected would be nice, but this sounds very challenging with 
high numbers of dimensions?

> Approximate nearest vector search
> -
>
> Key: LUCENE-9004
> URL: https://issues.apache.org/jira/browse/LUCENE-9004
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Michael Sokolov
>Priority: Major
>
> "Semantic" search based on machine-learned vector "embeddings" representing 
> terms, queries and documents is becoming a must-have feature for a modern 
> search engine. SOLR-12890 is exploring various approaches to this, including 
> providing vector-based scoring functions. This is a spinoff issue from that.
> The idea here is to explore approximate nearest-neighbor search. Researchers 
> have found an approach based on navigating a graph that partially encodes the 
> nearest neighbor relation at multiple scales can provide accuracy > 95% (as 
> compared to exact nearest neighbor calculations) at a reasonable cost. This 
> issue will explore implementing HNSW (hierarchical navigable small-world) 
> graphs for the purpose of approximate nearest vector search (often referred 
> to as KNN or k-nearest-neighbor search).
> At a high level the way this algorithm works is this. First assume you have a 
> graph that has a partial encoding of the nearest neighbor relation, with some 
> short and some long-distance links. If this graph is built in the right way 
> (has the hierarchical navigable small world property), then you can 
> efficiently traverse it to find nearest neighbors (approximately) in log N 
> time where N is the number of nodes in the graph. I believe this idea was 
> pioneered in  [1]. The great insight in that paper is that if you use the 
> graph search algorithm to find the K nearest neighbors of a new document 
> while indexing, and then link those neighbors (undirectedly, ie both ways) to 
> the new document, then the graph that emerges will have the desired 
> properties.
> The implementation I propose for Lucene is as follows. We need two new data 
> structures to encode the vectors and the graph. We can encode vectors using a 
> light wrapper around {{BinaryDocValues}} (we also want to encode the vector 
> dimension and have efficient conversion from bytes to floats). For the graph 
> we can use {{SortedNumericDocValues}} where the values we encode are the 
> docids of the related documents. Encoding the interdocument relations using 
> docids directly will make it relatively fast to traverse the graph since we 
> won't need to lookup through an id-field indirection. This choice limits us 
> to building a graph-per-segment since it would be impractical to maintain a 
> global graph for the whole index in the face of segment merges. However 
> graph-per-segment is a very natural at search time - we can traverse each 
> segments' graph independently and merge results as we do today for term-based 
> search.
> At index time, however, merging graphs is somewhat challenging. While 
> indexing we build a graph incrementally, performing searches to construct 
> links among neighbors. When merging segments we must construct a new graph 
> containing elements of all the merged segments. Ideally we would somehow 
> preserve the work done when building the initial graphs, but at least as a 
> start I'd propose we construct a new graph 

[jira] [Commented] (SOLR-13835) HttpSolrCall produces incorrect extra AuditEvent on AuthorizationResponse.PROMPT

2019-10-11 Thread Jira


[ 
https://issues.apache.org/jira/browse/SOLR-13835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949225#comment-16949225
 ] 

Jan Høydahl commented on SOLR-13835:


The first if block was introduced back in 2005 as part of SOLR-7757. 
[~noble.paul] why does the if not return? It will *always* fall through to and 
trigger the next if block!

> HttpSolrCall produces incorrect extra AuditEvent on 
> AuthorizationResponse.PROMPT
> 
>
> Key: SOLR-13835
> URL: https://issues.apache.org/jira/browse/SOLR-13835
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Authentication, Authorization
>Reporter: Chris M. Hostetter
>Priority: Major
>
> spinning this out of SOLR-13741...
> {quote}
> Wrt the REJECTED + UNAUTHORIZED events I see the same as you, and I believe 
> there is a code bug, not a test bug. In HttpSolrCall#471 in the 
> {{authorize()}} call, if authResponse == PROMPT, it will actually match both 
> blocks and emit two audit events: 
> [https://github.com/apache/lucene-solr/blob/26ede632e6259eb9d16861a3c0f782c9c8999762/solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java#L475:L493]
>  
> {code:java}
> if (authResponse.statusCode == AuthorizationResponse.PROMPT.statusCode) {...}
> if (!(authResponse.statusCode == HttpStatus.SC_ACCEPTED) && 
> !(authResponse.statusCode == HttpStatus.SC_OK)) {...}
> {code}
> When code==401, it is also true that code!=200. Intuitively there should be 
> both a sendErrora and return RETURN before line #484 in the first if block?
> {quote}
> This causes any and all {{REJECTED}} AuditEvent messages to be accompanied by 
> a coresponding {{UNAUTHORIZED}} AuditEvent.  
> It's not yet clear if, from the perspective of the external client, there are 
> any other bugs in behavior (TBD)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Assigned] (SOLR-13835) HttpSolrCall produces incorrect extra AuditEvent on AuthorizationResponse.PROMPT

2019-10-11 Thread Jira


 [ 
https://issues.apache.org/jira/browse/SOLR-13835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl reassigned SOLR-13835:
--

Assignee: (was: Jan Høydahl)

> HttpSolrCall produces incorrect extra AuditEvent on 
> AuthorizationResponse.PROMPT
> 
>
> Key: SOLR-13835
> URL: https://issues.apache.org/jira/browse/SOLR-13835
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Authentication, Authorization
>Reporter: Chris M. Hostetter
>Priority: Major
>
> spinning this out of SOLR-13741...
> {quote}
> Wrt the REJECTED + UNAUTHORIZED events I see the same as you, and I believe 
> there is a code bug, not a test bug. In HttpSolrCall#471 in the 
> {{authorize()}} call, if authResponse == PROMPT, it will actually match both 
> blocks and emit two audit events: 
> [https://github.com/apache/lucene-solr/blob/26ede632e6259eb9d16861a3c0f782c9c8999762/solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java#L475:L493]
>  
> {code:java}
> if (authResponse.statusCode == AuthorizationResponse.PROMPT.statusCode) {...}
> if (!(authResponse.statusCode == HttpStatus.SC_ACCEPTED) && 
> !(authResponse.statusCode == HttpStatus.SC_OK)) {...}
> {code}
> When code==401, it is also true that code!=200. Intuitively there should be 
> both a sendErrora and return RETURN before line #484 in the first if block?
> {quote}
> This causes any and all {{REJECTED}} AuditEvent messages to be accompanied by 
> a coresponding {{UNAUTHORIZED}} AuditEvent.  
> It's not yet clear if, from the perspective of the external client, there are 
> any other bugs in behavior (TBD)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jgq2008303393 commented on a change in pull request #940: LUCENE-9002: Query caching leads to absurdly slow queries

2019-10-11 Thread GitBox
jgq2008303393 commented on a change in pull request #940: LUCENE-9002: Query 
caching leads to absurdly slow queries
URL: https://github.com/apache/lucene-solr/pull/940#discussion_r333851594
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -732,8 +741,39 @@ public ScorerSupplier scorerSupplier(LeafReaderContext 
context) throws IOExcepti
 
   if (docIdSet == null) {
 if (policy.shouldCache(in.getQuery())) {
-  docIdSet = cache(context);
-  putIfAbsent(in.getQuery(), docIdSet, cacheHelper);
+  final ScorerSupplier supplier = in.scorerSupplier(context);
+  if (supplier == null) {
+putIfAbsent(in.getQuery(), DocIdSet.EMPTY, cacheHelper);
+return null;
+  }
+
+  final long cost = supplier.cost();
+  return new ScorerSupplier() {
+@Override
+public Scorer get(long leadCost) throws IOException {
+  // skip cache operation which would slow query down too much
+  if (cost > skipCacheCost && cost > leadCost * skipCacheFactor
 
 Review comment:
   We have tested different scenarios to observe the query latency with/without 
cacheing in an online metric ES cluster. The result is as follows:
   
   | queryPattern  | latencyWithoutCaching  | latencyWithCaching | leadCost 
| rangeQueryCost  | skipCacheFactor |
   | -- | :---:  | :---: | :---:  | 
:---: | :---:  |
   | ip:xxx AND time:[t-1h, t] | 10ms | 36ms(+260%) | 20528 | 878979 | 42 |
   | ip:xxx AND time:[t-4h, t] | 10ms | 100ms(+900%) | 20528 | 4365870 | 212 |
   | ip:xxx AND time:[t-8h, t] | 11ms | 200ms(+1700%) | 20528 | 8724483 | 425 |
   | ip:xxx AND time:[t-12h, t] | 12ms | 300ms(+2400%) | 20528 | 13083096 | 637 
|
   | ip:xxx AND time:[t-24h, t] | 16ms | 500ms(+3000%) | 20528 | 26158936 | 
1274 |
   | ip:xxx AND time:[t-48h, t] | 30ms | 1200ms(3900%) | 20528 | 52310616 | 
2548 |
   
   As the table show, query latency without caching is low and related with the 
final result set, and query latency with caching is much high and mainly 
related with _rangeQueryCost_. 
   
   We set the default value of _skipCacheFactor_ to 250, which make the query 
slower by no more than 10 times.
   
   In addition to _skipCacheFactor_ which is similar to _maxCostFactor_ in 
LUCENE-8027, we have added a new parameter _skipCacheCost_. The mainly reasons 
are:
   - control the time used for caching as the caching time is related to the 
cost of range query.
   - skip caching too large range queries which will consume too much memory 
and evict cache frequently.
   
   How do you think? Looking forward to your ideas. @jpountz 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jgq2008303393 commented on a change in pull request #940: LUCENE-9002: Query caching leads to absurdly slow queries

2019-10-11 Thread GitBox
jgq2008303393 commented on a change in pull request #940: LUCENE-9002: Query 
caching leads to absurdly slow queries
URL: https://github.com/apache/lucene-solr/pull/940#discussion_r333851594
 
 

 ##
 File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java
 ##
 @@ -732,8 +741,39 @@ public ScorerSupplier scorerSupplier(LeafReaderContext 
context) throws IOExcepti
 
   if (docIdSet == null) {
 if (policy.shouldCache(in.getQuery())) {
-  docIdSet = cache(context);
-  putIfAbsent(in.getQuery(), docIdSet, cacheHelper);
+  final ScorerSupplier supplier = in.scorerSupplier(context);
+  if (supplier == null) {
+putIfAbsent(in.getQuery(), DocIdSet.EMPTY, cacheHelper);
+return null;
+  }
+
+  final long cost = supplier.cost();
+  return new ScorerSupplier() {
+@Override
+public Scorer get(long leadCost) throws IOException {
+  // skip cache operation which would slow query down too much
+  if (cost > skipCacheCost && cost > leadCost * skipCacheFactor
 
 Review comment:
   We have tested different scenarios to observe the query latency with/without 
cacheing in an online metric ES cluster. The result is as follows:
   
   | query  | latencyWithoutCaching  | latencyWithCaching | leadCost | 
rangeQueryCost  | skipCacheFactor |
   | -- | :---:  | :---: | :---:  | 
:---: | :---:  |
   | ip:xxx AND time:[t-1h, t] | 10ms | 36ms(+260%) | 20528 | 878979 | 42 |
   | ip:xxx AND time:[t-4h, t] | 10ms | 100ms(+900%) | 20528 | 4365870 | 212 |
   | ip:xxx AND time:[t-8h, t] | 11ms | 200ms(+1700%) | 20528 | 8724483 | 425 |
   | ip:xxx AND time:[t-12h, t] | 12ms | 300ms(+2400%) | 20528 | 13083096 | 637 
|
   | ip:xxx AND time:[t-24h, t] | 16ms | 500ms(+3000%) | 20528 | 26158936 | 
1274 |
   | ip:xxx AND time:[t-48h, t] | 30ms | 1200ms(3900%) | 20528 | 52310616 | 
2548 |
   
   As the table show, query latency without caching is low and related with the 
final result set, and query latency with caching is much high and mainly 
related with _rangeQueryCost_. 
   
   We set the default value of _skipCacheFactor_ to 250, which make the query 
slower by no more than 10 times.
   
   In addition to _skipCacheFactor_ which is similar to _maxCostFactor_ in 
LUCENE-8027, we have added a new parameter _skipCacheCost_. The mainly reasons 
are:
   - control the time used for caching as the caching time is related to the 
cost of range query.
   - skip caching too large range queries which will consume too much memory 
and evict cache frequently.
   
   How do you think? Looking forward to your ideas. @jpountz 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] treygrainger opened a new pull request #941: SOLR-13836: Add 'streaming_expression' QParser

2019-10-11 Thread GitBox
treygrainger opened a new pull request #941: SOLR-13836: Add 
'streaming_expression' QParser
URL: https://github.com/apache/lucene-solr/pull/941
 
 
   # Description
   
   It is currently possible to hit the search handler in a streaming expression 
("search(...)"), but it is not currently possible to invoke a streaming 
expression from within a regular search within the search handler. In some 
cases, it would be useful to leverage the power of streaming expressions to 
generate a result set and then join that result set with a normal set of search 
results. This likely won't be particularly efficient for high cardinality 
streaming expression results, but it will be pretty powerful feature that could 
enable a bunch of use cases that aren't possible today within a normal search.
   
   See https://issues.apache.org/jira/browse/SOLR-13836 for usage information.
   
   # Solution
   
   The current solution adds a StreamingExpressionQParserPlugin which executes 
a streaming expression and joins the tuples returned on an id field with the 
main docset. The field name from the streaming expression tuples can be 
overridden ("f" param), as well as the method of joining ("method" parameter). 
   
   # Usage
   *Docs:*
   
   ```
   curl -X POST -H "Content-Type: application/json" 
http://localhost:8983/solr/food_collection/update?commit=true  --data-binary '
   [
   {"id": "1", "name_s":"donut","vector_fs":[5.0,0.0,1.0,5.0,0.0,4.0,5.0,1.0]},
   {"id": "2", "name_s":"apple 
juice","vector_fs":[1.0,5.0,0.0,0.0,0.0,4.0,4.0,3.0]},
   {"id": "3", 
"name_s":"cappuccino","vector_fs":[0.0,5.0,3.0,0.0,4.0,1.0,2.0,3.0]},
   {"id": "4", "name_s":"cheese 
pizza","vector_fs":[5.0,0.0,4.0,4.0,0.0,1.0,5.0,2.0]},
   {"id": "5", "name_s":"green 
tea","vector_fs":[0.0,5.0,0.0,0.0,2.0,1.0,1.0,5.0]},
   {"id": "6", "name_s":"latte","vector_fs":[0.0,5.0,4.0,0.0,4.0,1.0,3.0,3.0]},
   {"id": "7", "name_s":"soda","vector_fs":[0.0,5.0,0.0,0.0,3.0,5.0,5.0,0.0]},
   {"id": "8", "name_s":"cheese bread 
sticks","vector_fs":[5.0,0.0,4.0,5.0,0.0,1.0,4.0,2.0]},
   {"id": "9", "name_s":"water","vector_fs":[0.0,5.0,0.0,0.0,0.0,0.0,0.0,5.0]},
   {"id": "10", "name_s":"cinnamon bread 
sticks","vector_fs":[5.0,0.0,1.0,5.0,0.0,3.0,4.0,2.0]}
   ]
   ```
   
    
   
   *Query:*
   ```
   
http://localhost:8983/solr/food/select?q=*:*=\{!streaming_expression}top(select(search(food,%20q=%22*:*%22,%20fl=%22id,vector_fs%22,%20sort=%22id%20asc%22),%20cosineSimilarity(vector_fs,%20array(5.1,0.0,1.0,5.0,0.0,4.0,5.0,1.0))%20as%20cos,%20id),%20n=5,%20sort=%22cos%20desc%22)=id,name_s
   ```
   
    
   
   *Response:*
   ```
   {
 "responseHeader":{
   "zkConnected":true,
   "status":0,
   "QTime":7,
   "params":{
 "q":"*:*",
 "fl":"id,name_s",
 "fq":"{!streaming_expression}top(select(search(food, q=\"*:*\", 
fl=\"id,vector_fs\", sort=\"id asc\"), cosineSimilarity(vector_fs, 
array(5.2,0.0,1.0,5.0,0.0,4.0,5.0,1.0)) as cos, id), n=5, sort=\"cos desc\")"}},
 "response":{"numFound":5,"start":0,"docs":[
 {
   "name_s":"donut",
   "id":"1"},
 {
   "name_s":"apple juice",
   "id":"2"},
 {
   "name_s":"cheese pizza",
   "id":"4"},
 {
   "name_s":"cheese bread sticks",
   "id":"8"},
 {
   "name_s":"cinnamon bread sticks",
   "id":"10"}]
 }}
   ```
   
   # Tests
   
   No tests written yet. First draft.
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [x] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms 
to the standards described there to the best of my ability.
   - [x] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [x] I am authorized to contribute this code to the ASF and have removed 
any code I do not have a license to distribute.
   - [x] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [x] I have developed this patch against the `master` branch.
   - [ ] I have run `ant precommit` and the appropriate test suite.
   - [ ] I have added tests for my changes.
   - [ ] I have added documentation for the [Ref 
Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) 
(for Solr changes only).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org